How sorting data helps in performance improvement or reducing the execution time of code ? Here is an example.

When you are working on loops which process the data, do you think sorting the data helps in improving the performance which means reducing the execution time of a program ?

Yes. Sorting the data helps in improving the performance of a program. Have a look at below example in Java.

import java.util.Arrays;
import java.util.Random;

public class Demo {
    public static void main(String a[]) {
        int size = 30000;
        int randomData[] = new int[size];
        Random r = new Random(0);
        // loop 1
        for (int i = 0; i < size; i++) {
            randomData[i] = r.nextInt() % 256;

        // Arrays.sort(data);

        long startTime = System.nanoTime();
        long totalValue = 0;
        // loop 2
        for (int i = 0; i < 100000; i++) {
            for (int j = 0; j < randomData.length; j++) {
                if (randomData[j] > 128)
                    totalValue += randomData[j];

        long endTime = System.nanoTime();

        System.out.println("Execution Time : " + ((endTime - startTime) / 1000000000.0) + " sec");
        System.out.println("Total Value : " + totalValue);
Execution Time : 8.470835679 sec
Total Value : 141904900000

Now, uncomment the line "Arrays.sort(randomData)" in above code and execute it once again. I got the following result when i ran it.

Execution Time : 1.612308632 sec
Total Value : 141904900000


What happened in above code ?

  1. Loop 1 assigned random data to an array of size 30000
  2. Loop 2 calculated the sum of all the numbers greater than 128 in random data 1 lack times. Total iteration count : 100000 * 30000 = 3 billion
  3. Execution time of loop 2 without sorting was 8.4 sec whereas time for loop 2 with sorting was 1.6 sec.

How sorting improved the performance ?

Loop 2 has a conditional statement to check whether the number is greater than 128 or not. This if branch gets executed 3 billion times in this loop. Every modern processor comes with something called branch predictor which predicts the branch result based on history and processes the instructions. Here, processing the random data is a difficult task for processor since branch prediction probability is very less for random data. When data is sorted out, probability of branch prediction increases and the processing speed will get accelerated which ultimately reduces the execution time.

That's it. Now you know how to make use of branch prediction for increased performance. If you have any experience on it or any feedback / comments, please share by commenting below.


Submit a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>