Problem 1

From CSI 702

Jump to: navigation, search

Contents

  1. Problem 1
    1. Description
    2. Expectation
    3. Code
    4. Comments
    5. Contributors

1. Problem 1

1.1. Description

How do I make this code run faster?

 do i = 1, 1000
   if (i < 100) then
     a(i) = 10
   else
     a(i) = 20
   endif
 enddo

1.2. Expectation

The expectation is that if the if statement were to be removed from the loop, the code should run a bit faster. Please note that branches in loops break vector pipelines. This can be done by breaking the original code into two loop statements. In doing so, we can eliminate the need for the branching logic. It is not expected that this change will improve the order of the method from O(C*n).

1.3. Code

Original code in Fortran:

 program hw1p1v1
 integer j
 do j = 1, 50000
   call algorithm
 enddo
 end
 subroutine algorithm()
 integer i, a(1000)
 do i = 1, 1000
   if (i < 100) then
     a(i) = 10
   else
     a(i) = 20
   endif
 enddo
 end

can also be written as

 subroutine algorithm()
 integer i, a(1000)
 if (i < 100) then
   do i = 1, 1000
      a(i) = 10
   enddo
 else
   do i = 1, 1000
      a(i) = 20
   enddo
 endif
 end

In the above subroutine, the if test is evaluated only once.

Breaking the original loop into two loops to remove the if statement in the subroutine gives:

 subroutine algorithm2()
 integer i, a(1000)
 do i = 1, 99
   a(i) = 10
 enddo
 do i = 100, 1000
   a(i) = 20
 enddo
 end

Original version in C:

 #include <stdlib.h>
 int allocatemem(int n,int **a){
   (*a) = (int*)malloc(n*sizeof(int));
   return 0;
 }
 int makeloop(int n, int m, int* a){
   int i;
   for (i=0;i<n;i++){
      if (i < m)
        a[i] = 10;
      else 
        a[i] = 20;
   }
 return 0;
 } 
 int main(int argc,char *argv[]){
   int n=atoi(argv[1]);
   int m=n/10;
   int *a;
   allocatemem(n,&a); 
   makeloop(n,m,a); 
   free(a);
   return 0;
 }

Modified version in C:

 #include <stdlib.h>
 int allocatemem(int n,int **a){
   (*a) = (int*)malloc(n*sizeof(int));
    return 0;
 }
 int makeloop(int n, int m, int* a){
   int i;
   for (i=0;i<m;i++)
      a[i] = 10;
   for (i=m;i<n;i++)
      a[i] = 20;
     return 0;
 } 
 int main(int argc, char *argv[]){
   int n=atoi(argv[1]);
   int m=n/10;
   int *a;
   allocatemem(n,&a); 
   makeloop(n,m,a); 
   free(a);
   return 0; 
 }

1.4. Comments

Image:P1.png

The figure shows the result timing vs. the cycles quantity n of each version of the program. Time is the output of gprof for the makeloop subroutine. Both version, original and modified, were compiled with the flags -O0 and -O2. It should also be noted that execution times varied wildly depending on the number of executions of the subroutine. This is presumably due to caching from the previous run.

1.5. Contributors

H. Le

Marcelo Raschi

Zach Firth

Ash Agarwal

Personal tools