parallel processing - Mandelbrot optimization in openmp -


well have paralellisize mandelbrot program in c. think have done , cant better times. question if has idea improve code, ive been thinking perhaps in nested parallel regions between outer , insider for...

also have doubts if more elegant or recommended put pragmas in single line or write separate pragmas ( 1 omp parallel , shared , private variables , conditional, , pragma omp , schedule dynamic).

ive doubt if constants can used private variables because think cleaner have constants instead of defined variables.

also have written conditional ( if numcpu >1) has no sense use parallel region , make normal sequential execution.

finally have read dynamic chunk depends on hardware , system configuration... have left constant, can changed.

also adapt number of threads number of processors available..

 int main(int argc, char *argv[]) {     omp_set_dynamic(1);      int xactual, yactual;      //each iteration, calculates: newz = oldz*oldz + p, p current pixel, , oldz stars @ origin     double pr, pi;                   //real , imaginary part of pixel p     double newre, newim, oldre, oldim;   //real , imaginary parts of new , old z     double zoom = 1, movex = -0.5, movey = 0; //you can change these zoom , change position      pixel_t *pixels = malloc(sizeof(pixel_t)*imageheight*imagewidth);     clock_t begin, end;     double time_spent;          begin=clock();      int numcpu;     numcpu = omp_get_num_procs();      //file * fp;     printf("el nĂºmero de procesadores que utilizaremos es: %d", numcpu);      omp_set_num_threads(numcpu);      #pragma omp parallel shared(pixels, movex, movey, zoom) private(xactual, yactual, pr, pi, newre, newim) (if numcpu>1)     {         //int xactual=0;     //  int yactual=0;         #pragma omp  schedule(dynamic, chunk)             //loop through every pixel         for(yactual = 0; yactual < imageheight; yactual++)             for(xactual = 0; xactual < imagewidth; xactual++)             {                 //calculate initial real , imaginary part of z, based on pixel location , zoom , position values             pr = 1.5 * (xactual - imagewidth / 2) / (0.5 * zoom * imagewidth) + movex;             pi = (yactual - imageheight / 2) / (0.5 * zoom * imageheight) + movey;             newre = newim = oldre = oldim = 0; //these should start @ 0,0             //"i" represent number of iterations             int i;             //start iteration process             for(i = 0; < iterations; i++)             {                 //remember value of previous iteration                 oldre = newre;                 oldim = newim;                 //the actual iteration, real , imaginary part calculated                 newre = oldre * oldre - oldim * oldim + pr;                 newim = 2 * oldre * oldim + pi;                 //if point outside circle radius 2: stop                 if((newre * newre + newim * newim) > 4) break;             }              //            color(i % 256, 255, 255 * (i < maxiterations));             if(i == iterations)             {                 //color(0, 0, 0); // black                 pixels[yactual*imagewidth+xactual][0] = 0;                 pixels[yactual*imagewidth+xactual][1] = 0;                 pixels[yactual*imagewidth+xactual][2] = 0;             }             else             {                 double z = sqrt(newre * newre + newim * newim);                 int brightness = 256 * log2(1.75 + - log2(log2(z))) / log2((double)iterations);                  //color(brightness, brightness, 255)                 pixels[yactual*imagewidth+xactual][0] = brightness;                 pixels[yactual*imagewidth+xactual][1] = brightness;                 pixels[yactual*imagewidth+xactual][2] = 255;             }              }      }   //end of parallel region      end= clock();      time_spent = (double)(end - begin) / clocks_per_sec;     fprintf(stderr, "elapsed time: %.2lf seconds.\n", time_spent); 

you extend implementation leverage simd extensions. far know latest openmp standard includes vector constructs. check out this article describes new capabilities.

this whitepaper explains how sse3 can used when calculating mandelbrot set.


Popular posts from this blog

c# - ODP.NET Oracle.ManagedDataAccess causes ORA-12537 network session end of file -

matlab - Compression and Decompression of ECG Signal using HUFFMAN ALGORITHM -

utf 8 - split utf-8 string into bytes in python -