parallel processing - Mandelbrot optimization in openmp -
well have paralellisize mandelbrot program in c. think have done , cant better times. question if has idea improve code, ive been thinking perhaps in nested parallel regions between outer , insider for...
also have doubts if more elegant or recommended put pragmas in single line or write separate pragmas ( 1 omp parallel , shared , private variables , conditional, , pragma omp , schedule dynamic).
ive doubt if constants can used private variables because think cleaner have constants instead of defined variables.
also have written conditional ( if numcpu >1) has no sense use parallel region , make normal sequential execution.
finally have read dynamic chunk depends on hardware , system configuration... have left constant, can changed.
also adapt number of threads number of processors available..
int main(int argc, char *argv[]) { omp_set_dynamic(1); int xactual, yactual; //each iteration, calculates: newz = oldz*oldz + p, p current pixel, , oldz stars @ origin double pr, pi; //real , imaginary part of pixel p double newre, newim, oldre, oldim; //real , imaginary parts of new , old z double zoom = 1, movex = -0.5, movey = 0; //you can change these zoom , change position pixel_t *pixels = malloc(sizeof(pixel_t)*imageheight*imagewidth); clock_t begin, end; double time_spent; begin=clock(); int numcpu; numcpu = omp_get_num_procs(); //file * fp; printf("el nĂºmero de procesadores que utilizaremos es: %d", numcpu); omp_set_num_threads(numcpu); #pragma omp parallel shared(pixels, movex, movey, zoom) private(xactual, yactual, pr, pi, newre, newim) (if numcpu>1) { //int xactual=0; // int yactual=0; #pragma omp schedule(dynamic, chunk) //loop through every pixel for(yactual = 0; yactual < imageheight; yactual++) for(xactual = 0; xactual < imagewidth; xactual++) { //calculate initial real , imaginary part of z, based on pixel location , zoom , position values pr = 1.5 * (xactual - imagewidth / 2) / (0.5 * zoom * imagewidth) + movex; pi = (yactual - imageheight / 2) / (0.5 * zoom * imageheight) + movey; newre = newim = oldre = oldim = 0; //these should start @ 0,0 //"i" represent number of iterations int i; //start iteration process for(i = 0; < iterations; i++) { //remember value of previous iteration oldre = newre; oldim = newim; //the actual iteration, real , imaginary part calculated newre = oldre * oldre - oldim * oldim + pr; newim = 2 * oldre * oldim + pi; //if point outside circle radius 2: stop if((newre * newre + newim * newim) > 4) break; } // color(i % 256, 255, 255 * (i < maxiterations)); if(i == iterations) { //color(0, 0, 0); // black pixels[yactual*imagewidth+xactual][0] = 0; pixels[yactual*imagewidth+xactual][1] = 0; pixels[yactual*imagewidth+xactual][2] = 0; } else { double z = sqrt(newre * newre + newim * newim); int brightness = 256 * log2(1.75 + - log2(log2(z))) / log2((double)iterations); //color(brightness, brightness, 255) pixels[yactual*imagewidth+xactual][0] = brightness; pixels[yactual*imagewidth+xactual][1] = brightness; pixels[yactual*imagewidth+xactual][2] = 255; } } } //end of parallel region end= clock(); time_spent = (double)(end - begin) / clocks_per_sec; fprintf(stderr, "elapsed time: %.2lf seconds.\n", time_spent);
you extend implementation leverage simd extensions. far know latest openmp standard includes vector constructs. check out this article describes new capabilities.
this whitepaper explains how sse3 can used when calculating mandelbrot set.