[llvm-dev] Automatic Insertion of OpenACC/OpenMP directives

Jonathan Roelofs via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 3 07:17:07 PST 2017



On 12/31/16 12:37 PM, Fernando Magno Quintao Pereira via llvm-dev wrote:
> Dear Mehdi,
>
>     I've changed your example a little bit:
>
> float saxpy(float a, float *x, float *y, int n) {
>  int j = 0;
>  for (int i = 0; i < n; ++i) {
>    y[j] = a*x[i] + y[I]; // Change 'I' into 'j'?
>    ++j;
>  }
> }
>
> I get this code below, once I replace 'I' with 'j'. We are copying n
> positions of both arrays, 'x' and 'y':
>
> float saxpy(float a, float *x, float *y, int n) {
>   int j = 0;
>
>   long long int AI1[6];
>   AI1[0] = n + -1;
>   AI1[1] = 4 * AI1[0];
>   AI1[2] = AI1[1] + 4;
>   AI1[3] = AI1[2] / 4;
>   AI1[4] = (AI1[3] > 0);
>   AI1[5] = (AI1[4] ? AI1[3] : 0);
>   #pragma acc data pcopy(x[0:AI1[5]],y[0:AI1[5]])
>   #pragma acc kernels
>   for (int i = 0; i < n; ++i) {
>     y[j] = a * x[i] + y[j];
>     ++j;
>   }

I'm not familiar with OpenACC, but doesn't this still have a loop 
carried dependence on j, and therefore isn't correctly parallelizable as 
written?


Jon

> }
>
> Regards,
>
> Fernando
>

-- 
Jon Roelofs
jonathan at codesourcery.com
CodeSourcery / Mentor Embedded


More information about the llvm-dev mailing list