[llvm-dev] Automatic Insertion of OpenACC/OpenMP directives
Jonathan Roelofs via llvm-dev
llvm-dev at lists.llvm.org
Tue Jan 3 07:17:07 PST 2017
On 12/31/16 12:37 PM, Fernando Magno Quintao Pereira via llvm-dev wrote:
> Dear Mehdi,
>
> I've changed your example a little bit:
>
> float saxpy(float a, float *x, float *y, int n) {
> int j = 0;
> for (int i = 0; i < n; ++i) {
> y[j] = a*x[i] + y[I]; // Change 'I' into 'j'?
> ++j;
> }
> }
>
> I get this code below, once I replace 'I' with 'j'. We are copying n
> positions of both arrays, 'x' and 'y':
>
> float saxpy(float a, float *x, float *y, int n) {
> int j = 0;
>
> long long int AI1[6];
> AI1[0] = n + -1;
> AI1[1] = 4 * AI1[0];
> AI1[2] = AI1[1] + 4;
> AI1[3] = AI1[2] / 4;
> AI1[4] = (AI1[3] > 0);
> AI1[5] = (AI1[4] ? AI1[3] : 0);
> #pragma acc data pcopy(x[0:AI1[5]],y[0:AI1[5]])
> #pragma acc kernels
> for (int i = 0; i < n; ++i) {
> y[j] = a * x[i] + y[j];
> ++j;
> }
I'm not familiar with OpenACC, but doesn't this still have a loop
carried dependence on j, and therefore isn't correctly parallelizable as
written?
Jon
> }
>
> Regards,
>
> Fernando
>
--
Jon Roelofs
jonathan at codesourcery.com
CodeSourcery / Mentor Embedded
More information about the llvm-dev
mailing list