[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

Fri May 4 08:27:41 PDT 2012

Duncan Sands wrote:

>> As noted in the GEP FAQ, GEPs don't support variable-length arrays;
>
> that's not quite right. The problem is only with arrays of variable length
> arrays, and more generally with arrays where the element type has variable
> size (this occurs with Ada, which has all kinds of funky variable sized types,
> for example).

You're right, though of course I was quoting from the FAQ.

You worry about handling Ada...
I'm old and worry that we can't even handle Fortran well.

>Consider your examples:
>
>>    *void zap(long int n, long int A[]) {*
>>    *  for (unsigned int i = 0; i < n; i++)*
>>    *    A[1 + 2*i] = 0;*
>>    *}*
>
>>    *%arrayidx = getelementptr inbounds i64* %A, i64 %add3*

> Here GEP has no problem with this variable sized array.

Right.  I was trying to starting simple, to help illustrate what my
little code did.

>
>>    *void z1(long int n, long int A[][100][100]) {
>>    **  for (long int i = 0; i < n; i++)
>>    **    for (long int j = 0; j < n; j++)
>>    **      for (long int k = 0; k < n; k++)
>>    ****A[1 + 2*i][3 + 4*j][5 + 6*k] = 0;*
>>    }
>>
>>    *%arrayidx12 = getelementptr inbounds [100 x [100 x i64]]* %A, i64 %add109,
>
>
> Here neither.

Right again, another simple example.  In both cases, we've basically
got a vector of fixed-sized elements.  I'll note however, that because
of the general weakness of GEPs, I'm going to have to try and
delinearize everything that comes at me, 'cause I can't tell how
simple or complex the original array was in the source code.

>>    *void z2(long int n, long int A[][n][n][100][100]) { *
>>    *  for (long int i = 0; i < n; i++) *
>>    *    for (long int j = 0; j < n; j++) *
>>    *      for (long int k = 0; k < n; k++) *
>>    ***for (long int l = 0; l < n; l++) *
>>    ***for (long int m = 0; m < n; m++) *
>>    ***A[1 + 2*i][3 + 4*j][5 + 6*k][7 + 8*l][9 + 10*m] = 0; *
>>    }
>
>
> This is where you run into trouble, because this is an array with a variable
> sized element type.
>
> Currently GEP is designed so that the offset from the base pointer is an affine
> function *with constant multipliers*, eg: 3*x + 10*y.

I merely argue for allowing constant-but-unknown multipliers, i.e.,
parameters. They're still affine for purposes of dependence analysis
(by things like the Delta test or Polly), but oh so much more useful.

Multi-dimensional VLAs happen, especially in scientific code. How are
we going to express DGEMM (matrix multiplication)?  We could do it by
manually linearizing all the array references, or we could do it the
way god intended when he standardized C90, with VLAs. I'm arguing that
it would be wonderful to be able to analyze such code accurately.

Regardest,
Preston