[LLVMdev] Performance problems with FORTRAN allocatable arrays

Wonsun Ahn wonsun.ahn at gmail.com
Wed Feb 15 08:14:39 PST 2012


Hi Duncan,

Here is the test case:

------------------------------------- snip -------------------------------------

      MODULE LES3D_DATA
      IMPLICIT REAL*8 (A-H,O-Z)
      INTEGER IMAX, JMAX, KMAX, ND
      DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:,:) :: QAV
      DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:,:,:) :: Q
      END MODULE LES3D_DATA

      PROGRAM LES3D

      USE LES3D_DATA
      IMPLICIT REAL*8(A-H,O-Z)

      READ(5,*) IMAX, JMAX, KMAX, ND
      ALLOCATE(Q(-2:IMAX+2,-2:JMAX+2,-2:KMAX+2,ND,2),
     >       QAV(-2:IMAX+2,-2:JMAX+2,-2:KMAX+2,ND))

      DO L = 1, 5
         DO K = 0, KMAX
            DO J = 0, JMAX
               DO I = 0, IMAX
                  QAV(I,J,K,L) = 2.0D0 * Q(I,J,K,L,1)
               END DO
            END DO
         END DO
      END DO

      stop

      END

------------------------------------- snip -------------------------------------

I compiled the above using the following commands:

llvm-gfortran -O3 <source name> -c -emit-llvm -o <bytecode name>

If you disassemble the bytecode, you'll see that the matrix address
calculations for QAV and Q are not hoisted out of the loop.

Thanks,
Wonsun

On Wed, Feb 15, 2012 at 8:34 AM, Duncan Sands <baldrick at free.fr> wrote:
> Hi Wonsun, can you please provide a testcase.
>
> Best wishes, Duncan.
>
>> I've noticed that LLVM does a bad job of optimizing array indexing
>> code for FORTRAN arrays declared using the ALLOCATABLE keyword.
>>
>> For example if you have something like the following:
>>
>> DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:,:) :: QAV
>> ...
>> ALLOCATE( QAV( -2:IMAX+2,-2:JMAX+2,-2:KMAX+2,ND) )
>> ...
>> DO L = 1, 5
>>     DO K = K1, K2
>>        DO J = J1, J2
>>           DO I = I1, I2
>>              II  =  I + IADD
>>              IBD = II - IBDD
>>              ICD = II + IBDD
>>
>>              QAV(I,J,K,L) = R6I * (2.0D0 * Q(IBD,J,K,L,N) +
>>>                                      5.0D0 * Q( II,J,K,L,N) -
>>>                                              Q(ICD,J,K,L,N))
>>           END DO
>>        END DO
>>     END DO
>> END DO
>>
>> Most of the code needed to calculate the address of QAV(I,J,K,L)
>> should be hoisted out of the loop since J, K, and L are constant
>> inside the loop. But I'm not seeing this happening because LLVM's
>> alias analysis cannot distinguish between the loads of the array
>> dimensions for QAV and the store to QAV(I,J,K,L). I've tried all the
>> alias analyses available in the standard distribution, including type
>> based analysis and scalar evolution. But if you think about it, the
>> array dimensions of QAV is 'metadata' and should not alias with any
>> actual accesses in the program. I've compiled the same code with GCC
>> and it was able to hoist most of the address calculations out as
>> expected. GCC was able to hoist address calculations for Q also.
>>
>> This is an actual piece of code in SPECCPU2006 437.leslie3d and the
>> loop I analyzed is in line 1630 of file tml.f. 437.leslie3d suffers
>> horrible performance problems because of this and similar problems.
>>
>> Is there anyway to enable this optimization? Is there a way to flag in
>> the IR that a particular locations is array dimension meta data?
>>
>> Thanks,
>> Wonsun
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list