[LLVMdev] Help with new backend: byte-sized loads being generated for 'int' array access

Tue Jun 10 08:54:18 PDT 2014

----- Original Message -----
> From: "Jeff Kuskin" <jk500500 at yahoo.com>
> To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Tuesday, June 10, 2014 10:04:45 AM
> Subject: [LLVMdev] Help with new backend: byte-sized loads being generated	for 'int' array access
> 
> First, apologies because I'm quite new to LLVM backend development.
>   I very much appreciate any help from more experienced folks.
> 
> 
> I'm running into a problem in which byte-sized loads are _sometimes_
> being generated for a read access to an external array of 4-byte
> ints, depending on how the array is declared.
> 
> I am hoping someone can perhaps point me to possible sources of the
> problem in my backend code.  I would be happy to supply additional
> details; I'm trying to keep this message relatively short.
> 
> 
> 
> The issue arises when I compile the following C code:
> 
>      extern int EI[];
>      int MYFUNC() { return EI[1288]; }
> 
> 
> 
> I run the code through 'clang -emit-llvm' and end up with bitcode of:
> 
>     ; ModuleID = './clang2.c'
>     target datalayout =
>     "e-m:e-p:32:32-i8:8:32-i16:16:32-i64:64-n32-S64"
>     target triple = "dgc"
>     @EI = external global [0 x i32]
>     ; Function Attrs: nounwind
>     define i32 @MYFUNC() #0 {
>     entry:
>       %0 = load i32* getelementptr inbounds ([0 x i32]* @EI, i32 0,
>       i32 1288), align 1
>       ret i32 %0
>     }
> 
>     attributes #0 = { nounwind "less-precise-fpmad"="false"
>                    "no-frame-pointer-elim"="true"
>                    "no-frame-pointer-elim-non-leaf"
>                    "no-infs-fp-math"="false"
>                    "no-nans-fp-math"="false"
>                    "stack-protector-buffer-size"="8"
>                    "unsafe-fp-math"="false" "use-soft-float"="false"
>                    }
>     !llvm.ident = !{!0}
>     !0 = metadata !{metadata !"clang version 3.5.0 (209307)"}
> 
> 
> 
> When I then run the bitcode through llc, the memory load for the
> 'EI[1288]' reference is generated with a series of four byte-sized
> loads, followed by the appropriate shifting and OR'ing to get all
> the bytes into the proper place in the result.
> 
> This is not what I want, of course.  I want a single, word-sized load
> to be generated.  I have various sizes of load instructions defined
> in my TableGen file, an excerpt of which I've included at the end of
> this message.
> 
> Other backends built from the same source tree -- mipsel and xcore,
> for instance -- do indeed generate a single word-sized load, as
> expected, so I'm confident the problem is in my backend code.
> 
> What's interesting is that my backend *DOES* generate a single
> word-sized load if I make either of the following changes to the
> declaration of 'EI':
> 
>    (1) Provide an array size in the EI declaration:
>              extern int EI[5000];
> 
>        This yields the following in the bitcode, replacing the like
>        lines from above:
>            @EI = external global [5000 x i32]
>            ; Function Attrs: nounwind
>            define i32 @MYFUNC() #0 {
>            entry:
>              %0 = load i32* getelementptr inbounds ([5000 x i32]*
>              @EI, i32 0, i32 1288), align 4
>              ret i32 %0
>            }
> 
> 
> 
>    (2) Change EI to be an int*:
>              extern int* EI;
> 
>        This yields the following in the bitcode:
>          @EI = external global i32*
>          ; Function Attrs: nounwind
>          define i32 @MYFUNC() #0 {
>          entry:
>            %0 = load i32** @EI, align 4
>            %arrayidx = getelementptr inbounds i32* %0, i32 1288
>            %1 = load i32* %arrayidx, align 4
>            ret i32 %1
>          }
> 
> 
> 
> I have tried a number of things to figure out this issue, but to no
> avail.  For some reason the 'EI[1288]' reference is being treated as
> possibly unaligned ("align=1"), but I can't figure out why.
> 

For the question of how C is being translated into LLVM IR (why there is the 'align 1' vs 'align 4'), you should ask on the cfe-dev list (not here).

To mention a related point, if your target supports unaligned loads for 4-byte integers, then you need to override the *TargetLowering::allowsUnalignedMemoryAccesses callback for your target.

 -Hal

> 
> 
> 
> TD file excerpt (modeled after the MIPS .td file):
> 
> 
> def DGCAddrDefault :
>         ComplexPattern<iPTR, 2, "selectAddrDefault", [frameindex]>;
> def DGCAddrInt :
>         ComplexPattern<iPTR, 2, "selectAddrInt", [frameindex]>;
> 
> def DGCMemSrc : Operand<iPTR> {
>   let MIOperandInfo = (ops ptr_rc, i32imm);
>   let OperandType = "OPERAND_MEMORY";
> }
> 
> let canFoldAsLoad = 1,
>     mayLoad = 1 in
> {
> def LB : InstrDGC64_s__s_s<
>              (outs IntRegs:$rd),
>              (ins DGCMemSrc:$addr),
>              !strconcat("lb", "\t$rd, $addr"),
>              [(set i32:$rd, (sextloadi8 DGCAddrInt:$addr))],
>              0b10011, 0b000, 0, 0>;
> def LH : InstrDGC64_s__s_s<
>              (outs IntRegs:$rd),
>              (ins DGCMemSrc:$addr),
>              !strconcat("lh", "\t$rd, $addr"),
>              [(set i32:$rd, (sextloadi16 DGCAddrDefault:$addr))],
>              0b10011, 0b001, 0, 0>;
> def LBU : InstrDGC64_s__s_s<
>              (outs IntRegs:$rd),
>              (ins DGCMemSrc:$addr),
>              !strconcat("lbu", "\t$rd, $addr"),
>              [(set i32:$rd, (zextloadi8 DGCAddrDefault:$addr))],
>              0b10011, 0b100, 0, 0>;
> def LHU : InstrDGC64_s__s_s<
>              (outs IntRegs:$rd),
>              (ins DGCMemSrc:$addr),
>              !strconcat("lhu", "\t$rd, $addr"),
>              [(set i32:$rd, (zextloadi16 DGCAddrInt:$addr))],
>              0b10011, 0b101, 0, 0>;
> def LW : InstrDGC64_s__s_s<
>              (outs IntRegs:$rd),
>              (ins DGCMemSrc:$addr),
>              !strconcat("lw", "\t$rd, $addr"),
>              [(set i32:$rd, (load DGCAddrDefault:$addr))],
>              0b10011, 0b010, 0, 0>;
> }
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory