[LLVMdev] Issue with GetElementPtrInst in Instruction Combining pass

Pankaj Gode godepankaj at yahoo.com
Tue Apr 17 04:02:50 PDT 2012


Hi All,
 
I have been having this issue, when I am enable Instruction Combining pass, for an application.
I have read similar post ealier,
http://old.nabble.com/Instruction-Combining-Pass-*Breaking*-Struct-Reads--td24253572.html 
With reference to the above case, my target data layout is defined as:
DataLayout("e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-a:32:32")
Thus I don't see problem mentioned in the above reference.
 
In the function "visitGetElementPtrInst" has a piece of code for handling bitcasts, 
   /// See if we can simplify:
  ///   X = bitcast A* to B*
  ///   Y = gep X, <...constant indices...>
which when commented, does not give me problem.
 
The application uses a buffer.  This buffer is assigned to struct such as "FRAME_DATA", instead of allocating space for struct and then using it (I think, this is some kind of good usage of memory). 
 
Detailng further in the application,
The used buffer is a 16 bit pointer, declared as: 
#define SAMPLE 1024
>> Word16 Data[4*SAMPLE];                  /*!< Output buffer */

Elements of this buffer are initialized to 0 in main function./* initialize time data buffer */
        for (i=0; i < 4*SAMPLE; i++){
                Data[i] = 0;
        }

This filled buffer is then passed to various functions, on it's journey handling data at various instance of time about frames.
These functions use this buffer, by assiging to appropriate struct pointers (memory reusing probably). 
 
   frameDLt  = (FRAME_DATA*) &Data[MAX_SIZE];
   frameDRt  = (FRAME_DATA*) &Data[3*MAX_SIZE];

Where frameDLt and frameDRt are struct pointers to 
"FRAME_DATA *frameDLt;" and "SBR_FRAME_DATA *frameDRt;"
 
The struct is defined as:
 
 %struct._FRAME_DATA = type { i16, %struct._FRAME_INFO, [5 x i16], [2 x i16], [5 x i32], i32, i16, [48 x i32], i16, [240 x i16], [10 x i16] }
 %struct._FRAME_INFO = type { i16, i16, [6 x i16], [5 x i16], i16, i16, [3 x i16] } 
 
The code generated, when trying to access "coupling, a member 16 bit variable of struct", "without instruction combining" is:
 
%coupling = getelementptr inbounds %struct._FRAME_DATA* %2, i32 0, i32 5, !dbg !575
  store i32 0, i32* %coupling, align 4, !dbg !575
 
 
And the code generated "with instruction combining" is:
%coupling = getelementptr inbounds i16* %timeData, i32 1060, !dbg !575
  %24 = bitcast i16* %coupling to i32*, !dbg !575
  store i32 0, i32* %24, align 4, !dbg !575
 
 
The FRAME_DATA, is defined as:
typedef struct _FRAME_DATA
{
  Word16 nScaleFactors;                    /*!< total number of scalefactors in frame */
  FRAME_INFO frameInfo;                 /*!< time grid for current frame */
  Word16 domain_vec[MAX_ENVELOPES];  /*!< Bitfield containing direction of delta-coding for each envelope */
  Word16 domain_vec_noise[MAX_NOISE_ENVELOPES]; /*!< Same as above, but for noise envelopes */
  INVF_MODE sbr_invf_mode[MAX_INVF_BANDS]; /*!< Strength of filtering in transposer */
  COUPLING_MODE coupling;               /*!< Stereo-mode */ 
  Word16 ampResolutionCurrentFrame;        /*!< Amplitude resolution of envelope values (0: 1.5dB, 1: 3dB) */
  Flag addHarmonics[MAX_FREQ_COEFFS];   /*!< Flags for synthetic sine addition */
  Word16 maxQmfSubbandAac;       /*!< Solves the 'undefined x-over problem' for the enhancement */
  Word16 iEnvelope[MAX_NUM_ENVELOPE_VALUES];       /*!< Envelope data */
  Word16 sbrNoiseFloorLevel[MAX_NUM_NOISE_VALUES]; /*!< Noise envelope data */
}
FRAME_DATA;
COUPLING_MODE is a enum. 

the element ptr address calculated by GEP is different in above cases:
 
1. without instruction combining
coupling member variable, is at:
  %struct._FRAME_DATA* %2, i32 0, i32 5
 
i.e. at 5 offset in FRAME_DATA i.e. the 6th element, i.e. coupling member variable.
Why it is "i32 5" is? If we see this structure has some elements of size i32, so as per C guidelines, other elements will be padded to i32.
In terms of i16, the offset should have been, 
i16, i16, i16, [6 x i16], [5 x i16], i16, i16, [3 x i16], [5x i16], [2 x i16], [5 x i32], i32 ,....
31 words = 124 bytes.(considering it is aligned to i32)
 
2. with instruction combining
coupling is at:
i16* %timeData, i32 1060
i.e. 1060 bytes  offset. 
This is no way close to what is being referred by "without instruction combining".
 
 
I felt, the problem with this is the way llvm instruction combining handles generation of offset for a such a situation.
As I am not sure, I wanted to know more about this. 
 
 
Regards,
Pankaj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120417/648c0454/attachment.html>


More information about the llvm-dev mailing list