[LLVMdev] GEP instructions: is it possible to reverse-engineer array accesses?

Gabriel Rodríguez grodriguez at udc.es
Tue Oct 18 10:00:55 PDT 2011


Dear All, 


As of late I am having a hard time getting my head around how array accesses 
are translated by Clang into LLVM IR: the often misunderstood GEP instruction. 
I am trying to reverse-engineer array accesses to discover the number of dimensions 
and actual indexes of the original, and I am beginning to wonder whether this is 
possible at all. To illustrate (some of) my troubles, consider the following code and 
the LLVM IR for both 32 and 64 bit memory addresses: 


-- 
original C: 



#define N 1000 


int main(int argc, char **argv) 
{ 
int i, k; 
float aux, A[N][N]; 


aux = A[k][i]; 
} 


-- 
32-bit addresses LLVM IR (relevant part): 



%4 = load i32* %i, align 4 
%5 = load i32* %k, align 4 
%6 = getelementptr inbounds [1000 x [1000 x float]]* %A, i32 0, i32 %5 
%7 = getelementptr inbounds [1000 x float]* %6, i32 0, i32 %4 
%8 = load float* %7 
store float %8, float* %aux, align 4 


-- 
64-bit addresses LLVM IR (relevant part): 



%4 = load i32* %i, align 4 
%5 = load i32* %k, align 4 
%6 = getelementptr inbounds [1000 x [1000 x float]]* %A, i32 0, i32 0 
%7 = sext i32 %5 to i64 
%8 = getelementptr inbounds [1000 x float]* %6, i64 %7 
%9 = load float* %8 
store float %9, float* %aux, align 4 


-- 




Why does the 64-bit addresses version use two leading 0s instead of one? I have tried reading 
http://llvm.org/docs/GetElementPtr.html and I don't think the explanation provided is accurate, or 
at least I can't see how to apply it to this particular case. 


Besides, there is an incredible diversity of variations in how arrays can be represented and accessed 
in C codes, leading to my final question: is it really possible to reverse-engineer array accesses? If so, 
any insights? 




Thanks in advance, and best regards, 
Gabriel 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111018/2bf869b2/attachment.html>


More information about the llvm-dev mailing list