[LLVMdev] all LLVM Instructions that may write to memory -- other than StoreInst?

John Criswell criswell at illinois.edu
Fri Jan 21 15:22:11 PST 2011


On 1/21/11 5:00 PM, Chuck Zhao wrote:
> John,
>
> Thanks for the reply.
> I agree with your comments that the "Memory" LLVM Spec refers to 
> doesn't include stack.

It includes stack objects (memory allocated by the alloca instruction) 
but not the stack frame (e.g., spill slots).

>
> Let me leverage a bit further:
>
> If I need to work on high-level IRs (not machine dependent, not in the 
> code-gen stage), is it reasonable to assume that
> ALL LLVM IRs that have a result field will have potential to write stack?

Strictly speaking, I would go so far as to assume that any LLVM IR 
instruction can write to the stack frame.

>
>
> E.g.
>    <result>  = add<ty>  <op1>,<op2>           /; yields {ty}:result/
>
>    br i1<cond>, label<iftrue>, label<iffalse>
>    br label<dest>           /; Unconditional branch/
>
> ADD can (potential) write stack to store its result, while BR will 
> NEVER write stack because its doesn't have a result.

You might be able to get away with this on some platforms.  However, you 
can't assume this in general; the LLVM IR makes no guarantees at all 
about which instructions read and write the stack frame and which do 
not.  The branch could load its argument from the stack frame or from a 
global value pool.  On a VLIW machine, it could be packed into an 
instruction that also contains a read/write from/to the stack frame.  
Maybe the processor only supports indirect branch instructions.

Whether you want to count on LLVM IR branches writing to the stack 
depends on what hardware architecture you're using and what you're 
doing.  If you're counting memory accesses for a heuristic only on x86, 
then assuming branches don't write to memory seems like a reasonable 
assumption.  If you need an accurate count on all supported platforms, 
I'd look into analyzing the generated machine code.

-- John T.

>
>
> Thank you
>
> Chuck
>
>
>
>
> On 1/21/2011 5:33 PM, John Criswell wrote:
>> On 1/21/11 2:50 PM, Chuck Zhao wrote:
>>> I need to figure out all LLVM Instructions that may write to memory.
>>>
>>> In http://llvm.org/docs/tutorial/OCamlLangImpl7.html, it mentions that
>>> "In LLVM, all memory accesses are explicit with load/store 
>>> instructions, and it is carefully designed not to have (or need) an 
>>> "address-of" operator."
>>>
>>> I take this as "StoreInst is the only one that writes to memory".
>>
>> There are intrinsic functions which write to memory also, such as memcpy.
>>>
>>> However, this doesn't seem to be enough.
>>
>> Your observation is correct.  Strictly speaking, any instruction can 
>> write to memory after code generation because it may access a stack 
>> spill slot or a function parameter which the ABI places on the stack.
>>
>> When the Language Reference Manual talks about writing to memory, it 
>> is talking about writing to memory that is visible at the LLVM IR 
>> level.  The stack frame is invisible at the LLVM IR level.  Put 
>> another way, "memory" is a set of memory locations which can be 
>> explicitly accessed by LLVM load and store instructions and are not 
>> in SSA form; it is not all of the memory within the computer.
>>
>> If you're interested in finding instructions that write to RAM 
>> (including writes to stack spill slots), it may be better to work on 
>> Machine Instructions within the code generator framework.
>>
>> -- John T.
>>
>>
>>>
>>> Consider:
>>> ...
>>> int a, b, d;
>>> d = a + b;
>>> ...
>>>
>>> The above code is turned into LLVM IR:
>>>    %0 = load i32* @a, align 4
>>>    %1 = load i32* @b, align 4
>>>    %2 = add nsw i32 %1, %0
>>>    store i32 %2, i32* @d, align 4
>>>
>>> Is it possible that temps such as %0, %1 and/or %2 will NOT being register allocated later in the compilation stage, and thus left in memory?
>>>
>>> The above code, when converted back to C level, looks like this:
>>> ...
>>>    unsigned int llvm_cbe_tmp__6;
>>>    unsigned int llvm_cbe_tmp__7;
>>>    unsigned int llvm_cbe_tmp__8;
>>>    unsigned int llvm_cbe_tmp__9;
>>>
>>>    llvm_cbe_tmp__6 = *(&a);
>>>    llvm_cbe_tmp__7 = *(&b);
>>>    llvm_cbe_tmp__8 = ((unsigned int )(((unsigned int )llvm_cbe_tmp__7) + ((unsigned int )llvm_cbe_tmp__6)));
>>>    *(&d) = llvm_cbe_tmp__8;
>>>    llvm_cbe_tmp__9 =  /*tail*/ printf(((&_OC_str.array[((signed int )0u)])), llvm_cbe_tmp__8);
>>> ...
>>>
>>> It seems the compiler-generated temps are _actually_ left on stack, and writes to them are actually writes to stack memory (via load, add, ...).
>>>
>>>
>>>
>>> I am confused here.
>>> Could somebody help to clarify it?
>>>
>>> Thank you
>>>
>>> Chuck
>>>
>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110121/8b9082b7/attachment.html>


More information about the llvm-dev mailing list