[LLVMdev] RFC: SDNode Flags
Dan Gohman
gohman at apple.com
Mon Aug 3 10:38:47 PDT 2009
On Aug 3, 2009, at 9:30 AM, David Greene wrote:
> On Saturday 01 August 2009 15:12, Dan Gohman wrote:
>
>> LoadSDNode, which inherits from MemSDNode is the largest
>> SDNode. With the current SDNode allocation strategy, making it
>> bigger will increase the allocation needed for all nodes.
>
> Ok.
>
>
>>> new (N) LoadSDNode(..., isVolatile|isNonTemporal);
>>>
>>> Thoughts?
>>
>> This sounds reasonable. I'd suggest using names like MemRefFlags
>> rather
>> than isVolatileisNonTemporal, but that's just a detail :-).
>
> There's a pipe there. :)
Oops. I blame my font ;-).
>
>> My biggest concern is how this is encoded in the SDNode. It'd be
>> good to avoid making MemSDNode bigger, but there are a variety of
>> ways
>> that exiting bits can be made available. NodeType doesn't need all
>> 16
>> of
>> its bits, for example, and OperandsNeedDelete could be merged with
>> OperandList with a PointerIntPair if needed.
>
> *shudder* Undefined behavior? No thanks.
What undefined behavior? Is it PointerIntPair that's making you
shudder?
That's implementation-defined behavior. Very different :-). And it's
checked
by asserts.
>
> Right now, the lower five bits of SubclassData are used to encode
> various
> things for memory SDNodes. One of those is the volatile bit. This
> leaves
> 10 bits for alignment information, meaning we can represent
> alignments up to
> 2^1024, which seems like a bit much. :)
>
> What do you think about carving four more bits out of the alignment
> field
> so we have five flag/semantic bits and can represent alignments up
> to 2^64?
> That seems like plenty of alignment headroom.
Yes, that's fine too. If someone really wanted an alignment greater
than 1<<64, they would probably have other issues that would dwarf
this one.
>
>> How are you representing movnt in LLVM IR, with an intrinsic?
>
> Yes, it's an intrinsic inserted just before the load or store of
> interest.
> I'm not particularly happy with this solution because I'm not sure all
> transformation passes will preserve the ordering of
> intrinsic-followed-by-operation. Is there a better way to do this?
Crazy idea: how about an intrinsic just after the load or store of
interest?
It would be a sort of opposite of prefetch, saying "i'm not going to be
using the memory at this address for a while". It could use
IntrWriteArgMem like the regular prefetch does, which is usually
sufficient to keep it from wandering off, though it is a little
stronger than
is strictly necessary.
The other alternative is something like llvm.x86.sse.movnt.ps and
friends.
Dan
More information about the llvm-dev
mailing list