[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)

Fri Jul 17 13:16:54 PDT 2015

> On Jul 14, 2015, at 11:02 PM, Philip Reames <listmail at philipreames.com> wrote:
> 
> Comments inline.  I tried to reply to particular points of discussion downthread, so this is probably best read last.  
> 
> On 07/09/2015 02:04 PM, Juergen Ributzka wrote:
>> Hi @ll,
>> 
>> over the past year we gained more experience with the patchpoint/stackmap/statepoint intrinsics and it exposed limitations in the stackmap format.
>> The following proposal includes feedback and request from several interested parties and I would like to hear your feedback.
>> 
>> Missing correlation between functions and stackmap records:
>> Originally the client had to keep track of the ID to know which stackmap record belongs to which function, but this would stop working once functions are inlined.
>> The new format fixes that by adding a direct reference from the function to the stackmap records.
> +1
>> 
>> Call Size and Function Size:
>> These are additional information that are of interest and have been added to the format.
>> @Swaroop: Could you please provide a little more detailed explanation on the "Call Size" field and what exactly is there recorded. Is it just the call instruction
>> or also the materialization code for the address? For what is this used for?
> +1 subject to details being settled
>> 
>> Flat format:
>> We think moving to a flat form will make parsing easier, because every record has a fixed size and offsets can be calculated easily. 
> I'm still not convinced this is actually a good idea, but I have no strong objection either. 
>> 
>> The plan is to also
>> provide parsers for both stackmap versions (there is already one for the first format in tree) and a corresponding C-API to make it easier for clients to
>> adopt the new format. There is no plan to drop the original format and we will continue to support both formats. I will ask for feedback on the C API in a
>> separate RFC. 
> I strongly feel that trying to support multiple versions of the file format in tree is a bad idea.  It's a distraction, support burden, and provides extremely little value.  What's the value in supporting a parser for a format that ToT will no longer be able to emit?  Anyone who is actually consuming the old format would have to be using the old version in process anyways.

I don’t think supporting multiple version will be that hard and I don’t plan to keep all current and future version in tree forever. But we should provide a reasonable long transition period that allows clients to transition from one version to another - basically one release cycle.

>   
> 
> (Possible exception: Is this intended to simplify the object caching model?  If so, have we ever given any guarantees about object caching in the JIT across versions?  That seems risky at best..)
> 
> I'm also opposed to the complexity of C API, but I'll defer that until the separate RFC is actually posted.  
>> 
>> Another benefit we hope to achieve from this format is to optimize for size by uniquing entries - but that is independent optimization and not required.
>> 
>> More detailed frame record:
>> Clients require more information about the function frame, such as spilled registers, etc. The frame base register i.e. might change when dynamic stack
>> realignment is performed on X86.
>> 
>> 
>> If there is anything missing please let me know.
>> 
>> Thanks
>> 
>> Cheers,
>> Juergen
>> 
>> 
> Is the format Little endian, bit endian? native endian?

I would say whatever the object file format is.

>> Header v2 {
>>   uint8  : Stack Map Version (2)
>>   uint8  : Reserved [3] (0)
>>   uint32 : Constants Offset (bytes)
>>   uint32 : Frame Records Offset (bytes)
>>   uint32 : Frame Registers Offset (bytes)
>>   uint32 : StackMap Records Offset (bytes)
>>   uint32 : Locations Offset (bytes)
>>   uint32 : LiveOuts Offset (bytes)
> Let's reserve at least two uint32s for future field groups so that we can extend in a backwards compatible way.  Another option would be to steal one of the reserved bytes and require parsers to ignore sections they don't understand.  
> 
> Actually, so long as we document that there might be unknown bytes between Header.end and Constants.begin we should be fine.  It just needs to be *clearly* documented.

That was the idea. That way we could add new sections and old parsers would still work. They would just ignore those sections. Although with the addition of official parser in trees I am not sure how important this is anymore.

> 
> As an example, if we need to add the SymbolConstants section mentioned down thread, that shouldn't require an version change.

Once we release a LLVM version any change should require a version change.

>   
>> }
>> 
>> align to 8 bytes
>> Constants[] {
>>   uint64 : LargeConstant
>> }
>> 
>> align to 8 bytes
>> FrameRecord[] {
>>   uint64 : Function Address
>>   uint32 : Function Size
>>   uint32 : Stack Size
>>   uint16 : Flags {
>>     bool : HasFrame
> What does this mean?

If the function has a frame pointer.

>   
>>     bool : HasVariableSizeAlloca
>>     bool : HasStackRealignment
>>     bool : HasLiveOutInfo
>>     bool : Reserved [12]
> 
>>   }
>>   uint16 : Frame Base Register Dwarf RegNum
> I don't think this is actually sufficient for dynamic stack realignment.  Don't you end up with some things indexed off RSP and others indexed off RBP?  Actually, wait.. that's entirely handled within the record format.  What does tracking the Frame Base Register give us?  I think I'm missing a use case here.
>>   uint16 : Num Frame Registers
>>   uint16 : Frame Register Index
> Are these (begin, begin+length) offsets into the Frame Registers table?  If so, should we swap them?

Sure

>>   uint16 : Num StackMap Records
>>   uint16 : StackMap Record Index
>> }
>> 
>> align to 4 bytes
>> FrameRegister[] {
>>   uint16 : Dwarf RegNum
>>   int16  : Offset
>>   uint8  : Size in Bytes
>>   uint8  : Flags {
>>     bool : IsSpilled
>>     bool : Reserved [7]
>>   }
>> }
> This seems tied specifically to the assumption that registers are spilled on entry.  Given the recent work towards shrink wrapping, is that something we want to do?
> 
> p.s. What's with the IsSpilled bit?  Why would you have a record for an unspilled register?

This was done before shrink wrapping and pristine register support. The original problem was that the live-out set didn’t contain pristine registers and we have to add all callee-saved register to the live-out set too to make sure we didn’t miss anything. We could do better by recording which callee-save registers actually got spilled and which not.

>> 
>> align to 8 bytes
>> StackMapRecord[] {
>>   uint64 : PatchPoint ID
> "ID" (remove the patchpoint since statepoint now has one too)
sure

>>   uint32 : Instruction Offset
>>   uint8  : Call size (bytes)
> "Patchable size" per down thread discussion
>>   uint8  : Flags {
>>     bool : HasLiveOutInfo
>>     bool : Reserved [7]
>>   }
> Why do we need these flags?  Isn't this Num LiveOuts == 0?

A verification that the information was actually computed - we might have 0 live-outs too. We could also change this in the future to attach this information only to call-sites that have an attribute that request this information.

>>   uint16 : Num Locations
>>   uint16 : Location Index
>>   uint16 : Num LiveOuts
>>   uint16 : LiveOut Index
>> }
>> 
>> align to 4 bytes
>> Location[] {
>>   uint8  : Register | Direct | Indirect | Constant | ConstantIndex
>>   uint8  : Reserved (location flags)
>>   uint16 : Dwarf RegNum
>>   int32  : Offset or SmallConstant
>> }
>> 
>> align to 2 bytes
>> LiveOuts[] {
>>   uint16 : Dwarf RegNum
>>   uint8  : Reserved
>>   uint8  : Size in Bytes
>> }
>> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/6a9608d7/attachment.html>