[llvm-dev] LLVM as a back end for HHVM

Tue Sep 8 09:35:00 PDT 2015

On 09/04/2015 11:36 AM, Brett Simmers via llvm-dev wrote:
> On 9/4/15 1:12 AM, Sanjoy Das via llvm-dev wrote:
>> Specifically on "Location records" --
>>
>> Is it legal for the optimizer to drop the `!locrec` metadata that you
>> attach to instructions?  The general convention in LLVM is that
>> dropping metadata should not affect correctness, and if the location
>> record information is not best-effort or optional then metadata is
>> perhaps not the best way to represent it.
>
> Unfortunately not - all of our uses of locrecs are required for 
> correctness.
This will need to be a function attribute or operand bundle when 
upstreamed then, but that's a pretty simple change to make.
>
>>
>> We're currently developing a scheme called "operand bundles" (more
>> details at [1], patches at [2]) that can be used to tag calls and
>> invokes with arbitrary values in a way that that they won't be dropped
>> by the optimizer.  The exact semantics of operand bundles are still
>> under discussion, and I'd like to make mechanism broadly useful,
>> including for things like location records.  So it'd be great if you
>> take a look to see if location records can be implemented using
>> operand bundles.  I was thinking of something like
>>
>>    musttail call void @foo(i64 %val) [ "locrec"(i32 42) ]
>>
>> where the "locrec"(i32 42) gets lowered to some suitable form in
>> SelectionDAG.
>
> That sounds like it should work. One of the ideas behind locrecs was 
> that they'd work with any instruction, not just call. We currently 
> only use locrecs on call/invoke, and I can't think of anything we 
> haven't yet implemented that would benefit from locrecs on other 
> instructions (that may change in the future, of course).
Interesting.  What type of use cases are you imagining for locrecs on 
non-call instructions?  Are you thinking of things like implicit null 
and div-by-zero checks?  (The former is already supported in LLVM 
today.)  Or something else entirely?
>
>> I'm also curious about HHVM's notion of side exits -- how do you track
>> the abstract state to which you have to exit to?  Our primary use-case
>> for operand bundles is to track the abstract state of a thread (the
>> "interpreter state") that we need for side exits and asynchronous code
>> invalidation.
>
> All VM state syncing for side exits is explicit in the IR we lower to 
> LLVM (as a series of stores on an unlikely path), so we don't need 
> anything special from LLVM here. We use locrecs to update our jump 
> smashing stubs, so they know the address of the jump that entered the 
> stub and should be smashed.
On the VM state synching side of things, I'm curious about the tradeoffs 
involved in the approach you've used.  I'm guessing from what you said 
that you're essentially pre-reserving a set of allocas for the spill 
locations, somehow registering those with your runtime once, then 
emitting stores down the unlikely path into those allocas.  Is that 
roughly right?

How are you handling things like constants and duplicate values 
appearing in the VM state?  Our experience has been that constants are 
fairly common and so are duplicate values (particularly when combined 
with GC state).  It would seem like your frame sizes would be inflated 
if you had to pre-reserve space for each constant and each copy of a 
value.  Have you found this to be true?  If so, has it been problematic 
for you?

Philip