[PATCH] IR: Move the slot tracker out of AsmWriter into a separate public module.

Wed Jun 17 15:51:07 PDT 2015

> On 2015 Jun 17, at 15:44, Sean Silva <chisophugis at gmail.com> wrote:
> 
> 
> 
> On Wed, Jun 17, 2015 at 2:15 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> > On 2015 Jun 17, at 14:03, Alex L <arphaman at gmail.com> wrote:
> >
> >
> >
> > 2015-06-17 13:52 GMT-07:00 Duncan P. N. Exon Smith <dexonsmith at apple.com>:
> >
> > > On 2015 Jun 17, at 13:20, Alex Lorenz <arphaman at gmail.com> wrote:
> > >
> > > Hi dexonsmith, bob.wilson, bogner,
> > >
> > > This patch moves the SlotTracker class out of AsmWriter.cpp into a separate module that's publicly accessible.
> > >
> > > This patch would be useful for MIR Serialization, in particular it would enable the MIR parser to parse metadata machine operands. The metadata machine operands are serialized using the familiar '!' <slot> notation, and the MIR parser has to be able to map from slot numbers to the actual metadata nodes. The SlotTracker class would allow the MIRParser to create this mapping.
> >
> > I can see that this would be useful for *writing* .mir files, but I
> > don't think you can safely use this for *reading* .mir files.
> >
> > Metadata slots can be assigned arbitrarily in an LLVM IR file, such as:
> >
> >     !named = !{!36, !72}
> >     !72 = !{!"string"}}
> >     !36 = !{!72, !{!{}}}
> >
> > If you were to parse the module and then run the slot tracker, you'd get:
> >
> >     !named = !{!1, !2}
> >     !1 = !{!2, !3}
> >     !2 = !{!"string"}
> >     !3 = !{!4}
> >     !4 = !{}
> >
> > or something close to that.
> >
> > You couldn't safely take an already-parsed Module, run the slot-tracker
> > on it, and then parse machine functions that referenced metadata.  But
> > it sounds like that's what you're suggesting?
> >
> > This makes sense, yeah this patch wouldn't really work then.
> >
> >
> > Instead, I think you need to:
> >
> >  1. Yes, surface the slot tracker (exactly this patch), but for completely
> >     different reasons:  so that you can write out correct metadata numbers
> >     for metadata references within machine functions, to match the metadata
> >     that you wrote out for the LLVM IR.
> >  2. Use (1) so that the same slots are used when writing LLVM IR portion of
> >     MIR as the machine functions.
> >
> > I don't need to surface the slot tracker then, as I can print out the correct metadata slot numbers by printing the metadata nodes as operands. They create the slot tracker and initialize it for the whole module, so the correct slot numbers are printed.
> 
> Oof, that sounds expensive.  Actually, I know it is: I made it expensive,
> since previously it was just about useless :).
> 
> This'll be fine for hand-written testcases, but if someone is debugging
> some crash from a big input, the MIR will take O(N^2) to print (needs to
> make slots for all metadata every time something prints out a metadata
> machine operand).
> 
> Given that the main goal right now *is* testcases, maybe this is okay,
> but please add a FIXME to make it efficient as a follow-up.
> 
> FWIW, llvm-diff has the same problem (which makes it nearly useless for LTO issues). Maybe we could have some way for the module to cache this, with the proviso that you must not further modify the module without invalidating the result?
> 
> -- Sean Silva

I think that would be too bug prone.  Lots of passes dump things out in
DEBUG statements before/after making changes.

But there are some ideas I had a while ago that I never followed up on:

 1. Add API for a "lazy" numbering, which assigns slots in the order
    slots are requested.  This would make the numbering self-consistent
    within a given dump, but incomparable between separate dumps.
 2. Add API for passing in a SlotTracker.  Caller is in charge of only
    passing in the same slot tracker if it's still going to be valid.

Note that #2 easily solves the problem in MIR.  The trick is making it
easy to use elsewhere.