[cfe-dev] [llvm-dev] [RFC][llvm-mca] Adding binary support to llvm-mca.

Mon Dec 17 20:37:41 PST 2018

Hi Andrea,

I’ve updated the patch here:
https://reviews.llvm.org/D54603/new/

I have not modified the description in the Phabricator.  Of course, if this solution is the way-forward, then I will be happy to update the change description.
I also modified the format of the symbol names (from what I previously describe in my last response below), to save room:  “.mca_code_region.<id>.<number>”.  Where “id” is the user’s specified identifier they want associated to the region (cosmetic), and “number” is a module unique number for the region.  I have tested this with non-executable objects, executable files, and multiple objects linking into a single executable.

Also, I am trimming this thread up (sorry, but it’s starting to get unwieldy and I think the mailing list bot is grumpy).

-Matt

From: Andrea Di Biagio <andrea.dibiagio at gmail.com>
Sent: Monday, December 17, 2018 10:12 AM
To: Davis, Matthew <Matthew.Davis at sony.com>
Cc: Simon Pilgrim <llvm-dev at redking.me.uk>; llvm-dev <llvm-dev at lists.llvm.org>; cfe-dev at lists.llvm.org
Subject: Re: [llvm-dev] [RFC][llvm-mca] Adding binary support to llvm-mca.

Hi Matt,

could you please update the associated patch too?

Thanks
-Andrea

On Mon, Dec 17, 2018 at 5:48 PM via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Adding the llvm-dev list, because my email client decides to remove certain lists when I reply-all... including the list that I intend to respond to.

> -----Original Message-----
> From: Davis, Matthew <Matthew.Davis at sony.com<mailto:Matthew.Davis at sony.com>>
> Sent: Monday, December 17, 2018 9:47 AM
> To: Davis, Matthew <Matthew.Davis at sony.com<mailto:Matthew.Davis at sony.com>>; llvm-dev at redking.me.uk<mailto:llvm-dev at redking.me.uk>
> Cc: cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
> Subject: RE: [llvm-dev] [RFC][llvm-mca] Adding binary support to llvm-mca.
>
> Just an update to this RFC.  (I thought this was going to be a short email... my
> apologies).
>
> One of the primary limitations described in this RFC (earlier in this thread), is that the
> patch for this RFC only handles linked executables.  This restriction is due to myself
> wanting to avoid handling relocations in a target specific manner, at least in the
> initial patch.  Especially so, since I want to keep the initial patch simple.  However,
> sometimes simple and practical are at odds with each other.  I envision the main use-
> case of llvm-mca, with binary support, is for analyzing .o files.  Probably analyzing .o
> more commonly than fully linked executables.  Without handling relocated objects,
> this patch seems rather useless.
>
> I started exploring an alternative solution this weekend to the aforementioned
> problem.  This alternative solution avoids having to handle relocations, but does give
> us support for object files (with relocated symbols) and linked executables.  The
> change is quite simple, and seems to be effective.  In short, we still generate
> intrinsics as discussed in the RFC, one to mark the start of a code region, and
> another to mark the end.  These intrinsics get lowered into local symbols.  The
> symbols are already  encoded with address information about their position in the
> object file.  What is different is that we ensure that these symbols have unique
> names and also encode the user provided ID value.
>
> Previously the labels  were named like: .Lmca_code_region_start_<number>, and
> similar for .Lmca_code_region_end.  The user id number and region size were
> encoded in the .mca_code_regions object file section.  Previously mca never looked
> at the symbol table.  But, In reality we can calculate the region size by using the
> symbols in the symbol table (look for the mca symbols), instead of relying on the
> information encoded in .mca_code_regions.  The alternative approach gets rid of
> that section entirely but achieves the same functionality by encoding the information
> in the symbol name. In short the alternative approach just parses the symbol table
> for MCA symbols, and the symbol names are encoded with the data we need.
>
> The newly proposed name is formatted
> as: .Lmca_code_region_start.<id>.<function>.<number>.  Similar for
> mca_code_region_end.  'function' is the function that the marker appears in, 'ID' is
> the user-specified ID (this is a value that users specify for easily identifying the code
> region under analysis... just cosmetic), and 'number' is a unique number to avoid any
> duplicate name conflicts.  The benefit of this alternative solution is that we can get
> rid of .mca_code_regions, and gather all of the information llvm-mca needs by
> parsing the symbol table looking for any symbols with the 'mca_code_region_start'
> and 'mca_code_region_end' format discussed above.  Of course, if the string table is
> stripped, then we will lose this data.  The main drawback from this alternative
> approach is that it relies on encoding symbol names and string processing on those
> names.   I'm somewhat biased against doing string parsing, but the code to perform
> this is simple and small, and more importantly it allows llvm-mca to handle linked or
> relocated object files.
>
> -Matt
>
>
>
> > -----Original Message-----
> > From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of via llvm-dev
> > Sent: Tuesday, December 11, 2018 11:23 AM
> > To: llvm-dev at redking.me.uk<mailto:llvm-dev at redking.me.uk>; llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
> > Cc: cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
> > Subject: Re: [llvm-dev] [RFC][llvm-mca] Adding binary support to llvm-mca.
> >
> > Thanks for the response Simon.  My reply is inline:
> >
> > > From: Simon Pilgrim <llvm-dev at redking.me.uk<mailto:llvm-dev at redking.me.uk>>
> > > Sent: Monday, December 10, 2018 1:40 PM
> > >
> > > Hi Matt,
> > >
> > > I can see a near future where perf-analysis tooling uses branch history
> > > profiler captures to determine how often loops/branches are taken and
> > > feeds that into llvm-mca, especially for hot/branchy loop analysis
> > > reports etc. Are you confident that your approach will be easily
> > > extendable for this?
> >
> > That is a very interesting use case.  The restriction of a code-region to a single
> block
> > is a limitation for any tools that want to analyze branches.  However, I believe
> > that it will be easy to lift this restriction (it's just a check in IR/Verifier).  This
> > limitation is not
> > expressed in the llvm-mca driver.
> >
> > If the information is coming from a profile report, then we'd most
> > likely need to extend the llvm-mca driver to accept profile reports.  Currently,
> > code regions, from the perspective of the  llvm-mca driver, are very simple. They
> are
> > just a collection of MCInst.  The binary support in this RFC+patch
> > disassembles just the address range from marker start address for
> > some specified number of bytes.  It might be useful to add another driver argument
> > so that a user (or tool) can specify, from the command line, a range of instructions
> > to
> > analyze.  I recently added a class for handling inputs to
> > llvm::mca::CodeRegionGenerator,
> > which is just responsible for taking some input and creating a list of MCInst that
> llvm-
> > mca uses.
> > We could subclass this to handle profile reports.
> >
> > > Similarly, being able to generally embed the profile markers in object
> > > libraries for reuse is going to be important for some people - I'd like
> > > to see more of a plan of how this will be achieved. I understand that it
> > > might not be easy for some exe formats.
> >
> > That is definitely a limitation.  This initial patch+RFC only handles linked
> > executables (i.e., the llvm-mca marker symbol addresses are resolved).
> > I'm working on a better solution so that this will not be a restriction.
> > In fact, I'll probably delay trying to land any patches until I solve relocations
> > (or use a different solution for identifying start/end addresses for llvm-mca code
> > regions).
> >
> > > Sorry if I'm being too critical, but I'm a bit worried that we end up
> > > with an initial implementation that will take a lot of reworking to meet
> > > our final aims.
> > >
> > > Thanks, Simon.
> >
> > I understand your criticisms and value your input. Thanks a ton!
> >
> > -Matt,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181218/f4859425/attachment.html>