[cfe-dev] [llvm-dev] [RFC][llvm-mca] Adding binary support to llvm-mca.

Guillaume Chatelet via cfe-dev cfe-dev at lists.llvm.org
Mon Dec 10 07:15:46 PST 2018


+1 to what Clement said.
I believe the intrinsics are a better design to support many architectures.

IACA users are probably decorating their code with IACA_START / IACA_END
macros. One possibility is to provide a header that define these macros in
terms of the new intrinsics.

On Mon, Dec 10, 2018 at 3:59 PM Clement Courbet <courbet at google.com> wrote:

> Hi Matt/Andrea,
>
> I see pros and cons for IACA-style markers vs intrinsics.
> On the one hand, IACA-style markers are very magical, and not very visible
> in both the source and object code. Using IACA-style markers has the
> advantage that you can use llvm-mca as a drop-in replacement for IACA, or
> even to compare their outputs on the exact same binary. They also do not
> require tooling on the compiler side and allow comparing the output of
> several compilers.
>
On the other hand, IACA-style markers do not have a equivalent on other
> architectures, and I'm not sure inventing new ones is a good idea :) I
> think the latter makes them pretty much a no-go for llvm-mca as I don't
> think we'll want to teach each target how to parse code regions. That's
> much better handled in a target-agnostic way by the object. Intel got away
> with them because they only had to support one architecture.
>
> tl;dr: In the case of llvm-mca, I like your design better than the markers.
>
> In terms of future-proofness of only allowing regions within a basic
> block, are we confident we can actually ever simulate branches apart from
> "always taken, perfectly predicated" loop ? Even this simple need requires
> knowing quite a few details on the frontend. The current design could
> handle this use case with the addition of an external "loop mode" option to
> MCA. If there are no other strong use cases, I would advocate for
> experimental intrinsics unless people can contribute other example use
> cases.
>
> On Mon, Dec 3, 2018 at 11:38 PM Matt Davis <matthew.davis at sony.com> wrote:
>
>> Hi Andrea,
>>
>> On Mon, Dec 03, 2018 at 01:21:33PM +0000, Andrea Di Biagio wrote:
>> > So, I have been thinking a bit more about this whole design.
>> >
>> > The more I think about your suggested design, the more I am convinced
>> that
>> > we should do something more to support ranges in binary object files
>> too.
>> > My understanding is that the reason why we don't support object files in
>> > general, is because of the presence of relocations. That is because a
>> > region start marker is effectively symbol relative, and the symbol (a
>> > function) would be relocated in the final executable.
>> > You mentioned to me that resolving even a 'simple' symbol-relative
>> > relocation is not trivial, beause it requires specific knowledge about
>> the
>> > binary format, and the target (i.e. how relocations are encoded is
>> target
>> > specific). I am surprised that there is not a utility library for
>> resolving
>> > relocations.. but I am not familiar with that part of the compiler. I
>> was
>> > hoping that there was a target specific interface to use in this case...
>>
>> There might be a better way of resolving the relocs, but from what I saw
>> looking at llvm-objdump and other related tools, it seems that resolving
>> the relocated symbol is a target specific effort.  I also spent sometime
>> sniffing around ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp which also
>> performs the reloc resolution.  I should clarify that I too am not an
>> expert in llvm's utilities for performing symbol/reloc resolution, and
>> perhaps someone in the community can point me in the right direction.  I
>> can clearly see the reloc data in the object file via tools like
>> objdump; however, accessing the relocs via
>> llvm::object::ObjectFile::relocations() did not produce address values
>> that we could use (values of zero).
>>
>> I was hoping that, for a first pass at this patch, supporting just
>> executables would be okay.  That keeps this initial patch set simple,
>> and hopefully will encourage others to take a peek at it, since it's
>> less daunting than what it might otherwise be.  Of course, there is the
>> concern that this initial patch will lock us into a design that will be
>> more complicated to unravel later.
>>
>> > An alternative approach would require that you define your own
>> > "symbol-relative" reference. After all, ranges are just a sequences of
>> > instructions in a function. If a function symbol is described by the
>> symbol
>> > table, then you should be able to obtain its offset in the .text
>> section.
>> > So, you could potentially encode your own symbol+offset. However, the
>> > linker would not be able to understand your "custom relocation", and
>> > information about regions in the final elf would be basically broken.
>> > So,that would not be a solution...
>> >
>> > I don't know honestly what is the best approach to use in this case.
>> > As a compromise, it would not be a bad idea to add the ability to
>> specify
>> > ranges from command line. What do you think?
>> > Still, from a user point of view, the idea that we don't support object
>> > files in general sounds like a big limitation.
>>
>> I agree, only supporting executables is a limitation.  However, I'd
>> like to land the base support now and add in the additional
>> features/support after this large patch set lands.  But I can see
>> where landing the whole thing entirely also makes sense.
>>
>> > About the new experimental intrinsics: those would definitely work well
>> for
>> > the simple case where instructions are from the same basic block.
>> > However, some/most of the constraints that you plan to add will have to
>> > change if in future we decide to allow ranges that potentially cross
>> > multiple basic blocks. How will the rules/constraints on those new
>> > intrinsics change? I just want to make sure that the suggested design is
>> > future-proof.
>>
>> Since the llvm/clang parts of the code are just responsible for
>> collecting where a range starts/ends, I hope that we can remove some
>> of the baked-in constraints that are specified in IR/Verifier.cpp.
>> As you pointed out earlier in this thread, we might want to
>> introduce a dominance check if/when we lift the one-basic-block
>> restriction.
>>
>> -Matt
>>
>> >
>> > -Andrea
>> >
>> > On Tue, Nov 27, 2018 at 5:08 PM Andrea Di Biagio <
>> andrea.dibiagio at gmail.com>
>> > wrote:
>> >
>> > > Thanks for clarifying it Matt.
>> > >
>> > > In general, I quite like your suggested design.
>> > >
>> > > My only concern is about the semantic of the two new intrinsics. You
>> > > design doesn't allow mca ranges to span through multiple basic
>> blocks. That
>> > > constraint is acceptable for now, since llvm-mca doesn't know how to
>> deal
>> > > with control flow.
>> > > However, I am a bit concerned about what might happen in future if we
>> > > decide to let users specify code regions that span through multiple
>> basic
>> > > blocks. Basically, I don't particularly like the idea of changing the
>> > > semantic of already existing intrinsic. A design that already
>> accounts for
>> > > that particular scenario/future work would be ideal. That being said,
>> > > marking those new intrinsics as 'experimental' may be a good
>> compromise (at
>> > > least for now).
>> > >
>> > > So, I am quite happy overall with the direction of this RFC.
>> > > However, I am interesting to hear from other developers about your
>> > > suggested design.
>> > >
>> > > > This initial patch only targets ELF object files, and does not
>> handle
>> > > relocatable addresses. Since the start of a code region is
>> represented as
>> > > an
>> > > assembly label, and referenced in the .mca_code_regions section, that
>> > > address
>> > > is relocatable.
>> > >
>> > > This may be okay for now. However, it would be nice to remove that
>> > > constraint in future and add support to generic object files.
>> > >
>> > > -Andrea
>> > >
>> > > On Thu, Nov 22, 2018 at 7:21 PM <Matthew.Davis at sony.com> wrote:
>> > >
>> > >> I want to clarify a few restrictions of llvm-mca code regions that
>> this
>> > >> RFC proposes:
>> > >>
>> > >> 1) All llvm-mca code regions must start with an
>> > >> llvm.mca.code.region.start intrinsic and end with
>> > >> an llvm.mca.code.region.end intrinsic.  This rule is enforced at the
>> IR
>> > >> level in the IR verifier.
>> > >>
>> > >> 2) llvm-mca code regions cannot nest.  This restriction implies that
>> an
>> > >> llvm.mca.code.region.start
>> > >> must have a llvm.mca.code.region.end intrinsic without any other
>> llvm.mca
>> > >> start intrinsics
>> > >> between the two. The current implementation in the patch enforces
>> this
>> > >> restriction at the
>> > >> IR level via the IR Verifier.
>> > >>
>> > >> 3) An llvm-mca code region cannot span multiple basic blocks.
>> llvm-mca
>> > >> does not follow
>> > >> branches (yet).  Instead, a branch instruction is treated by llvm-mca
>> > >> like any other instruction.
>> > >> The current patch associated with this RFC does not enforce this
>> > >> restriction.  I plan on updating
>> > >> the patch to enforce that a code region can only belong to a single
>> basic
>> > >> block.  This is a simple
>> > >> check, ensuring that both the llvm.mca.code.region.start and
>> accompanying
>> > >> end intrinsics live
>> > >> in the same basic block. I imagine adding this check at the IR level
>> when
>> > >> we also verify points 1 and 2
>> > >> above.  That will keep the code-region verification logic isolated
>> to the
>> > >> IR verifier.  The start/end
>> > >> intrinsics should not have any uses, so I'm not sure that they would
>> be
>> > >> moved/sunk on behalf
>> > >> of any other instruction.  In other words, I do not imagine that a
>> start
>> > >> and end would be split
>> > >> apart due to later MI optimizations.  If I discover that such a case
>> > >> occurs, then I might add the
>> > >> basic-block check prior to emitting the code region data to the
>> object
>> > >> file.    Once  llvm-mca  is
>> > >> updated to handle branches, then we can remove this constraint.
>> > >>
>> > >> -Matt
>> > >>
>> > >> > -----Original Message-----
>> > >> > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Matt
>> > >> Davis via llvm-
>> > >> > dev
>> > >> > Sent: Wednesday, November 21, 2018 8:47 AM
>> > >> > To: Andrea Di Biagio <andrea.dibiagio at gmail.com>
>> > >> > Cc: llvm-dev <llvm-dev at lists.llvm.org>; Di Biagio, Andrea
>> > >> > <Andrea.Dibiagio at sony.com>; cfe-dev at lists.llvm.org
>> > >> > Subject: Re: [llvm-dev] [RFC][llvm-mca] Adding binary support to
>> > >> llvm-mca.
>> > >> >
>> > >> > Hi Andrea,
>> > >> >
>> > >> > Thanks for your input.
>> > >> >
>> > >> > On Wed, Nov 21, 2018 at 12:43:52PM +0000, Andrea Di Biagio wrote:
>> > >> > [... snip ...]
>> > >> > > About the suggested design:
>> > >> > > I like the idea of being able to identify code regions using a
>> numeric
>> > >> > > identifier.
>> > >> > > However, what happens if a code region spans through multiple
>> basic
>> > >> blocks?
>> > >> >
>> > >> > The current patch does not take into consideration cases where the
>> > >> > region start and end intrinsics are placed in different basic
>> blocks.
>> > >> > Such would be the case if a region is defined to span multiple
>> blocks.
>> > >> > This would be similar to the current case where a user places a
>> > >> > #LLVM-MCA-BEGIN assembly comment in one block and an #LLVM-MCA-END
>> in
>> > >> > another.  However, as you point out below, if the user does this
>> in the
>> > >> > source code via intrinsics (just what this patch is proposing),
>> then
>> > >> > there is a chance that optimizations might change the layout of the
>> > >> > instructions and confuse the ordering of the MCA intrinsics.
>> > >> >
>> > >> > Since MCA does not follow branches (MCA just treats a branch as it
>> would
>> > >> > a non-branching instruction), it seems that a user should be aware
>> that
>> > >> > defining MCA code regions that span multiple blocks might result
>> in an
>> > >> > unexpected analysis.  While we do not discourage this, it seems
>> like
>> > >> > such a case will probably not produce an expected result for the
>> user.
>> > >> > We could introduce a warning, or automatically divide the regions
>> so
>> > >> > that a single region can only contain a single block.
>> > >> >
>> > >> > > My understanding is that code regions are not allowed to
>> overlap. So,
>> > >> it
>> > >> > > makes sense if ` __mca_code_region_end()` doesn't take an ID as
>> input.
>> > >> > > However, what if ` __mca_code_region_end()` ends in a different
>> basic
>> > >> block?
>> > >> > >
>> > >> > > `__mca_code_region_start()` has to always dominate `
>> > >> > > __mca_code_region_end()`. This is trivial to verify when both
>> calls
>> > >> are in
>> > >> > > a same basic block; however, we need to make sure that the
>> > >> relationship is
>> > >> > > still the same when the `end()` call is in a different basic
>> block.
>> > >> > > That would not be enough. I think we should also verify  that `
>> > >> > > __mca_code_region_end()` always post-dominates the call to
>> > >> > > `__mca_code_region_start()`.
>> > >> >
>> > >> > In any case this patch should probably check dominance of the
>> > >> > intrinsics, even though MCA does not follow branches and MCA does
>> not
>> > >> > not explicitly forbid a region from containing multiple blocks.
>> > >> >
>> > >> > >
>> > >> > > My question is: what happens with basic block reordering? We
>> don't
>> > >> know the
>> > >> > > layout of basic blocks until we reach code emission. How does it
>> work
>> > >> for
>> > >> > > regions that span through multiple basic blocks?. I think your
>> RFC
>> > >> should
>> > >> > > clarify this aspect.
>> > >> > >
>> > >> > > As a side note: at the moment, llvm-mca doesn't know how to deal
>> with
>> > >> > > branches. So, for simplicity we could force code regions to only
>> > >> contain
>> > >> > > instructions from a single basic block.
>> > >> > >
>> > >> > > However, In future we may want to teach llvm-mca how to analyze
>> > >> branchy
>> > >> > > code too. For example, we could introduce a simple control-flow
>> > >> analysis in
>> > >> > > llvm-mca, and use an external "branch trace" information (for
>> > >> example, a
>> > >> > > perf trace generated by an external tool) to decorate branches
>> with
>> > >> with
>> > >> > > branch probabilities (similarly to what we currently do in LLVM
>> with
>> > >> PGO).
>> > >> > > We could then use that knowledge to model branch prediction and
>> > >> simulate
>> > >> > > what happens in the presence of multiple branches.
>> > >> > >
>> > >> > > So, the idea of having regions that potentially span multiple
>> basic
>> > >> blocks
>> > >> > > is not bad in general. However, I think you should better clarify
>> > >> what are
>> > >> > > the constraints (at least, you should answer to my questions from
>> > >> before).
>> > >> >
>> > >> > I agree! Thanks for pointing that out.
>> > >> >
>> > >> > > If we decide to use those new intrinsics, then those should be
>> > >> experimental
>> > >> > > (at least to start).
>> > >> >
>> > >> > Agreed.
>> > >> >
>> > >> > -Matt
>> > >> >
>> > >> > > On Thu, Nov 15, 2018 at 11:07 PM via llvm-dev <
>> > >> llvm-dev at lists.llvm.org>
>> > >> > > wrote:
>> > >> > >
>> > >> > > > Introduction
>> > >> > > > -----------------
>> > >> > > > Currently llvm-mca only accepts assembly code as input. We
>> would
>> > >> like to
>> > >> > > > extend llvm-mca to support object files, allowing users to
>> analyze
>> > >> the
>> > >> > > > performance of binaries. The proposed changes (which involve
>> both
>> > >> > > > clang and llvm) optionally introduce an object file section,
>> but
>> > >> this can
>> > >> > > > be
>> > >> > > > stripped-out if desired.
>> > >> > > >
>> > >> > > > For the llvm-mca binary support feature to be useful, a user
>> needs
>> > >> to tell
>> > >> > > > llvm-mca which portions of their code they would like analyzed.
>> > >> Currently,
>> > >> > > > this is accomplished via assembly comments. However, assembly
>> > >> comments are
>> > >> > > > not
>> > >> > > > preserved in object files, and this has encouraged this RFC.
>> For the
>> > >> > > > proposed
>> > >> > > > binary support, we need to introduce changes to clang and llvm
>> to
>> > >> allow the
>> > >> > > > user's object code to be recognized by llvm-mca:
>> > >> > > >
>> > >> > > > * We need a way for a user to identify a region/block of code
>> they
>> > >> want
>> > >> > > >    analyzed by llvm-mca.
>> > >> > > > * We need the information defining the user's region of code
>> to be
>> > >> > > > maintained
>> > >> > > >    in the object file so that llvm-mca can analyze the desired
>> > >> region(s)
>> > >> > > > from the
>> > >> > > >    object file.
>> > >> > > >
>> > >> > > > We define a "code region" as a subset of a user's program that
>> is
>> > >> to be
>> > >> > > > analyzed via llvm-mca. The sequence of instructions to be
>> analyzed
>> > >> is
>> > >> > > > represented as a pair: <start, end> where the 'start' marks the
>> > >> beginning
>> > >> > > > of
>> > >> > > > the user's source code and 'end' terminates the sequence. The
>> > >> instructions
>> > >> > > > between 'start' and 'end' form the region that can be analyzed
>> by
>> > >> llvm-mca
>> > >> > > > at a
>> > >> > > > later time.
>> > >> > > >
>> > >> > > > Example
>> > >> > > > -----------
>> > >> > > > Before we go into the details of this proposed change, let's
>> first
>> > >> look at
>> > >> > > > a
>> > >> > > > simple example:
>> > >> > > >
>> > >> > > > // example.c -- Analyze a dot-product expression.
>> > >> > > > double test(double x, double y) {
>> > >> > > >    double result = 0.0;
>> > >> > > >    __mca_code_region_start(42);
>> > >> > > >    result += x * y;
>> > >> > > >    __mca_code_region_end();
>> > >> > > >    return result;
>> > >> > > > }
>> > >> > > >
>> > >> > > > In the example above, we have identified a code region, in this
>> > >> case a
>> > >> > > > single
>> > >> > > > dot-product expression. For the sake of brevity and simplicity,
>> > >> we've
>> > >> > > > chosen
>> > >> > > > a very simple example, but in reality a more complicated
>> example
>> > >> could use
>> > >> > > > multiple expressions. We have also denoted this region as
>> number
>> > >> 42. That
>> > >> > > > identifier is only for the user, and simplifies reading an
>> llvm-mca
>> > >> > > > analysis
>> > >> > > > report later.
>> > >> > > >
>> > >> > > > When this code is compiled, the region markers (the
>> mca_code_region
>> > >> > > > markers)
>> > >> > > > are transformed into assembly labels. While the markers are
>> > >> presented as
>> > >> > > > function calls, in reality they are no-ops.
>> > >> > > >
>> > >> > > > test:
>> > >> > > > pushq   %rbp
>> > >> > > > movq    %rsp, %rbp
>> > >> > > > movsd   %xmm0, -8(%rbp)
>> > >> > > > movsd   %xmm1, -16(%rbp)
>> > >> > > > .Lmca_code_region_start_0: # LLVM-MCA-START ID: 42
>> > >> > > > xorps   %xmm0, %xmm0
>> > >> > > > movsd   %xmm0, -24(%rbp)
>> > >> > > > movsd   -8(%rbp), %xmm0
>> > >> > > > mulsd   -16(%rbp), %xmm0
>> > >> > > > addsd   -24(%rbp), %xmm0
>> > >> > > > movsd   %xmm0, -24(%rbp)
>> > >> > > > .Lmca_code_region_end_0: # LLVM-MCA-END ID: 42
>> > >> > > > movsd   -24(%rbp), %xmm0
>> > >> > > > popq    %rbp
>> > >> > > > retq
>> > >> > > > .section        .mca_code_regions,"", at progbits
>> > >> > > > .quad   42
>> > >> > > > .quad   .Lmca_code_region_start_0
>> > >> > > > .quad   .Lmca_code_region_end_0-.Lmca_code_region_start_0
>> > >> > > >
>> > >> > > > The assembly has been trimmed to show the portions relevant to
>> this
>> > >> RFC.
>> > >> > > > Notice the labels enclose the user's defined region, and that
>> they
>> > >> > > > preserve the
>> > >> > > > user's arbitrary region identifier, the ever-so-important
>> region 42.
>> > >> > > >
>> > >> > > > In the object file section .mca_code_regions, we have noted the
>> > >> user's
>> > >> > > > region
>> > >> > > > identifier (.quad 42), start address, and region size. A more
>> > >> complicated
>> > >> > > > example can have multiple regions defined within a single
>> > >> .mca_code_regions
>> > >> > > > section. This section can be read by llvm-mca, allowing
>> llvm-mca to
>> > >> take
>> > >> > > > object files as input instead of assembly source.
>> > >> > > >
>> > >> > > > Details
>> > >> > > > ---------
>> > >> > > > We need a way for a user to identify a region/block of code
>> they
>> > >> want
>> > >> > > > analyzed
>> > >> > > > by llvm-mca. We solve this problem by introducing two
>> intrinsics
>> > >> that a
>> > >> > > > user can
>> > >> > > > specify, for identifying regions of code for analysis.
>> > >> > > >
>> > >> > > > The two intrinsics are: llvm.mca.code.regions.start and
>> > >> > > > llvm.mca.code.regions.end. A user can identify a code region by
>> > >> inserting
>> > >> > > > the
>> > >> > > > mca_code_region_start and mca_code_region_end markers. These
>> are
>> > >> simply
>> > >> > > > clang builtins and are transformed into the aforementioned
>> > >> intrinsics
>> > >> > > > during
>> > >> > > > compilation. The code between the intrinsics are what we call
>> "code
>> > >> > > > regions"
>> > >> > > > and are to be easily identifiable by llvm-mca; any code
>> between a
>> > >> start/end
>> > >> > > > pair can be analyzed by llvm-mca at a later time. A user can
>> define
>> > >> > > > multiple
>> > >> > > > non-overlapping code regions within their program.
>> > >> > > >
>> > >> > > > The llvm.mca.code.region.start intrinsic takes an integer
>> constant
>> > >> as its
>> > >> > > > only
>> > >> > > > argument. This argument is implemented as a metadata i32, and
>> is
>> > >> only used
>> > >> > > > when generating llvm-mca reports. This value allows a user to
>> more
>> > >> easily
>> > >> > > > identify a specific code region. llvm.mca.code.region.end
>> takes no
>> > >> > > > arguments.
>> > >> > > > Since we disallow nesting of regions, the first 'end' intrinsic
>> > >> lexically
>> > >> > > > following a 'start' intrinsic represents the end of that code
>> > >> region.
>> > >> > > >
>> > >> > > > Now that we have a solution for identifying regions for
>> analysis,
>> > >> we now
>> > >> > > > need a
>> > >> > > > way for preserving that information to be read at a later
>> time. To
>> > >> > > > accomplish
>> > >> > > > this we propose adding a new section (.mca_code_regions) to the
>> > >> object file
>> > >> > > > generated by llvm. During code generation, the start/end
>> intrinsics
>> > >> > > > described
>> > >> > > > above will be transformed into start/end labels in assembly.
>> When
>> > >> llvm
>> > >> > > > generates the object file from the user's code, these start/end
>> > >> labels
>> > >> > > > form a
>> > >> > > > pair of values identifying the start of the user's code
>> region, and
>> > >> size.
>> > >> > > > The
>> > >> > > > size represents the number of bytes between the start and end
>> > >> address of
>> > >> > > > the
>> > >> > > > labels. Note that the labels are emitted during assembly
>> printing.
>> > >> We hope
>> > >> > > > that these labels have no influence on code generation or
>> > >> basic-block
>> > >> > > > placement. However, the target assembler strategy for handling
>> > >> labels is
>> > >> > > > outside of our control.
>> > >> > > >
>> > >> > > > This proposed change affects the size of a binary, but only if
>> the
>> > >> user
>> > >> > > > calls
>> > >> > > > the start/end builtins mentioned above. The additional size of
>> the
>> > >> > > > .mca_code_regions section, which we imagine to be very small
>> (to
>> > >> the order
>> > >> > > > of a
>> > >> > > > few bytes), can trivially be stripped by tools like 'strip' or
>> > >> 'objcopy'.
>> > >> > > >
>> > >> > > > Implementation Status
>> > >> > > > ------------------------------
>> > >> > > > We currently have the proposed changes implemented at the url
>> > >> posted below.
>> > >> > > > This initial patch only targets ELF object files, and does not
>> > >> handle
>> > >> > > > relocatable addresses. Since the start of a code region is
>> > >> represented as
>> > >> > > > an
>> > >> > > > assembly label, and referenced in the .mca_code_regions
>> section,
>> > >> that
>> > >> > > > address
>> > >> > > > is relocatable. That value can be represented as
>> section-relative
>> > >> > > > relocatable
>> > >> > > > symbol (.text + addend), but we are not handling that case yet.
>> > >> Instead,
>> > >> > > > the
>> > >> > > > proposed changes only handle linked/executable object files.
>> > >> > > >
>> > >> > > > For purposes of review and to communicate the idea, the change
>> is
>> > >> > > > presented as a monolithic patch here:
>> > >> > > >
>> > >> > > > https://reviews.llvm.org/D54603
>> > >> > > >
>> > >> > > > The change is presented as a monolithic patch; however, if
>> accepted
>> > >> > > > the patch will be split into three smaller patches:
>> > >> > > > 1. The introduction of the builtins to clang.
>> > >> > > > 2. The llvm portion (the added intrinsics).
>> > >> > > > 3. The llvm-mca portion.
>> > >> > > >
>> > >> > > > Thanks!
>> > >> > > >
>> > >> > > > -Matt
>> > >> > > > _______________________________________________
>> > >> > > > LLVM Developers mailing list
>> > >> > > > llvm-dev at lists.llvm.org
>> > >> > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> > >> > > >
>> > >> > _______________________________________________
>> > >> > LLVM Developers mailing list
>> > >> > llvm-dev at lists.llvm.org
>> > >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> > >>
>> > >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181210/a7947a16/attachment-0001.html>


More information about the cfe-dev mailing list