[llvm-dev] Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'

Fri Sep 15 12:16:44 PDT 2017

Hi JinGu,

The initial selection dag looks reasonable to me. Are you seeing a cannot
select error related to the extending load or does the assembly generated
fail to implement the semantics you expect?

Jon

On Fri, Sep 15, 2017 at 8:00 PM, via llvm-dev <llvm-dev at lists.llvm.org>
wrote:

> Send llvm-dev mailing list submissions to
>         llvm-dev at lists.llvm.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> or, via email, send a message with subject or body 'help' to
>         llvm-dev-request at lists.llvm.org
>
> You can reach the person managing the list at
>         llvm-dev-owner at lists.llvm.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of llvm-dev digest..."
>
>
> Today's Topics:
>
>    1. What should a truncating store do? (Jon Chesterfield via llvm-dev)
>    2. Re: Question about
>       'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'
>       (jingu at codeplay.com via llvm-dev)
>    3. DIVA - Debug Information Visual Analyser (Phil Camp via llvm-dev)
>    4. Re: Changes to 'ADJCALLSTACK*' and 'callseq_*' between LLVM
>       v4.0 and v5.0 (Serge Pavlov via llvm-dev)
>    5. Re: RFC: Trace-based layout. (Kyle Butt via llvm-dev)
>    6. Re: Question about
>       'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'
>       (Demikhovsky, Elena via llvm-dev)
>    7. Re: What should a truncating store do?
>       (Friedman, Eli via llvm-dev)
>    8. Re: What should a truncating store do?
>       (Jon Chesterfield via llvm-dev)
>    9. Re: What should a truncating store do?
>       (Friedman, Eli via llvm-dev)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 15 Sep 2017 13:49:48 +0100
> From: Jon Chesterfield via llvm-dev <llvm-dev at lists.llvm.org>
> To: llvm-dev <llvm-dev at lists.llvm.org>
> Subject: [llvm-dev] What should a truncating store do?
> Message-ID:
>         <CAOUYtQCN4KYLtmwmVjnCajsSfVKwSETAPZ1zaoYK9w=v3c26Tg at mail.
> gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> For example, truncating store of an i32 to i6. My assumption was that this
> should write the low six bits of the i32 to somewhere in memory.
>
> Should the top 24 bits of a corresponding 32 bit region of memory be
> unchanged, zero,  undefined?
>
> Should the two bits that would round the i6 up to a byte be preserved,
> zero, undefined?
>
> I can't write six bits directly so am trying to determine what set of
> bitwise ops to apply between a load and subsequent store to emulate the
> truncating store.
>
> Thanks!
>
> Jon
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.llvm.org/pipermail/llvm-dev/
> attachments/20170915/5b458bec/attachment-0001.html>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 15 Sep 2017 15:45:05 +0100
> From: "jingu at codeplay.com via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org>,
>         elena.demikhovsky at intel.com, daniel_l_sanders at apple.com
> Subject: Re: [llvm-dev] Question about
>         'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'
> Message-ID: <5fdb722e-2682-ee03-871b-0f00ed1b5909 at codeplay.com>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Can someone give the comment about it please?
>
> Thanks,
>
> JinGu Kang
>
>
> On 14/09/17 12:05, jingu at codeplay.com wrote:
> > Hi All,
> >
> > I have a question about splitting 'EXTRACT_VECTOR_ELT' with 'v2i1'. I
> > have a llvm IR code snippet as following:
> >
> > llvm IR code snippet:
> >
> > for.body:                                         ; preds = %entry,
> > %for.cond
> >   %i.022 = phi i32 [ 0, %entry ], [ %inc, %for.cond ]
> >   %0 = icmp ne <2 x i32> %vecinit1, <i32 0, i32 -23>
> >   %1 = extractelement <2 x i1> %0, i32 %i.022
> >   %vecext4 = extractelement <2 x i32> %vecinit1, i32 %i.022
> >   %vecext5 = extractelement <2 x i32> <i32 0, i32 -23>, i32 %i.022
> >   %cmp6 = icmp ne i32 %vecext4, %vecext5
> >   %cmp7 = xor i1 %1, %cmp6
> >
> > ...
> >
> > and the SelectionDAG before TypeLegalizer is like this.
> >
> >   t0: ch = EntryToken
> >   t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0
> >   t3: ch = ValueType:i32
> >       t5: i32,ch = CopyFromReg t2:1, Register:i32 %vreg1
> >     t7: i32 = AssertZext t5, ValueType:ch:i1
> >   t8: v2i32 = BUILD_VECTOR t2, t7
> >   t11: v2i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<-23>
> >   t15: i32,ch = CopyFromReg t0, Register:i32 %vreg2
> >           t22: i32 = add t15, Constant:i32<1>
> >         t24: ch = CopyToReg t0, Register:i32 %vreg3, t22
> >         t27: ch = CopyToReg t0, Register:i32 %vreg8, Constant:i32<-1>
> >       t31: ch = TokenFactor t24, t27
> >             t13: v2i1 = setcc t8, t11, setne:ch
> >           t16: i1 = extract_vector_elt t13, t15
> >             t17: i32 = extract_vector_elt t8, t15
> >             t18: i32 = extract_vector_elt t11, t15
> >           t19: i1 = setcc t17, t18, setne:ch
> >         t20: i1 = xor t16, t19
> >
> > ...
> >
> > I have not added any vector register class so 'DAGTypeLegalizer' tries
> > to split the "t16: i1 = extract_vector_elt t13, t15" because  t13's
> > result type is 'v2i1'. If the size of vector element is less than
> > 8bit, 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT()' function
> > extends the elements to 8bit and stores them on stack. Finally, the
> > function generates 'ExtLoad' to load specific element. But if the
> > element's size is less than 8bit, I think it could be wrong. It looks
> > it needs just 'Load' or "Load and Truncate" to match the result type
> > of 'EXTRACT_VECTOR_ELT'. How do you think about it? If I missed
> > something, please let me know.
> >
> > Thanks,
> >
> > JinGu Kang
> >
>
>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 15 Sep 2017 16:38:48 +0100
> From: Phil Camp via llvm-dev <llvm-dev at lists.llvm.org>
> To: llvm-dev at lists.llvm.org
> Subject: [llvm-dev] DIVA - Debug Information Visual Analyser
> Message-ID: <5b25cc76-bbd9-515c-b984-34a03dd1cd2a at flametop.co.uk>
> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>
> DIVA, the Debug Information Visual Analyser, was presented at the 2017
> European LLVM Developers Meeting
> (https://www.youtube.com/watch?v=SwtpXaCk2bE).
>
> The DIVA binaries have been available since March, I am pleased to
> announce that the source code is now available on GitHub.
> https://github.com/SNSystems/DIVA
>
> DIVA is a command line tool that processes DWARF debug information
> contained within ELF files and prints the semantics of that debug
> information. The DIVA output is designed to be understandable by
> software programmers without any low-level compiler or DWARF knowledge;
> as such, it can be used to report debug information bugs to the compiler
> provider. DIVA's output can also be used as the input to DWARF tests, to
> compare the debug information generated from multiple compilers, from
> different versions of the same compiler, from different compiler
> switches and from the use of different DWARF specifications (i.e. DWARF
> 3, 4 and 5). DIVA will be used on the LLVM project to test and validate
> the output of clang to help improve the quality of the debug experience.
>
> Phil Camp
>
> SN Systems
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.llvm.org/pipermail/llvm-dev/
> attachments/20170915/c02ff23f/attachment-0001.html>
>
> ------------------------------
>
> Message: 4
> Date: Fri, 15 Sep 2017 23:39:44 +0700
> From: Serge Pavlov via llvm-dev <llvm-dev at lists.llvm.org>
> To: "Martin J. O'Riordan" <MartinO at theheart.ie>
> Cc: LLVM Developers <llvm-dev at lists.llvm.org>
> Subject: Re: [llvm-dev] Changes to 'ADJCALLSTACK*' and 'callseq_*'
>         between LLVM v4.0 and v5.0
> Message-ID:
>         <CACOhrX4VSKtYBubv9q5kFd=btSWe5k6eEQSOYEo8c4uB2O27Rw at mail.
> gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Martin,
>
> Pseudo CALLSEQ_START was changed in r302527, commit message contains
> details on the changes.
> However CALLSEQ_END was not modified. If your made changes to
>  ADJCALLSTACKUP to add
> additional argument, that may result in error.
>
> Thanks,
> --Serge
>
> 2017-09-15 19:09 GMT+07:00 Martin J. O'Riordan via llvm-dev <
> llvm-dev at lists.llvm.org>:
>
> > Hi LLVM-Devs,
> >
> > I have managed to complete updating our sources from LLVM v4.0 to v5.0,
> but
> > I am getting selection errors for 'callseq_end'.  I am aware that the
> > 'ADJCALLSTACKUP' and 'ADJCALLSTACKDOWN' patterns have changed, and have
> > added an additional argument to the TD descriptions for these.
> >
> > There are interactions with 'ISD::CALL' and 'ISD::RET_FLAG', but so far
> as
> > I
> > can tell I have revised these in the same way as the in-tree targets have
> > adjusted their sources.
> >
> > The error I am seeing is:
> >
> >   fatal error: error in backend: Cannot select: 0x15c9bbe00: ch,glue =
> > callseq_end 0x15c9bbd98, TargetConstant:i32<0>,
> > TargetGlobalAddress:i32<void
> > (i8*, i32, i8*, i8*)* @__assert_func> 0, 0x15c9bbd98:1
> >     0x15c9bb920: i32 = TargetConstant<0>
> >     0x15c9bb8b8: i32 = TargetGlobalAddress<void (i8*, i32, i8*, i8*)*
> > @__assert_func> 0
> >     0x15c9bbd98: ch,glue = MYISD::CALL 0x15c9bbcc8,
> > TargetGlobalAddress:i32<void (i8*, i32, i8*, i8*)* @__assert_func> 0,
> > Register:i32 %I18, Register:i32 %I17, Register:i32 %I16, Register:i32
> %I15,
> > RegisterMask:Untyped, 0x15c9bbcc8:1
> >       0x15c9bb8b8: i32 = TargetGlobalAddress<void (i8*, i32, i8*, i8*)*
> > @__assert_func> 0
> >       0x15c9bb9f0: i32 = Register %I18
> >       0x15c9bbac0: i32 = Register %I17
> >       0x15c9bbb90: i32 = Register %I16
> >       0x15c9bbc60: i32 = Register %I15
> >       0x15c9bbd30: Untyped = RegisterMask
> >       0x15c9bbcc8: ch,glue = CopyToReg 0x15c9bbbf8, Register:i32 %I15,
> > 0x15c9bb718, 0x15c9bbbf8:1
> >         0x15c9bbc60: i32 = Register %I15
> >         0x15c9bb718: i32,ch,glue = CopyFromReg 0x15c9bb648:1,
> Register:i32
> > %vreg2, 0x15c9bb648:1
> >           0x15c9bb6b0: i32 = Register %vreg2
> >         0x15c9bbbf8: ch,glue = CopyToReg 0x15c9bbb28, Register:i32 %I16,
> > Constant:i32<0>, 0x15c9bbb28:1
> >           0x15c9bbb90: i32 = Register %I16
> >           0x15c9bb850: i32 = Constant<0>
> >           0x15c9bbb28: ch,glue = CopyToReg 0x15c9bba58, Register:i32
> %I17,
> > 0x15c9bb648, 0x15c9bba58:1
> >             0x15c9bbac0: i32 = Register %I17
> >             0x15c9bb648: i32,ch,glue = CopyFromReg 0x15c9bb578:1,
> > Register:i32 %vreg1, 0x15c9bb578:1
> >               0x15c9bb5e0: i32 = Register %vreg1
> >             0x15c9bba58: ch,glue = CopyToReg 0x15c9bb988, Register:i32
> > %I18,
> > 0x15c9bb578
> >               0x15c9bb9f0: i32 = Register %I18
> >               0x15c9bb578: i32,ch,glue = CopyFromReg 0x15c967b38,
> > Register:i32 %vreg0
> >                 0x15c9bb510: i32 = Register %vreg0
> >
> > My TD for this has:
> >
> >   def SDT_MYCallSeqStart : SDCallSeqStart<[SDTCisVT<0, i32>, SDTCisVT<1,
> > i32>]>;
> >   def SDT_MYCallSeqEnd   : SDCallSeqStart<[SDTCisVT<0, i32>, SDTCisVT<1,
> > i32>]>;
> >   def MYCallseqStart     : SDNode<"ISD::CALLSEQ_START",
> SDT_MYCallSeqStart,
> >                                   [SDNPHasChain, SDNPOutGlue]>;
> >   def MYCallseqEnd       : SDNode<"ISD::CALLSEQ_END",   SDT_MYCallSeqEnd,
> >                                   [SDNPHasChain, SDNPOptInGlue,
> > SDNPOutGlue]>;
> >
> >   def SDT_MYCall         : SDTypeProfile<0, 1, [SDTCisVT<0, i32>]>;
> >   def SDT_MYRet          : SDTypeProfile<0, 0, []>;
> >   def MYcall             : SDNode<"MYISD::CALL",     SDT_MYCall,
> >                                   [SDNPHasChain, SDNPOptInGlue,
> > SDNPOutGlue,
> > SDNPVariadic]>;
> >   def MYret              : SDNode<"MYISD::RET_FLAG", SDTNone,
> >                                   [SDNPHasChain, SDNPOptInGlue,
> > SDNPVariadic]>;
> >
> >   let hasCtrlDep = 1, hasSideEffects = 1 in {
> >     def ADJCALLSTACKDOWN : Pseudo<(outs), (ins i32imm:$amt1,
> i32imm:$amt2),
> >                                   [(MYCallseqStart timm:$amt1,
> > timm:$amt2)]>;
> >     def ADJCALLSTACKUP   : Pseudo<(outs), (ins i32imm:$amt1,
> i32imm:$amt2),
> >                                   [(MYCallseqEnd timm:$amt1,
> timm:$amt2)]>;
> >   }
> >
> >   def: Pat<(MYret), (JMP_Ret (i32 LR))>;
> >
> > The function that is failing does warn - "warning: function declared
> > 'noreturn' should not return [-Winvalid-noreturn]", and it does seem to
> > return.  In fact it invokes a custom builtin which does not actually
> > return.
> > In the past I have just ignored this warning.
> >
> > Any hints that might help me to make the necessary adaptations to fix
> this?
> >
> > Thanks in advance,
> >
> >         MartinO
> >
> > PS: I won't be able to reply until Monday as I will be away for the
> weekend
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.llvm.org/pipermail/llvm-dev/
> attachments/20170915/88bef271/attachment-0001.html>
>
> ------------------------------
>
> Message: 5
> Date: Fri, 15 Sep 2017 10:00:11 -0700
> From: Kyle Butt via llvm-dev <llvm-dev at lists.llvm.org>
> To: Sean Silva <chisophugis at gmail.com>
> Cc: LLVM Developers <llvm-dev at lists.llvm.org>
> Subject: Re: [llvm-dev] RFC: Trace-based layout.
> Message-ID:
>         <CABeP02Ar0toCzHnax2EdGyGu8Bukq6PGEeoTy0CmSi0Dg8yneQ at mail.
> gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> It is essentially block layout algorithm 2 here, with limited non-greedy
> lookahead. (The triangle detection)
> https://www.ece.cmu.edu/~ece447/s13/lib/exe/fetch.php?media=p16-pettis.pdf
>
> On Thu, Sep 14, 2017 at 7:24 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
> > Is this an existing published algorithm? Do you have a link to a paper?
> >
> > -- Sean Silva
> >
> > On Thu, Sep 14, 2017 at 6:53 PM, Kyle Butt via llvm-dev <
> > llvm-dev at lists.llvm.org> wrote:
> >
> >> I plan on rewriting the block placement algorithm to proceed by traces.
> >>
> >> A trace is a chain of blocks where each block in the chain may fall
> >> through to
> >> the successor in the chain.
> >>
> >> The overall algorithm would be to first produce traces for a function,
> >> and then
> >> order those traces to try and get cache locality.
> >>
> >> Currently block placement uses a greedy single step approach to layout.
> It
> >> produces chains working from inner to outer loops. Unlike a trace, a
> >> chain may
> >> contain non-fallthrough edges. This causes problems with loop layout.
> The
> >> main
> >> problems with loop layout are: loop rotation and cold blocks in a loop.
> >>
> >> Overview of proposed solution:
> >>
> >> Phase 1:
> >> Greedily produce a set of traces through the function. A trace is a list
> >> of
> >> blocks with each block in the list falling through (possibly
> >> conditionally) to
> >> the next block in the list. Loop rotation will occur naturally in this
> >> phase via
> >> the triangle replacement algorithm below. Handling single trace loops
> >> requires a
> >> tweak, see the detailed design.
> >>
> >> Phase 2:
> >> After producing what we believe are the best traces, they need to be
> >> ordered.
> >> They will be ordered topologically, except that traces that are cold
> >> enough (As
> >> measured by their warmest block) will be floated later, This may push
> >> them out
> >> of a loop or to the end of the function.
> >>
> >> Detailed Design
> >>
> >> Note whenever an edge is used as a number, I am referring to the edge
> >> frequency.
> >>
> >> Phase 1: Producing traces
> >> Traces are produced according to the following algorithm:
> >>  * Sort the edges according to weight, stable-sorting them according the
> >> incoming
> >> block and edge ordering.
> >>  * Place each block in a trace of length 1.
> >>  * For each edge in order:
> >>     * If the source is at the end of a trace, and the target is at the
> >> beginning
> >>       of a trace, glue those 2 traces into 1 longer trace.
> >>     * If an edge has a target or source in the middle of another trace,
> >> consider
> >>       tail duplication. The benefit calculation is the same as the
> >> existing
> >>       code.
> >>     * If an edge has a source or target in the middle, check them to see
> >> if they
> >>       can be replaced as a triangle. (Triangle replacement described
> >> below)
> >>       * Compare the benefit of choosing the edge, along with any
> triangles
> >>         found, with the cost of breaking the existing edges.
> >>         * If it is a net benefit, perform the switch.
> >>  * Triangle checking:
> >>     Consider a trace in 2 parts: A1->A2, and the current edge under
> >> consideration
> >>     is A1->B (the case for C->A2 is mirror, and both may need to be
> done)
> >>     * First find the best alternative C->B
> >>     * Check for an alternative for A2: D->A2
> >>     * Find D's best Alternative: D->E
> >>     * Compare the frequencies: A1->A2 + C->B + D->E vs A1->B + D->A2
> >>     * If the 2nd sum is bigger, do the switch.
> >>   * Loop Rotation Tweak:
> >>     If A contains a backedge A2->A1, then when considering A1->B or
> >> C->A2, we
> >>     can include that backedge in the gain:
> >>     A1->A2 + C->D + E->B vs A1->B + C->A2 + A2->A
> >>
> >> Phase 2: Order traces.
> >> First we compute the frequency of a trace by finding the max frequency
> of
> >> any of
> >> its blocks.
> >> Then we attempt to place the traces topologically. When a trace cannot
> be
> >> placed
> >> topologically, we prefer warmer traces first.
> >>
> >> Questions and comments welcome.
> >>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
> >>
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.llvm.org/pipermail/llvm-dev/
> attachments/20170915/82dfc991/attachment-0001.html>
>
> ------------------------------
>
> Message: 6
> Date: Fri, 15 Sep 2017 17:42:23 +0000
> From: "Demikhovsky, Elena via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "jingu at codeplay.com" <jingu at codeplay.com>,
>         "daniel_l_sanders at apple.com" <daniel_l_sanders at apple.com>
> Cc: "llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org>
> Subject: Re: [llvm-dev] Question about
>         'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'
> Message-ID:
>         <A0DC88CEB3010344830D52D66533DA8E5EE2F88D at hasmsx108.ger.
> corp.intel.com>
>
> Content-Type: text/plain; charset="utf-8"
>
> > extends the elements to 8bit and stores them on stack.
> Store is responsible for zero-extend. This is the policy...
>
> -  Elena
>
>
> -----Original Message-----
> From: jingu at codeplay.com [mailto:jingu at codeplay.com]
> Sent: Friday, September 15, 2017 17:45
> To: llvm-dev at lists.llvm.org; Demikhovsky, Elena <
> elena.demikhovsky at intel.com>; daniel_l_sanders at apple.com
> Subject: Re: Question about 'DAGTypeLegalizer::SplitVecOp_
> EXTRACT_VECTOR_ELT'
>
> Can someone give the comment about it please?
>
> Thanks,
>
> JinGu Kang
>
>
> On 14/09/17 12:05, jingu at codeplay.com wrote:
> > Hi All,
> >
> > I have a question about splitting 'EXTRACT_VECTOR_ELT' with 'v2i1'. I
> > have a llvm IR code snippet as following:
> >
> > llvm IR code snippet:
> >
> > for.body:                                         ; preds = %entry,
> > %for.cond
> >   %i.022 = phi i32 [ 0, %entry ], [ %inc, %for.cond ]
> >   %0 = icmp ne <2 x i32> %vecinit1, <i32 0, i32 -23>
> >   %1 = extractelement <2 x i1> %0, i32 %i.022
> >   %vecext4 = extractelement <2 x i32> %vecinit1, i32 %i.022
> >   %vecext5 = extractelement <2 x i32> <i32 0, i32 -23>, i32 %i.022
> >   %cmp6 = icmp ne i32 %vecext4, %vecext5
> >   %cmp7 = xor i1 %1, %cmp6
> >
> > ...
> >
> > and the SelectionDAG before TypeLegalizer is like this.
> >
> >   t0: ch = EntryToken
> >   t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0
> >   t3: ch = ValueType:i32
> >       t5: i32,ch = CopyFromReg t2:1, Register:i32 %vreg1
> >     t7: i32 = AssertZext t5, ValueType:ch:i1
> >   t8: v2i32 = BUILD_VECTOR t2, t7
> >   t11: v2i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<-23>
> >   t15: i32,ch = CopyFromReg t0, Register:i32 %vreg2
> >           t22: i32 = add t15, Constant:i32<1>
> >         t24: ch = CopyToReg t0, Register:i32 %vreg3, t22
> >         t27: ch = CopyToReg t0, Register:i32 %vreg8, Constant:i32<-1>
> >       t31: ch = TokenFactor t24, t27
> >             t13: v2i1 = setcc t8, t11, setne:ch
> >           t16: i1 = extract_vector_elt t13, t15
> >             t17: i32 = extract_vector_elt t8, t15
> >             t18: i32 = extract_vector_elt t11, t15
> >           t19: i1 = setcc t17, t18, setne:ch
> >         t20: i1 = xor t16, t19
> >
> > ...
> >
> > I have not added any vector register class so 'DAGTypeLegalizer' tries
> > to split the "t16: i1 = extract_vector_elt t13, t15" because  t13's
> > result type is 'v2i1'. If the size of vector element is less than
> > 8bit, 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT()' function
> > extends the elements to 8bit and stores them on stack. Finally, the
> > function generates 'ExtLoad' to load specific element. But if the
> > element's size is less than 8bit, I think it could be wrong. It looks
> > it needs just 'Load' or "Load and Truncate" to match the result type
> > of 'EXTRACT_VECTOR_ELT'. How do you think about it? If I missed
> > something, please let me know.
> >
> > Thanks,
> >
> > JinGu Kang
> >
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
> ------------------------------
>
> Message: 7
> Date: Fri, 15 Sep 2017 10:55:14 -0700
> From: "Friedman, Eli via llvm-dev" <llvm-dev at lists.llvm.org>
> To: Jon Chesterfield <jonathanchesterfield at gmail.com>, llvm-dev
>         <llvm-dev at lists.llvm.org>
> Subject: Re: [llvm-dev] What should a truncating store do?
> Message-ID: <a0b1d63b-d177-beff-899e-420e8f2c0798 at codeaurora.org>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> On 9/15/2017 5:49 AM, Jon Chesterfield via llvm-dev wrote:
> > For example, truncating store of an i32 to i6. My assumption was that
> > this should write the low six bits of the i32 to somewhere in memory.
> >
> > Should the top 24 bits of a corresponding 32 bit region of memory be
> > unchanged, zero,  undefined?
>
> Unchanged.
>
> > Should the two bits that would round the i6 up to a byte be preserved,
> > zero, undefined?
>
> Zero.  Legalization will normally handle this for you, though, by
> transforming it to an i8 store.
>
> -Eli
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
> Foundation Collaborative Project
>
>
>
> ------------------------------
>
> Message: 8
> Date: Fri, 15 Sep 2017 19:30:20 +0100
> From: Jon Chesterfield via llvm-dev <llvm-dev at lists.llvm.org>
> To: "Friedman, Eli" <efriedma at codeaurora.org>
> Cc: llvm-dev <llvm-dev at lists.llvm.org>
> Subject: Re: [llvm-dev] What should a truncating store do?
> Message-ID:
>         <CAOUYtQBoArROmMx1Ke0jFxpsQ2ztFqtNxgbLzWVvycs0Ls72eA at mail.
> gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Interesting, thank you. I expected both answers to be "unchanged" so was
> surprised by the zero extend in the legaliser.
>
> The motivation here is that it's faster for us to load N bytes, apply
> whatever masks are necessary to reproduce the truncating store then store
> all N bytes. This is only a good plan if there's no change to the semantics
> :)
>
> Are scalar integer types zero extended to the next multiple of 8 or to the
> next power of 2 greater than 7? For example, i17 => i24 or i17 => i32?
>
> I think this means truncating stores of vector types will introduce zero
> bits at the end of each element instead grouping all the zeros at the end.
> For example, <i6 63, i6 63> writes to sixteen bits as 0b0011111100111111,
> not as 0b0000111111111111?
>
>
> Thanks!
>
> Jon
>
>
>
> On Fri, Sep 15, 2017 at 6:55 PM, Friedman, Eli <efriedma at codeaurora.org>
> wrote:
>
> > On 9/15/2017 5:49 AM, Jon Chesterfield via llvm-dev wrote:
> >
> >> For example, truncating store of an i32 to i6. My assumption was that
> >> this should write the low six bits of the i32 to somewhere in memory.
> >>
> >> Should the top 24 bits of a corresponding 32 bit region of memory be
> >> unchanged, zero,  undefined?
> >>
> >
> > Unchanged.
> >
> > Should the two bits that would round the i6 up to a byte be preserved,
> >> zero, undefined?
> >>
> >
> > Zero.  Legalization will normally handle this for you, though, by
> > transforming it to an i8 store.
> >
> > -Eli
> >
> > --
> > Employee of Qualcomm Innovation Center, Inc.
> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
> Linux
> > Foundation Collaborative Project
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.llvm.org/pipermail/llvm-dev/
> attachments/20170915/1b054776/attachment-0001.html>
>
> ------------------------------
>
> Message: 9
> Date: Fri, 15 Sep 2017 11:41:14 -0700
> From: "Friedman, Eli via llvm-dev" <llvm-dev at lists.llvm.org>
> To: Jon Chesterfield <jonathanchesterfield at gmail.com>
> Cc: llvm-dev <llvm-dev at lists.llvm.org>
> Subject: Re: [llvm-dev] What should a truncating store do?
> Message-ID: <8a9c81d9-9c89-9956-c269-d3057a71b451 at codeaurora.org>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> On 9/15/2017 11:30 AM, Jon Chesterfield wrote:
> > Interesting, thank you. I expected both answers to be "unchanged" so
> > was surprised by the zero extend in the legaliser.
> >
> > The motivation here is that it's faster for us to load N bytes, apply
> > whatever masks are necessary to reproduce the truncating store then
> > store all N bytes. This is only a good plan if there's no change to
> > the semantics :)
>
> See http://llvm.org/docs/LangRef.html#store-instruction .  In general,
> you have to be careful to avoid data races, but that might not apply to
> your target.
>
> > Are scalar integer types zero extended to the next multiple of 8 or to
> > the next power of 2 greater than 7? For example, i17 => i24 or i17 =>
> i32?
>
> Multiple of 8.
>
> > I think this means truncating stores of vector types will introduce
> > zero bits at the end of each element instead grouping all the zeros at
> > the end. For example, <i6 63, i6 63> writes to sixteen bits as
> > 0b0011111100111111, not as 0b0000111111111111?
>
> Vector types are tightly packed, so <8 x i1> is 1 byte, not 8 bytes.
>
> -Eli
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
> Foundation Collaborative Project
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> llvm-dev mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> ------------------------------
>
> End of llvm-dev Digest, Vol 159, Issue 57
> *****************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170915/e80f49d2/attachment.html>