<div dir="ltr">Hi JinGu,<div><br></div><div>The initial selection dag looks reasonable to me. Are you seeing a cannot select error related to the extending load or does the assembly generated fail to implement the semantics you expect?</div><div><br></div><div><div class="gmail_extra">Jon</div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Sep 15, 2017 at 8:00 PM, via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Send llvm-dev mailing list submissions to<br>

        <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

<br>

To subscribe or unsubscribe via the World Wide Web, visit<br>

        <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

or, via email, send a message with subject or body 'help' to<br>

        <a href="mailto:llvm-dev-request@lists.llvm.org">llvm-dev-request@lists.llvm.<wbr>org</a><br>

<br>

You can reach the person managing the list at<br>

        <a href="mailto:llvm-dev-owner@lists.llvm.org">llvm-dev-owner@lists.llvm.org</a><br>

<br>

When replying, please edit your Subject line so it is more specific<br>

than "Re: Contents of llvm-dev digest..."<br>

<br>

<br>

Today's Topics:<br>

<br>

   1. What should a truncating store do? (Jon Chesterfield via llvm-dev)<br>

   2. Re: Question about<br>

      'DAGTypeLegalizer::SplitVecOp_<wbr>EXTRACT_VECTOR_ELT'<br>

      (<a href="mailto:jingu@codeplay.com">jingu@codeplay.com</a> via llvm-dev)<br>

   3. DIVA - Debug Information Visual Analyser (Phil Camp via llvm-dev)<br>

   4. Re: Changes to 'ADJCALLSTACK*' and 'callseq_*' between LLVM<br>

      v4.0 and v5.0 (Serge Pavlov via llvm-dev)<br>

   5. Re: RFC: Trace-based layout. (Kyle Butt via llvm-dev)<br>

   6. Re: Question about<br>

      'DAGTypeLegalizer::SplitVecOp_<wbr>EXTRACT_VECTOR_ELT'<br>

      (Demikhovsky, Elena via llvm-dev)<br>

   7. Re: What should a truncating store do?<br>

      (Friedman, Eli via llvm-dev)<br>

   8. Re: What should a truncating store do?<br>

      (Jon Chesterfield via llvm-dev)<br>

   9. Re: What should a truncating store do?<br>

      (Friedman, Eli via llvm-dev)<br>

<br>

<br>

------------------------------<wbr>------------------------------<wbr>----------<br>

<br>

Message: 1<br>

Date: Fri, 15 Sep 2017 13:49:48 +0100<br>

From: Jon Chesterfield via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

To: llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

Subject: [llvm-dev] What should a truncating store do?<br>

Message-ID:<br>

        <<wbr>CAOUYtQCN4KYLtmwmVjnCajsSfVKwS<wbr>ETAPZ1zaoYK9w=<a href="mailto:v3c26Tg@mail.gmail.com">v3c26Tg@mail.<wbr>gmail.com</a>><br>

Content-Type: text/plain; charset="utf-8"<br>

<br>

For example, truncating store of an i32 to i6. My assumption was that this<br>

should write the low six bits of the i32 to somewhere in memory.<br>

<br>

Should the top 24 bits of a corresponding 32 bit region of memory be<br>

unchanged, zero,  undefined?<br>

<br>

Should the two bits that would round the i6 up to a byte be preserved,<br>

zero, undefined?<br>

<br>

I can't write six bits directly so am trying to determine what set of<br>

bitwise ops to apply between a load and subsequent store to emulate the<br>

truncating store.<br>

<br>

Thanks!<br>

<br>

Jon<br>

-------------- next part --------------<br>

An HTML attachment was scrubbed...<br>

URL: <<a href="http://lists.llvm.org/pipermail/llvm-dev/attachments/20170915/5b458bec/attachment-0001.html" rel="noreferrer" target="_blank">http://lists.llvm.org/<wbr>pipermail/llvm-dev/<wbr>attachments/20170915/5b458bec/<wbr>attachment-0001.html</a>><br>

<br>

------------------------------<br>

<br>

Message: 2<br>

Date: Fri, 15 Sep 2017 15:45:05 +0100<br>

From: "<a href="mailto:jingu@codeplay.com">jingu@codeplay.com</a> via llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

To: "<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>>,<br>

        <a href="mailto:elena.demikhovsky@intel.com">elena.demikhovsky@intel.com</a>, <a href="mailto:daniel_l_sanders@apple.com">daniel_l_sanders@apple.com</a><br>

Subject: Re: [llvm-dev] Question about<br>

        'DAGTypeLegalizer::SplitVecOp_<wbr>EXTRACT_VECTOR_ELT'<br>

Message-ID: <<a href="mailto:5fdb722e-2682-ee03-871b-0f00ed1b5909@codeplay.com">5fdb722e-2682-ee03-871b-<wbr>0f00ed1b5909@codeplay.com</a>><br>

Content-Type: text/plain; charset=utf-8; format=flowed<br>

<br>

Can someone give the comment about it please?<br>

<br>

Thanks,<br>

<br>

JinGu Kang<br>

<br>

<br>

On 14/09/17 12:05, <a href="mailto:jingu@codeplay.com">jingu@codeplay.com</a> wrote:<br>

> Hi All,<br>

><br>

> I have a question about splitting 'EXTRACT_VECTOR_ELT' with 'v2i1'. I<br>

> have a llvm IR code snippet as following:<br>

><br>

> llvm IR code snippet:<br>

><br>

> for.body:                                         ; preds = %entry,<br>

> %for.cond<br>

>   %i.022 = phi i32 [ 0, %entry ], [ %inc, %for.cond ]<br>

>   %0 = icmp ne <2 x i32> %vecinit1, <i32 0, i32 -23><br>

>   %1 = extractelement <2 x i1> %0, i32 %i.022<br>

>   %vecext4 = extractelement <2 x i32> %vecinit1, i32 %i.022<br>

>   %vecext5 = extractelement <2 x i32> <i32 0, i32 -23>, i32 %i.022<br>

>   %cmp6 = icmp ne i32 %vecext4, %vecext5<br>

>   %cmp7 = xor i1 %1, %cmp6<br>

><br>

> ...<br>

><br>

> and the SelectionDAG before TypeLegalizer is like this.<br>

><br>

>   t0: ch = EntryToken<br>

>   t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0<br>

>   t3: ch = ValueType:i32<br>

>       t5: i32,ch = CopyFromReg t2:1, Register:i32 %vreg1<br>

>     t7: i32 = AssertZext t5, ValueType:ch:i1<br>

>   t8: v2i32 = BUILD_VECTOR t2, t7<br>

>   t11: v2i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<-23><br>

>   t15: i32,ch = CopyFromReg t0, Register:i32 %vreg2<br>

>           t22: i32 = add t15, Constant:i32<1><br>

>         t24: ch = CopyToReg t0, Register:i32 %vreg3, t22<br>

>         t27: ch = CopyToReg t0, Register:i32 %vreg8, Constant:i32<-1><br>

>       t31: ch = TokenFactor t24, t27<br>

>             t13: v2i1 = setcc t8, t11, setne:ch<br>

>           t16: i1 = extract_vector_elt t13, t15<br>

>             t17: i32 = extract_vector_elt t8, t15<br>

>             t18: i32 = extract_vector_elt t11, t15<br>

>           t19: i1 = setcc t17, t18, setne:ch<br>

>         t20: i1 = xor t16, t19<br>

><br>

> ...<br>

><br>

> I have not added any vector register class so 'DAGTypeLegalizer' tries<br>

> to split the "t16: i1 = extract_vector_elt t13, t15" because  t13's<br>

> result type is 'v2i1'. If the size of vector element is less than<br>

> 8bit, 'DAGTypeLegalizer::SplitVecOp_<wbr>EXTRACT_VECTOR_ELT()' function<br>

> extends the elements to 8bit and stores them on stack. Finally, the<br>

> function generates 'ExtLoad' to load specific element. But if the<br>

> element's size is less than 8bit, I think it could be wrong. It looks<br>

> it needs just 'Load' or "Load and Truncate" to match the result type<br>

> of 'EXTRACT_VECTOR_ELT'. How do you think about it? If I missed<br>

> something, please let me know.<br>

><br>

> Thanks,<br>

><br>

> JinGu Kang<br>

><br>

<br>

<br>

<br>

------------------------------<br>

<br>

Message: 3<br>

Date: Fri, 15 Sep 2017 16:38:48 +0100<br>

From: Phil Camp via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

To: <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

Subject: [llvm-dev] DIVA - Debug Information Visual Analyser<br>

Message-ID: <<a href="mailto:5b25cc76-bbd9-515c-b984-34a03dd1cd2a@flametop.co.uk">5b25cc76-bbd9-515c-b984-<wbr>34a03dd1cd2a@flametop.co.uk</a>><br>

Content-Type: text/plain; charset="utf-8"; Format="flowed"<br>

<br>

DIVA, the Debug Information Visual Analyser, was presented at the 2017<br>

European LLVM Developers Meeting<br>

(<a href="https://www.youtube.com/watch?v=SwtpXaCk2bE" rel="noreferrer" target="_blank">https://www.youtube.com/<wbr>watch?v=SwtpXaCk2bE</a>).<br>

<br>

The DIVA binaries have been available since March, I am pleased to<br>

announce that the source code is now available on GitHub.<br>

<a href="https://github.com/SNSystems/DIVA" rel="noreferrer" target="_blank">https://github.com/SNSystems/<wbr>DIVA</a><br>

<br>

DIVA is a command line tool that processes DWARF debug information<br>

contained within ELF files and prints the semantics of that debug<br>

information. The DIVA output is designed to be understandable by<br>

software programmers without any low-level compiler or DWARF knowledge;<br>

as such, it can be used to report debug information bugs to the compiler<br>

provider. DIVA's output can also be used as the input to DWARF tests, to<br>

compare the debug information generated from multiple compilers, from<br>

different versions of the same compiler, from different compiler<br>

switches and from the use of different DWARF specifications (i.e. DWARF<br>

3, 4 and 5). DIVA will be used on the LLVM project to test and validate<br>

the output of clang to help improve the quality of the debug experience.<br>

<br>

Phil Camp<br>

<br>

SN Systems<br>

<br>

-------------- next part --------------<br>

An HTML attachment was scrubbed...<br>

URL: <<a href="http://lists.llvm.org/pipermail/llvm-dev/attachments/20170915/c02ff23f/attachment-0001.html" rel="noreferrer" target="_blank">http://lists.llvm.org/<wbr>pipermail/llvm-dev/<wbr>attachments/20170915/c02ff23f/<wbr>attachment-0001.html</a>><br>

<br>

------------------------------<br>

<br>

Message: 4<br>

Date: Fri, 15 Sep 2017 23:39:44 +0700<br>

From: Serge Pavlov via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

To: "Martin J. O'Riordan" <<a href="mailto:MartinO@theheart.ie">MartinO@theheart.ie</a>><br>

Cc: LLVM Developers <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

Subject: Re: [llvm-dev] Changes to 'ADJCALLSTACK*' and 'callseq_*'<br>

        between LLVM v4.0 and v5.0<br>

Message-ID:<br>

        <CACOhrX4VSKtYBubv9q5kFd=<a href="mailto:btSWe5k6eEQSOYEo8c4uB2O27Rw@mail.gmail.com">btSWe<wbr>5k6eEQSOYEo8c4uB2O27Rw@mail.<wbr>gmail.com</a>><br>

Content-Type: text/plain; charset="utf-8"<br>

<br>

Hi Martin,<br>

<br>

Pseudo CALLSEQ_START was changed in r302527, commit message contains<br>

details on the changes.<br>

However CALLSEQ_END was not modified. If your made changes to<br>

 ADJCALLSTACKUP to add<br>

additional argument, that may result in error.<br>

<br>

Thanks,<br>

--Serge<br>

<br>

2017-09-15 19:09 GMT+07:00 Martin J. O'Riordan via llvm-dev <<br>

<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>>:<br>

<br>

> Hi LLVM-Devs,<br>

><br>

> I have managed to complete updating our sources from LLVM v4.0 to v5.0, but<br>

> I am getting selection errors for 'callseq_end'.  I am aware that the<br>

> 'ADJCALLSTACKUP' and 'ADJCALLSTACKDOWN' patterns have changed, and have<br>

> added an additional argument to the TD descriptions for these.<br>

><br>

> There are interactions with 'ISD::CALL' and 'ISD::RET_FLAG', but so far as<br>

> I<br>

> can tell I have revised these in the same way as the in-tree targets have<br>

> adjusted their sources.<br>

><br>

> The error I am seeing is:<br>

><br>

>   fatal error: error in backend: Cannot select: 0x15c9bbe00: ch,glue =<br>

> callseq_end 0x15c9bbd98, TargetConstant:i32<0>,<br>

> TargetGlobalAddress:i32<void<br>

> (i8*, i32, i8*, i8*)* @__assert_func> 0, 0x15c9bbd98:1<br>

>     0x15c9bb920: i32 = TargetConstant<0><br>

>     0x15c9bb8b8: i32 = TargetGlobalAddress<void (i8*, i32, i8*, i8*)*<br>

> @__assert_func> 0<br>

>     0x15c9bbd98: ch,glue = MYISD::CALL 0x15c9bbcc8,<br>

> TargetGlobalAddress:i32<void (i8*, i32, i8*, i8*)* @__assert_func> 0,<br>

> Register:i32 %I18, Register:i32 %I17, Register:i32 %I16, Register:i32 %I15,<br>

> RegisterMask:Untyped, 0x15c9bbcc8:1<br>

>       0x15c9bb8b8: i32 = TargetGlobalAddress<void (i8*, i32, i8*, i8*)*<br>

> @__assert_func> 0<br>

>       0x15c9bb9f0: i32 = Register %I18<br>

>       0x15c9bbac0: i32 = Register %I17<br>

>       0x15c9bbb90: i32 = Register %I16<br>

>       0x15c9bbc60: i32 = Register %I15<br>

>       0x15c9bbd30: Untyped = RegisterMask<br>

>       0x15c9bbcc8: ch,glue = CopyToReg 0x15c9bbbf8, Register:i32 %I15,<br>

> 0x15c9bb718, 0x15c9bbbf8:1<br>

>         0x15c9bbc60: i32 = Register %I15<br>

>         0x15c9bb718: i32,ch,glue = CopyFromReg 0x15c9bb648:1, Register:i32<br>

> %vreg2, 0x15c9bb648:1<br>

>           0x15c9bb6b0: i32 = Register %vreg2<br>

>         0x15c9bbbf8: ch,glue = CopyToReg 0x15c9bbb28, Register:i32 %I16,<br>

> Constant:i32<0>, 0x15c9bbb28:1<br>

>           0x15c9bbb90: i32 = Register %I16<br>

>           0x15c9bb850: i32 = Constant<0><br>

>           0x15c9bbb28: ch,glue = CopyToReg 0x15c9bba58, Register:i32 %I17,<br>

> 0x15c9bb648, 0x15c9bba58:1<br>

>             0x15c9bbac0: i32 = Register %I17<br>

>             0x15c9bb648: i32,ch,glue = CopyFromReg 0x15c9bb578:1,<br>

> Register:i32 %vreg1, 0x15c9bb578:1<br>

>               0x15c9bb5e0: i32 = Register %vreg1<br>

>             0x15c9bba58: ch,glue = CopyToReg 0x15c9bb988, Register:i32<br>

> %I18,<br>

> 0x15c9bb578<br>

>               0x15c9bb9f0: i32 = Register %I18<br>

>               0x15c9bb578: i32,ch,glue = CopyFromReg 0x15c967b38,<br>

> Register:i32 %vreg0<br>

>                 0x15c9bb510: i32 = Register %vreg0<br>

><br>

> My TD for this has:<br>

><br>

>   def SDT_MYCallSeqStart : SDCallSeqStart<[SDTCisVT<0, i32>, SDTCisVT<1,<br>

> i32>]>;<br>

>   def SDT_MYCallSeqEnd   : SDCallSeqStart<[SDTCisVT<0, i32>, SDTCisVT<1,<br>

> i32>]>;<br>

>   def MYCallseqStart     : SDNode<"ISD::CALLSEQ_START", SDT_MYCallSeqStart,<br>

>                                   [SDNPHasChain, SDNPOutGlue]>;<br>

>   def MYCallseqEnd       : SDNode<"ISD::CALLSEQ_END",   SDT_MYCallSeqEnd,<br>

>                                   [SDNPHasChain, SDNPOptInGlue,<br>

> SDNPOutGlue]>;<br>

><br>

>   def SDT_MYCall         : SDTypeProfile<0, 1, [SDTCisVT<0, i32>]>;<br>

>   def SDT_MYRet          : SDTypeProfile<0, 0, []>;<br>

>   def MYcall             : SDNode<"MYISD::CALL",     SDT_MYCall,<br>

>                                   [SDNPHasChain, SDNPOptInGlue,<br>

> SDNPOutGlue,<br>

> SDNPVariadic]>;<br>

>   def MYret              : SDNode<"MYISD::RET_FLAG", SDTNone,<br>

>                                   [SDNPHasChain, SDNPOptInGlue,<br>

> SDNPVariadic]>;<br>

><br>

>   let hasCtrlDep = 1, hasSideEffects = 1 in {<br>

>     def ADJCALLSTACKDOWN : Pseudo<(outs), (ins i32imm:$amt1, i32imm:$amt2),<br>

>                                   [(MYCallseqStart timm:$amt1,<br>

> timm:$amt2)]>;<br>

>     def ADJCALLSTACKUP   : Pseudo<(outs), (ins i32imm:$amt1, i32imm:$amt2),<br>

>                                   [(MYCallseqEnd timm:$amt1, timm:$amt2)]>;<br>

>   }<br>

><br>

>   def: Pat<(MYret), (JMP_Ret (i32 LR))>;<br>

><br>

> The function that is failing does warn - "warning: function declared<br>

> 'noreturn' should not return [-Winvalid-noreturn]", and it does seem to<br>

> return.  In fact it invokes a custom builtin which does not actually<br>

> return.<br>

> In the past I have just ignored this warning.<br>

><br>

> Any hints that might help me to make the necessary adaptations to fix this?<br>

><br>

> Thanks in advance,<br>

><br>

>         MartinO<br>

><br>

> PS: I won't be able to reply until Monday as I will be away for the weekend<br>

><br>

><br>

> ______________________________<wbr>_________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

><br>

-------------- next part --------------<br>

An HTML attachment was scrubbed...<br>

URL: <<a href="http://lists.llvm.org/pipermail/llvm-dev/attachments/20170915/88bef271/attachment-0001.html" rel="noreferrer" target="_blank">http://lists.llvm.org/<wbr>pipermail/llvm-dev/<wbr>attachments/20170915/88bef271/<wbr>attachment-0001.html</a>><br>

<br>

------------------------------<br>

<br>

Message: 5<br>

Date: Fri, 15 Sep 2017 10:00:11 -0700<br>

From: Kyle Butt via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

To: Sean Silva <<a href="mailto:chisophugis@gmail.com">chisophugis@gmail.com</a>><br>

Cc: LLVM Developers <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

Subject: Re: [llvm-dev] RFC: Trace-based layout.<br>

Message-ID:<br>

        <<a href="mailto:CABeP02Ar0toCzHnax2EdGyGu8Bukq6PGEeoTy0CmSi0Dg8yneQ@mail.gmail.com">CABeP02Ar0toCzHnax2EdGyGu8Buk<wbr>q6PGEeoTy0CmSi0Dg8yneQ@mail.<wbr>gmail.com</a>><br>

Content-Type: text/plain; charset="utf-8"<br>

<br>

It is essentially block layout algorithm 2 here, with limited non-greedy<br>

lookahead. (The triangle detection)<br>

<a href="https://www.ece.cmu.edu/~ece447/s13/lib/exe/fetch.php?media=p16-pettis.pdf" rel="noreferrer" target="_blank">https://www.ece.cmu.edu/~<wbr>ece447/s13/lib/exe/fetch.php?<wbr>media=p16-pettis.pdf</a><br>

<br>

On Thu, Sep 14, 2017 at 7:24 PM, Sean Silva <<a href="mailto:chisophugis@gmail.com">chisophugis@gmail.com</a>> wrote:<br>

<br>

> Is this an existing published algorithm? Do you have a link to a paper?<br>

><br>

> -- Sean Silva<br>

><br>

> On Thu, Sep 14, 2017 at 6:53 PM, Kyle Butt via llvm-dev <<br>

> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>

><br>

>> I plan on rewriting the block placement algorithm to proceed by traces.<br>

>><br>

>> A trace is a chain of blocks where each block in the chain may fall<br>

>> through to<br>

>> the successor in the chain.<br>

>><br>

>> The overall algorithm would be to first produce traces for a function,<br>

>> and then<br>

>> order those traces to try and get cache locality.<br>

>><br>

>> Currently block placement uses a greedy single step approach to layout. It<br>

>> produces chains working from inner to outer loops. Unlike a trace, a<br>

>> chain may<br>

>> contain non-fallthrough edges. This causes problems with loop layout. The<br>

>> main<br>

>> problems with loop layout are: loop rotation and cold blocks in a loop.<br>

>><br>

>> Overview of proposed solution:<br>

>><br>

>> Phase 1:<br>

>> Greedily produce a set of traces through the function. A trace is a list<br>

>> of<br>

>> blocks with each block in the list falling through (possibly<br>

>> conditionally) to<br>

>> the next block in the list. Loop rotation will occur naturally in this<br>

>> phase via<br>

>> the triangle replacement algorithm below. Handling single trace loops<br>

>> requires a<br>

>> tweak, see the detailed design.<br>

>><br>

>> Phase 2:<br>

>> After producing what we believe are the best traces, they need to be<br>

>> ordered.<br>

>> They will be ordered topologically, except that traces that are cold<br>

>> enough (As<br>

>> measured by their warmest block) will be floated later, This may push<br>

>> them out<br>

>> of a loop or to the end of the function.<br>

>><br>

>> Detailed Design<br>

>><br>

>> Note whenever an edge is used as a number, I am referring to the edge<br>

>> frequency.<br>

>><br>

>> Phase 1: Producing traces<br>

>> Traces are produced according to the following algorithm:<br>

>>  * Sort the edges according to weight, stable-sorting them according the<br>

>> incoming<br>

>> block and edge ordering.<br>

>>  * Place each block in a trace of length 1.<br>

>>  * For each edge in order:<br>

>>     * If the source is at the end of a trace, and the target is at the<br>

>> beginning<br>

>>       of a trace, glue those 2 traces into 1 longer trace.<br>

>>     * If an edge has a target or source in the middle of another trace,<br>

>> consider<br>

>>       tail duplication. The benefit calculation is the same as the<br>

>> existing<br>

>>       code.<br>

>>     * If an edge has a source or target in the middle, check them to see<br>

>> if they<br>

>>       can be replaced as a triangle. (Triangle replacement described<br>

>> below)<br>

>>       * Compare the benefit of choosing the edge, along with any triangles<br>

>>         found, with the cost of breaking the existing edges.<br>

>>         * If it is a net benefit, perform the switch.<br>

>>  * Triangle checking:<br>

>>     Consider a trace in 2 parts: A1->A2, and the current edge under<br>

>> consideration<br>

>>     is A1->B (the case for C->A2 is mirror, and both may need to be done)<br>

>>     * First find the best alternative C->B<br>

>>     * Check for an alternative for A2: D->A2<br>

>>     * Find D's best Alternative: D->E<br>

>>     * Compare the frequencies: A1->A2 + C->B + D->E vs A1->B + D->A2<br>

>>     * If the 2nd sum is bigger, do the switch.<br>

>>   * Loop Rotation Tweak:<br>

>>     If A contains a backedge A2->A1, then when considering A1->B or<br>

>> C->A2, we<br>

>>     can include that backedge in the gain:<br>

>>     A1->A2 + C->D + E->B vs A1->B + C->A2 + A2->A<br>

>><br>

>> Phase 2: Order traces.<br>

>> First we compute the frequency of a trace by finding the max frequency of<br>

>> any of<br>

>> its blocks.<br>

>> Then we attempt to place the traces topologically. When a trace cannot be<br>

>> placed<br>

>> topologically, we prefer warmer traces first.<br>

>><br>

>> Questions and comments welcome.<br>

>><br>

>> ______________________________<wbr>_________________<br>

>> LLVM Developers mailing list<br>

>> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

>> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

>><br>

>><br>

><br>

-------------- next part --------------<br>

An HTML attachment was scrubbed...<br>

URL: <<a href="http://lists.llvm.org/pipermail/llvm-dev/attachments/20170915/82dfc991/attachment-0001.html" rel="noreferrer" target="_blank">http://lists.llvm.org/<wbr>pipermail/llvm-dev/<wbr>attachments/20170915/82dfc991/<wbr>attachment-0001.html</a>><br>

<br>

------------------------------<br>

<br>

Message: 6<br>

Date: Fri, 15 Sep 2017 17:42:23 +0000<br>

From: "Demikhovsky, Elena via llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

To: "<a href="mailto:jingu@codeplay.com">jingu@codeplay.com</a>" <<a href="mailto:jingu@codeplay.com">jingu@codeplay.com</a>>,<br>

        "<a href="mailto:daniel_l_sanders@apple.com">daniel_l_sanders@apple.com</a>" <<a href="mailto:daniel_l_sanders@apple.com">daniel_l_sanders@apple.com</a>><br>

Cc: "<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

Subject: Re: [llvm-dev] Question about<br>

        'DAGTypeLegalizer::SplitVecOp_<wbr>EXTRACT_VECTOR_ELT'<br>

Message-ID:<br>

        <<a href="mailto:A0DC88CEB3010344830D52D66533DA8E5EE2F88D@hasmsx108.ger.corp.intel.com">A0DC88CEB3010344830D52D66533D<wbr>A8E5EE2F88D@hasmsx108.ger.<wbr>corp.intel.com</a>><br>

<br>

Content-Type: text/plain; charset="utf-8"<br>

<br>

> extends the elements to 8bit and stores them on stack.<br>

Store is responsible for zero-extend. This is the policy...<br>

<br>

-  Elena<br>

<br>

<br>

-----Original Message-----<br>

From: <a href="mailto:jingu@codeplay.com">jingu@codeplay.com</a> [mailto:<a href="mailto:jingu@codeplay.com">jingu@codeplay.com</a>]<br>

Sent: Friday, September 15, 2017 17:45<br>

To: <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>; Demikhovsky, Elena <<a href="mailto:elena.demikhovsky@intel.com">elena.demikhovsky@intel.com</a>>; <a href="mailto:daniel_l_sanders@apple.com">daniel_l_sanders@apple.com</a><br>

Subject: Re: Question about 'DAGTypeLegalizer::SplitVecOp_<wbr>EXTRACT_VECTOR_ELT'<br>

<br>

Can someone give the comment about it please?<br>

<br>

Thanks,<br>

<br>

JinGu Kang<br>

<br>

<br>

On 14/09/17 12:05, <a href="mailto:jingu@codeplay.com">jingu@codeplay.com</a> wrote:<br>

> Hi All,<br>

><br>

> I have a question about splitting 'EXTRACT_VECTOR_ELT' with 'v2i1'. I<br>

> have a llvm IR code snippet as following:<br>

><br>

> llvm IR code snippet:<br>

><br>

> for.body:                                         ; preds = %entry,<br>

> %for.cond<br>

>   %i.022 = phi i32 [ 0, %entry ], [ %inc, %for.cond ]<br>

>   %0 = icmp ne <2 x i32> %vecinit1, <i32 0, i32 -23><br>

>   %1 = extractelement <2 x i1> %0, i32 %i.022<br>

>   %vecext4 = extractelement <2 x i32> %vecinit1, i32 %i.022<br>

>   %vecext5 = extractelement <2 x i32> <i32 0, i32 -23>, i32 %i.022<br>

>   %cmp6 = icmp ne i32 %vecext4, %vecext5<br>

>   %cmp7 = xor i1 %1, %cmp6<br>

><br>

> ...<br>

><br>

> and the SelectionDAG before TypeLegalizer is like this.<br>

><br>

>   t0: ch = EntryToken<br>

>   t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0<br>

>   t3: ch = ValueType:i32<br>

>       t5: i32,ch = CopyFromReg t2:1, Register:i32 %vreg1<br>

>     t7: i32 = AssertZext t5, ValueType:ch:i1<br>

>   t8: v2i32 = BUILD_VECTOR t2, t7<br>

>   t11: v2i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<-23><br>

>   t15: i32,ch = CopyFromReg t0, Register:i32 %vreg2<br>

>           t22: i32 = add t15, Constant:i32<1><br>

>         t24: ch = CopyToReg t0, Register:i32 %vreg3, t22<br>

>         t27: ch = CopyToReg t0, Register:i32 %vreg8, Constant:i32<-1><br>

>       t31: ch = TokenFactor t24, t27<br>

>             t13: v2i1 = setcc t8, t11, setne:ch<br>

>           t16: i1 = extract_vector_elt t13, t15<br>

>             t17: i32 = extract_vector_elt t8, t15<br>

>             t18: i32 = extract_vector_elt t11, t15<br>

>           t19: i1 = setcc t17, t18, setne:ch<br>

>         t20: i1 = xor t16, t19<br>

><br>

> ...<br>

><br>

> I have not added any vector register class so 'DAGTypeLegalizer' tries<br>

> to split the "t16: i1 = extract_vector_elt t13, t15" because  t13's<br>

> result type is 'v2i1'. If the size of vector element is less than<br>

> 8bit, 'DAGTypeLegalizer::SplitVecOp_<wbr>EXTRACT_VECTOR_ELT()' function<br>

> extends the elements to 8bit and stores them on stack. Finally, the<br>

> function generates 'ExtLoad' to load specific element. But if the<br>

> element's size is less than 8bit, I think it could be wrong. It looks<br>

> it needs just 'Load' or "Load and Truncate" to match the result type<br>

> of 'EXTRACT_VECTOR_ELT'. How do you think about it? If I missed<br>

> something, please let me know.<br>

><br>

> Thanks,<br>

><br>

> JinGu Kang<br>

><br>

<br>

------------------------------<wbr>------------------------------<wbr>---------<br>

Intel Israel (74) Limited<br>

<br>

This e-mail and any attachments may contain confidential material for<br>

the sole use of the intended recipient(s). Any review or distribution<br>

by others is strictly prohibited. If you are not the intended<br>

recipient, please contact the sender and delete all copies.<br>

<br>

------------------------------<br>

<br>

Message: 7<br>

Date: Fri, 15 Sep 2017 10:55:14 -0700<br>

From: "Friedman, Eli via llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

To: Jon Chesterfield <<a href="mailto:jonathanchesterfield@gmail.com">jonathanchesterfield@gmail.<wbr>com</a>>, llvm-dev<br>

        <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

Subject: Re: [llvm-dev] What should a truncating store do?<br>

Message-ID: <<a href="mailto:a0b1d63b-d177-beff-899e-420e8f2c0798@codeaurora.org">a0b1d63b-d177-beff-899e-<wbr>420e8f2c0798@codeaurora.org</a>><br>

Content-Type: text/plain; charset=utf-8; format=flowed<br>

<br>

On 9/15/2017 5:49 AM, Jon Chesterfield via llvm-dev wrote:<br>

> For example, truncating store of an i32 to i6. My assumption was that<br>

> this should write the low six bits of the i32 to somewhere in memory.<br>

><br>

> Should the top 24 bits of a corresponding 32 bit region of memory be<br>

> unchanged, zero,  undefined?<br>

<br>

Unchanged.<br>

<br>

> Should the two bits that would round the i6 up to a byte be preserved,<br>

> zero, undefined?<br>

<br>

Zero.  Legalization will normally handle this for you, though, by<br>

transforming it to an i8 store.<br>

<br>

-Eli<br>

<br>

--<br>

Employee of Qualcomm Innovation Center, Inc.<br>

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project<br>

<br>

<br>

<br>

------------------------------<br>

<br>

Message: 8<br>

Date: Fri, 15 Sep 2017 19:30:20 +0100<br>

From: Jon Chesterfield via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

To: "Friedman, Eli" <<a href="mailto:efriedma@codeaurora.org">efriedma@codeaurora.org</a>><br>

Cc: llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

Subject: Re: [llvm-dev] What should a truncating store do?<br>

Message-ID:<br>

        <<a href="mailto:CAOUYtQBoArROmMx1Ke0jFxpsQ2ztFqtNxgbLzWVvycs0Ls72eA@mail.gmail.com">CAOUYtQBoArROmMx1Ke0jFxpsQ2zt<wbr>FqtNxgbLzWVvycs0Ls72eA@mail.<wbr>gmail.com</a>><br>

Content-Type: text/plain; charset="utf-8"<br>

<br>

Interesting, thank you. I expected both answers to be "unchanged" so was<br>

surprised by the zero extend in the legaliser.<br>

<br>

The motivation here is that it's faster for us to load N bytes, apply<br>

whatever masks are necessary to reproduce the truncating store then store<br>

all N bytes. This is only a good plan if there's no change to the semantics<br>

:)<br>

<br>

Are scalar integer types zero extended to the next multiple of 8 or to the<br>

next power of 2 greater than 7? For example, i17 => i24 or i17 => i32?<br>

<br>

I think this means truncating stores of vector types will introduce zero<br>

bits at the end of each element instead grouping all the zeros at the end.<br>

For example, <i6 63, i6 63> writes to sixteen bits as 0b0011111100111111,<br>

not as 0b0000111111111111?<br>

<br>

<br>

Thanks!<br>

<br>

Jon<br>

<br>

<br>

<br>

On Fri, Sep 15, 2017 at 6:55 PM, Friedman, Eli <<a href="mailto:efriedma@codeaurora.org">efriedma@codeaurora.org</a>><br>

wrote:<br>

<br>

> On 9/15/2017 5:49 AM, Jon Chesterfield via llvm-dev wrote:<br>

><br>

>> For example, truncating store of an i32 to i6. My assumption was that<br>

>> this should write the low six bits of the i32 to somewhere in memory.<br>

>><br>

>> Should the top 24 bits of a corresponding 32 bit region of memory be<br>

>> unchanged, zero,  undefined?<br>

>><br>

><br>

> Unchanged.<br>

><br>

> Should the two bits that would round the i6 up to a byte be preserved,<br>

>> zero, undefined?<br>

>><br>

><br>

> Zero.  Legalization will normally handle this for you, though, by<br>

> transforming it to an i8 store.<br>

><br>

> -Eli<br>

><br>

> --<br>

> Employee of Qualcomm Innovation Center, Inc.<br>

> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux<br>

> Foundation Collaborative Project<br>

><br>

><br>

-------------- next part --------------<br>

An HTML attachment was scrubbed...<br>

URL: <<a href="http://lists.llvm.org/pipermail/llvm-dev/attachments/20170915/1b054776/attachment-0001.html" rel="noreferrer" target="_blank">http://lists.llvm.org/<wbr>pipermail/llvm-dev/<wbr>attachments/20170915/1b054776/<wbr>attachment-0001.html</a>><br>

<br>

------------------------------<br>

<br>

Message: 9<br>

Date: Fri, 15 Sep 2017 11:41:14 -0700<br>

From: "Friedman, Eli via llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

To: Jon Chesterfield <<a href="mailto:jonathanchesterfield@gmail.com">jonathanchesterfield@gmail.<wbr>com</a>><br>

Cc: llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

Subject: Re: [llvm-dev] What should a truncating store do?<br>

Message-ID: <<a href="mailto:8a9c81d9-9c89-9956-c269-d3057a71b451@codeaurora.org">8a9c81d9-9c89-9956-c269-<wbr>d3057a71b451@codeaurora.org</a>><br>

Content-Type: text/plain; charset=utf-8; format=flowed<br>

<br>

On 9/15/2017 11:30 AM, Jon Chesterfield wrote:<br>

> Interesting, thank you. I expected both answers to be "unchanged" so<br>

> was surprised by the zero extend in the legaliser.<br>

><br>

> The motivation here is that it's faster for us to load N bytes, apply<br>

> whatever masks are necessary to reproduce the truncating store then<br>

> store all N bytes. This is only a good plan if there's no change to<br>

> the semantics :)<br>

<br>

See <a href="http://llvm.org/docs/LangRef.html#store-instruction" rel="noreferrer" target="_blank">http://llvm.org/docs/LangRef.<wbr>html#store-instruction</a> .  In general,<br>

you have to be careful to avoid data races, but that might not apply to<br>

your target.<br>

<br>

> Are scalar integer types zero extended to the next multiple of 8 or to<br>

> the next power of 2 greater than 7? For example, i17 => i24 or i17 => i32?<br>

<br>

Multiple of 8.<br>

<br>

> I think this means truncating stores of vector types will introduce<br>

> zero bits at the end of each element instead grouping all the zeros at<br>

> the end. For example, <i6 63, i6 63> writes to sixteen bits as<br>

> 0b0011111100111111, not as 0b0000111111111111?<br>

<br>

Vector types are tightly packed, so <8 x i1> is 1 byte, not 8 bytes.<br>

<br>

-Eli<br>

<br>

--<br>

Employee of Qualcomm Innovation Center, Inc.<br>

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project<br>

<br>

<br>

<br>

------------------------------<br>

<br>

Subject: Digest Footer<br>

<br>

______________________________<wbr>_________________<br>

llvm-dev mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

<br>

<br>

------------------------------<br>

<br>

End of llvm-dev Digest, Vol 159, Issue 57<br>

******************************<wbr>***********<br>

</blockquote></div><br></div></div></div>