<div dir="ltr">Elena, you seem to have missed where I specifically said that these kinds of refactorings should not go in without precommit review since it is not at all clear what direction this code is actually going, and since you don't understand the testing infrastructure that was built up for vector shuffle lowering.<div><br></div><div>I'll try to be more explicit: do not commit without approval, and do not approve patches for commit regarding x86 vector shuffle lowering.</div><div><br></div><div>I know this is a bit harsh, but here is my problem: you have not addressed my comments or my concerns, you aren't following the latest best practices for testing them and don't seem to understand those testing practices, and you seem to be building the avx512 shuffle lowering in a way I disagree with design-wise and committing it without review. I don't know how to realistically correct this without building up more bad code in the x86 backend that will have to be replaced later other than to insist on precommit review. I'm CC-ing some other folks to try and make sure I'm not misunderstanding what is going on here, and to double-check that this is the correct response.</div><div><br></div><div><br></div><div>Assuming that folks generally agree, and because we don't seem to be making progress with post-commit review, I suggest the following course of action:</div><div><br></div><div>1) Revert the functional changes to the vector shuffle lowering code in X86ISelLowering.cpp until the design has been discussed.</div><div><br></div><div>2) Start a discussion on llvmdev for how to design the avx512 shuffle lowering based on the new general shuffle lowering infrastructure. This follows exactly the process that *I* used to start working on vector shuffle lowering.</div><div><br></div><div>3) Based on the discussion in #2, it should be clear what the correct initial patch will look like. Post that for precommit review, preferably (since I'm likely one of a couple of good reviewers) using Phabricator.</div><div><br></div><div><br></div><div>Some more comments in line on specifics:</div><div><br></div><div><div class="gmail_quote"><div dir="ltr">On Thu, Jun 4, 2015 at 1:03 AM Demikhovsky, Elena <<a href="mailto:elena.demikhovsky@intel.com">elena.demikhovsky@intel.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-US" link="blue" vlink="purple">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Chandler,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I re-generated the previously deleted test file test/CodeGen/X86/vector-shuffle-512-v8.ll and put it back into repository.</span></p></div></div></blockquote><div>I don't understand why you are continuing to use the other tests though?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">As far I understand, the vector-shuffle-512-v8.ll does not check correctness of the shuffles, it just checks that noting fails.</span></p></div></div></blockquote><div>That is pretty clearly not the case. The test is checking for specific instruction sequences on each pattern.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">And it should be re-generated after each optimization, right?</span></p></div></div></blockquote><div>Yes, and most specifically the diff should show exactly what the change entailed.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div>
<div>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Demikhovsky, Elena
<br>
<b>Sent:</b> Thursday, June 04, 2015 09:02</span> <span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""><br></span></p></div></div></div></div><div lang="EN-US" link="blue" vlink="purple"><p class="MsoNormal"><u></u></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Hi Chandler,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">First of all, I did not add any functional changes to AVX2. So the AVX2 code state did not change at all.</span></p></div></blockquote><div>You made the code significantly more complex and hard to understand. Perhaps this complexity is worth while to share code with AVX-512, but that isn't clear to me because we don't actually have a design for how AVX-512 shuffle lowering should work yet.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">SHUFPD is common logic for AVX2 and AVX-512 and I put it in a separate function, like you did with SHUFPS.</span></p></div></blockquote><div>These are not necessarily equivalent. They may be, they may not be. SHUFPD has common logic between SSE2 and AVX2 as well and yet the code was more clear without trying to overtly share it.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I worked on AVX-512 shuffles, 32 and 64-bit elements. Now the functions lowerV8X64VectorShuffle() and lowerV16X32VectorShuffle() are short and clear.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Each AVX-512 shuffle is replaced now with 1 instruction - instead of 5-6 that were before. This is the main benefit of AVX-512 - we don’t need more than one
instruction.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">The new ISA provides many instructions that shuffle 512-bit vectors. In the worst case I put an instruction with variable permutations, which loads indices
from memory.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">In order to avoid the load, I’m trying to match PSHUFD, SHUFPS, VPERMIL, VALIGN patterns. I added a lit test for each pattern.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">The work on V8X64 and V16X32 is almost completed. We also have 128-bit shuffles that will be implemented soon and may give some code improvements.</span></p></div></blockquote><div><br></div><div>I'm sorry that I didn't review each of these patches as they went in, but they didn't go for precommit review, and this is where I started diving into the code.</div><div><br></div><div>I think that the overall design needs a discussion with the community, and some agreement that it is the correct design first. Your patches, the amount of comments, and the complete *deletion* of tests and failure to add new tests that follow the existing pattern for all other vector shuffle testing makes it very hard for me to conclude that these patches are in any way "obvious" and reasonable to commit without review.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I’m going to look at your utility. ( It is still not clear for me why do we need it. )</span></p></div></blockquote><div>Note that there was code review for the utility and discussion in the community about it. If you disagree with the usage of it or the design of the vector shuffle tests, then raise that as a separate point of discussion.</div><div><br></div><div>What I find really unacceptable is to simply implement AVX-512 in a different manner with a different approach to testing than the rest of the vector shuffle lowering with no discussion with the other significant contributors to the x86 backend, and that is what has happened here.</div></div></div></div>