[PATCH] D24681: Optimize patterns of vectorized interleaved memory accesses for X86.

Fri Oct 7 17:10:37 PDT 2016

Farhana added inline comments.

================
Comment at: test/CodeGen/X86/x86-interleaved-access.ll:1
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64-pc-linux  -mattr=+avx < %s | FileCheck %s --check-prefix=AVX --check-prefix=AVX1
----------------
RKSimon wrote:
> Farhana wrote:
> > Farhana wrote:
> > > RKSimon wrote:
> > > > Farhana wrote:
> > > > > RKSimon wrote:
> > > > > > Is this actually true? The checks below don't look like what the script would generate.
> > > > > Hi Simon,
> > > > > I am not sure whether I understand your concern. Which checks are you talking about?
> > > > > Farhana
> > > > The 'AVX-NEXT'/'AVX1-NEXT'/'AVX2-NEXT' checks - the update script would generate quite a bit more than what is shown below.
> > > If I understand your comment correctly, you are saying the optimization will generate more instructions than it is checking for. Yes, it only checks for the must instructions, because the rest can be optimized away depending on the uses. 
> > Hi Simon,
> > 
> > I think I understand your question now (Dave helped me).
> > 
> > You are right the script update_llc_test_checks.py generates quite a bit more checks than what I have here. Yes, the checks are not auto-generated by the script. I got rid of the NOTE. 
> > 
> > But now I am wondering whether I should have used the script or not. I did not want to put all the checks because in my opinion putting all of them would be unnecessary in this case, checking for first few instructions would be enough to ensure the behavior.
> > 
> > Let me know if you think it's good practice to use the script always...
> > 
> Generally yes, the script output is great as its easy to regenerate, it means you're not hiding anything and its easier to grok the entire codegen. 
> 
> There are plenty of cases where bulky codesize is just too off putting and CHECKs should be more selective, but if the codesize could be reduced in the future I'd tend to include it as its very useful to show the delta.
> 
> But for these interleave cases I think it'd be useful, especially as we don't have any other reference examples of x86 interleave codegen at present. If it means you need to split the tests into multiple files (we often have 128 / 256 / 512 versions of test files), so be it.
Sounds good. 

Right now, I don't need to split the file, may be in the future I will have to consider it. 

https://reviews.llvm.org/D24681