[PATCH] D130618: [AArch64][LoopVectorize] Enable tail-folding of simple loops on neoverse-v1

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 2 08:54:14 PDT 2023


david-arm marked 5 inline comments as done.
david-arm added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:3536
 
-  return TailFoldingOptionLoc.satisfies(ST->getSVETailFoldingDefaultOpts(),
-                                        Required);
+  if (!TailFoldingOptionLoc.satisfies(ST->getSVETailFoldingDefaultOpts(),
+                                      Required))
----------------
Hi @dmgreen, I've tried to address your concerns about using tail-folding for very tight loops here. It's a little crude, but I've introduced an instruction threshold, below which we don't tail-fold. For the godbolt example you gave we won't use tail-folding, but for the SPEC2017 benchmarks (and many others) we will.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:60-61
+  uint8_t DefaultBits = TFDisabled;
+  std::string OrigVal;
+  SmallVector<StringRef, 4> TailFoldTypes;
+
----------------
david-arm wrote:
> sdesmalen wrote:
> > sdesmalen wrote:
> > > nit: Could you rename this to UnparsedOptionString or something?
> > Rather than storing the string and substrings, can you just do the parsing as part of the constructor of TailFoldingOption so that you only need to store the bits?
> I seem to remember there was a problem with ordering where we had to ensure we got the same behaviour for each version of this command line:
> 
> `-mcpu=native -mllvm -sve-tail-folding=default`
> 
> and
> 
> `-mllvm -sve-tail-folding=default -mcpu=native`
> 
> which is tested in sve-tail-folding-option.ll. I think it meant that we could only deal with the options after the entire command line has been parsed, so we couldn't do it in the constructor. I'll double check though as I may have remembered incorrectly!
Hi @sdesmalen, so this isn't possible because (as mentioned above) you can't guarantee that the construction (in this case the `operator=`) of the option happens before setting the CPU with `-mcpu=neoverse-v1`. So you have to construct the bits only after you're guaranteed to have parsed *all* options on the command line. This is why we have to dynamically reconstruct the bits each time.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:90
+      else
+        llvm_unreachable("No! That's impossible!");
+    }
----------------
sdesmalen wrote:
> Instead of an llvm_unreachable, do you want to print the error message here?
So we now do report an error when parsing the option with the `operator=` flag below, which means that this else case should really be unreachable now.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130618/new/

https://reviews.llvm.org/D130618



More information about the llvm-commits mailing list