<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Oct 3, 2016, at 2:17 PM, Hal Finkel <<a href="mailto:hfinkel@anl.gov" class="">hfinkel@anl.gov</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">----- Original Message -----</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">From: "Volkan Keles via llvm-commits" <<a href="mailto:llvm-commits@lists.llvm.org" class="">llvm-commits@lists.llvm.org</a>><br class="">To: <a href="mailto:llvm-commits@lists.llvm.org" class="">llvm-commits@lists.llvm.org</a><br class="">Sent: Monday, October 3, 2016 5:31:35 AM<br class="">Subject: [llvm] r283099 - Add new target hooks for LoadStoreVectorizer<br class=""><br class="">Author: volkan<br class="">Date: Mon Oct  3 05:31:34 2016<br class="">New Revision: 283099<br class=""><br class="">URL: <a href="http://llvm.org/viewvc/llvm-project?rev=283099&view=rev" class="">http://llvm.org/viewvc/llvm-project?rev=283099&view=rev</a><br class="">Log:<br class="">Add new target hooks for LoadStoreVectorizer<br class=""><br class="">Summary: Added 6 new target hooks for the vectorizer in order to<br class="">filter types, handle size constraints and decide how to split<br class="">chains.<br class=""><br class="">Reviewers: tstellarAMD, arsenm<br class=""><br class="">Subscribers: arsenm, mzolotukhin, wdng, llvm-commits, nhaehnle<br class=""><br class="">Differential Revision: <a href="https://reviews.llvm.org/D24727" class="">https://reviews.llvm.org/D24727</a><br class=""><br class="">Modified:<br class="">   llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h<br class="">   llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h<br class="">   llvm/trunk/lib/Analysis/TargetTransformInfo.cpp<br class="">   llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp<br class="">   llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h<br class="">   llvm/trunk/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp<br class=""><br class="">Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h<br class="">URL:<br class=""><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=283099&r1=283098&r2=283099&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=283099&r1=283098&r2=283099&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original)<br class="">+++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Mon Oct  3<br class="">05:31:34 2016<br class="">@@ -466,10 +466,6 @@ public:<br class="">  /// \return The width of the largest scalar or vector register<br class="">  type.<br class="">  unsigned getRegisterBitWidth(bool Vector) const;<br class=""><br class="">-  /// \return The bitwidth of the largest vector type that should be<br class="">used to<br class="">-  /// load/store in the given address space.<br class="">-  unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) const;<br class="">-<br class="">  /// \return The size of a cache line in bytes.<br class="">  unsigned getCacheLineSize() const;<br class=""><br class="">@@ -620,6 +616,38 @@ public:<br class="">  bool areInlineCompatible(const Function *Caller,<br class="">                           const Function *Callee) const;<br class=""><br class="">+  /// \returns The bitwidth of the largest vector type that should<br class="">be used to<br class="">+  /// load/store in the given address space.<br class="">+  unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) const;<br class="">+<br class="">+  /// \returns True if the load instruction is legal to vectorize.<br class="">+  bool isLegalToVectorizeLoad(LoadInst *LI) const;<br class="">+<br class="">+  /// \returns True if the store instruction is legal to vectorize.<br class="">+  bool isLegalToVectorizeStore(StoreInst *SI) const;<br class=""></blockquote><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Should all of the vectorizers be updated to use these?</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote><div><br class=""></div><div>They can use but they work on other instructions as well. I think one function for all of them might be better.</div><br class=""><blockquote type="cite" class=""><div class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">+<br class="">+  /// \returns True if it is legal to vectorize the given load<br class="">chain.<br class="">+  bool isLegalToVectorizeLoadChain(unsigned ChainSizeInBytes,<br class="">+                                   unsigned Alignment,<br class="">+                                   unsigned AddrSpace) const;<br class="">+<br class="">+  /// \returns True if it is legal to vectorize the given store<br class="">chain.<br class="">+  bool isLegalToVectorizeStoreChain(unsigned ChainSizeInBytes,<br class="">+                                    unsigned Alignment,<br class="">+                                    unsigned AddrSpace) const;<br class=""></blockquote><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">For these, please enhance the comments to explain what a "load chain" and "store chain" are.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote><div><br class=""></div><div>I will.</div><div><br class=""></div><div>Thanks,</div><div>Volkan</div><br class=""><blockquote type="cite" class=""><div class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Thanks again,</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Hal</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">+<br class="">+  /// \returns The new vector factor value if the target doesn't<br class="">support \p<br class="">+  /// SizeInBytes loads or has a better vector factor.<br class="">+  unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,<br class="">+                               unsigned ChainSizeInBytes,<br class="">+                               VectorType *VecTy) const;<br class="">+<br class="">+  /// \returns The new vector factor value if the target doesn't<br class="">support \p<br class="">+  /// SizeInBytes stores or has a better vector factor.<br class="">+  unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,<br class="">+                                unsigned ChainSizeInBytes,<br class="">+                                VectorType *VecTy) const;<br class="">+<br class="">  /// @}<br class=""><br class="">private:<br class="">@@ -695,7 +723,6 @@ public:<br class="">                            Type *Ty) = 0;<br class="">  virtual unsigned getNumberOfRegisters(bool Vector) = 0;<br class="">  virtual unsigned getRegisterBitWidth(bool Vector) = 0;<br class="">-  virtual unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) =<br class="">0;<br class="">  virtual unsigned getCacheLineSize() = 0;<br class="">  virtual unsigned getPrefetchDistance() = 0;<br class="">  virtual unsigned getMinPrefetchStride() = 0;<br class="">@@ -748,6 +775,21 @@ public:<br class="">                                                   Type<br class="">                                                   *ExpectedType) =<br class="">                                                   0;<br class="">  virtual bool areInlineCompatible(const Function *Caller,<br class="">                                   const Function *Callee) const =<br class="">                                   0;<br class="">+  virtual unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace)<br class="">const = 0;<br class="">+  virtual bool isLegalToVectorizeLoad(LoadInst *LI) const = 0;<br class="">+  virtual bool isLegalToVectorizeStore(StoreInst *SI) const = 0;<br class="">+  virtual bool isLegalToVectorizeLoadChain(unsigned<br class="">ChainSizeInBytes,<br class="">+                                           unsigned Alignment,<br class="">+                                           unsigned AddrSpace) const<br class="">= 0;<br class="">+  virtual bool isLegalToVectorizeStoreChain(unsigned<br class="">ChainSizeInBytes,<br class="">+                                            unsigned Alignment,<br class="">+                                            unsigned AddrSpace)<br class="">const = 0;<br class="">+  virtual unsigned getLoadVectorFactor(unsigned VF, unsigned<br class="">LoadSize,<br class="">+                                       unsigned ChainSizeInBytes,<br class="">+                                       VectorType *VecTy) const = 0;<br class="">+  virtual unsigned getStoreVectorFactor(unsigned VF, unsigned<br class="">StoreSize,<br class="">+                                        unsigned ChainSizeInBytes,<br class="">+                                        VectorType *VecTy) const =<br class="">0;<br class="">};<br class=""><br class="">template <typename T><br class="">@@ -890,10 +932,6 @@ public:<br class="">    return Impl.getRegisterBitWidth(Vector);<br class="">  }<br class=""><br class="">-  unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) override {<br class="">-    return Impl.getLoadStoreVecRegBitWidth(AddrSpace);<br class="">-  }<br class="">-<br class="">  unsigned getCacheLineSize() override {<br class="">    return Impl.getCacheLineSize();<br class="">  }<br class="">@@ -993,6 +1031,37 @@ public:<br class="">                           const Function *Callee) const override {<br class="">    return Impl.areInlineCompatible(Caller, Callee);<br class="">  }<br class="">+  unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) const<br class="">override {<br class="">+    return Impl.getLoadStoreVecRegBitWidth(AddrSpace);<br class="">+  }<br class="">+  bool isLegalToVectorizeLoad(LoadInst *LI) const override {<br class="">+    return Impl.isLegalToVectorizeLoad(LI);<br class="">+  }<br class="">+  bool isLegalToVectorizeStore(StoreInst *SI) const override {<br class="">+    return Impl.isLegalToVectorizeStore(SI);<br class="">+  }<br class="">+  bool isLegalToVectorizeLoadChain(unsigned ChainSizeInBytes,<br class="">+                                   unsigned Alignment,<br class="">+                                   unsigned AddrSpace) const<br class="">override {<br class="">+    return Impl.isLegalToVectorizeLoadChain(ChainSizeInBytes,<br class="">Alignment,<br class="">+                                            AddrSpace);<br class="">+  }<br class="">+  bool isLegalToVectorizeStoreChain(unsigned ChainSizeInBytes,<br class="">+                                    unsigned Alignment,<br class="">+                                    unsigned AddrSpace) const<br class="">override {<br class="">+    return Impl.isLegalToVectorizeStoreChain(ChainSizeInBytes,<br class="">Alignment,<br class="">+                                             AddrSpace);<br class="">+  }<br class="">+  unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,<br class="">+                               unsigned ChainSizeInBytes,<br class="">+                               VectorType *VecTy) const override {<br class="">+    return Impl.getLoadVectorFactor(VF, LoadSize, ChainSizeInBytes,<br class="">VecTy);<br class="">+  }<br class="">+  unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,<br class="">+                                unsigned ChainSizeInBytes,<br class="">+                                VectorType *VecTy) const override {<br class="">+    return Impl.getStoreVectorFactor(VF, StoreSize,<br class="">ChainSizeInBytes, VecTy);<br class="">+  }<br class="">};<br class=""><br class="">template <typename T><br class=""><br class="">Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h<br class="">URL:<br class=""><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h?rev=283099&r1=283098&r2=283099&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h?rev=283099&r1=283098&r2=283099&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h<br class="">(original)<br class="">+++ llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h Mon<br class="">Oct  3 05:31:34 2016<br class="">@@ -290,8 +290,6 @@ public:<br class=""><br class="">  unsigned getRegisterBitWidth(bool Vector) { return 32; }<br class=""><br class="">-  unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) { return<br class="">128; }<br class="">-<br class="">  unsigned getCacheLineSize() { return 0; }<br class=""><br class="">  unsigned getPrefetchDistance() { return 0; }<br class="">@@ -393,6 +391,36 @@ public:<br class="">           (Caller->getFnAttribute("target-features") ==<br class="">            Callee->getFnAttribute("target-features"));<br class="">  }<br class="">+<br class="">+  unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) const {<br class="">return 128; }<br class="">+<br class="">+  bool isLegalToVectorizeLoad(LoadInst *LI) const { return true; }<br class="">+<br class="">+  bool isLegalToVectorizeStore(StoreInst *SI) const { return true; }<br class="">+<br class="">+  bool isLegalToVectorizeLoadChain(unsigned ChainSizeInBytes,<br class="">+                                   unsigned Alignment,<br class="">+                                   unsigned AddrSpace) const {<br class="">+    return true;<br class="">+  }<br class="">+<br class="">+  bool isLegalToVectorizeStoreChain(unsigned ChainSizeInBytes,<br class="">+                                    unsigned Alignment,<br class="">+                                    unsigned AddrSpace) const {<br class="">+    return true;<br class="">+  }<br class="">+<br class="">+  unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,<br class="">+                               unsigned ChainSizeInBytes,<br class="">+                               VectorType *VecTy) const {<br class="">+    return VF;<br class="">+  }<br class="">+<br class="">+  unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,<br class="">+                                unsigned ChainSizeInBytes,<br class="">+                                VectorType *VecTy) const {<br class="">+    return VF;<br class="">+  }<br class="">};<br class=""><br class="">/// \brief CRTP base class for use as a mix-in that aids<br class="">implementing<br class=""><br class="">Modified: llvm/trunk/lib/Analysis/TargetTransformInfo.cpp<br class="">URL:<br class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/TargetTransformInfo.cpp?rev=283099&r1=283098&r2=283099&view=diff<br class="">==============================================================================<br class="">--- llvm/trunk/lib/Analysis/TargetTransformInfo.cpp (original)<br class="">+++ llvm/trunk/lib/Analysis/TargetTransformInfo.cpp Mon Oct  3<br class="">05:31:34 2016<br class="">@@ -251,10 +251,6 @@ unsigned TargetTransformInfo::getRegiste<br class="">  return TTIImpl->getRegisterBitWidth(Vector);<br class="">}<br class=""><br class="">-unsigned TargetTransformInfo::getLoadStoreVecRegBitWidth(unsigned<br class="">AS) const {<br class="">-  return TTIImpl->getLoadStoreVecRegBitWidth(AS);<br class="">-}<br class="">-<br class="">unsigned TargetTransformInfo::getCacheLineSize() const {<br class="">  return TTIImpl->getCacheLineSize();<br class="">}<br class="">@@ -423,6 +419,44 @@ bool TargetTransformInfo::areInlineCompa<br class="">  return TTIImpl->areInlineCompatible(Caller, Callee);<br class="">}<br class=""><br class="">+unsigned TargetTransformInfo::getLoadStoreVecRegBitWidth(unsigned<br class="">AS) const {<br class="">+  return TTIImpl->getLoadStoreVecRegBitWidth(AS);<br class="">+}<br class="">+<br class="">+bool TargetTransformInfo::isLegalToVectorizeLoad(LoadInst *LI) const<br class="">{<br class="">+  return TTIImpl->isLegalToVectorizeLoad(LI);<br class="">+}<br class="">+<br class="">+bool TargetTransformInfo::isLegalToVectorizeStore(StoreInst *SI)<br class="">const {<br class="">+  return TTIImpl->isLegalToVectorizeStore(SI);<br class="">+}<br class="">+<br class="">+bool TargetTransformInfo::isLegalToVectorizeLoadChain(<br class="">+    unsigned ChainSizeInBytes, unsigned Alignment, unsigned<br class="">AddrSpace) const {<br class="">+  return TTIImpl->isLegalToVectorizeLoadChain(ChainSizeInBytes,<br class="">Alignment,<br class="">+                                              AddrSpace);<br class="">+}<br class="">+<br class="">+bool TargetTransformInfo::isLegalToVectorizeStoreChain(<br class="">+    unsigned ChainSizeInBytes, unsigned Alignment, unsigned<br class="">AddrSpace) const {<br class="">+  return TTIImpl->isLegalToVectorizeStoreChain(ChainSizeInBytes,<br class="">Alignment,<br class="">+                                               AddrSpace);<br class="">+}<br class="">+<br class="">+unsigned TargetTransformInfo::getLoadVectorFactor(unsigned VF,<br class="">+                                                  unsigned LoadSize,<br class="">+                                                  unsigned<br class="">ChainSizeInBytes,<br class="">+                                                  VectorType *VecTy)<br class="">const {<br class="">+  return TTIImpl->getLoadVectorFactor(VF, LoadSize,<br class="">ChainSizeInBytes, VecTy);<br class="">+}<br class="">+<br class="">+unsigned TargetTransformInfo::getStoreVectorFactor(unsigned VF,<br class="">+                                                   unsigned<br class="">StoreSize,<br class="">+                                                   unsigned<br class="">ChainSizeInBytes,<br class="">+                                                   VectorType<br class="">*VecTy) const {<br class="">+  return TTIImpl->getStoreVectorFactor(VF, StoreSize,<br class="">ChainSizeInBytes, VecTy);<br class="">+}<br class="">+<br class="">TargetTransformInfo::Concept::~Concept() {}<br class=""><br class="">TargetIRAnalysis::TargetIRAnalysis() : TTICallback(&getDefaultTTI)<br class="">{}<br class=""><br class="">Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp<br class="">URL:<br class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp?rev=283099&r1=283098&r2=283099&view=diff<br class="">==============================================================================<br class="">--- llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp<br class="">(original)<br class="">+++ llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp Mon<br class="">Oct  3 05:31:34 2016<br class="">@@ -80,7 +80,7 @@ unsigned AMDGPUTTIImpl::getRegisterBitWi<br class="">  return Vector ? 0 : 32;<br class="">}<br class=""><br class="">-unsigned AMDGPUTTIImpl::getLoadStoreVecRegBitWidth(unsigned<br class="">AddrSpace) {<br class="">+unsigned AMDGPUTTIImpl::getLoadStoreVecRegBitWidth(unsigned<br class="">AddrSpace) const {<br class="">  switch (AddrSpace) {<br class="">  case AMDGPUAS::GLOBAL_ADDRESS:<br class="">  case AMDGPUAS::CONSTANT_ADDRESS:<br class=""><br class="">Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h<br class="">URL:<br class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h?rev=283099&r1=283098&r2=283099&view=diff<br class="">==============================================================================<br class="">--- llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h<br class="">(original)<br class="">+++ llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h Mon Oct<br class="">3 05:31:34 2016<br class="">@@ -82,7 +82,7 @@ public:<br class=""><br class="">  unsigned getNumberOfRegisters(bool Vector);<br class="">  unsigned getRegisterBitWidth(bool Vector);<br class="">-  unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace);<br class="">+  unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) const;<br class="">  unsigned getMaxInterleaveFactor(unsigned VF);<br class=""><br class="">  int getArithmeticInstrCost(<br class=""><br class="">Modified: llvm/trunk/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp<br class="">URL:<br class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp?rev=283099&r1=283098&r2=283099&view=diff<br class="">==============================================================================<br class="">--- llvm/trunk/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp<br class="">(original)<br class="">+++ llvm/trunk/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp Mon<br class="">Oct  3 05:31:34 2016<br class="">@@ -429,10 +429,13 @@ void Vectorizer::eraseInstructions(Array<br class="">std::pair<ArrayRef<Instruction *>, ArrayRef<Instruction *>><br class="">Vectorizer::splitOddVectorElts(ArrayRef<Instruction *> Chain,<br class="">                               unsigned ElementSizeBits) {<br class="">-  unsigned ElemSizeInBytes = ElementSizeBits / 8;<br class="">-  unsigned SizeInBytes = ElemSizeInBytes * Chain.size();<br class="">-  unsigned NumRight = (SizeInBytes % 4) / ElemSizeInBytes;<br class="">-  unsigned NumLeft = Chain.size() - NumRight;<br class="">+  unsigned ElementSizeBytes = ElementSizeBits / 8;<br class="">+  unsigned SizeBytes = ElementSizeBytes * Chain.size();<br class="">+  unsigned NumLeft = (SizeBytes - (SizeBytes % 4)) /<br class="">ElementSizeBytes;<br class="">+  if (NumLeft == Chain.size())<br class="">+    --NumLeft;<br class="">+  else if (NumLeft == 0)<br class="">+    NumLeft = 1;<br class="">  return std::make_pair(Chain.slice(0, NumLeft),<br class="">  Chain.slice(NumLeft));<br class="">}<br class=""><br class="">@@ -540,6 +543,10 @@ Vectorizer::collectInstructions(BasicBlo<br class="">      if (!LI->isSimple())<br class="">        continue;<br class=""><br class="">+      // Skip if it's not legal.<br class="">+      if (!TTI.isLegalToVectorizeLoad(LI))<br class="">+        continue;<br class="">+<br class="">      Type *Ty = LI->getType();<br class="">      if (!VectorType::isValidElementType(Ty->getScalarType()))<br class="">        continue;<br class="">@@ -565,8 +572,6 @@ Vectorizer::collectInstructions(BasicBlo<br class="">          }))<br class="">        continue;<br class=""><br class="">-      // TODO: Target hook to filter types.<br class="">-<br class="">      // Save the load locations.<br class="">      Value *ObjPtr = GetUnderlyingObject(Ptr, DL);<br class="">      LoadRefs[ObjPtr].push_back(LI);<br class="">@@ -575,6 +580,10 @@ Vectorizer::collectInstructions(BasicBlo<br class="">      if (!SI->isSimple())<br class="">        continue;<br class=""><br class="">+      // Skip if it's not legal.<br class="">+      if (!TTI.isLegalToVectorizeStore(SI))<br class="">+        continue;<br class="">+<br class="">      Type *Ty = SI->getValueOperand()->getType();<br class="">      if (!VectorType::isValidElementType(Ty->getScalarType()))<br class="">        continue;<br class="">@@ -719,6 +728,7 @@ bool Vectorizer::vectorizeStoreChain(<br class="">  unsigned VecRegSize = TTI.getLoadStoreVecRegBitWidth(AS);<br class="">  unsigned VF = VecRegSize / Sz;<br class="">  unsigned ChainSize = Chain.size();<br class="">+  unsigned Alignment = getAlignment(S0);<br class=""><br class="">  if (!isPowerOf2_32(Sz) || VF < 2 || ChainSize < 2) {<br class="">    InstructionsProcessed->insert(Chain.begin(), Chain.end());<br class="">@@ -741,17 +751,11 @@ bool Vectorizer::vectorizeStoreChain(<br class="">  Chain = NewChain;<br class="">  ChainSize = Chain.size();<br class=""><br class="">-  // Store size should be 1B, 2B or multiple of 4B.<br class="">-  // TODO: Target hook for size constraint?<br class="">+  // Check if it's legal to vectorize this chain. If not, split the<br class="">chain and<br class="">+  // try again.<br class="">  unsigned EltSzInBytes = Sz / 8;<br class="">  unsigned SzInBytes = EltSzInBytes * ChainSize;<br class="">-  if (SzInBytes > 2 && SzInBytes % 4 != 0) {<br class="">-    DEBUG(dbgs() << "LSV: Size should be 1B, 2B "<br class="">-                    "or multiple of 4B. Splitting.\n");<br class="">-    if (SzInBytes == 3)<br class="">-      return vectorizeStoreChain(Chain.slice(0, ChainSize - 1),<br class="">-                                 InstructionsProcessed);<br class="">-<br class="">+  if (!TTI.isLegalToVectorizeStoreChain(SzInBytes, Alignment, AS)) {<br class="">    auto Chains = splitOddVectorElts(Chain, Sz);<br class="">    return vectorizeStoreChain(Chains.first, InstructionsProcessed)<br class="">    |<br class="">           vectorizeStoreChain(Chains.second,<br class="">           InstructionsProcessed);<br class="">@@ -765,13 +769,15 @@ bool Vectorizer::vectorizeStoreChain(<br class="">  else<br class="">    VecTy = VectorType::get(StoreTy, Chain.size());<br class=""><br class="">-  // If it's more than the max vector size, break it into two<br class="">pieces.<br class="">-  // TODO: Target hook to control types to split to.<br class="">-  if (ChainSize > VF) {<br class="">-    DEBUG(dbgs() << "LSV: Vector factor is too big."<br class="">+  // If it's more than the max vector size or the target has a<br class="">better<br class="">+  // vector factor, break it into two pieces.<br class="">+  unsigned TargetVF = TTI.getStoreVectorFactor(VF, Sz, SzInBytes,<br class="">VecTy);<br class="">+  if (ChainSize > VF || (VF != TargetVF && TargetVF < ChainSize)) {<br class="">+    DEBUG(dbgs() << "LSV: Chain doesn't match with the vector<br class="">factor."<br class="">                    " Creating two separate arrays.\n");<br class="">-    return vectorizeStoreChain(Chain.slice(0, VF),<br class="">InstructionsProcessed) |<br class="">-           vectorizeStoreChain(Chain.slice(VF),<br class="">InstructionsProcessed);<br class="">+    return vectorizeStoreChain(Chain.slice(0, TargetVF),<br class="">+                               InstructionsProcessed) |<br class="">+           vectorizeStoreChain(Chain.slice(TargetVF),<br class="">InstructionsProcessed);<br class="">  }<br class=""><br class="">  DEBUG({<br class="">@@ -784,9 +790,6 @@ bool Vectorizer::vectorizeStoreChain(<br class="">  // whether we succeed below.<br class="">  InstructionsProcessed->insert(Chain.begin(), Chain.end());<br class=""><br class="">-  // Check alignment restrictions.<br class="">-  unsigned Alignment = getAlignment(S0);<br class="">-<br class="">  // If the store is going to be misaligned, don't vectorize it.<br class="">  if (accessIsMisaligned(SzInBytes, AS, Alignment)) {<br class="">    if (S0->getPointerAddressSpace() != 0)<br class="">@@ -873,6 +876,7 @@ bool Vectorizer::vectorizeLoadChain(<br class="">  unsigned VecRegSize = TTI.getLoadStoreVecRegBitWidth(AS);<br class="">  unsigned VF = VecRegSize / Sz;<br class="">  unsigned ChainSize = Chain.size();<br class="">+  unsigned Alignment = getAlignment(L0);<br class=""><br class="">  if (!isPowerOf2_32(Sz) || VF < 2 || ChainSize < 2) {<br class="">    InstructionsProcessed->insert(Chain.begin(), Chain.end());<br class="">@@ -895,16 +899,11 @@ bool Vectorizer::vectorizeLoadChain(<br class="">  Chain = NewChain;<br class="">  ChainSize = Chain.size();<br class=""><br class="">-  // Load size should be 1B, 2B or multiple of 4B.<br class="">-  // TODO: Should size constraint be a target hook?<br class="">+  // Check if it's legal to vectorize this chain. If not, split the<br class="">chain and<br class="">+  // try again.<br class="">  unsigned EltSzInBytes = Sz / 8;<br class="">  unsigned SzInBytes = EltSzInBytes * ChainSize;<br class="">-  if (SzInBytes > 2 && SzInBytes % 4 != 0) {<br class="">-    DEBUG(dbgs() << "LSV: Size should be 1B, 2B "<br class="">-                    "or multiple of 4B. Splitting.\n");<br class="">-    if (SzInBytes == 3)<br class="">-      return vectorizeLoadChain(Chain.slice(0, ChainSize - 1),<br class="">-                                InstructionsProcessed);<br class="">+  if (!TTI.isLegalToVectorizeLoadChain(SzInBytes, Alignment, AS)) {<br class="">    auto Chains = splitOddVectorElts(Chain, Sz);<br class="">    return vectorizeLoadChain(Chains.first, InstructionsProcessed) |<br class="">           vectorizeLoadChain(Chains.second, InstructionsProcessed);<br class="">@@ -918,22 +917,20 @@ bool Vectorizer::vectorizeLoadChain(<br class="">  else<br class="">    VecTy = VectorType::get(LoadTy, Chain.size());<br class=""><br class="">-  // If it's more than the max vector size, break it into two<br class="">pieces.<br class="">-  // TODO: Target hook to control types to split to.<br class="">-  if (ChainSize > VF) {<br class="">-    DEBUG(dbgs() << "LSV: Vector factor is too big. "<br class="">-                    "Creating two separate arrays.\n");<br class="">-    return vectorizeLoadChain(Chain.slice(0, VF),<br class="">InstructionsProcessed) |<br class="">-           vectorizeLoadChain(Chain.slice(VF),<br class="">InstructionsProcessed);<br class="">+  // If it's more than the max vector size or the target has a<br class="">better<br class="">+  // vector factor, break it into two pieces.<br class="">+  unsigned TargetVF = TTI.getLoadVectorFactor(VF, Sz, SzInBytes,<br class="">VecTy);<br class="">+  if (ChainSize > VF || (VF != TargetVF && TargetVF < ChainSize)) {<br class="">+    DEBUG(dbgs() << "LSV: Chain doesn't match with the vector<br class="">factor."<br class="">+                    " Creating two separate arrays.\n");<br class="">+    return vectorizeLoadChain(Chain.slice(0, TargetVF),<br class="">InstructionsProcessed) |<br class="">+           vectorizeLoadChain(Chain.slice(TargetVF),<br class="">InstructionsProcessed);<br class="">  }<br class=""><br class="">  // We won't try again to vectorize the elements of the chain,<br class="">  regardless of<br class="">  // whether we succeed below.<br class="">  InstructionsProcessed->insert(Chain.begin(), Chain.end());<br class=""><br class="">-  // Check alignment restrictions.<br class="">-  unsigned Alignment = getAlignment(L0);<br class="">-<br class="">  // If the load is going to be misaligned, don't vectorize it.<br class="">  if (accessIsMisaligned(SzInBytes, AS, Alignment)) {<br class="">    if (L0->getPointerAddressSpace() != 0)<br class=""><br class=""><br class="">_______________________________________________<br class="">llvm-commits mailing list<br class="">llvm-commits@lists.llvm.org<br class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits<br class=""><br class=""></blockquote><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">--<span class="Apple-converted-space"> </span></span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Hal Finkel</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Lead, Compiler Technology and Programming Languages</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Leadership Computing Facility</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Argonne National Laboratory</span></div></blockquote></div><br class=""></body></html>