[PATCH] D33243: [Atomics][LoopIdiom] Recognize unordered atomic memcpy

Daniel Neilson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 1 07:55:37 PDT 2017


dneilson added inline comments.


================
Comment at: lib/Transforms/Scalar/LoopIdiomRecognize.cpp:1012
+    unsigned StoreSizeBits = StoreSize * 8;
+    if (StoreSizeBits > TTI->getRegisterBitWidth(false))
+      return false;
----------------
anna wrote:
> dneilson wrote:
> > This is the replacement due to the inability to use RTLIB in ScalarOpt due to a circular dependency.  It's not ideal as it's theoretically possible that larger width versions of the intrinsic's lib call will exist (ex: ones that are implemented to use vector regs for the load/stores), but we won't be able to exploit any that are wider than scalar register width.
> > 
> Is it possible that even if the `StoreSizeBits` is within the max `registerBitWidth` for the target arch, we do not have the corresponding lib call?
> Could you please check if `TLI` has the information you require or can be modified to do so - it seems to have `memset`, `memcpy` etc. 
Theoretically it's possible, yes. Lib calls for sizes 1, 2, 4, 8, and 16 bytes are currently defined. It a platform doesn't have one of those sizes's definition available and we create the call, then the result will be a link error -- which seems to be intentional per the discussion around the design of the element atomic memcpy. If scalar registers with more than 16 bytes are available on the platform, then we'd end up creating the intrinsic call and then blowing up when lowering it to a libcall in SelectionDAGBuilder -- this is the part I'd like to avoid.

TLI seems to be concerned with system lib calls (libc, libm, etc), and not the __llvm_ lib calls, so I don't think that's a suitable place for this.


https://reviews.llvm.org/D33243





More information about the llvm-commits mailing list