[PATCH] D122105: [SystemZ] Patchset for expanding memcpy/memset using at most 2 stores.

Jonas Paulsson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Mar 20 12:02:30 PDT 2022


jonpa created this revision.
jonpa added a reviewer: uweigand.
Herald added subscribers: steven.zhang, hiraditya.
Herald added a project: All.
jonpa requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

This is my proposal for a combined patch that expands memcpys/memsets of up to 32 bytes in length (but greater than 16 bytes), handles the conversion to VREP in combineSTORE(), and rejects big displacements for vector type in isLegalAddrssingMode() which helps the DAGCombiner to handle the multiple uses of the addresses used by the load/store sequences. In addition, the GEP offsets splitting is also enabled as it seems sensible to do this when the big displacements are rejected, even thought this is not really related to the memcpys/memsets. It would probably be possible to do only memop expansions and handlings, but for a performance measurement I think we could do the whole set and expect no regressions, and it would make sense generally...

This is based on https://reviews.llvm.org/D120277 "[SystemZ] Expand some memcpys/memsets into Load/Store sequences." and https://reviews.llvm.org/D120531 "[SystemZ] Use VREP for storing replicated regs/immediates." with a limit of 2 stores per memcpy/memset and handling of replication in combineSTORE().

Tests:

- combineSTORE() general handling of replicated values: store-replicated-vals.ll
- Memcpy expansions: memcpy-03.ll
- Memset expansions: memset-08.ll
- Splitting big GEP offsets in CGP: codegenprepare-gepoffs-split.ll
- Rejecting big displacments for vector type in isLegalAddressingMode(): dag-combine-06.ll

combineSTORE():

- isOnlyUsedByStores() extended to handle also a BuildVectorSDNode. This is needed to handle tests memset-08.ll/reg21()-reg24() optimally, but it is NFC on benchmarks. I guess these cases might as well be handled properly.
- The handling in combineSTORE() is somewhat involved, as it handles not just one case but scalar and vector replication of both immediates and registers. It makes sense to me to have this as it eliminates scalar multiplications also in other cases than memcpy/memset expansions, but as mentioned before, it would be possible to change SelectionDAGBuilder to emit the splat directly instead, which would then handle just the memcpy/memset expansions in a simpler way.

Common code changes:

- Check for MVT::Untyped in findOptimalMemOpLowering(), in which case the memop is not expanded to loads/stores. For SystemZ, this would be in cases where a single MVC of length 16 or less could be used.


https://reviews.llvm.org/D122105

Files:
  llvm/include/llvm/CodeGen/TargetLowering.h
  llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
  llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
  llvm/lib/Target/SystemZ/SystemZISelLowering.h
  llvm/test/CodeGen/SystemZ/codegenprepare-gepoffs-split.ll
  llvm/test/CodeGen/SystemZ/dag-combine-06.ll
  llvm/test/CodeGen/SystemZ/memcpy-03.ll
  llvm/test/CodeGen/SystemZ/memset-08.ll
  llvm/test/CodeGen/SystemZ/store-replicated-vals.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D122105.416795.patch
Type: text/x-patch
Size: 43854 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220320/68506ed0/attachment.bin>


More information about the llvm-commits mailing list