[PATCH] D116930: [DAGCombine] Fold SRA of a load into a narrower sign-extending load

Tue Jan 11 05:36:51 PST 2022

spatel added a comment.

Nice transform! Might be worth including an IR proof in the description to make it clearer:
https://alive2.llvm.org/ce/z/yYuasA

================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12142
+    // accessing any of the loaded bytes.
+    ShAmt = N1C->getZExtValue();
+    uint64_t MemoryWidth = LN0->getMemoryVT().getScalarSizeInBits();
----------------
The various shift amount variables are difficult to follow. This is really a question for the existing code - can we make it clearer (either through renaming or code comments) how "ShAmt", "ShiftAmt" and "ShLeftAmt" are different?

If we can improve that, is it possible to make a small helper/lambda, so we can share the bailout conditions for SRA/SRL instead of duplicating code?

================
Comment at: llvm/test/CodeGen/X86/combine-sra-load.ll:74-75

 ; Negative test. All bits loaded from memory are shifted out, so we can fold
 ; away the shift.
 define i32 @sra_of_sextload_no_fold(i16* %p) {
----------------
This comment is not accurate. We are replicating (splatting) the sign bit of the loaded i16 across 32-bits, so there's still a shift.
In IR, instcombine would transform this into:

```
define i32 @sra_of_sextload_no_fold(i16* %p) {
  %load = load i16, i16* %p, align 2
  %1 = ashr i16 %load, 15
  %shift = sext i16 %1 to i32
  ret i32 %shift
}

```

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116930/new/

https://reviews.llvm.org/D116930