[PATCH] D105253: GlobalISel: Handle lowering non-power-of-2 extloads
    Matt Arsenault via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Thu Jul  1 11:20:07 PDT 2021
    
    
  
arsenm added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir:15995-16020
+    ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+    ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+    ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+    ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+    ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+    ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+    ; GFX9-MESA: [[DEF:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF
----------------
foad wrote:
> Why is this so much more convoluted than the GFX9-HSA case? The loads look the same, it's just all the shifting and ORing afterwards that looks crazy here.
This is an artifact from the terrible way we currently handle unaligned accesses. We treat it as a narrowScalar action, which doesn't really make sense. I'm trying to move towards making widenScalar/narrowScalar only touch the register size, and leave the memory access alone. Unaligned access decomposition is a kind of lowering, and only tangentially related to the register types needed after legalization.
The HSA case enables unaligned access and the mesa case doesn't, so we start out by reporting we need to narrow the s32 result to s24. When that load is legalized, it ends up producing this mess. Once lowering handles alignment decomposition they should look the same
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105253/new/
https://reviews.llvm.org/D105253
    
    
More information about the llvm-commits
mailing list