[PATCH] D55570: [AMDGPU] Improve SDWA generation for V_OR_B32_E32.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 1 09:33:17 PST 2019


arsenm added inline comments.


================
Comment at: lib/Target/AMDGPU/SIPeepholeSDWA.cpp:694
     auto ValSrc = Src1;
     auto Imm = foldToImm(*Src0);
 
----------------
Seems like a bad use of auto


================
Comment at: lib/Target/AMDGPU/SIPeepholeSDWA.cpp:707
+      Msk = WORD_0;
+    else if (*Imm == 0x0ffff0000 || *Imm == -65536)
+      Msk = WORD_1;
----------------
ronlieb wrote:
> arsenm wrote:
> > These are the same thing
> actually, these are not always the same in the LLVM IR for Immediate constants, when i dump out the Imm value one can see
> 
> This is from sdwa-ors.mir
> IMM 4294901760
> ffff0000
> 
> and this is from load-log16.ll
> IMM -65536
> ffffffffffff0000
> 
> it would probably be easier to simply preserve the low 32 bits of the Imm which would allow me to get rid of the 2 additional expressions
> || *Imm == -65536
> and
> || *Imm == -16777216
> 
> 
This is because foldToImm returns int64_t and somebody didn't sign extend this constant properly somewhere. You can either truncate the constant, but the constant probably should have been sign extended in the first place?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55570/new/

https://reviews.llvm.org/D55570





More information about the llvm-commits mailing list