[PATCH] D93708: [AMDGPU] Add a new Clamp Pattern to the GlobalISel Path.
    Thomas Symalla via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Wed Jan 13 06:41:42 PST 2021
    
    
  
tsymalla marked 6 inline comments as done.
tsymalla added a comment.
No, I doubt there is support for this in the other path.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp:71
+
+  LLVM_DEBUG(dbgs() << "Matching Clamp i64 to i16\n");
+
----------------
arsenm wrote:
> Probably shouldn't print here especially when failure is likely. Ideally we would get a message from the generated combiner based on the defined combine name
I removed this message. There aren't any other debug outputs in the other Legalize combines, too.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp:137-141
+  auto CvtPk = B.buildInstr(CvtOpcode);
+  CvtPk.addDef(CvtDst);
+  CvtPk.addReg(Hi32);
+  CvtPk.addReg(Lo32);
+  CvtPk.setMIFlags(MI.getFlags());
----------------
Petar.Avramovic wrote:
> You can replace this block with `B.buildInstr(CvtOpcode, {CvtDst}, {Hi32, Lo32}, MI.getFlags());`
Thanks. I removed this for the other `buildInstr`, too.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp:201
       : CombinerInfo(/*AllowIllegalOps*/ true, /*ShouldLegalizeIllegal*/ false,
-                     /*LegalizerInfo*/ nullptr, EnableOpt, OptSize, MinSize),
+                     /*LegalizerInfo*/ LI, EnableOpt, OptSize, MinSize),
         KB(KB), MDT(MDT) {
----------------
Petar.Avramovic wrote:
> Why do you need LegalizerInfo? CombinerHelper::isLegalOrBeforeLegalizer relies on LI being nullptr before legalizer.
Thanks. This is not needed here, so I removed it.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp:134
+
+  constexpr unsigned int CvtOpcode = AMDGPU::V_CVT_PK_I16_I32_e64;
+  assert(MI.getOpcode() != CvtOpcode);
----------------
arsenm wrote:
> tsymalla wrote:
> > arsenm wrote:
> > > It would be better to use a generic operation pre-regbank select rather than forcing register classes here
> > How would I go for this?
> You can follow the example of the other G_AMDGPU_* pseudos (e.g. G_AMDGPU_CVT_UBYTE0). I'm a bit ambivalent on whether we should do this in this case, as there are still some unanswered questions about how regbankselect should work. However, so far we've been using pseudos like this
Thanks. With some help I was able to get this working and learned something new. This should now use the new pseudo and select it into the corresponding machine instruction.
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93708/new/
https://reviews.llvm.org/D93708
    
    
More information about the llvm-commits
mailing list