[PATCH] D145220: [X86][InstCombine] Simplify some `pternlog` intrinsics

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 3 14:07:55 PST 2023


craig.topper added inline comments.


================
Comment at: llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp:585
+
+  bool ABIsConst = isa<Constant>(A) && isa<Constant>(B);
+  bool ACIsConst = isa<Constant>(A) && isa<Constant>(C);
----------------
goldstein.w.n wrote:
> craig.topper wrote:
> > Do we need to make sure we are looking at vectors of ConstantInts rather than vectors of ConstantExpr?
> Will the ALU not fold with `ConstantExpr`? How can I test that?
ConstantExpr can involve things that are conceptually constant but aren’t known to the compiler. Like the address of a global variable. SelectionDAG will expand ConstantExprs into instructions.


================
Comment at: llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp:588
+  bool BCIsConst = isa<Constant>(B) && isa<Constant>(C);
+  bool ABCIsConst = isa<Constant>(A) && isa<Constant>(B) && isa<Constant>(C);
+
----------------
goldstein.w.n wrote:
> goldstein.w.n wrote:
> > craig.topper wrote:
> > > craig.topper wrote:
> > > > If everything is const can we someone use the immediate as a lookup table instead of manually describing every case?
> > > somehow* not someone
> > > If everything is const can we someone use the immediate as a lookup table instead of manually describing every case?
> > 
> > Yes although there are some IMM where we only need AC/AB/BC so think we would still want to check the IMM.
> > 
> > Are you concerned about compile time overhead? If not keeping the same logic for const / non-const seems simpler and less error prone.
> > > If everything is const can we someone use the immediate as a lookup table instead of manually describing every case?
> > 
> > Yes although there are some IMM where we only need AC/AB/BC so think we would still want to check the IMM.
> > 
> > Are you concerned about compile time overhead? If not keeping the same logic for const / non-const seems simpler and less error prone.
> 
> If all three are const we can just implement it with a loop above all the bits. Its just:
> ```
> for(i : bitwidth()) {
> res[i] = (imm >> ((A[i] << 2) + (B[i] << 1) + (C[i] << 0))) & 1
> }
> ```
> but still issue of we will miss cases where `C` is non-constant but also truly unneeded.
> 
I really want there to be some nice algorithm to do this optimization. It’s not easy to see that all 256 entries in the switch are correct. We have to trust that you transcribed them from Intel’s table correctly or we need to check that ourselves.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145220/new/

https://reviews.llvm.org/D145220



More information about the llvm-commits mailing list