[PATCH] D34141: [X86] Recognize constant arrays with special values and replace loads from it with subtract and shift instructions, which then may be replaced by BZHI machine instruction.

Wed Jun 21 10:14:37 PDT 2017

RKSimon added a comment.

In https://reviews.llvm.org/D34141#786536, @delena wrote:

> In https://reviews.llvm.org/D34141#786514, @RKSimon wrote:
>
> > While I don't want to end up forcing you to over-engineer this, it seems this pass is currently very specific and should be generalised to make it straightforward to work for constants arrays of other patterns that are easily re-materializable (upper/lower masks, single bit masks, etc.) - "SubstituteLoadWithRematerializationPass", possibly driven by the TTI cost models?
>
>
> I agree that the name of the pass should be more generic. 
>  Do you mean renaming by "generalization" ? Or you propose to add this transformation for all targets and ask TTI about profitability of this transformation?

By generalization I mean setup the pass so that other patterns can be easily added in the future (e.g. 0xFF, 0xFE, ..., 0x80, 0x00 high 'sign' masks are common too).

I think its OK to keep this as X86 only initially, but it shouldn't be BMI2 only - checking the cost of a sub + shift compared to a constant pool load should work alright (even better if PR31274 ever gets addressed).

================
Comment at: include/llvm/CodeGen/Passes.h:423

+  FunctionPass *createSubstituteLoadWithSubShrPass();
+
----------------
Comment describing the pass.

================
Comment at: lib/CodeGen/SubstituteLoadWithSubShr.cpp:86
+              dyn_cast<ConstantInt>(Init->getAggregateElement(i));
+          if (Elem->getZExtValue() != (((uint64_t)1 << i) - 1))
+            return nullptr;
----------------
dyn_cast<> can fail - add a check for null:
```
if (!Elem || Elem->getZExtValue() != (((uint64_t)1 << i) - 1))
```

================
Comment at: lib/CodeGen/SubstituteLoadWithSubShr.cpp:102
+        APInt ElemSize =
+            APInt(ElemTy->getScalarSizeInBits(), ElemTy->getScalarSizeInBits());
+        Constant *ElemSizeConst = Constant::getIntegerValue(ElemTy, ElemSize);
----------------
Duplicate code - pull out
```
unsigned EltSizeInBits = ElemTy->getScalarSizeInBits();
APInt ElemSize = APInt(EltSizeInBits, EltSizeInBits);
```

================
Comment at: test/CodeGen/X86/replace-load-with-sub-shr.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+bmi2 | FileCheck %s
+
----------------
Test without bmi2 as well - the pass currently runs on that too.

https://reviews.llvm.org/D34141