[PATCH] InstCombine: Improve the result bitvect type when folding (cmp pred (load (gep GV, i)) C) to a bit test.

Thu Mar 21 09:39:31 PDT 2013

Hi Duncan !

On Thu, Mar 21, 2013 at 2:57 PM, Duncan Sands <baldrick at free.fr> wrote:
> Hi Ahmed,
>
>
>> This patch makes InstCombine choose better types when folding (cmp
>> pred (load (gep GV, i)) C) to a bit test (((bitvector >> i) & 1) !=
>> 0).
>>
>> As the commit says:
>>
>>> The original code used i32, and i64 if legal. This introduced unneeded
>>> casts when they aren't legal, or when the index variable i has another
>>> type. In order of preference: try to use i's type, looking past a cast
>>> if there is one;  use the smallest fitting legal type (using an added
>>> DataLayout method); default to i32.
>>> A testcase checks that this works when everything is i16.
>>
>>
>> I'm not convinced about adding DataLayout::getSmallestLegalIntType,
>> but it seemed like the natural way to do it (mirrorring
>> isLegalInteger).
>
>
>
>> --- a/include/llvm/IR/DataLayout.h
>> +++ b/include/llvm/IR/DataLayout.h
>> @@ -350,6 +350,9 @@ public:
>>    /// type.
>>    Type *getIntPtrType(Type *) const;
>>
>> +  /// getSmallestLegalIntType - Return an integer type with size
>
>
> This comment is incomplete.
>
>> +  Type *getSmallestLegalIntType(LLVMContext &C, unsigned Width = 0)
>> const;
>> +
>>    /// getIndexedOffset - return the offset from the beginning of the type
>> for
>>    /// the specified indices.  This is used to implement getelementptr.
>>    uint64_t getIndexedOffset(Type *Ty, ArrayRef<Value *> Indices) const;
>
>
>> --- a/lib/Transforms/InstCombine/InstCombineCompares.cpp
>> +++ b/lib/Transforms/InstCombine/InstCombineCompares.cpp
>> @@ -443,20 +443,35 @@ FoldCmpLoadFromIndexedGlobal(GetElementPtrInst *GEP,
>> GlobalVariable *GV,
>>    }
>>
>>
>> -  // If a 32-bit or 64-bit magic bitvector captures the entire comparison
>> state
>> +  // If a magic bitvector captures the entire comparison state
>>    // of this load, replace it with computation that does:
>>    //   ((magic_cst >> i) & 1) != 0
>> -  if (ArrayElementCount <= 32 ||
>> -      (TD && ArrayElementCount <= 64 && TD->isLegalInteger(64))) {
>> -    Type *Ty;
>> -    if (ArrayElementCount <= 32)
>> +  {
>> +    Type *Ty = 0;
>> +
>> +    // Default to either a legal type, or i32
>> +    if (TD)
>> +      Ty = TD->getSmallestLegalIntType(Init->getContext(),
>> ArrayElementCount);
>> +    else if (ArrayElementCount <= 32)
>>        Ty = Type::getInt32Ty(Init->getContext());
>> -    else
>> -      Ty = Type::getInt64Ty(Init->getContext());
>> -    Value *V = Builder->CreateIntCast(Idx, Ty, false);
>> -    V = Builder->CreateLShr(ConstantInt::get(Ty, MagicBitvector), V);
>> -    V = Builder->CreateAnd(ConstantInt::get(Ty, 1), V);
>> -    return new ICmpInst(ICmpInst::ICMP_NE, V, ConstantInt::get(Ty, 0));
>> +
>> +    // Use Idx' type; if it's a cast, look past it
>
>
> Given that you might have already determined a value for Ty, shouldn't this
> logic only apply when Ty is null?  If not, then shouldn't you first check
> this
> case and only do the above Ty logic afterwards (if Ty is null) in order to
> reduce unnecessary work.  Also I think you mean "Idx" in the comment not
> "Idx'".  Finally, why look through casts of Idx here, why not just let the
> optimizers zap the extra cast if they think it is a good idea?
>
>> +    CastInst *IdxC = dyn_cast<CastInst>(Idx);
>> +    if (IdxC && IdxC->isIntegerCast() &&
>> +        ArrayElementCount <= IdxC->getSrcTy()->getIntegerBitWidth() &&
>> +        ShouldOptimizeCast(IdxC->getOpcode(), IdxC->getOperand(0),
>> +                           Idx->getType())) {
>> +      Ty = IdxC->getSrcTy();
>> +      Idx = IdxC->getOperand(0);
>> +    } else if (ArrayElementCount <= Idx->getType()->getIntegerBitWidth())
>> +      Ty = Idx->getType();
>
>
> Style: the above line should be in curly brackets {} because the preceding
> "if" branch was.
>
>> +
>> +    if (Ty != 0) {
>> +      Value *V = Builder->CreateIntCast(Idx, Ty, false);
>> +      V = Builder->CreateLShr(ConstantInt::get(Ty, MagicBitvector), V);
>> +      V = Builder->CreateAnd(ConstantInt::get(Ty, 1), V);
>> +      return new ICmpInst(ICmpInst::ICMP_NE, V, ConstantInt::get(Ty, 0));
>> +    }
>>    }
>>
>>    return 0;
>
>
> Lastly, I don't think your testcase tests all cases, for example it doesn't
> look like it tests the case when Idx is a cast, so can you please beef it
> up.

The cast was implicitly added as InstCombine casts the getelementptr's
operands to the DataLayout pointer type (here i64). I originally added
the cast-aware part because it avoided introducing pointer-sized
arithmetic when keeping the index type is enough, but I'm not sure
what's the best thing to do here.
Anyway, I updated the patch to no longer take into account casted
indices (among the other fixes, sorry for the hastiness!).
Also, I replaced the testcase with a cleaner one, next to its i32 equivalent.

Thanks!

> Best wishes, Duncan.
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-InstCombine-Improve-the-result-bitvect-type-when-fol.patch
Type: application/octet-stream
Size: 4887 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130321/4af52ca8/attachment.obj>