[llvm] r204076 - Use range metadata instead of introducing selects.

Dan Gohman dan433584 at gmail.com
Tue Mar 25 06:49:44 PDT 2014


Hi Lang,

I can reproduce the performance regression on fourinarow, at least. With
the patch, the code size and static instruction count of the benchmark's
one embarassingly-hot function is lower, the dynamic instruction count is
lower, and the stack frame is smaller, but it still runs slower.
Instruction selection is basically the same, except that there are fewer
cmovs. There appears to be a minor difference in instruction scheduling in
the hot function. The regression disappeared when I experimented with
non-default values for -pre-RA-sched. However, I'm not prepared for the
adventure of changing the instruction scheduler's heuristics at this time,
so I'll just let this patch go for now.

Sorry for the trouble,

Dan


On Sat, Mar 22, 2014 at 9:32 PM, Lang Hames <lhames at gmail.com> wrote:

> Hi Dan,
>
> I've reverted this in r204558. Sorry for the hassle, but I didn't want to
> let too much pile up on top of it given how serious the regressions were.
>
> It should be easy to reproduce the regressions by running the nightly test
> suite on x86-64. If you don't see the same issues I saw please let me know
> and we can compare notes.
>
> Cheers,
> Lang.
>
>
> On Fri, Mar 21, 2014 at 11:21 PM, Lang Hames <lhames at gmail.com> wrote:
>
>> Hi Dan,
>>
>> We had quite a few regressions due to this on both arm and x86-64. The
>> extreme case was MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow,
>> which regressed by 20% on x86-64 at -O3 -flto.
>>
>> Is it possible to fix this, or revert it until the performance issue can
>> be worked out?
>>
>> Cheers,
>> Lang.
>>
>>
>> On Mon, Mar 17, 2014 at 12:57 PM, Dan Gohman <dan433584 at gmail.com> wrote:
>>
>>> Author: djg
>>> Date: Mon Mar 17 14:57:04 2014
>>> New Revision: 204076
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=204076&view=rev
>>> Log:
>>> Use range metadata instead of introducing selects.
>>>
>>> When GlobalOpt has determined that a GlobalVariable only ever has two
>>> values,
>>> it would convert the GlobalVariable to a boolean, and introduce
>>> SelectInsts
>>> at every load, to choose between the two possible values. These
>>> SelectInsts
>>> introduce overhead and other unpleasantness.
>>>
>>> This patch makes GlobalOpt just add range metadata to loads from such
>>> GlobalVariables instead. This enables the same main optimization (as
>>> seen in
>>> test/Transforms/GlobalOpt/integer-bool.ll), without introducing selects.
>>>
>>> The main downside is that it doesn't get the memory savings of shrinking
>>> such
>>> GlobalVariables, but this is expected to be negligible.
>>>
>>> Added:
>>>     llvm/trunk/test/Transforms/GlobalOpt/twostore-gv-range.ll
>>> Modified:
>>>     llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp
>>>     llvm/trunk/test/Transforms/GlobalOpt/integer-bool.ll
>>>
>>> Modified: llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp?rev=204076&r1=204075&r2=204076&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp (original)
>>> +++ llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp Mon Mar 17 14:57:04 2014
>>> @@ -1588,18 +1588,16 @@ static bool OptimizeOnceStoredGlobal(Glo
>>>    return false;
>>>  }
>>>
>>> -/// TryToShrinkGlobalToBoolean - At this point, we have learned that
>>> the only
>>> +/// TryToAddRangeMetadata - At this point, we have learned that the only
>>>  /// two values ever stored into GV are its initializer and OtherVal.
>>>  See if we
>>> -/// can shrink the global into a boolean and select between the two
>>> values
>>> -/// whenever it is used.  This exposes the values to other scalar
>>> optimizations.
>>> -static bool TryToShrinkGlobalToBoolean(GlobalVariable *GV, Constant
>>> *OtherVal) {
>>> +/// can annotate loads from it with range metadata describing this.
>>> +/// This exposes the values to other scalar optimizations.
>>> +static bool TryToAddRangeMetadata(GlobalVariable *GV, Constant
>>> *OtherVal) {
>>>    Type *GVElType = GV->getType()->getElementType();
>>>
>>> -  // If GVElType is already i1, it is already shrunk.  If the type of
>>> the GV is
>>> -  // an FP value, pointer or vector, don't do this optimization because
>>> a select
>>> -  // between them is very expensive and unlikely to lead to later
>>> -  // simplification.  In these cases, we typically end up with "cond ?
>>> v1 : v2"
>>> -  // where v1 and v2 both require constant pool loads, a big loss.
>>> +  // If GVElType is already i1, it already has a minimal range. If the
>>> type of
>>> +  // the GV is an FP value, pointer or vector, don't do this
>>> optimization
>>> +  // because range metadata is currently only supported on scalar
>>> integers.
>>>    if (GVElType == Type::getInt1Ty(GV->getContext()) ||
>>>        GVElType->isFloatingPointTy() ||
>>>        GVElType->isPointerTy() || GVElType->isVectorTy())
>>> @@ -1611,81 +1609,53 @@ static bool TryToShrinkGlobalToBoolean(G
>>>      if (!isa<LoadInst>(U) && !isa<StoreInst>(U))
>>>        return false;
>>>
>>> -  DEBUG(dbgs() << "   *** SHRINKING TO BOOL: " << *GV);
>>> -
>>> -  // Create the new global, initializing it to false.
>>> -  GlobalVariable *NewGV = new
>>> GlobalVariable(Type::getInt1Ty(GV->getContext()),
>>> -                                             false,
>>> -
>>> GlobalValue::InternalLinkage,
>>> -
>>>  ConstantInt::getFalse(GV->getContext()),
>>> -                                             GV->getName()+".b",
>>> -                                             GV->getThreadLocalMode(),
>>> -
>>> GV->getType()->getAddressSpace());
>>> -  GV->getParent()->getGlobalList().insert(GV, NewGV);
>>> -
>>>    Constant *InitVal = GV->getInitializer();
>>>    assert(InitVal->getType() != Type::getInt1Ty(GV->getContext()) &&
>>> -         "No reason to shrink to bool!");
>>> +         "No reason to add range metadata!");
>>> +
>>> +  // The MD_range metadata only supports absolute integer constants.
>>> +  if (!isa<ConstantInt>(InitVal) || !isa<ConstantInt>(OtherVal))
>>> +    return false;
>>>
>>> -  // If initialized to zero and storing one into the global, we can use
>>> a cast
>>> -  // instead of a select to synthesize the desired value.
>>> -  bool IsOneZero = false;
>>> -  if (ConstantInt *CI = dyn_cast<ConstantInt>(OtherVal))
>>> -    IsOneZero = InitVal->isNullValue() && CI->isOne();
>>> -
>>> -  while (!GV->use_empty()) {
>>> -    Instruction *UI = cast<Instruction>(GV->user_back());
>>> -    if (StoreInst *SI = dyn_cast<StoreInst>(UI)) {
>>> -      // Change the store into a boolean store.
>>> -      bool StoringOther = SI->getOperand(0) == OtherVal;
>>> -      // Only do this if we weren't storing a loaded value.
>>> -      Value *StoreVal;
>>> -      if (StoringOther || SI->getOperand(0) == InitVal) {
>>> -        StoreVal = ConstantInt::get(Type::getInt1Ty(GV->getContext()),
>>> -                                    StoringOther);
>>> +  DEBUG(dbgs() << "   *** ADDING RANGE METADATA: " << *GV);
>>> +
>>> +  for (User *U : GV->users()) {
>>> +    Instruction *UI = cast<Instruction>(U);
>>> +    if (LoadInst *LI = dyn_cast<LoadInst>(UI)) {
>>> +      // If we already have a range, don't add a new one, so that
>>> GlobalOpt
>>> +      // terminates. In theory, we could merge the two ranges.
>>> +      if (LI->getMetadata(LLVMContext::MD_range))
>>> +        return false;
>>> +      // Add range metadata to the load. Range metadata can represent
>>> multiple
>>> +      // ranges, but they must be discontiguous, so we have two cases:
>>> the case
>>> +      // where the values are adjacent, in which case we add one range,
>>> and the
>>> +      // case where they're not, in which case we add two.
>>> +      APInt Min = cast<ConstantInt>(InitVal)->getValue();
>>> +      APInt Max = cast<ConstantInt>(OtherVal)->getValue();
>>> +      if (Max.slt(Min))
>>> +        std::swap(Min, Max);
>>> +      APInt Min1 = Min + 1;
>>> +      APInt Max1 = Max + 1;
>>> +      if (Min1 == Max) {
>>> +        Value *Vals[] = {
>>> +          ConstantInt::get(GV->getContext(), Min),
>>> +          ConstantInt::get(GV->getContext(), Max1),
>>> +        };
>>> +        MDNode *MD = MDNode::get(LI->getContext(), Vals);
>>> +        LI->setMetadata(LLVMContext::MD_range, MD);
>>>        } else {
>>> -        // Otherwise, we are storing a previously loaded copy.  To do
>>> this,
>>> -        // change the copy from copying the original value to just
>>> copying the
>>> -        // bool.
>>> -        Instruction *StoredVal = cast<Instruction>(SI->getOperand(0));
>>> -
>>> -        // If we've already replaced the input, StoredVal will be a
>>> cast or
>>> -        // select instruction.  If not, it will be a load of the
>>> original
>>> -        // global.
>>> -        if (LoadInst *LI = dyn_cast<LoadInst>(StoredVal)) {
>>> -          assert(LI->getOperand(0) == GV && "Not a copy!");
>>> -          // Insert a new load, to preserve the saved value.
>>> -          StoreVal = new LoadInst(NewGV, LI->getName()+".b", false, 0,
>>> -                                  LI->getOrdering(),
>>> LI->getSynchScope(), LI);
>>> -        } else {
>>> -          assert((isa<CastInst>(StoredVal) ||
>>> isa<SelectInst>(StoredVal)) &&
>>> -                 "This is not a form that we understand!");
>>> -          StoreVal = StoredVal->getOperand(0);
>>> -          assert(isa<LoadInst>(StoreVal) && "Not a load of NewGV!");
>>> -        }
>>> +        Value *Vals[] = {
>>> +          ConstantInt::get(GV->getContext(), Min),
>>> +          ConstantInt::get(GV->getContext(), Min1),
>>> +          ConstantInt::get(GV->getContext(), Max),
>>> +          ConstantInt::get(GV->getContext(), Max1),
>>> +        };
>>> +        MDNode *MD = MDNode::get(LI->getContext(), Vals);
>>> +        LI->setMetadata(LLVMContext::MD_range, MD);
>>>        }
>>> -      new StoreInst(StoreVal, NewGV, false, 0,
>>> -                    SI->getOrdering(), SI->getSynchScope(), SI);
>>> -    } else {
>>> -      // Change the load into a load of bool then a select.
>>> -      LoadInst *LI = cast<LoadInst>(UI);
>>> -      LoadInst *NLI = new LoadInst(NewGV, LI->getName()+".b", false, 0,
>>> -                                   LI->getOrdering(),
>>> LI->getSynchScope(), LI);
>>> -      Value *NSI;
>>> -      if (IsOneZero)
>>> -        NSI = new ZExtInst(NLI, LI->getType(), "", LI);
>>> -      else
>>> -        NSI = SelectInst::Create(NLI, OtherVal, InitVal, "", LI);
>>> -      NSI->takeName(LI);
>>> -      LI->replaceAllUsesWith(NSI);
>>>      }
>>> -    UI->eraseFromParent();
>>>    }
>>>
>>> -  // Retain the name of the old global variable. People who are
>>> debugging their
>>> -  // programs may expect these variables to be named the same.
>>> -  NewGV->takeName(GV);
>>> -  GV->eraseFromParent();
>>>    return true;
>>>  }
>>>
>>> @@ -1839,11 +1809,10 @@ bool GlobalOpt::ProcessInternalGlobal(Gl
>>>                                   DL, TLI))
>>>        return true;
>>>
>>> -    // Otherwise, if the global was not a boolean, we can shrink it to
>>> be a
>>> -    // boolean.
>>> +    // Otherwise, if the global was not a boolean, we can add range
>>> metadata.
>>>      if (Constant *SOVConstant = dyn_cast<Constant>(GS.StoredOnceValue))
>>> {
>>>        if (GS.Ordering == NotAtomic) {
>>> -        if (TryToShrinkGlobalToBoolean(GV, SOVConstant)) {
>>> +        if (TryToAddRangeMetadata(GV, SOVConstant)) {
>>>            ++NumShrunkToBool;
>>>            return true;
>>>          }
>>>
>>> Modified: llvm/trunk/test/Transforms/GlobalOpt/integer-bool.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalOpt/integer-bool.ll?rev=204076&r1=204075&r2=204076&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/Transforms/GlobalOpt/integer-bool.ll (original)
>>> +++ llvm/trunk/test/Transforms/GlobalOpt/integer-bool.ll Mon Mar 17
>>> 14:57:04 2014
>>> @@ -1,20 +1,21 @@
>>>  ; RUN: opt < %s -S -globalopt -instcombine | FileCheck %s
>>> -;; check that global opt turns integers that only hold 0 or 1 into
>>> bools.
>>> +;; check that global opt annotates loads from global variales that only
>>> hold 0 or 1
>>> +;; such that instcombine can optimize accordingly.
>>>
>>>  @G = internal addrspace(1) global i32 0
>>>  ; CHECK: @G
>>>  ; CHECK: addrspace(1)
>>> -; CHECK: global i1 false
>>> +; CHECK: global i32 0
>>>
>>>  define void @set1() {
>>>    store i32 0, i32 addrspace(1)* @G
>>> -; CHECK: store i1 false
>>> +; CHECK: store i32 0
>>>    ret void
>>>  }
>>>
>>>  define void @set2() {
>>>    store i32 1, i32 addrspace(1)* @G
>>> -; CHECK: store i1 true
>>> +; CHECK: store i32 1
>>>    ret void
>>>  }
>>>
>>>
>>> Added: llvm/trunk/test/Transforms/GlobalOpt/twostore-gv-range.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalOpt/twostore-gv-range.ll?rev=204076&view=auto
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/Transforms/GlobalOpt/twostore-gv-range.ll (added)
>>> +++ llvm/trunk/test/Transforms/GlobalOpt/twostore-gv-range.ll Mon Mar 17
>>> 14:57:04 2014
>>> @@ -0,0 +1,52 @@
>>> +; RUN: opt < %s -S -globalopt | FileCheck %s
>>> +;; check that global opt annotates loads from global variales that have
>>> +;; constant values stored to them.
>>> +
>>> + at G = internal global i32 5
>>> + at H = internal global i32 7
>>> + at I = internal global i32 17
>>> + at J = internal global i32 29
>>> + at K = internal global i32 31
>>> +
>>> +define void @set() {
>>> +  store i32 6, i32* @G
>>> +  store i32 13, i32* @H
>>> +  store i32 16, i32* @I
>>> +  store i32 29, i32* @J
>>> +  store i32 -37, i32* @K
>>> +  ret void
>>> +}
>>> +
>>> +define i32 @getG() {
>>> +; CHECK: %t = load i32* @G, !range [[G:![0-9]+]]
>>> +  %t = load i32* @G
>>> +  ret i32 %t
>>> +}
>>> +define i32 @getH() {
>>> +; CHECK: %t = load i32* @H, !range [[H:![0-9]+]]
>>> +  %t = load i32* @H
>>> +  ret i32 %t
>>> +}
>>> +
>>> +define i32 @getI() {
>>> +; CHECK: %t = load i32* @I, !range [[I:![0-9]+]]
>>> +  %t = load i32* @I
>>> +  ret i32 %t
>>> +}
>>> +
>>> +define i32 @getJ() {
>>> +; CHECK: ret i32 29
>>> +  %t = load i32* @J
>>> +  ret i32 %t
>>> +}
>>> +
>>> +define i32 @getK() {
>>> +; CHECK: %t = load i32* @K, !range [[K:![0-9]+]]
>>> +  %t = load i32* @K
>>> +  ret i32 %t
>>> +}
>>> +
>>> +; CHECK: [[G]] = metadata !{i32 5, i32 7}
>>> +; CHECK: [[H]] = metadata !{i32 7, i32 8, i32 13, i32 14}
>>> +; CHECK: [[I]] = metadata !{i32 16, i32 18}
>>> +; CHECK: [[K]] = metadata !{i32 -37, i32 -36, i32 31, i32 32}
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140325/29aa91e4/attachment.html>


More information about the llvm-commits mailing list