[llvm-commits] [PATCH] Memory footprint improvement of pass GlobalOpt (refined version)
Nick Lewycky
nicholas at mxc.ca
Thu Dec 6 13:05:13 PST 2012
Sheng Zhou wrote:
> Hi Nick,
>
> Any further feedback for this patch?
Yes. Thanks for the ping.
I'm not convinced you need a SetVector. The issue is:
> + if (Obsolete[i]->use_empty())
which requires that you iterate in the right order. What if you used "if
(Obsolete[i]->isConstantUsed())" ? That function will look at the
Constant's users recursively and save you from worrying about the
iteration order of Obsolete.
Come to think of it, you could also call
"Obsolete[i]->removeDeadConstantUsers();" then test use_empty(), and get
the right effect.
> + // This is to improve the memory footprint, because current
> algorithm is to
> + // one by one hoist and store the leaf element constant into the
> global array,
The description of "current algorithm" becomes obsolete the moment you
commit this patch. :) You should instead just describe that you want to
minimize the number of intermediate constants created.
Nick
>
>
> -Sheng
>
> On Mon, Dec 3, 2012 at 3:17 PM, Zhou Sheng <zhousheng00 at gmail.com
> <mailto:zhousheng00 at gmail.com>> wrote:
>
> Hi nicholas,
>
> Refined the patch according to Nick & Andy's feedback.
>
> The attached patch is to improve the memory footprint of pass
> GlobalOpt. Please kindly let me know if the patch is good to commit.
> I created some case (attached in my previous email) to repeat the
> issue, on which 'opt -globalopt' consumes 3.2GB memory and is stuck
> for long time.
> With my patch the memory footprint is reduced to 300MB and GlobalOpt
> transform finishes in 3mins. (My machine info: 16 Intel(R) Xeon(R)
> CPU E5620 at 2.40GHz, 16GB)
>
> The big memory footprint cause is that current GlobalOpt one by one
> hoists and stores the leaf element constant into the global array,
> in each iteration, it recreates the global array initializer
> constant and leave the old initializer alone. This may result in
> many obsolete constants left.
> For example: we have global array @rom = global [16 x i32]
> zeroinitializer
> After the first element value is hoisted and installed: @rom =
> global [16 x i32] [ 1, 0, 0, ... ]
> After the second element value is installed: @rom = global [16 x
> 32] [ 1, 2, 0, 0, ... ] // here the previous initializer is
> obsolete
> ...
> When the transform is done, we have 15 obsolete initializers left
> useless.
>
> My solution is to use a container to record those obsolete constants
> created in each iteration in EvaluateStoreInto(),
> and then destroy them in the end of CommitValueTo() which is the
> caller of EvaluateStoreInto().
>
> Note here I choose to use SetVector as container because we need to
> keep the insertion order of the obsolete constants, as the global
> variable initializer is created in hierachical way which is bottom
> up. Therefore, we need to delete the obsolete ones in topdown way.
> For example:
>
> @rom = global [16 x [1024 x i32]] [[1, 2, 3, ...,1024 ] [2, 4,
> ...2048] ... ]
>
> For the above global variable, we also generate 16 constants [1024 x
> i32] before we generate the initializer every time, so when we
> delete the obsolete initializer, we need to make sure we delete the
> 16 [1024xi32] after [16 x [1024 x i32]] is deleted, otherwise there
> will be non-use-empty issue.
>
> http://llvm-reviews.chandlerc.com/D155
>
> Files:
> lib/Transforms/IPO/GlobalOpt.cpp
>
> Index: lib/Transforms/IPO/GlobalOpt.cpp
> ===================================================================
> --- lib/Transforms/IPO/GlobalOpt.cpp
> +++ lib/Transforms/IPO/GlobalOpt.cpp
> @@ -36,6 +36,7 @@
> #include "llvm/ADT/DenseMap.h"
> #include "llvm/ADT/SmallPtrSet.h"
> #include "llvm/ADT/SmallVector.h"
> +#include "llvm/ADT/SetVector.h"
> #include "llvm/ADT/Statistic.h"
> #include "llvm/ADT/STLExtras.h"
> #include <algorithm>
> @@ -2398,7 +2399,8 @@
> /// initializer. This returns 'Init' modified to reflect 'Val'
> stored into it.
> /// At this point, the GEP operands of Addr [0, OpNo) have been
> stepped into.
> static Constant *EvaluateStoreInto(Constant *Init, Constant *Val,
> - ConstantExpr *Addr, unsigned OpNo) {
> + ConstantExpr *Addr, unsigned OpNo,
> + SetVector<Constant*> &Obsolete) {
> // Base case of the recursion.
> if (OpNo == Addr->getNumOperands()) {
> assert(Val->getType() == Init->getType() && "Type mismatch!");
> @@ -2415,7 +2417,9 @@
> ConstantInt *CU = cast<ConstantInt>(Addr->getOperand(OpNo));
> unsigned Idx = CU->getZExtValue();
> assert(Idx < STy->getNumElements() && "Struct index out of
> range!");
> - Elts[Idx] = EvaluateStoreInto(Elts[Idx], Val, Addr, OpNo+1);
> + if (Elts[Idx]->getType()->isAggregateType())
> + Obsolete.insert(Elts[Idx]);
> + Elts[Idx] = EvaluateStoreInto(Elts[Idx], Val, Addr, OpNo+1,
> Obsolete);
>
> // Return the modified struct.
> return ConstantStruct::get(STy, Elts);
> @@ -2435,8 +2439,11 @@
> Elts.push_back(Init->getAggregateElement(i));
>
> assert(CI->getZExtValue() < NumElts);
> + Constant *OrigElem = Elts[CI->getZExtValue()];
> + if (OrigElem->getType()->isAggregateType())
> + Obsolete.insert(OrigElem);
> Elts[CI->getZExtValue()] =
> - EvaluateStoreInto(Elts[CI->getZExtValue()], Val, Addr, OpNo+1);
> + EvaluateStoreInto(OrigElem, Val, Addr, OpNo+1, Obsolete);
>
> if (Init->getType()->isArrayTy())
> return ConstantArray::get(cast<ArrayType>(InitTy), Elts);
> @@ -2452,9 +2459,33 @@
> return;
> }
>
> + // Collect obsolete constants created during EvaluateStoreInto()
> invoke,
> + // and then destroy them in the end of this function.
> + // This is to improve the memory footprint, because current
> algorithm is to
> + // one by one hoist and store the leaf element constant into the
> global array,
> + // in each iteration, it recreates the global array initializer
> constant and
> + // leave the old initializer alone. This may result in many
> obsolete constants
> + // left still. In some case, with big global array:
> + // @rom = [16 x [2048 x [5 x i32]]];
> + // it will consume over 3.2GB memory, and the transform will get
> stuck.
> + //
> + // Here we choose to use SetVector as container because we need
> to keep
> + // the insertion order of the obsolete constants as the global
> variable
> + // initializer is created in hierachical way which is bottom up.
> So we
> + // need to delete them in topdown way to avoid non-use-empty issue.
> + SetVector<Constant*> Obsolete;
> ConstantExpr *CE = cast<ConstantExpr>(Addr);
> GlobalVariable *GV = cast<GlobalVariable>(CE->getOperand(0));
> - GV->setInitializer(EvaluateStoreInto(GV->getInitializer(), Val,
> CE, 2));
> + Constant *OrigInit = GV->getInitializer();
> + if (OrigInit->getType()->isAggregateType())
> + Obsolete.insert(OrigInit);
> + Constant *Init = EvaluateStoreInto(OrigInit, Val, CE, 2, Obsolete);
> + GV->setInitializer(Init);
> +
> + for (unsigned i = 0, e = Obsolete.size(); i != e; ++i) {
> + if (Obsolete[i]->use_empty())
> + Obsolete[i]->destroyConstant();
> + }
> }
>
> namespace {
>
>
More information about the llvm-commits
mailing list