[llvm-commits] [llvm] r158358 - in /llvm/trunk: include/llvm/Constants.h include/llvm/Instruction.h lib/Transforms/Scalar/Reassociate.cpp lib/VMCore/Constants.cpp lib/VMCore/Instruction.cpp test/Transforms/Reassociate/repeats.ll

Wed Jun 13 06:21:16 PDT 2012

OK. I try to fix it tomorrow then. I think it it would be good if we set 
SmallMap non POD by default, and define SmallMapPod that will assume 
that keys and values are POD.

-Stepan.
Duncan Sands wrote:
> Hi Stepan,
>
>> Hi Duncan. Values and Keys may have any type, but there is should be
>> DenseMapInfo defined for the key. So this case looks like a bug. I'm
>> on vocation
>> till 21st June, and till this time my activity will be low. After that
>> I'll try
>> to fix that.
>
> thanks!
>
>> You will help me if you try to use FlatArrayMap directly instead of
>> SmallMap. If
>> error will reproduced, the bug is in FlatArrayMap then.
>
> The problem is in FlatArrayMap.  It uses memcpy to copy things around
> which is
> wrong for types with constructors/destructors.
>
> Ciao, Duncan.
>
>>
>> -Stepan.
>>
>> Duncan Sands wrote:
>>> Hi Matt, thanks for the great test case. It seems to be a bug in
>>> SmallMap: it
>>> assumes that the key has POD type. Alternatively, maybe the key is
>>> required to
>>> be a POD type but I don't see that documented anywhere. Hopefully
>>> Stepan can
>>> clarify. In the meantime I will change the code to use a std::map.
>>>
>>> Ciao, Duncan.
>>>
>>> On 12/06/12 20:53, Matt Beaumont-Gay wrote:
>>>> Here's the reduced input:
>>>>
>>>> typedef __uint128_t widelimb;
>>>> typedef unsigned long long felem[4];
>>>> typedef __uint128_t widefelem[7];
>>>> static void felem_square(widefelem out, const felem in) {
>>>> out[6] = ((widelimb) in[3]) * in[3];
>>>> }
>>>> void ec_GFp_nistp224_point_get_affine_coordinates() {
>>>> felem z2;
>>>> widefelem tmp;
>>>> felem_square(tmp, z2);
>>>> }
>>>>
>>>> (command line: clang -cc1 -emit-obj -O1 -o /dev/null /tmp/crasher.i)
>>>>
>>>> On Tue, Jun 12, 2012 at 11:17 AM, Matt Beaumont-Gay
>>>> <matthewbg at google.com> wrote:
>>>>> This might also help:
>>>>>
>>>>> ==31759== ERROR: AddressSanitizer attempting double-free on
>>>>> 0x7f4fc6415580:
>>>>> #0 0x40b3c02 operator delete[]()
>>>>> #1 0x4006c54 llvm::APInt::AssignSlowCase()
>>>>> #2 0x39048eb std::pair<>::operator=()
>>>>> #3 0x39045c9 llvm::FlatArrayMap<>::insertInternal()
>>>>> #4 0x3902ee3 llvm::FlatArrayMap<>::insert()
>>>>> #5 0x3902a0c llvm::MultiImplMap<>::insert()
>>>>> #6 0x39013d1 llvm::MultiImplMap<>::operator[]()
>>>>> #7 0x38f12bf LinearizeExprTree()
>>>>> #8 0x38eff0c (anonymous
>>>>> namespace)::Reassociate::ReassociateExpression()
>>>>> #9 0x38ef101 (anonymous namespace)::Reassociate::OptimizeInst()
>>>>> #10 0x38ee154 (anonymous namespace)::Reassociate::runOnFunction()
>>>>> #11 0x3f02ac5 llvm::FPPassManager::runOnFunction()
>>>>> #12 0x3b9f498 (anonymous namespace)::CGPassManager::RunPassOnSCC()
>>>>> #13 0x3b9ed0e (anonymous
>>>>> namespace)::CGPassManager::RunAllPassesOnSCC()
>>>>> #14 0x3b9e4e2 (anonymous namespace)::CGPassManager::runOnModule()
>>>>> #15 0x3f0328c llvm::MPPassManager::runOnModule()
>>>>> #16 0x3f03c96 llvm::PassManagerImpl::run()
>>>>> #17 0x3f03e99 llvm::PassManager::run()
>>>>> #18 0x14b92a1 (anonymous
>>>>> namespace)::EmitAssemblyHelper::EmitAssembly()
>>>>> #19 0x14b8df7 clang::EmitBackendOutput()
>>>>> #20 0x14b25b3 clang::BackendConsumer::HandleTranslationUnit()
>>>>> #21 0x2193ba2 clang::ParseAST()
>>>>> #22 0x14b06b4 clang::CodeGenAction::ExecuteAction()
>>>>> #23 0x2005fb3 clang::FrontendAction::Execute()
>>>>> #24 0x1e960fa clang::CompilerInstance::ExecuteAction()
>>>>> #25 0x14abf1a clang::ExecuteCompilerInvocation()
>>>>> #26 0x148f499 cc1_main()
>>>>> #27 0x14a2b9b main
>>>>> #28 0x7f4fc9656d5d __libc_start_main
>>>>> 0x7f4fc6415580 is located 0 bytes inside of 16-byte region
>>>>> [0x7f4fc6415580,0x7f4fc6415590)
>>>>> freed by thread T0 here:
>>>>> #0 0x40b3c02 operator delete[]()
>>>>> #1 0x39022bf llvm::FlatArrayMap<>::erase()
>>>>> #2 0x3902200 llvm::MultiImplMap<>::erase()
>>>>> #3 0x38f0ffc LinearizeExprTree()
>>>>> #4 0x38eff0c (anonymous
>>>>> namespace)::Reassociate::ReassociateExpression()
>>>>> #5 0x38ef101 (anonymous namespace)::Reassociate::OptimizeInst()
>>>>> #6 0x38ee154 (anonymous namespace)::Reassociate::runOnFunction()
>>>>> #7 0x3f02ac5 llvm::FPPassManager::runOnFunction()
>>>>> #8 0x3b9f498 (anonymous namespace)::CGPassManager::RunPassOnSCC()
>>>>> #9 0x3b9ed0e (anonymous namespace)::CGPassManager::RunAllPassesOnSCC()
>>>>> #10 0x3b9e4e2 (anonymous namespace)::CGPassManager::runOnModule()
>>>>> #11 0x3f0328c llvm::MPPassManager::runOnModule()
>>>>> #12 0x3f03c96 llvm::PassManagerImpl::run()
>>>>> #13 0x3f03e99 llvm::PassManager::run()
>>>>> #14 0x14b92a1 (anonymous
>>>>> namespace)::EmitAssemblyHelper::EmitAssembly()
>>>>> #15 0x14b8df7 clang::EmitBackendOutput()
>>>>> #16 0x14b25b3 clang::BackendConsumer::HandleTranslationUnit()
>>>>> #17 0x2193ba2 clang::ParseAST()
>>>>> #18 0x14b06b4 clang::CodeGenAction::ExecuteAction()
>>>>> #19 0x2005fb3 clang::FrontendAction::Execute()
>>>>> #20 0x1e960fa clang::CompilerInstance::ExecuteAction()
>>>>> #21 0x14abf1a clang::ExecuteCompilerInvocation()
>>>>> #22 0x148f499 cc1_main()
>>>>> #23 0x14a2b9b main
>>>>> #24 0x7f4fc9656d5d __libc_start_main
>>>>> previously allocated by thread T0 here:
>>>>> #0 0x40b3a82 operator new[]()
>>>>> #1 0x4005c7f getMemory()
>>>>> #2 0x4006a2f llvm::APInt::AssignSlowCase()
>>>>> #3 0x38f12d2 LinearizeExprTree()
>>>>> #4 0x38eff0c (anonymous
>>>>> namespace)::Reassociate::ReassociateExpression()
>>>>> #5 0x38ef101 (anonymous namespace)::Reassociate::OptimizeInst()
>>>>> #6 0x38ee154 (anonymous namespace)::Reassociate::runOnFunction()
>>>>> #7 0x3f02ac5 llvm::FPPassManager::runOnFunction()
>>>>> #8 0x3b9f498 (anonymous namespace)::CGPassManager::RunPassOnSCC()
>>>>> #9 0x3b9ed0e (anonymous namespace)::CGPassManager::RunAllPassesOnSCC()
>>>>> #10 0x3b9e4e2 (anonymous namespace)::CGPassManager::runOnModule()
>>>>> #11 0x3f0328c llvm::MPPassManager::runOnModule()
>>>>> #12 0x3f03c96 llvm::PassManagerImpl::run()
>>>>> #13 0x3f03e99 llvm::PassManager::run()
>>>>> #14 0x14b92a1 (anonymous
>>>>> namespace)::EmitAssemblyHelper::EmitAssembly()
>>>>> #15 0x14b8df7 clang::EmitBackendOutput()
>>>>> #16 0x14b25b3 clang::BackendConsumer::HandleTranslationUnit()
>>>>> #17 0x2193ba2 clang::ParseAST()
>>>>> #18 0x14b06b4 clang::CodeGenAction::ExecuteAction()
>>>>> #19 0x2005fb3 clang::FrontendAction::Execute()
>>>>> #20 0x1e960fa clang::CompilerInstance::ExecuteAction()
>>>>> #21 0x14abf1a clang::ExecuteCompilerInvocation()
>>>>> #22 0x148f499 cc1_main()
>>>>> #23 0x14a2b9b main
>>>>>
>>>>> On Tue, Jun 12, 2012 at 11:04 AM, Matt Beaumont-Gay
>>>>> <matthewbg at google.com> wrote:
>>>>>> This seems to have caused some heap corruption when building OpenSSL
>>>>>> at -O1:
>>>>>>
>>>>>> #14 0x00007f45009c7806 in malloc_printerr (action=3,
>>>>>> str=0x7f4500a9b2f0 "double free or corruption (fasttop)",
>>>>>> ptr=<optimized out>) at malloc.c:6266
>>>>>> #15 0x00007f45009ce0d3 in *__GI___libc_free (mem=<optimized out>)
>>>>>> at malloc.c:3738
>>>>>> #16 0x0000000002821bb9 in llvm::APInt::AssignSlowCase
>>>>>> (this=0x7fffa46d55b8,
>>>>>> RHS=...) at llvm/lib/Support/APInt.cpp:143
>>>>>> #17 0x00000000008f1f59 in llvm::APInt::operator=
>>>>>> (this=0x7fffa46d55b8, RHS=...)
>>>>>> at llvm/include/llvm/ADT/APInt.h:595
>>>>>> #18 0x0000000002401d5e in std::pair<llvm::Value*,
>>>>>> llvm::APInt>::operator= (
>>>>>> this=0x7fffa46d55b0)
>>>>>> at
>>>>>> /usr/lib/gcc/x86_64-linux-gnu/4.4/../../../../include/c++/4.4/bits/stl_pair.h:67
>>>>>>
>>>>>>
>>>>>>
>>>>>> #19 0x0000000002402ae6 in llvm::FlatArrayMap<llvm::Value*,
>>>>>> llvm::APInt, 8u>::insertInternal (this=0x7fffa46d5598, Ptr=0x4767698,
>>>>>> Val=<error reading variable: Unhandled dwarf expression opcode
>>>>>> 0x0>,
>>>>>> Item=@0x7fffa46d4fd8: 0x4767698)
>>>>>> at llvm/include/llvm/ADT/FlatArrayMap.h:117
>>>>>> #20 0x0000000002401fed in llvm::FlatArrayMap<llvm::Value*,
>>>>>> llvm::APInt, 8u>::insert (this=0x7fffa46d5598, KV=...)
>>>>>> at llvm/include/llvm/ADT/FlatArrayMap.h:188
>>>>>> #21 0x0000000002401e57 in
>>>>>> llvm::MultiImplMap<llvm::FlatArrayMap<llvm::Value*, llvm::APInt, 8u>,
>>>>>> llvm::DenseMap<llvm::Value*, llvm::APInt,
>>>>>> llvm::DenseMapInfo<llvm::Value*> >, 8u, false,
>>>>>> llvm::MultiImplMapIteratorsFactory<llvm::FlatArrayMap<llvm::Value*,
>>>>>> llvm::APInt, 8u>, llvm::DenseMap<llvm::Value*, llvm::APInt,
>>>>>> llvm::DenseMapInfo<llvm::Value*> > > >::insert (this=0x7fffa46d5598,
>>>>>> KV=...)
>>>>>> at llvm/include/llvm/ADT/MultiImplMap.h:218
>>>>>> #22 0x0000000002400e81 in
>>>>>> llvm::MultiImplMap<llvm::FlatArrayMap<llvm::Value*, llvm::APInt, 8u>,
>>>>>> llvm::DenseMap<llvm::Value*, llvm::APInt,
>>>>>> llvm::DenseMapInfo<llvm::Value*> >, 8u, false,
>>>>>> llvm::MultiImplMapIteratorsFactory<llvm::FlatArrayMap<llvm::Value*,
>>>>>> llvm::APInt, 8u>, llvm::DenseMap<llvm::Value*, llvm::APInt,
>>>>>> llvm::DenseMapInfo<llvm::Value*> > > >::operator[]
>>>>>> (this=0x7fffa46d5598,
>>>>>> Key=@0x7fffa46d54a0: 0x4767698)
>>>>>> at llvm/include/llvm/ADT/MultiImplMap.h:281
>>>>>> #23 0x00000000023f6626 in LinearizeExprTree (I=0x47679b0, Ops=...)
>>>>>> at llvm/lib/Transforms/Scalar/Reassociate.cpp:589
>>>>>>
>>>>>> I'll throw delta at it and give you a real bug report soon.
>>>>>>
>>>>>> On Tue, Jun 12, 2012 at 7:33 AM, Duncan Sands<baldrick at free.fr>
>>>>>> wrote:
>>>>>>> Author: baldrick
>>>>>>> Date: Tue Jun 12 09:33:56 2012
>>>>>>> New Revision: 158358
>>>>>>>
>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=158358&view=rev
>>>>>>> Log:
>>>>>>> Now that Reassociate's LinearizeExprTree can look through arbitrary
>>>>>>> expression
>>>>>>> topologies, it is quite possible for a leaf node to have huge
>>>>>>> multiplicity, for
>>>>>>> example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a
>>>>>>> value which is x
>>>>>>> raised to a vast power (the multiplicity, or weight, of x). This
>>>>>>> patch fixes
>>>>>>> the computation of weights by correctly computing them no matter
>>>>>>> how big they
>>>>>>> are, rather than just overflowing and getting a wrong value. It
>>>>>>> turns out that
>>>>>>> the weight for a value never needs more bits to represent than the
>>>>>>> value itself,
>>>>>>> so it is enough to represent weights as APInts of the same bitwidth
>>>>>>> and do the
>>>>>>> right overflow-avoiding dance steps when computing weights. As a
>>>>>>> side-effect it
>>>>>>> reduces the number of multiplies needed in some cases of large
>>>>>>> powers. While
>>>>>>> there, in view of external uses (eg by the vectorizer) I made
>>>>>>> LinearizeExprTree
>>>>>>> static, pushing the rank computation out into users. This is
>>>>>>> progress towards
>>>>>>> fixing PR13021.
>>>>>>>
>>>>>>> Added:
>>>>>>> llvm/trunk/test/Transforms/Reassociate/repeats.ll
>>>>>>> Modified:
>>>>>>> llvm/trunk/include/llvm/Constants.h
>>>>>>> llvm/trunk/include/llvm/Instruction.h
>>>>>>> llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp
>>>>>>> llvm/trunk/lib/VMCore/Constants.cpp
>>>>>>> llvm/trunk/lib/VMCore/Instruction.cpp
>>>>>>>
>>>>>>> Modified: llvm/trunk/include/llvm/Constants.h
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Constants.h?rev=158358&r1=158357&r2=158358&view=diff
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --- llvm/trunk/include/llvm/Constants.h (original)
>>>>>>> +++ llvm/trunk/include/llvm/Constants.h Tue Jun 12 09:33:56 2012
>>>>>>> @@ -917,6 +917,11 @@
>>>>>>> return getLShr(C1, C2, true);
>>>>>>> }
>>>>>>>
>>>>>>> + /// getBinOpIdentity - Return the identity for the given binary
>>>>>>> operation,
>>>>>>> + /// i.e. a constant C such that X op C = X and C op X = X for
>>>>>>> every X. It
>>>>>>> + /// is an error to call this for an operation that doesn't have
>>>>>>> an identity.
>>>>>>> + static Constant *getBinOpIdentity(unsigned Opcode, Type *Ty);
>>>>>>> +
>>>>>>> /// Transparently provide more efficient getOperand methods.
>>>>>>> DECLARE_TRANSPARENT_OPERAND_ACCESSORS(Constant);
>>>>>>>
>>>>>>>
>>>>>>> Modified: llvm/trunk/include/llvm/Instruction.h
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Instruction.h?rev=158358&r1=158357&r2=158358&view=diff
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --- llvm/trunk/include/llvm/Instruction.h (original)
>>>>>>> +++ llvm/trunk/include/llvm/Instruction.h Tue Jun 12 09:33:56 2012
>>>>>>> @@ -215,6 +215,27 @@
>>>>>>> bool isCommutative() const { return isCommutative(getOpcode()); }
>>>>>>> static bool isCommutative(unsigned op);
>>>>>>>
>>>>>>> + /// isIdempotent - Return true if the instruction is idempotent:
>>>>>>> + ///
>>>>>>> + /// Idempotent operators satisfy: x op x === x
>>>>>>> + ///
>>>>>>> + /// In LLVM, the And and Or operators are idempotent.
>>>>>>> + ///
>>>>>>> + bool isIdempotent() const { return isIdempotent(getOpcode()); }
>>>>>>> + static bool isIdempotent(unsigned op);
>>>>>>> +
>>>>>>> + /// isNilpotent - Return true if the instruction is nilpotent:
>>>>>>> + ///
>>>>>>> + /// Nilpotent operators satisfy: x op x === Id,
>>>>>>> + ///
>>>>>>> + /// where Id is the identity for the operator, i.e. a constant
>>>>>>> such that
>>>>>>> + /// x op Id === x and Id op x === x for all x.
>>>>>>> + ///
>>>>>>> + /// In LLVM, the Xor operator is nilpotent.
>>>>>>> + ///
>>>>>>> + bool isNilpotent() const { return isNilpotent(getOpcode()); }
>>>>>>> + static bool isNilpotent(unsigned op);
>>>>>>> +
>>>>>>> /// mayWriteToMemory - Return true if this instruction may
>>>>>>> modify memory.
>>>>>>> ///
>>>>>>> bool mayWriteToMemory() const;
>>>>>>>
>>>>>>> Modified: llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp?rev=158358&r1=158357&r2=158358&view=diff
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --- llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp (original)
>>>>>>> +++ llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp Tue Jun 12
>>>>>>> 09:33:56 2012
>>>>>>> @@ -143,7 +143,6 @@
>>>>>>> Value *buildMinimalMultiplyDAG(IRBuilder<> &Builder,
>>>>>>> SmallVectorImpl<Factor>
>>>>>>> &Factors);
>>>>>>> Value *OptimizeMul(BinaryOperator *I,
>>>>>>> SmallVectorImpl<ValueEntry> &Ops);
>>>>>>> - void LinearizeExprTree(BinaryOperator *I,
>>>>>>> SmallVectorImpl<ValueEntry> &Ops);
>>>>>>> Value *RemoveFactorFromExpression(Value *V, Value *Factor);
>>>>>>> void EraseInst(Instruction *I);
>>>>>>> void OptimizeInst(Instruction *I);
>>>>>>> @@ -251,10 +250,148 @@
>>>>>>> return Res;
>>>>>>> }
>>>>>>>
>>>>>>> +/// CarmichaelShift - Returns k such that lambda(2^Bitwidth) =
>>>>>>> 2^k, where lambda
>>>>>>> +/// is the Carmichael function. This means that x^(2^k) === 1 mod
>>>>>>> 2^Bitwidth for
>>>>>>> +/// every odd x, i.e. x^(2^k) = 1 for every odd x in Bitwidth-bit
>>>>>>> arithmetic.
>>>>>>> +/// Note that 0<= k< Bitwidth, and if Bitwidth> 3 then x^(2^k) =
>>>>>>> 0 for every
>>>>>>> +/// even x in Bitwidth-bit arithmetic.
>>>>>>> +static unsigned CarmichaelShift(unsigned Bitwidth) {
>>>>>>> + if (Bitwidth< 3)
>>>>>>> + return Bitwidth - 1;
>>>>>>> + return Bitwidth - 2;
>>>>>>> +}
>>>>>>> +
>>>>>>> +/// IncorporateWeight - Add the extra weight 'RHS' to the existing
>>>>>>> weight 'LHS',
>>>>>>> +/// reducing the combined weight using any special properties of
>>>>>>> the operation.
>>>>>>> +/// The existing weight LHS represents the computation X op X op
>>>>>>> ... op X where
>>>>>>> +/// X occurs LHS times. The combined weight represents X op X op
>>>>>>> ... op X with
>>>>>>> +/// X occurring LHS + RHS times. If op is "Xor" for example then
>>>>>>> the combined
>>>>>>> +/// operation is equivalent to X if LHS + RHS is odd, or 0 if LHS
>>>>>>> + RHS is even;
>>>>>>> +/// the routine returns 1 in LHS in the first case, and 0 in LHS
>>>>>>> in the second.
>>>>>>> +static void IncorporateWeight(APInt&LHS, const APInt&RHS, unsigned
>>>>>>> Opcode) {
>>>>>>> + // If we were working with infinite precision arithmetic then
>>>>>>> the combined
>>>>>>> + // weight would be LHS + RHS. But we are using finite precision
>>>>>>> arithmetic,
>>>>>>> + // and the APInt sum LHS + RHS may not be correct if it wraps
>>>>>>> (it is correct
>>>>>>> + // for nilpotent operations and addition, but not for idempotent
>>>>>>> operations
>>>>>>> + // and multiplication), so it is important to correctly reduce
>>>>>>> the combined
>>>>>>> + // weight back into range if wrapping would be wrong.
>>>>>>> +
>>>>>>> + // If RHS is zero then the weight didn't change.
>>>>>>> + if (RHS.isMinValue())
>>>>>>> + return;
>>>>>>> + // If LHS is zero then the combined weight is RHS.
>>>>>>> + if (LHS.isMinValue()) {
>>>>>>> + LHS = RHS;
>>>>>>> + return;
>>>>>>> + }
>>>>>>> + // From this point on we know that neither LHS nor RHS is zero.
>>>>>>> +
>>>>>>> + if (Instruction::isIdempotent(Opcode)) {
>>>>>>> + // Idempotent means X op X === X, so any non-zero weight is
>>>>>>> equivalent to a
>>>>>>> + // weight of 1. Keeping weights at zero or one also means
>>>>>>> that wrapping is
>>>>>>> + // not a problem.
>>>>>>> + assert(LHS == 1&& RHS == 1&& "Weights not reduced!");
>>>>>>> + return; // Return a weight of 1.
>>>>>>> + }
>>>>>>> + if (Instruction::isNilpotent(Opcode)) {
>>>>>>> + // Nilpotent means X op X === 0, so reduce weights modulo 2.
>>>>>>> + assert(LHS == 1&& RHS == 1&& "Weights not reduced!");
>>>>>>> + LHS = 0; // 1 + 1 === 0 modulo 2.
>>>>>>> + return;
>>>>>>> + }
>>>>>>> + if (Opcode == Instruction::Add) {
>>>>>>> + // TODO: Reduce the weight by exploiting nsw/nuw?
>>>>>>> + LHS += RHS;
>>>>>>> + return;
>>>>>>> + }
>>>>>>> +
>>>>>>> + assert(Opcode == Instruction::Mul&& "Unknown associative
>>>>>>> operation!");
>>>>>>> + unsigned Bitwidth = LHS.getBitWidth();
>>>>>>> + // If CM is the Carmichael number then a weight W satisfying W>=
>>>>>>> CM+Bitwidth
>>>>>>> + // can be replaced with W-CM. That's because x^W=x^(W-CM) for
>>>>>>> every Bitwidth
>>>>>>> + // bit number x, since either x is odd in which case x^CM = 1,
>>>>>>> or x is even in
>>>>>>> + // which case both x^W and x^(W - CM) are zero. By subtracting
>>>>>>> off multiples
>>>>>>> + // of CM like this weights can always be reduced to the range
>>>>>>> [0, CM+Bitwidth)
>>>>>>> + // which by a happy accident means that they can always be
>>>>>>> represented using
>>>>>>> + // Bitwidth bits.
>>>>>>> + // TODO: Reduce the weight by exploiting nsw/nuw? (Could do
>>>>>>> much better than
>>>>>>> + // the Carmichael number).
>>>>>>> + if (Bitwidth> 3) {
>>>>>>> + /// CM - The value of Carmichael's lambda function.
>>>>>>> + APInt CM = APInt::getOneBitSet(Bitwidth,
>>>>>>> CarmichaelShift(Bitwidth));
>>>>>>> + // Any weight W>= Threshold can be replaced with W - CM.
>>>>>>> + APInt Threshold = CM + Bitwidth;
>>>>>>> + assert(LHS.ult(Threshold)&& RHS.ult(Threshold)&& "Weights
>>>>>>> not reduced!");
>>>>>>> + // For Bitwidth 4 or more the following sum does not overflow.
>>>>>>> + LHS += RHS;
>>>>>>> + while (LHS.uge(Threshold))
>>>>>>> + LHS -= CM;
>>>>>>> + } else {
>>>>>>> + // To avoid problems with overflow do everything the same as
>>>>>>> above but using
>>>>>>> + // a larger type.
>>>>>>> + unsigned CM = 1U<< CarmichaelShift(Bitwidth);
>>>>>>> + unsigned Threshold = CM + Bitwidth;
>>>>>>> + assert(LHS.getZExtValue()< Threshold&& RHS.getZExtValue()<
>>>>>>> Threshold&&
>>>>>>> + "Weights not reduced!");
>>>>>>> + unsigned Total = LHS.getZExtValue() + RHS.getZExtValue();
>>>>>>> + while (Total>= Threshold)
>>>>>>> + Total -= CM;
>>>>>>> + LHS = Total;
>>>>>>> + }
>>>>>>> +}
>>>>>>> +
>>>>>>> +/// EvaluateRepeatedConstant - Compute C op C op ... op C where
>>>>>>> the constant C
>>>>>>> +/// is repeated Weight times.
>>>>>>> +static Constant *EvaluateRepeatedConstant(unsigned Opcode,
>>>>>>> Constant *C,
>>>>>>> + APInt Weight) {
>>>>>>> + // For addition the result can be efficiently computed as the
>>>>>>> product of the
>>>>>>> + // constant and the weight.
>>>>>>> + if (Opcode == Instruction::Add)
>>>>>>> + return ConstantExpr::getMul(C,
>>>>>>> ConstantInt::get(C->getContext(), Weight));
>>>>>>> +
>>>>>>> + // The weight might be huge, so compute by repeated squaring to
>>>>>>> ensure that
>>>>>>> + // compile time is proportional to the logarithm of the weight.
>>>>>>> + Constant *Result = 0;
>>>>>>> + Constant *Power = C; // Successively C, C op C, (C op C) op (C
>>>>>>> op C) etc.
>>>>>>> + // Visit the bits in Weight.
>>>>>>> + while (Weight != 0) {
>>>>>>> + // If the current bit in Weight is non-zero do Result = Result
>>>>>>> op Power.
>>>>>>> + if (Weight[0])
>>>>>>> + Result = Result ? ConstantExpr::get(Opcode, Result, Power) :
>>>>>>> Power;
>>>>>>> + // Move on to the next bit if any more are non-zero.
>>>>>>> + Weight = Weight.lshr(1);
>>>>>>> + if (Weight.isMinValue())
>>>>>>> + break;
>>>>>>> + // Square the power.
>>>>>>> + Power = ConstantExpr::get(Opcode, Power, Power);
>>>>>>> + }
>>>>>>> +
>>>>>>> + assert(Result&& "Only positive weights supported!");
>>>>>>> + return Result;
>>>>>>> +}
>>>>>>> +
>>>>>>> +typedef std::pair<Value*, APInt> RepeatedValue;
>>>>>>> +
>>>>>>> /// LinearizeExprTree - Given an associative binary expression,
>>>>>>> return the leaf
>>>>>>> -/// nodes in Ops. The original expression is the same as Ops[0]
>>>>>>> op ... Ops[N].
>>>>>>> -/// Note that a node may occur multiple times in Ops, but if so
>>>>>>> all occurrences
>>>>>>> -/// are consecutive in the vector.
>>>>>>> +/// nodes in Ops along with their weights (how many times the leaf
>>>>>>> occurs). The
>>>>>>> +/// original expression is the same as
>>>>>>> +/// (Ops[0].first op Ops[0].first op ... Ops[0].first)<-
>>>>>>> Ops[0].second times
>>>>>>> +/// op
>>>>>>> +/// (Ops[1].first op Ops[1].first op ... Ops[1].first)<-
>>>>>>> Ops[1].second times
>>>>>>> +/// op
>>>>>>> +/// ...
>>>>>>> +/// op
>>>>>>> +/// (Ops[N].first op Ops[N].first op ... Ops[N].first)<-
>>>>>>> Ops[N].second times
>>>>>>> +///
>>>>>>> +/// Note that the values Ops[0].first, ..., Ops[N].first are all
>>>>>>> distinct, and
>>>>>>> +/// they are all non-constant except possibly for the last one,
>>>>>>> which if it is
>>>>>>> +/// constant will have weight one (Ops[N].second === 1).
>>>>>>> +///
>>>>>>> +/// This routine may modify the function, in which case it returns
>>>>>>> 'true'. The
>>>>>>> +/// changes it makes may well be destructive, changing the value
>>>>>>> computed by 'I'
>>>>>>> +/// to something completely different. Thus if the routine
>>>>>>> returns 'true' then
>>>>>>> +/// you MUST either replace I with a new expression computed from
>>>>>>> the Ops array,
>>>>>>> +/// or use RewriteExprTree to put the values back in.
>>>>>>> ///
>>>>>>> /// A leaf node is either not a binary operation of the same kind
>>>>>>> as the root
>>>>>>> /// node 'I' (i.e. is not a binary operator at all, or is, but
>>>>>>> with a different
>>>>>>> @@ -276,7 +413,7 @@
>>>>>>> /// + * | F, G
>>>>>>> ///
>>>>>>> /// The leaf nodes are C, E, F and G. The Ops array will contain
>>>>>>> (maybe not in
>>>>>>> -/// that order) C, E, F, F, G, G.
>>>>>>> +/// that order) (C, 1), (E, 1), (F, 2), (G, 2).
>>>>>>> ///
>>>>>>> /// The expression is maximal: if some instruction is a binary
>>>>>>> operator of the
>>>>>>> /// same kind as 'I', and all of its uses are non-leaf nodes of
>>>>>>> the expression,
>>>>>>> @@ -287,7 +424,8 @@
>>>>>>> /// order to ensure that every non-root node in the expression
>>>>>>> has *exactly one*
>>>>>>> /// use by a non-leaf node of the expression. This destruction
>>>>>>> means that the
>>>>>>> /// caller MUST either replace 'I' with a new expression or use
>>>>>>> something like
>>>>>>> -/// RewriteExprTree to put the values back in.
>>>>>>> +/// RewriteExprTree to put the values back in if the routine
>>>>>>> indicates that it
>>>>>>> +/// made a change by returning 'true'.
>>>>>>> ///
>>>>>>> /// In the above example either the right operand of A or the
>>>>>>> left operand of B
>>>>>>> /// will be replaced by undef. If it is B's operand then this
>>>>>>> gives:
>>>>>>> @@ -310,9 +448,14 @@
>>>>>>> /// of the expression) if it can turn them into binary operators
>>>>>>> of the right
>>>>>>> /// type and thus make the expression bigger.
>>>>>>>
>>>>>>> -void Reassociate::LinearizeExprTree(BinaryOperator *I,
>>>>>>> - SmallVectorImpl<ValueEntry>
>>>>>>> &Ops) {
>>>>>>> +static bool LinearizeExprTree(BinaryOperator *I,
>>>>>>> + SmallVectorImpl<RepeatedValue> &Ops) {
>>>>>>> DEBUG(dbgs()<< "LINEARIZE: "<< *I<< '\n');
>>>>>>> + unsigned Bitwidth =
>>>>>>> I->getType()->getScalarType()->getPrimitiveSizeInBits();
>>>>>>> + unsigned Opcode = I->getOpcode();
>>>>>>> + assert(Instruction::isAssociative(Opcode)&&
>>>>>>> + Instruction::isCommutative(Opcode)&&
>>>>>>> + "Expected an associative and commutative operation!");
>>>>>>>
>>>>>>> // Visit all operands of the expression, keeping track of their
>>>>>>> weight (the
>>>>>>> // number of paths from the expression root to the operand, or
>>>>>>> if you like
>>>>>>> @@ -324,9 +467,9 @@
>>>>>>> // with their weights, representing a certain number of paths to
>>>>>>> the operator.
>>>>>>> // If an operator occurs in the worklist multiple times then we
>>>>>>> found multiple
>>>>>>> // ways to get to it.
>>>>>>> - SmallVector<std::pair<BinaryOperator*, unsigned>, 8> Worklist;
>>>>>>> // (Op, Weight)
>>>>>>> - Worklist.push_back(std::make_pair(I, 1));
>>>>>>> - unsigned Opcode = I->getOpcode();
>>>>>>> + SmallVector<std::pair<BinaryOperator*, APInt>, 8> Worklist; //
>>>>>>> (Op, Weight)
>>>>>>> + Worklist.push_back(std::make_pair(I, APInt(Bitwidth, 1)));
>>>>>>> + bool MadeChange = false;
>>>>>>>
>>>>>>> // Leaves of the expression are values that either aren't the
>>>>>>> right kind of
>>>>>>> // operation (eg: a constant, or a multiply in an add tree), or
>>>>>>> are, but have
>>>>>>> @@ -343,7 +486,7 @@
>>>>>>>
>>>>>>> // Leaves - Keeps track of the set of putative leaves as well as
>>>>>>> the number of
>>>>>>> // paths to each leaf seen so far.
>>>>>>> - typedef SmallMap<Value*, unsigned, 8> LeafMap;
>>>>>>> + typedef SmallMap<Value*, APInt, 8> LeafMap;
>>>>>>> LeafMap Leaves; // Leaf -> Total weight so far.
>>>>>>> SmallVector<Value*, 8> LeafOrder; // Ensure deterministic leaf
>>>>>>> output order.
>>>>>>>
>>>>>>> @@ -351,13 +494,12 @@
>>>>>>> SmallPtrSet<Value*, 8> Visited; // For sanity checking the
>>>>>>> iteration scheme.
>>>>>>> #endif
>>>>>>> while (!Worklist.empty()) {
>>>>>>> - std::pair<BinaryOperator*, unsigned> P =
>>>>>>> Worklist.pop_back_val();
>>>>>>> + std::pair<BinaryOperator*, APInt> P = Worklist.pop_back_val();
>>>>>>> I = P.first; // We examine the operands of this binary operator.
>>>>>>> - assert(P.second>= 1&& "No paths to here, so how did we get
>>>>>>> here?!");
>>>>>>>
>>>>>>> for (unsigned OpIdx = 0; OpIdx< 2; ++OpIdx) { // Visit operands.
>>>>>>> Value *Op = I->getOperand(OpIdx);
>>>>>>> - unsigned Weight = P.second; // Number of paths to this operand.
>>>>>>> + APInt Weight = P.second; // Number of paths to this operand.
>>>>>>> DEBUG(dbgs()<< "OPERAND: "<< *Op<< " ("<< Weight<< ")\n");
>>>>>>> assert(!Op->use_empty()&& "No uses, so how did we get to
>>>>>>> it?!");
>>>>>>>
>>>>>>> @@ -389,7 +531,7 @@
>>>>>>> assert(Visited.count(Op)&& "In leaf map but not visited!");
>>>>>>>
>>>>>>> // Update the number of paths to the leaf.
>>>>>>> - It->second += Weight;
>>>>>>> + IncorporateWeight(It->second, Weight, Opcode);
>>>>>>>
>>>>>>> // The leaf already has one use from inside the
>>>>>>> expression. As we want
>>>>>>> // exactly one such use, drop this new use of the leaf.
>>>>>>> @@ -450,21 +592,44 @@
>>>>>>>
>>>>>>> // The leaves, repeated according to their weights, represent
>>>>>>> the linearized
>>>>>>> // form of the expression.
>>>>>>> + Constant *Cst = 0; // Accumulate constants here.
>>>>>>> for (unsigned i = 0, e = LeafOrder.size(); i != e; ++i) {
>>>>>>> Value *V = LeafOrder[i];
>>>>>>> LeafMap::iterator It = Leaves.find(V);
>>>>>>> if (It == Leaves.end())
>>>>>>> - // Leaf already output, or node initially thought to be a
>>>>>>> leaf wasn't.
>>>>>>> + // Node initially thought to be a leaf wasn't.
>>>>>>> continue;
>>>>>>> assert(!isReassociableOp(V, Opcode)&& "Shouldn't be a leaf!");
>>>>>>> - unsigned Weight = It->second;
>>>>>>> - assert(Weight> 0&& "No paths to this value!");
>>>>>>> - // FIXME: Rather than repeating values Weight times, use a
>>>>>>> vector of
>>>>>>> - // (ValueEntry, multiplicity) pairs.
>>>>>>> - Ops.append(Weight, ValueEntry(getRank(V), V));
>>>>>>> + APInt Weight = It->second;
>>>>>>> + if (Weight.isMinValue())
>>>>>>> + // Leaf already output or weight reduction eliminated it.
>>>>>>> + continue;
>>>>>>> // Ensure the leaf is only output once.
>>>>>>> - Leaves.erase(It);
>>>>>>> + It->second = 0;
>>>>>>> + // Glob all constants together into Cst.
>>>>>>> + if (Constant *C = dyn_cast<Constant>(V)) {
>>>>>>> + C = EvaluateRepeatedConstant(Opcode, C, Weight);
>>>>>>> + Cst = Cst ? ConstantExpr::get(Opcode, Cst, C) : C;
>>>>>>> + continue;
>>>>>>> + }
>>>>>>> + // Add non-constant
>>>>>>> + Ops.push_back(std::make_pair(V, Weight));
>>>>>>> + }
>>>>>>> +
>>>>>>> + // Add any constants back into Ops, all globbed together and
>>>>>>> reduced to having
>>>>>>> + // weight 1 for the convenience of users.
>>>>>>> + if (Cst&& Cst != ConstantExpr::getBinOpIdentity(Opcode,
>>>>>>> I->getType()))
>>>>>>> + Ops.push_back(std::make_pair(Cst, APInt(Bitwidth, 1)));
>>>>>>> +
>>>>>>> + // For nilpotent operations or addition there may be no
>>>>>>> operands, for example
>>>>>>> + // because the expression was "X xor X" or consisted of
>>>>>>> 2^Bitwidth additions:
>>>>>>> + // in both cases the weight reduces to 0 causing the value to be
>>>>>>> skipped.
>>>>>>> + if (Ops.empty()) {
>>>>>>> + Constant *Identity = ConstantExpr::getBinOpIdentity(Opcode,
>>>>>>> I->getType());
>>>>>>> + Ops.push_back(std::make_pair(Identity, APInt(Bitwidth, 1)));
>>>>>>> }
>>>>>>> +
>>>>>>> + return MadeChange;
>>>>>>> }
>>>>>>>
>>>>>>> // RewriteExprTree - Now that the operands for this expression
>>>>>>> tree are
>>>>>>> @@ -775,8 +940,15 @@
>>>>>>> BinaryOperator *BO = isReassociableOp(V, Instruction::Mul);
>>>>>>> if (!BO) return 0;
>>>>>>>
>>>>>>> + SmallVector<RepeatedValue, 8> Tree;
>>>>>>> + MadeChange |= LinearizeExprTree(BO, Tree);
>>>>>>> SmallVector<ValueEntry, 8> Factors;
>>>>>>> - LinearizeExprTree(BO, Factors);
>>>>>>> + Factors.reserve(Tree.size());
>>>>>>> + for (unsigned i = 0, e = Tree.size(); i != e; ++i) {
>>>>>>> + RepeatedValue E = Tree[i];
>>>>>>> + Factors.append(E.second.getZExtValue(),
>>>>>>> + ValueEntry(getRank(E.first), E.first));
>>>>>>> + }
>>>>>>>
>>>>>>> bool FoundFactor = false;
>>>>>>> bool NeedsNegate = false;
>>>>>>> @@ -1439,8 +1611,15 @@
>>>>>>>
>>>>>>> // First, walk the expression tree, linearizing the tree,
>>>>>>> collecting the
>>>>>>> // operand information.
>>>>>>> + SmallVector<RepeatedValue, 8> Tree;
>>>>>>> + MadeChange |= LinearizeExprTree(I, Tree);
>>>>>>> SmallVector<ValueEntry, 8> Ops;
>>>>>>> - LinearizeExprTree(I, Ops);
>>>>>>> + Ops.reserve(Tree.size());
>>>>>>> + for (unsigned i = 0, e = Tree.size(); i != e; ++i) {
>>>>>>> + RepeatedValue E = Tree[i];
>>>>>>> + Ops.append(E.second.getZExtValue(),
>>>>>>> + ValueEntry(getRank(E.first), E.first));
>>>>>>> + }
>>>>>>>
>>>>>>> DEBUG(dbgs()<< "RAIn:\t"; PrintOps(I, Ops); dbgs()<< '\n');
>>>>>>>
>>>>>>>
>>>>>>> Modified: llvm/trunk/lib/VMCore/Constants.cpp
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Constants.cpp?rev=158358&r1=158357&r2=158358&view=diff
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --- llvm/trunk/lib/VMCore/Constants.cpp (original)
>>>>>>> +++ llvm/trunk/lib/VMCore/Constants.cpp Tue Jun 12 09:33:56 2012
>>>>>>> @@ -2007,6 +2007,26 @@
>>>>>>> isExact ? PossiblyExactOperator::IsExact : 0);
>>>>>>> }
>>>>>>>
>>>>>>> +/// getBinOpIdentity - Return the identity for the given binary
>>>>>>> operation,
>>>>>>> +/// i.e. a constant C such that X op C = X and C op X = X for
>>>>>>> every X. It
>>>>>>> +/// is an error to call this for an operation that doesn't have an
>>>>>>> identity.
>>>>>>> +Constant *ConstantExpr::getBinOpIdentity(unsigned Opcode, Type
>>>>>>> *Ty) {
>>>>>>> + switch (Opcode) {
>>>>>>> + default:
>>>>>>> + llvm_unreachable("Not a binary operation with identity");
>>>>>>> + case Instruction::Add:
>>>>>>> + case Instruction::Or:
>>>>>>> + case Instruction::Xor:
>>>>>>> + return Constant::getNullValue(Ty);
>>>>>>> +
>>>>>>> + case Instruction::Mul:
>>>>>>> + return ConstantInt::get(Ty, 1);
>>>>>>> +
>>>>>>> + case Instruction::And:
>>>>>>> + return Constant::getAllOnesValue(Ty);
>>>>>>> + }
>>>>>>> +}
>>>>>>> +
>>>>>>> // destroyConstant - Remove the constant from the constant table...
>>>>>>> //
>>>>>>> void ConstantExpr::destroyConstant() {
>>>>>>>
>>>>>>> Modified: llvm/trunk/lib/VMCore/Instruction.cpp
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Instruction.cpp?rev=158358&r1=158357&r2=158358&view=diff
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --- llvm/trunk/lib/VMCore/Instruction.cpp (original)
>>>>>>> +++ llvm/trunk/lib/VMCore/Instruction.cpp Tue Jun 12 09:33:56 2012
>>>>>>> @@ -395,6 +395,29 @@
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> +/// isIdempotent - Return true if the instruction is idempotent:
>>>>>>> +///
>>>>>>> +/// Idempotent operators satisfy: x op x === x
>>>>>>> +///
>>>>>>> +/// In LLVM, the And and Or operators are idempotent.
>>>>>>> +///
>>>>>>> +bool Instruction::isIdempotent(unsigned Opcode) {
>>>>>>> + return Opcode == And || Opcode == Or;
>>>>>>> +}
>>>>>>> +
>>>>>>> +/// isNilpotent - Return true if the instruction is nilpotent:
>>>>>>> +///
>>>>>>> +/// Nilpotent operators satisfy: x op x === Id,
>>>>>>> +///
>>>>>>> +/// where Id is the identity for the operator, i.e. a constant
>>>>>>> such that
>>>>>>> +/// x op Id === x and Id op x === x for all x.
>>>>>>> +///
>>>>>>> +/// In LLVM, the Xor operator is nilpotent.
>>>>>>> +///
>>>>>>> +bool Instruction::isNilpotent(unsigned Opcode) {
>>>>>>> + return Opcode == Xor;
>>>>>>> +}
>>>>>>> +
>>>>>>> Instruction *Instruction::clone() const {
>>>>>>> Instruction *New = clone_impl();
>>>>>>> New->SubclassOptionalData = SubclassOptionalData;
>>>>>>>
>>>>>>> Added: llvm/trunk/test/Transforms/Reassociate/repeats.ll
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Reassociate/repeats.ll?rev=158358&view=auto
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --- llvm/trunk/test/Transforms/Reassociate/repeats.ll (added)
>>>>>>> +++ llvm/trunk/test/Transforms/Reassociate/repeats.ll Tue Jun 12
>>>>>>> 09:33:56 2012
>>>>>>> @@ -0,0 +1,252 @@
>>>>>>> +; RUN: opt< %s -reassociate -S | FileCheck %s
>>>>>>> +
>>>>>>> +; Tests involving repeated operations on the same value.
>>>>>>> +
>>>>>>> +define i8 @nilpotent(i8 %x) {
>>>>>>> +; CHECK: @nilpotent
>>>>>>> + %tmp = xor i8 %x, %x
>>>>>>> + ret i8 %tmp
>>>>>>> +; CHECK: ret i8 0
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i2 @idempotent(i2 %x) {
>>>>>>> +; CHECK: @idempotent
>>>>>>> + %tmp1 = and i2 %x, %x
>>>>>>> + %tmp2 = and i2 %tmp1, %x
>>>>>>> + %tmp3 = and i2 %tmp2, %x
>>>>>>> + ret i2 %tmp3
>>>>>>> +; CHECK: ret i2 %x
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i2 @add(i2 %x) {
>>>>>>> +; CHECK: @add
>>>>>>> + %tmp1 = add i2 %x, %x
>>>>>>> + %tmp2 = add i2 %tmp1, %x
>>>>>>> + %tmp3 = add i2 %tmp2, %x
>>>>>>> + ret i2 %tmp3
>>>>>>> +; CHECK: ret i2 0
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i2 @cst_add() {
>>>>>>> +; CHECK: @cst_add
>>>>>>> + %tmp1 = add i2 1, 1
>>>>>>> + %tmp2 = add i2 %tmp1, 1
>>>>>>> + ret i2 %tmp2
>>>>>>> +; CHECK: ret i2 -1
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i8 @cst_mul() {
>>>>>>> +; CHECK: @cst_mul
>>>>>>> + %tmp1 = mul i8 3, 3
>>>>>>> + %tmp2 = mul i8 %tmp1, 3
>>>>>>> + %tmp3 = mul i8 %tmp2, 3
>>>>>>> + %tmp4 = mul i8 %tmp3, 3
>>>>>>> + ret i8 %tmp4
>>>>>>> +; CHECK: ret i8 -13
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i3 @foo3x5(i3 %x) {
>>>>>>> +; Can be done with two multiplies.
>>>>>>> +; CHECK: @foo3x5
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: ret
>>>>>>> + %tmp1 = mul i3 %x, %x
>>>>>>> + %tmp2 = mul i3 %tmp1, %x
>>>>>>> + %tmp3 = mul i3 %tmp2, %x
>>>>>>> + %tmp4 = mul i3 %tmp3, %x
>>>>>>> + ret i3 %tmp4
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i3 @foo3x6(i3 %x) {
>>>>>>> +; Can be done with two multiplies.
>>>>>>> +; CHECK: @foo3x6
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: ret
>>>>>>> + %tmp1 = mul i3 %x, %x
>>>>>>> + %tmp2 = mul i3 %tmp1, %x
>>>>>>> + %tmp3 = mul i3 %tmp2, %x
>>>>>>> + %tmp4 = mul i3 %tmp3, %x
>>>>>>> + %tmp5 = mul i3 %tmp4, %x
>>>>>>> + ret i3 %tmp5
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i3 @foo3x7(i3 %x) {
>>>>>>> +; Can be done with two multiplies.
>>>>>>> +; CHECK: @foo3x7
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: ret
>>>>>>> + %tmp1 = mul i3 %x, %x
>>>>>>> + %tmp2 = mul i3 %tmp1, %x
>>>>>>> + %tmp3 = mul i3 %tmp2, %x
>>>>>>> + %tmp4 = mul i3 %tmp3, %x
>>>>>>> + %tmp5 = mul i3 %tmp4, %x
>>>>>>> + %tmp6 = mul i3 %tmp5, %x
>>>>>>> + ret i3 %tmp6
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i4 @foo4x8(i4 %x) {
>>>>>>> +; Can be done with two multiplies.
>>>>>>> +; CHECK: @foo4x8
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: ret
>>>>>>> + %tmp1 = mul i4 %x, %x
>>>>>>> + %tmp2 = mul i4 %tmp1, %x
>>>>>>> + %tmp3 = mul i4 %tmp2, %x
>>>>>>> + %tmp4 = mul i4 %tmp3, %x
>>>>>>> + %tmp5 = mul i4 %tmp4, %x
>>>>>>> + %tmp6 = mul i4 %tmp5, %x
>>>>>>> + %tmp7 = mul i4 %tmp6, %x
>>>>>>> + ret i4 %tmp7
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i4 @foo4x9(i4 %x) {
>>>>>>> +; Can be done with three multiplies.
>>>>>>> +; CHECK: @foo4x9
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: ret
>>>>>>> + %tmp1 = mul i4 %x, %x
>>>>>>> + %tmp2 = mul i4 %tmp1, %x
>>>>>>> + %tmp3 = mul i4 %tmp2, %x
>>>>>>> + %tmp4 = mul i4 %tmp3, %x
>>>>>>> + %tmp5 = mul i4 %tmp4, %x
>>>>>>> + %tmp6 = mul i4 %tmp5, %x
>>>>>>> + %tmp7 = mul i4 %tmp6, %x
>>>>>>> + %tmp8 = mul i4 %tmp7, %x
>>>>>>> + ret i4 %tmp8
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i4 @foo4x10(i4 %x) {
>>>>>>> +; Can be done with three multiplies.
>>>>>>> +; CHECK: @foo4x10
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: ret
>>>>>>> + %tmp1 = mul i4 %x, %x
>>>>>>> + %tmp2 = mul i4 %tmp1, %x
>>>>>>> + %tmp3 = mul i4 %tmp2, %x
>>>>>>> + %tmp4 = mul i4 %tmp3, %x
>>>>>>> + %tmp5 = mul i4 %tmp4, %x
>>>>>>> + %tmp6 = mul i4 %tmp5, %x
>>>>>>> + %tmp7 = mul i4 %tmp6, %x
>>>>>>> + %tmp8 = mul i4 %tmp7, %x
>>>>>>> + %tmp9 = mul i4 %tmp8, %x
>>>>>>> + ret i4 %tmp9
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i4 @foo4x11(i4 %x) {
>>>>>>> +; Can be done with four multiplies.
>>>>>>> +; CHECK: @foo4x11
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: ret
>>>>>>> + %tmp1 = mul i4 %x, %x
>>>>>>> + %tmp2 = mul i4 %tmp1, %x
>>>>>>> + %tmp3 = mul i4 %tmp2, %x
>>>>>>> + %tmp4 = mul i4 %tmp3, %x
>>>>>>> + %tmp5 = mul i4 %tmp4, %x
>>>>>>> + %tmp6 = mul i4 %tmp5, %x
>>>>>>> + %tmp7 = mul i4 %tmp6, %x
>>>>>>> + %tmp8 = mul i4 %tmp7, %x
>>>>>>> + %tmp9 = mul i4 %tmp8, %x
>>>>>>> + %tmp10 = mul i4 %tmp9, %x
>>>>>>> + ret i4 %tmp10
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i4 @foo4x12(i4 %x) {
>>>>>>> +; Can be done with two multiplies.
>>>>>>> +; CHECK: @foo4x12
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: ret
>>>>>>> + %tmp1 = mul i4 %x, %x
>>>>>>> + %tmp2 = mul i4 %tmp1, %x
>>>>>>> + %tmp3 = mul i4 %tmp2, %x
>>>>>>> + %tmp4 = mul i4 %tmp3, %x
>>>>>>> + %tmp5 = mul i4 %tmp4, %x
>>>>>>> + %tmp6 = mul i4 %tmp5, %x
>>>>>>> + %tmp7 = mul i4 %tmp6, %x
>>>>>>> + %tmp8 = mul i4 %tmp7, %x
>>>>>>> + %tmp9 = mul i4 %tmp8, %x
>>>>>>> + %tmp10 = mul i4 %tmp9, %x
>>>>>>> + %tmp11 = mul i4 %tmp10, %x
>>>>>>> + ret i4 %tmp11
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i4 @foo4x13(i4 %x) {
>>>>>>> +; Can be done with three multiplies.
>>>>>>> +; CHECK: @foo4x13
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: ret
>>>>>>> + %tmp1 = mul i4 %x, %x
>>>>>>> + %tmp2 = mul i4 %tmp1, %x
>>>>>>> + %tmp3 = mul i4 %tmp2, %x
>>>>>>> + %tmp4 = mul i4 %tmp3, %x
>>>>>>> + %tmp5 = mul i4 %tmp4, %x
>>>>>>> + %tmp6 = mul i4 %tmp5, %x
>>>>>>> + %tmp7 = mul i4 %tmp6, %x
>>>>>>> + %tmp8 = mul i4 %tmp7, %x
>>>>>>> + %tmp9 = mul i4 %tmp8, %x
>>>>>>> + %tmp10 = mul i4 %tmp9, %x
>>>>>>> + %tmp11 = mul i4 %tmp10, %x
>>>>>>> + %tmp12 = mul i4 %tmp11, %x
>>>>>>> + ret i4 %tmp12
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i4 @foo4x14(i4 %x) {
>>>>>>> +; Can be done with three multiplies.
>>>>>>> +; CHECK: @foo4x14
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: ret
>>>>>>> + %tmp1 = mul i4 %x, %x
>>>>>>> + %tmp2 = mul i4 %tmp1, %x
>>>>>>> + %tmp3 = mul i4 %tmp2, %x
>>>>>>> + %tmp4 = mul i4 %tmp3, %x
>>>>>>> + %tmp5 = mul i4 %tmp4, %x
>>>>>>> + %tmp6 = mul i4 %tmp5, %x
>>>>>>> + %tmp7 = mul i4 %tmp6, %x
>>>>>>> + %tmp8 = mul i4 %tmp7, %x
>>>>>>> + %tmp9 = mul i4 %tmp8, %x
>>>>>>> + %tmp10 = mul i4 %tmp9, %x
>>>>>>> + %tmp11 = mul i4 %tmp10, %x
>>>>>>> + %tmp12 = mul i4 %tmp11, %x
>>>>>>> + %tmp13 = mul i4 %tmp12, %x
>>>>>>> + ret i4 %tmp13
>>>>>>> +}
>>>>>>> +
>>>>>>> +define i4 @foo4x15(i4 %x) {
>>>>>>> +; Can be done with four multiplies.
>>>>>>> +; CHECK: @foo4x15
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: mul
>>>>>>> +; CHECK-NEXT: ret
>>>>>>> + %tmp1 = mul i4 %x, %x
>>>>>>> + %tmp2 = mul i4 %tmp1, %x
>>>>>>> + %tmp3 = mul i4 %tmp2, %x
>>>>>>> + %tmp4 = mul i4 %tmp3, %x
>>>>>>> + %tmp5 = mul i4 %tmp4, %x
>>>>>>> + %tmp6 = mul i4 %tmp5, %x
>>>>>>> + %tmp7 = mul i4 %tmp6, %x
>>>>>>> + %tmp8 = mul i4 %tmp7, %x
>>>>>>> + %tmp9 = mul i4 %tmp8, %x
>>>>>>> + %tmp10 = mul i4 %tmp9, %x
>>>>>>> + %tmp11 = mul i4 %tmp10, %x
>>>>>>> + %tmp12 = mul i4 %tmp11, %x
>>>>>>> + %tmp13 = mul i4 %tmp12, %x
>>>>>>> + %tmp14 = mul i4 %tmp13, %x
>>>>>>> + ret i4 %tmp14
>>>>>>> +}
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>>
>