[llvm] r182976 - Reapply with r182909 with a fix to the calculation of the new indices for
Nick Lewycky
nlewycky at google.com
Mon Jun 3 15:42:49 PDT 2013
On 30 May 2013 20:36, Nadav Rotem <nrotem at apple.com> wrote:
> Hi Nick,
>
> Thanks for working on this. I have a few comments.
>
>
> URL: http://llvm.org/viewvc/llvm-project?rev=182976&view=rev
> Log:
> Reapply with r182909 with a fix to the calculation of the new indices for
> insertelement instructions.
>
>
> Can you use the commit message to describe what the patch does ?
>
>
> }
>
> +/// Return true if we can evaluate the specified expression tree if the
> vector
> +/// elements were shuffled in a different order.
> +static bool CanEvaluateShuffled(Value *V, ArrayRef<int> Mask,
> + unsigned Depth = 100) {
>
>
> Depth = 100 ?? This is really high. It should be a much lower threshold.
> This can potentially search 2^100 paths because you are calling this
> recursion twice on binary operands. Please limit this to something like 5.
>
So I've updated it to 5 as you asked, but two points. LLVM has made a
trade-off in many places (loop unrolling and the inliner's cost analysis
for instance) that we spend more compile time optimizing code which uses
vectors. Second, in order for this 2^n to ever happen, the function needs
to have 2^n instructions in it. Reduced to a mere chain of five
instructions, I hope the optimization still fires on real-world code, such
as your backwards-iterating example where the loop vectorizer emits
redundant shuffles. I don't want to lose the point of this optimization.
> + case Instruction::InsertElement: {
> + ConstantInt *CI = dyn_cast<ConstantInt>(I->getOperand(2));
> + if (!CI) return false;
> + int ElementNumber = CI->getLimitedValue();
> +
> + // Verify that 'CI' does not occur twice in Mask. A single
> 'insertelement'
> + // can't put an element into multiple indices.
> + bool SeenOnce = false;
> + for (int i = 0, e = Mask.size(); i != e; ++i) {
> + if (Mask[i] == ElementNumber) {
> + if (SeenOnce)
> + return false;
> + SeenOnce = true;
> + }
> + }
> + return CanEvaluateShuffled(I->getOperand(0), Mask, Depth-1);
> + }
> + }
> + return false;
>
>
> I would like to see more tests for the code that deals with insertelements.
>
There was one case I wasn't testing, but as of r183080 I think we have them
all?
What happens if only some of the lanes are inserted ? Does it ever fail ?
>
Hm? This case is covered in the tests.
> }
>
> + if (isa<UndefValue>(RHS) &&
> + // This isn't necessary for correctness, but the comment block below
> + // claims that there are cases where folding two shuffles into one
> would
> + // cause worse codegen on some targets.
> + !isa<ShuffleVectorInst>(LHS) &&
> + CanEvaluateShuffled(LHS, Mask)) {
>
>
> What ?
>
Good spotting. Now that the patch doesn't do shuffle-shuffle folding, I'll
go ahead and remove the extra check and comment block.
Nick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130603/0a5281a8/attachment.html>
More information about the llvm-commits
mailing list