RFC: LoopEditor, a high-level loop transform toolkit

Tue Jul 28 12:49:00 PDT 2015

Hi Michael,

+llvmdev,Hal,Nadav

For testing, I was currently thinking of a two pronged approach. Lit tests
as you suggest with a dummy pass, probably with command line options to
define what transform to do, and unit tests to test the delegate behaviour
and return values.

I'll try and produce a mega patch with at least the loop vectoriser moved
over, then split it up again after review.

Cheers,

James

On Tue, 28 Jul 2015 at 20:27, Michael Zolotukhin <mzolotukhin at apple.com>
wrote:

> BTW, I think this thread should be on LLVM-dev, and I’d also CC Arnold,
> Hal, and Nadav - their feedback would be very valuable here.
>
> Michael
>
> On Jul 28, 2015, at 12:21 PM, Mikhail Zolotukhin <mzolotukhin at apple.com>
> wrote:
>
> Hi James,
>
> Thanks for your comments!
>
> On Jul 28, 2015, at 1:08 AM, James Molloy <james at jamesmolloy.co.uk> wrote:
>
> Hi Michael,
>
> Thanks for the feedback.
>
> > As a show-case, how does it help to simplify the aforementioned
> createEmptyBlock?
>
> That's a good example. If you look at http://reviews.llvm.org/D11530 line
> 916 in the .cpp file, there's a function "versionWidenAndInterleaveLoop"
> that does basically createEmptyBlock(). It also does more stuff - it
> performs the loop interleave too - but as an example showcase it should do.
>
> For the vectorizer, I think the cleanest thing would be to add a delegate
> hook that's called whenever an instruction is about to be cloned. Then, the
> vectorizer could override that hook and provide its own implementation
> (instead of cloning an instruction completely, it'd replace it with a
> vector variant. So the entire main body of the vectorizer would look
> something like this:
>
> struct Delegate : public LoopEditor::Delegate {
>   Value *hookCloneInstruction(Instruction *OldI, IRBuilder<> &IRB) {
>     if (isControlFlowInst(OldI)) return IRB.CreateExtractElement; // Don't
> vectorize branches
>     return createVectorVersionOfInstruction(OldI, VF, IRB); // Defined
> somewhere in LoopVectorize
>   }
> };
> BasicBlock *PredBB;
> Delegate D;
> LoopEditor LE(L, DT, SE, LI);
> LoopEditor VectorLE = LE.versionWidenAndInterleaveLoop(UF, PredBB, &D);
> VectorLE.widen(VF); // Widen further, so it's widened by VF*UF. Only
> affects induction variable steps and trip count.
>
> How does that look to you?
>
> That looks good, I always wanted to factor this stuff out too:)
>
>
> > Also, I think it’s very important to explicitly define requirements for
> loops we can handle, and what we do with other loops. E.g. what do we do if
> a loop has multiple back-edges? What do we do if a loop body has
> complicated control-flow?
>
> Good point, agreed. I think we need to handle loops in LoopSimplify form
> and LCSSA form at least. Some functions may not be able to handle arbitrary
> control flow but most should be agnostic to it (for example making sure
> control flow is correct in a vectorized loop, should be the user's job not
> LoopEditor's). But defining exactly what loop type each function supports
> and ensuring it does support that is a good idea.
>
> I'm also trying to work out what's the best way of getting this in-tree.
> It's a lot of code and I don't want to dump it in one go, but getting
> pre-commit review on every little step would make it take forever. Do you
> have any suggestions as to methodology? I was thinking getting review on
> the header (API), then committing that with stub implementations and
> implementing them bit by bit from there...
>
> I agree, and this way seems reasonable to me. However, it's really
> tempting to see what we end up with before we commit to it - so, if you
> have some final patch with this API applied to most of its users (even if
> this patch is huge), it would be cool if you share it.
>
> Also, how will we test it? Should we create a dummy-pass that will invoke
> these transforms on a given loop (without any cost models, just doing what
> you ask it to do)?
>
> Michael
>
>
> Cheers,
>
> James
>
> On Mon, 27 Jul 2015 at 20:23 Michael Zolotukhin <mzolotukhin at apple.com>
> wrote:
>
>> Hi James,
>>
>> On Jul 27, 2015, at 10:04 AM, James Molloy <james at jamesmolloy.co.uk>
>> wrote:
>>
>> Hi all,
>>
>> LLVM currently lacks a single coherent way to edit, modify and transform
>> loops. We have the Loop abstraction that provides some functionality, and
>> we have a cluster of functions and abstractions in LoopUtils, but each is
>> fairly cumbersome to use. There's also VersionedLoop, but that's quite
>> specific and not really extendable/composable.
>>
>> I've recently been trying to do some prototyping of a high level loop
>> optimization (unroll-and-jam) and ended up with quite a bit of spaghetti
>> code doing what should be simple transforms (as another example, look at
>> LoopVectorize::createEmptyBlock()!)
>>
>> Totally agree here!
>>
>>
>> So I decided to prototype an API for performing more high level
>> operations on LLVM Loops - cloning, widening, interleaving, stitching the
>> output of one loop into another loop, setting loop trip counts, peeling...
>> and all while keeping loopinfo, domtree and SCEV updated.
>>
>> I've uploaded my current prototype to Phab here:
>> http://reviews.llvm.org/D11530
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D11530&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=3t7gCOgkqnA-65i6cez_eslTon3hxx5XJKhMBw98kv0&s=yCaxJfCogeqwUH-wlsdHFPHktniLwG3-cZnq1yciR04&e=>
>>
>>
>> It's still at the prototype stage so there are lots of things that need
>> fixing - sorting out the testing story being top of my list. But I'm
>> looking for feedback on:
>>   * Is this a useful abstraction? Is it something the community would
>> like to see in-tree (eventually)?
>>
>> I think it could be very useful - depending on how easy it would be to
>> use. As a show-case, how does it help to simplify the aforementioned
>> createEmptyBlock?
>>
>>   * Does the API design make sense? Are there obvious things I've missed
>> that people need?
>>
>> From the first glance, it seems reasonable. One probably missing thing:
>> we have peelAfter, but we may also need peelBefore (it would be helpful if
>> we create prologues for alignment for instance).
>>
>> Also, I think it’s very important to explicitly define requirements for
>> loops we can handle, and what we do with other loops. E.g. what do we do if
>> a loop has multiple back-edges? What do we do if a loop body has
>> complicated control-flow?
>>
>> Thanks,
>> Michael
>>
>>
>> I've thought about the use cases of the loop vectorizer, loop
>> distribution, loop unrolling, loop versioning, and I think the current
>> design can solve their use cases fairly elegantly. I've used this API to
>> write a fairly sophisticated unroll-and-jam transform in very few lines too.
>>
>> Any feedback would be gratefully appreciated!
>>
>> Cheers,
>>
>> James
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150728/38ac043c/attachment.html>