[llvm-dev] Using [GlobalISel] to provide peephole optimizations
via llvm-dev
llvm-dev at lists.llvm.org
Fri Aug 23 04:51:02 PDT 2019
Hi,
GlobalISel is fantastic, but obviously lacks a lot of the transforms that
makes SelectionDAG so good. Whilst it's plenty usable, you'll find yourself
wanting/needing to add a lot of manual little transforms to clean things up.
I know of the RFC for a new Combiner with its own syntax
(https://reviews.llvm.org/D54286 is the latest I can find of it), but after
manually adding my Nth manually coded pass for a niggling important
transform, and then needing to add more cases to
FoldImmediate/MemoryOperand/OptimizeLoad after it. I wondered how hard it
would be to allow GlobalISel to reselect machine patterns, eg after they've
been made available by other passes.
What I was thinking is in addition to anything else that's coming, allowing
Instructions to exist on the input side of Pat<>, and using the same
InstructionSelector we already have to reselect.
To my surprise... not many changes are required to seemingly make this work:
// fold loads in to compare instructions
def : Pat<(CPw_sr i32:$k, (MOVw_wf iPTR:$s)), (CPw_sf i32:$k, iPTR:$s)>;
And it looks like SDNodeXForms will work off the bat, along with complex
renderers. The main catches being with constants that require checking (due
G_CONSTANT being handled differently to immediates), along with needing to
add checks that the instruction has the same implicits, and that they're
dead where appropriate. But it seems viable and is definitely easy to use so
I'm just wondering... is this something that's being considered/is appealing
to people? And/or is the restriction of not allowing Instructions on the LHS
quite an intentional design decision?
Because it seems that this would provide some value even for those not using
GlobalISel as their primary selector, just as a way of quickly describing
peephole optimizations and leveraging the very nifty little VM there to
implement them. In theory, a lot of pattern fragments could even be added
automatically, by comparing pattern fragments and the machine opcodes they
represent - giving a free automatically generated "foldImmediate", among
other things.
A diff of the proof-of-concept can be found here:
(https://paste.ee/p/aDHIg), note though it's really just a curiosity to get
some conversation going, that other Selectors will reject these patterns (I
haven't added their escape routes), and that you're likely to break it in
many different ways. And of course that they should exist as their own
table, and that Reselect ought return the newly selected instruction.
But it's neat enough for me as is to get some useful patterns added already,
with these caveats in mind.
Any thoughts?
Regards,
Alex Davies
PS Apologies if the diff doesn't work first go - had to wrestle with quite a
few out-of-tree things back to get things in order. Note, includes fix for
https://bugs.llvm.org/show_bug.cgi?id=42032 also. Oh, and if there's already
a peephole pattern emitter that I've missed, please do let me know, I'll
easily survive the shame. :)
More information about the llvm-dev
mailing list