[PATCH][ISel] Keep matching state consistent when folding during X86 address match
anemet at apple.com
Fri Oct 3 13:11:37 PDT 2014
On Oct 3, 2014, at 11:18 AM, Quentin Colombet <qcolombet at apple.com> wrote:
> Hi Adam,
> LGTM with some minor fixes.
> + // Some paranoid early-returns here even though we only install this
> + // update listener during matching a complex patterns so none of these
> + // should ever occur.
> + if (!E || E->isMachineOpcode())
> + return;
> Should this be an assert then?
Thanks for the review.
As we discussed offline, it’s not technically wrong for these to occur. For example, I initially had the listener set up for the entire function scope in SelectCodeCommon so both of these would occur. I updated the comment to explain this a bit better:
// Some early-returns here to avoid the search if we deleted the node or
// if the update comes from MorphNodeTo (MorphNodeTo is the last thing we
// do, so it's unnecessary to update matching state at that point).
// Neither of these can occur currently because we only install this
// update listener during matching a complex patterns.
> + /// \brief Address-mode matching performs shift-of-and to and-of-shift
> + /// reassocication in order to expose more scaled addressing
> + /// oppportunities.
> - reassocication -> reassociation
> - oppportunities -> opportunities
> + // If target can modify DAG during matching, keep the matching state
> + // consisent.
> typo: consisent -> consistent.
> +; CHECK-NOT: and
> +; The shift (leal) should be folded into the scale of the address in the load.
> +; CHECK-NOT: leal
> Add a CHECK-LABEL at the start.
Checked in as r219009.
> On Oct 3, 2014, at 9:56 AM, Adam Nemet <anemet at apple.com> wrote:
>> Ping. Can someone please look at this? It’s getting pretty important for us.
>> On Sep 26, 2014, at 1:33 PM, Adam Nemet <anemet at apple.com> wrote:
>>> In the X86 backend, matching an address is initiated by the 'addr' complex
>>> pattern and its friends. During this process we may reassociate and-of-shift
>>> into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
>>> shift into the scale of the address.
>>> However as demonstrated by the testcase, this can trigger CSE of not only the
>>> shift and the AND which the code is prepared for but also the underlying load
>>> node. In the testcase this node is sitting in the RecordedNode and MatchScope
>>> data structures of the matcher and becomes a deleted node upon CSE. Returning
>>> from the complex pattern function, we try to access it again hitting an assert
>>> because the node is no longer a load even though this was checked before.
>>> Now obviously changing the DAG this late is bending the rules but I think it
>>> makes sense somewhat. Outside of addresses we prefer and-of-shift because it
>>> may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
>>> example because it create a non-canonical node). We currently don't recognize
>>> addresses during DAGCombiner where arguably this canonicalization should be
>>> performed. On the other hand, having this in the matcher allows us to cover
>>> all the cases where an address can be used in an instruction.
>>> I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
>>> the new shift node in FoldMaskedShiftToScaledMask. This RAUW is responsible
>>> for initiating the recursive CSE on users
>>> (http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
>>> is not strictly necessary since the shift is hooked into the visited user. Of
>>> course it's safer to keep the DAG consistent at all times (e.g. for accurate
>>> number of uses, etc.).
>>> So rather than changing the fundamentals, I've decided to continue along the
>>> previous patches and detect the CSE. This patch installs a very targeted
>>> DAGUpdateListener for the duration of a complex-pattern match and updates the
>>> matching state accordingly. (Previous patches used HandleSDNode to detect the
>>> CSE but that's not practical here). The listener is only installed on X86.
>>> I tested that there is no measurable overhead due to this while running
>>> through the spec2k BC files with llc. The only thing we pay for is the
>>> creation of the listener. The callback never ever triggers in spec2k since
>>> this is a corner case.
>>> Fixes rdar://problem/18206171
>>> Please let me know if this makes sense.
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
More information about the llvm-commits