[PATCH] D27861: [DAGCombiner] Match load by bytes idiom and fold it into a single load. Attempt #2.
Artur Pilipenko via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 16 04:45:24 PST 2017
apilipenko marked an inline comment as done.
apilipenko added inline comments.
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:4446
+ return Index >= NarrowByteWidth
+ ? ByteProvider::getConstantZero()
+ : calculateByteProvider(NarrowOp, Index);
> So, one path of what this is doing is essentially computing known-zero bits. I wonder if we should really be hand rolling that or whether we should instead use computeKnownBits. We could likely use computeKnownBits in the OR logic and then only handle recursing to find loads in the rest of these paths... Thoughts? maybe not worth it, hard to tell.
It's definitely an option, but I can't justify the change. In follow up changes I'm going to exploit the information about zero bytes in DAGCombiner::MatchLoadCombine to handle partially available patterns. That will introduce another point where I'll need to use computeKnownBits.
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:4485-4492
+ if (VT != MVT::i16 && VT != MVT::i32 && VT != MVT::i64)
+ return SDValue();
+ unsigned ByteWidth = VT.getSizeInBits() / 8;
+ // There is nothing to do here if the target can't load a value of this type
+ const TargetLowering &TLI = DAG.getTargetLoweringInfo();
+ if (LegalOperations && !TLI.isOperationLegal(ISD::LOAD, VT))
> Why only do the legality check when in the legalize phase? When would we want to combine loads to a non-legal integer type?
> Is the goal here to combine loads into a too-wide illegal integer type to let the legalizer then split them into legal sized chunks? If so, that at least needs a comment. And in that case, why only up to 64-bit integers?
> Alternatively, you could do the legality check in all phases and just never combine stores when the merged value is wider than the legal integer size.
Yes, the goal is to combine to a possibly too-wide load and let the legalizer split it later. This enables us to combine load i64 by i8 to a couple of i32 loads on 32 bit targets. Will add a comment.
The i64 limitation is somewhat arbitrary just to limit the scope of the transformation. It can be lifted easily. (On the other hand with the newly introduced depth limit in calculateByteProvider we won't be able to fold patterns much wider than i64)
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:4573-4577
+ // Check that a load of the wide type is both allowed and fast on the target
+ bool Fast = false;
+ bool Allowed = TLI.allowsMemoryAccess(*DAG.getContext(), DAG.getDataLayout(),
+ VT, FirstLoad->getAddressSpace(),
+ FirstLoad->getAlignment(), &Fast);
> Shouldn't this check come first, up with the legalization stuff?
allowsMemoryAccess needs to know the address, addrspace and alignment specifically. We don't know it before we do all the computations above.
More information about the llvm-commits