[llvm-dev] Load combine pass
Sanjoy Das via llvm-dev
llvm-dev at lists.llvm.org
Thu Sep 29 10:56:44 PDT 2016
David Chisnall wrote:
> Nope, we’re not using the address sanitiser. Our architecture
supports byte-granularity bounds checking in hardware.
I mentioned address sanitizer since (then) I thought your architecture
would have to prohibit the same kinds of transforms that address
sanitizer has to prohibit.
However, on second thought, I think I have a counter-example to my
statement above -- I suppose your architecture only checks bounds and
not that the location being loaded from is initialized?
> Note that even without this, for pure MIPS code without our
> extensions, load widening generates significantly worse code than when
> it doesn’t happen. I’m actually finding it difficult to come up with
> a microarchitecture where a 16-bit load followed by an 8-bit load from
> the same cache line would give worse performance than a 32-bit load, a
> mask and a shift. In an in-order design, it’s more instructions to do
> the same work, and therefore slower. In an out-of-order design, the
> two loads within the cache line will likely be dispatched
> simultaneously and you’ll have less pressure on the register rename
That makes sense, but what do you think of Artur's suggestion of
catching only the obvious patterns? That is, catching only cases like
i16* ptr = ...
i32 val = ptr | (ptr << 16);
==> // subject to endianess
i16* ptr = ...
i32 val = *(i32*) ptr;
To me that seems like a win (or at least, not a loss) on any
architecture. However, I will admit that I've only ever worked on x86
so I have a lot of blind spots here.
More information about the llvm-dev