[llvm-dev] Load combine pass
Sanjoy Das via llvm-dev
llvm-dev at lists.llvm.org
Thu Sep 29 10:56:44 PDT 2016
Hi David,
David Chisnall wrote:
> Nope, we’re not using the address sanitiser. Our architecture
supports byte-granularity bounds checking in hardware.
I mentioned address sanitizer since (then) I thought your architecture
would have to prohibit the same kinds of transforms that address
sanitizer has to prohibit.
However, on second thought, I think I have a counter-example to my
statement above -- I suppose your architecture only checks bounds and
not that the location being loaded from is initialized?
> Note that even without this, for pure MIPS code without our
> extensions, load widening generates significantly worse code than when
> it doesn’t happen. I’m actually finding it difficult to come up with
> a microarchitecture where a 16-bit load followed by an 8-bit load from
> the same cache line would give worse performance than a 32-bit load, a
> mask and a shift. In an in-order design, it’s more instructions to do
> the same work, and therefore slower. In an out-of-order design, the
> two loads within the cache line will likely be dispatched
> simultaneously and you’ll have less pressure on the register rename
> engine.
That makes sense, but what do you think of Artur's suggestion of
catching only the obvious patterns? That is, catching only cases like
i16* ptr = ...
i32 val = ptr[0] | (ptr[1] << 16);
==> // subject to endianess
i16* ptr = ...
i32 val = *(i32*) ptr;
To me that seems like a win (or at least, not a loss) on any
architecture. However, I will admit that I've only ever worked on x86
so I have a lot of blind spots here.
Thanks!
-- Sanjoy
More information about the llvm-dev
mailing list