[llvm] r244601 - [X86] Allow merging of immediates within a basic block for code size savings

Thu Aug 13 14:25:19 PDT 2015

On Thu, Aug 13, 2015 at 1:07 PM, Ansari, Zia <zia.ansari at intel.com> wrote:

>
>
> Regarding 24449, the optimization would be nice to do, as long as we’re
> careful we don’t create additional hazards with the larger memory
> instructions. Specifically, we don’t want to start splitting cache lines.
>
>
>
Hi Zia -

Crossing a cache line is not a constraint that we take into account today
when merging. Can you give us an idea about the penalty? Sorry if this is
in the Opt Guide, and I've overlooked it. Possibly related: we do split
unaligned 32-byte AVX memory accesses for SandyBridge (grep for Mem32Slow),
but not any other chips.

I don't see a good mechanism for us to detect a cache line crossing when
we're doing these memory access merges. We could assume that anything with
alignment under the cache line size should not be merged, but that would
rule out almost all merging. Eg, it's very unlikely that we'd ever see a
4-byte access specified with alignment 64.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150813/608150e9/attachment.html>