[llvm] r244601 - [X86] Allow merging of immediates within a basic block for code size savings

Thu Aug 13 11:25:30 PDT 2015

Filed as:
https://llvm.org/bugs/show_bug.cgi?id=24447
https://llvm.org/bugs/show_bug.cgi?id=24448
https://llvm.org/bugs/show_bug.cgi?id=24449

The last one looks like the easiest one to solve and probably offers the
most upside given that you're seeing mostly zeros being stored.

On Thu, Aug 13, 2015 at 9:21 AM, Sanjay Patel <spatel at rotateright.com>
wrote:

>
>
> On Wed, Aug 12, 2015 at 6:33 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>>
>> For reference, `mov [mem],imm` is decoded into 2 micro-ops (see "Table 1.
>> Typical Instruction Mappings" in [SOG]) whereas `mov [mem],reg` is only 1
>> micro-op, so it is *preferable* to use a reg since it amortizes the cost of
>> the `mov-imm` micro-op across the stores.
>>
>
>
> Wow, I never noticed that line in the table. So whatever we do may have to
> be specialized further by micro-arch...
>
> But the Intel Perf guide has this gem at Rule 39:
> "Try to schedule μops that have no immediate immediately before or after
> μops with 32-bit immediates."
>
>  ...so maybe it's a no-brainer for everyone after all. :)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150813/f38febe6/attachment.html>