Hi Quentin,

2014-10-02 0:21 GMT+07:00 Quentin Colombet <qcolombet at apple.com>:

> The constant hoisting pass does this kind of things. Should we try to
>> teach it to handle this kind of cases?
>> That would be interesting. However this pass is x86 specific and can use
> processor features (subregister structure, loading 64-bit value with 32-bit
> move). Can theses features be used by constant hoisting?
> Maybe. This pass has a bunch of target hooks if I remember correctly.
> Juergen would know better :).

Ok, looking forward to advice:)

> Moreover, this may be beneficial for code size, but I guess it is
>> generally not beneficial for performances. Therefore, I believe this should
>> be done for functions with the Os or Oz attributes only.
>> Just curious, why? Moves from register must be faster than move from
> memory.
> Yes, but those are moves from immediate, which does not require memory at
> all.

On the other hand it loads memory bus.

> My performance concerns are:
> - Register pressure, like Rafael mentioned.
> - Additional scheduling dependencies.
> Going back to your example:
> This yields two independent chain of computation that can be scheduled
> independently. Moreover, you need just one register to realize this
> sequence.
>   mov $0, 0x4(%esi)
>    mov $0, 0x8(%esi)
> The two sequences of computations have now to wait for the first mov
> immediate. Moreover, this sequence requires 2 registers.
>    mov $0, %eax
>    mov %eax, 0x4(%esi)
>    mov %eax, 0x8(%esi)
Looks reasonable, thank you for explanation.

> Both gcc and icc use moves from register when compiling with optimization.
> Sure. What I am saying is that generally speaking, trading an immediate to
> register copy against a register to register copy does not sound like
> beneficial to me.
> Except from code size improvements, what kind of improvements are you
> seeing?

The main goal was code size. In fact the main interest is optimization of
memset, the problem is described in

Also, how big are those improvements?
Compilation of PHP distribution with and without this pass shows size
reduction about 0.3%.

