[llvm-dev] [RFC] llvm-mca: a static performance analysis tool

Sun Mar 4 12:51:10 PST 2018

> On Mar 2, 2018, at 9:30 AM, Andrew Trick <atrick at apple.com> wrote:
> 
> +Matthias
> 
>> On Mar 2, 2018, at 6:42 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com <mailto:andrea.dibiagio at gmail.com>> wrote:
>> 
>>> Known limitations on X86 processors
>>> -----------------------------------
>>> 
>>> 1) Partial register updates versus full register updates.
>>> <snip>
>> 
>> MachineOperand handles this. You just need to create the machine instrs.
>> 
>> Interesting. I couldn't find how to do it. It would be great if somebody helps me on this.
> 
> 
> I was thinking of APIs like MachineOperand::readsReg().
> 
> I guess if you’re only asking whether an instruction zeros the upper part of the register, that information *should* be available from MCInstr/MCRegisterInfo, but I’m not very familiar with the API.

I don’t think we have this information in an explicit form today:

- It’s usually not a correctness problem because we cannot really address the upper register parts independently on those targets.
- We work around some ISEL shortcomings via `SUBREG_TO_REG` (see TargetOpcode.def) which I consider a nasty hack as stating assumptions about the predecessor node violates the referential transparency that you would expect from SSA.
- Coalescing/regalloc is using `%vregX:sub32<undef> =` to represent this.

So today you are probably out of luck when coming from the MC side of things. I think adding a OperandFlag in MCInstrDesc would be a great idea and could be a first step towards retiring SUBREG_TO_REG.

- Matthias

> 
> Matthias?
> 
> -Andy
> 
>> 1) Partial register updates versus full register updates.
>> 
>> On x86-64, a 32-bit GPR write fully updates the super-register. Example:
>>       add %edi %eax    ## eax += edi
>> 
>> Here, register %eax aliases the lower half of 64-bit register %rax. On x86-64,
>> register %rax is fully updated by the 'add' (the upper half of %rax is zeroed).
>> Essentially, it "kills" any previous definition of (the upper half of) register
>> %rax.
>> 
>> On the other hand, 8/16 bit register writes only perform a so-called "partial
>> register update". Example:
>>       add %di, %ax     ## ax += di
>> 
>> Here, register %eax is only partially updated. To be more specific, the lower
>> half of %eax is set, and the upper half is left unchanged. There is also no
>> change in the upper 48 bits of register %rax.
>> 
>> To get accurate performance analysis, the tool has to know which instructions
>> perform a partial register update, and which instructions fully update the
>> destination's super-register.
>> 
>> One way to expose this information is (again) via tablegen.  For example, we
>> could add a flag in the tablegen instruction class to tag instructions that
>> perform partial register updates. Something like this: 'bit
>> hasPartialRegisterUpdate = 1'. However, this would force a `let
>> hasPartialRegisterUpdate = 0` on several instruction definitions.
>> 
>> Another approach is to have a MCSubtargetInfo hook similar to this:
>>     virtual bool updatesSuperRegisters(unsigned short opcode) { return false; }
>> 
>> Targets will be able to override this method if needed.  Again, this is just an
>> idea. But the plan is to have this fixed as a future development.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180304/e677b73f/attachment.html>