[LLVMdev] TableGen syntax for matching a constant load

Sat Feb 26 18:12:00 PST 2011

On Feb 26, 2011, at 4:50 PM, Joerg Sonnenberger wrote:

> On Sun, Feb 27, 2011 at 01:29:25AM +0100, Joerg Sonnenberger wrote:
>> +let Predicates = [OptForSize] in {
>> +def : Pat<(store (i32 0), addr:$dst), (AND32mi8 addr:$dst, 0)>;
>> +def : Pat<(store (i32 0), addr:$dst), (AND32mi8 addr:$dst, 0)>;
>> +def : Pat<(store (i64 -1), addr:$dst), (OR64mi8 addr:$dst, -1)>;
>> +def : Pat<(store (i64 -1), addr:$dst), (OR64mi8 addr:$dst, -1)>;
>> +}
> 
> All these patterns have one important downside. They are suboptimal if
> more than one store happens in a row. E.g. the 0 store is better
> expressed as xor followed by two register moves, if a register is
> available... This is most noticable when memset() gets inlined

Note that LLVM's -Os option does not quite mean the same as GCC's flag.
It disables optimizations that increase code size without a clear performance gain.
It does not try to minimize code size at any cost.

When you said you weren't concerned about performance, I assumed you wouldn't be submitting patches. Sorry about the confusion.

Implementing constant stores as load+bitop+store is almost certainly not worth the small size win.

As for materializing (i32 -1) in 3 bytes instead of 5, but with 2 µ-ops instead of 1, I would like to see some performance numbers first. It might be cheap enough that it is worth it.

The MOV32ri instruction can be rematerialized and is considered to be as cheap as a move. That is not true for xorl+decl, and unfortunately the register allocator currently doesn't know how to rematerialize multiple instructions which means that the register containing -1 could get spilled. We really don't want that to happen.

Until the register allocator learns how to rematerialize multiple instructions, you would need to use a pseudo-instruction representing the xorl+decl pair. That instruction could be marked as rematerializable and even as cheap as a move.

Then you can start measuring the performance impact ;-)

Thanks,
/jakob