[PATCH] Allow code generation of ARM usat/ssat instructions
weimingz at codeaurora.org
Mon Mar 16 18:15:35 PDT 2015
Hi apazos, mcrosier,
Currently, LLVM is unable to emit usat/ssat for codes like: x = c > 255 ? 255 : (c < 0 ? 0 : c)
%cmp = icmp sgt i32 %c, 255
%cmp1 = icmp slt i32 %c, 0
%cond = select i1 %cmp1, i32 0, i32 %c
%cond5 = select i1 %cmp, i32 255, i32 %cond
ret i32 %cond5
cmp r0, #0
mov r1, r0
movwlt r1, #0
cmp r0, #255
movwgt r1, #255
mov r0, r1
We expect only one instruction:
usat r0, #8, r0
This pass transforms comparisons and selections into ARM usat/ssat saturating intrinsic. I implemented as a IR level transformation instead of backend peephole because it's easier to matching and maybe shared by other targets if similar instructions are available.
Our testing shows up to 4% speedup for some benchmarks and no regressions.
Please help to review!
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 34420 bytes
Desc: not available
More information about the llvm-commits