[cfe-dev] Spurious register spill with volatile function argument
Michael Clark via cfe-dev
cfe-dev at lists.llvm.org
Sat Mar 26 22:57:55 PDT 2016
Seems I had misused volatile. I removed ‘volatile’ from the function argument on test_0 and it prevented the spill through the stack.
I added volatile because I was trying to avoid the compiler optimising away the call to test_0 (as it has no side effects) but it appeared that volatile was unnecessary and was a misuse of volatile (intended to indicate storage may change outside of the control of the compiler). However it is an interesting case… as a register arguments don’t have storage.
GCC, Clang folk, any ideas on why there is a stack spill for a volatile register argument passed in esi? Does volatile force the argument to have storage allocated on the stack? Is this a corner case in the C standard? This argument in the x86_64 calling convention only has a register, so technically it can’t change outside the control of the C "virtual machine” so volatile has a vague meaning here. This seems to be a case of interpreting the C standard in such a was as to make sure that a volatile argument “can be changed” outside the control of the C "virtual machine” by explicitly giving it a storage location on the stack. I think volatile scalar arguments are a special case and that the volatile type label shouldn’t widen the scope beyond the register unless it actually *needs* storage to spill. This is not a volatile stack scoped variable unless the C standard interprets ABI register parameters as actually having ‘storage’ so this is questionable… Maybe I should have gotten a warning… or the volatile type qualifier on a scalar register argument should have been ignored…
volatile for scalar function arguments seems to mean: “make this volatile and subject to change outside of the compiler” rather than being a qualifier for its storage (which is a register).
# gcc
test_0:
mov DWORD PTR [rsp-4], esi
mov ecx, DWORD PTR [rsp-4]
mov eax, edi
cdq
idiv ecx
mov eax, edx
ret
# clang
test_0:
mov dword ptr [rsp - 4], esi
xor edx, edx
mov eax, edi
div dword ptr [rsp - 4]
mov eax, edx
ret
/* Test program compiled on x86_64 with: cc -O3 -fomit-frame-pointer -masm=intel -S test.c -o test.S */
#include <stdio.h>
#include <limits.h>
static const int p = 8191;
static const int s = 13;
int __attribute__ ((noinline)) test_0(unsigned int k, volatile int p)
{
return k % p;
}
int __attribute__ ((noinline)) test_1(unsigned int k)
{
return k % p;
}
int __attribute__ ((noinline)) test_2(unsigned int k)
{
int i = (k&p) + (k>>s);
i = (i&p) + (i>>s);
if (i>=p) i -= p;
return i;
}
int main()
{
test_0(1, 8191); /* control */
for (int i = INT_MIN; i < INT_MAX; i++) {
int r1 = test_1(i), r2 = test_2(i);
if (r1 != r2) printf("%d %d %d\n", i, r1, r2);
}
}
> On 27 Mar 2016, at 2:32 PM, Andrew Waterman <andrew at sifive.com> wrote:
>
> It would be good to figure out how to get rid of the spurious register spills.
>
> The strength reduction optimization isn't always profitable on Rocket,
> as it increases instruction count and code size. The divider has an
> early out and for small numbers is quite fast.
>
> On Fri, Mar 25, 2016 at 5:43 PM, Michael Clark <michaeljclark at mac.com> wrote:
>> Now considering I have no idea how many cycles it takes for an integer divide on the Rocket so the optimisation may not be a win.
>>
>> Trying to read MuDiv in multiplier.scala, and will at some point run some timings in the cycle-accurate simulator.
>>
>> In either case, the spurious stack moves emitted by GCC are curious...
>>
>>> On 26 Mar 2016, at 9:42 AM, Michael Clark <michaeljclark at mac.com> wrote:
>>>
>>> Hi All,
>>>
>>> I have found an interesting case where an optimisation is not being applied by GCC on RISC-V. And also some strange assembly output from GCC on RISC-V.
>>>
>>> Both GCC and Clang appear to optimise division by a constant Mersenne prime on x86_64 however GCC on RISC-V is not applying this optimisation.
>>>
>>> See test program and assembly output for these platforms:
>>>
>>> * GCC -O3 on RISC-V
>>> * GCC -O3 on x86_64
>>> * LLVM/Clang -O3 on x86_64
>>>
>>> Another strange observation is GCC on RISC-V is moving a1 to a5 via a stack store followed by a stack load. Odd? GCC 5 also seems to be doing odd stuff with stack ‘moves' on x86_64, moving esi to ecx via the stack (I think recent x86 micro-architecture treats tip of the stack like an extended register file so this may only have a small penalty on x86).
>>>
>>> See GCC on RISC-V is emitting this:
>>>
>>> test_0:
>>> add sp,sp,-16
>>> sw a1,12(sp)
>>> lw a5,12(sp)
>>> add sp,sp,16
>>> remuw a0,a0,a5
>>> jr ra
>>>
>>> instead of this:
>>>
>>> test_0:
>>> remuw a0,a0,a1
>>> jr ra
>>>
>>> Compiler devs, please read Test program and assembly output. I have not yet tested LLVM/Clang on RISC-V yet… I will do that next… I have not had time to dig into compiler code yet...
>>>
>>> Regards,
>>> Michael.
>>>
>>>
>>> /* Test program */
>>>
>>> #include <stdio.h>
>>> #include <limits.h>
>>>
>>> static const int p = 8191;
>>> static const int s = 13;
>>>
>>> int __attribute__ ((noinline)) test_0(unsigned int k, volatile int p)
>>> {
>>> return k % p;
>>> }
>>>
>>> int __attribute__ ((noinline)) test_1(unsigned int k)
>>> {
>>> return k % p;
>>> }
>>>
>>> int __attribute__ ((noinline)) test_2(unsigned int k)
>>> {
>>> int i = (k&p) + (k>>s);
>>> i = (i&p) + (i>>s);
>>> if (i>=p) i -= p;
>>> return i;
>>> }
>>>
>>> int main()
>>> {
>>> test_0(1, 8191); /* control */
>>> for (int i = INT_MIN; i < INT_MAX; i++) {
>>> int r1 = test_1(i), r2 = test_2(i);
>>> if (r1 != r2) printf("%d %d %d\n", i, r1, r2);
>>> }
>>> }
>>>
>>>
>>>
>>> /* RISC-V GCC */
>>>
>>> $ riscv64-unknown-elf-gcc --version
>>> riscv64-unknown-elf-gcc (GCC) 5.2.0
>>>
>>> test_0:
>>> add sp,sp,-16
>>> sw a1,12(sp)
>>> lw a5,12(sp)
>>> add sp,sp,16
>>> remuw a0,a0,a5
>>> jr ra
>>> test_1:
>>> li a5,8192
>>> addw a5,a5,-1
>>> remuw a0,a0,a5
>>> ret
>>> test_2:
>>> li a3,8192
>>> addw a2,a3,-1
>>> and a4,a0,a2
>>> srlw a0,a0,13
>>> addw a5,a4,a0
>>> and a0,a5,a2
>>> sraw a5,a5,13
>>> addw a0,a0,a5
>>> addw a3,a3,-2
>>> ble a0,a3,.L5
>>> subw a0,a0,a2
>>> .L5:
>>> ret
>>>
>>>
>>> /* Linux x86_64 GCC */
>>>
>>> $ gcc --version
>>> gcc (Debian 5.2.1-23) 5.2.1 20151028
>>>
>>> test_0:
>>> mov DWORD PTR [rsp-4], esi
>>> mov ecx, DWORD PTR [rsp-4]
>>> mov eax, edi
>>> cdq
>>> idiv ecx
>>> mov eax, edx
>>> ret
>>> test_1:
>>> mov eax, edi
>>> mov rcx, rax
>>> mov rdx, rax
>>> sal rcx, 6
>>> sal rdx, 19
>>> add rdx, rcx
>>> add rax, rdx
>>> mov edx, edi
>>> shr rax, 32
>>> sub edx, eax
>>> shr edx
>>> add eax, edx
>>> shr eax, 12
>>> mov edx, eax
>>> sal edx, 13
>>> sub edx, eax
>>> sub edi, edx
>>> mov eax, edi
>>> ret
>>> test_2:
>>> mov eax, edi
>>> shr edi, 13
>>> and eax, 8191
>>> add eax, edi
>>> mov edx, eax
>>> sar eax, 13
>>> and edx, 8191
>>> add eax, edx
>>> lea edx, [rax-8191]
>>> cmp eax, 8191
>>> cmovge eax, edx
>>> ret
>>>
>>>
>>> /* Darwin x86_64 LLVM Clang */
>>>
>>> $ cc --version
>>> Apple LLVM version 7.3.0 (clang-703.0.29)
>>>
>>> _test_0:
>>> mov dword ptr [rsp - 4], esi
>>> xor edx, edx
>>> mov eax, edi
>>> div dword ptr [rsp - 4]
>>> mov eax, edx
>>> ret
>>> _test_1:
>>> mov eax, edi
>>> imul rax, rax, 524353
>>> shr rax, 32
>>> mov ecx, edi
>>> sub ecx, eax
>>> shr ecx
>>> add ecx, eax
>>> shr ecx, 12
>>> imul eax, ecx, 8191
>>> sub edi, eax
>>> mov eax, edi
>>> ret
>>> _test_2:
>>> mov eax, edi
>>> and eax, 8191
>>> mov ecx, edi
>>> shr ecx, 13
>>> add eax, ecx
>>> add ecx, edi
>>> and ecx, 8191
>>> shr eax, 13
>>> lea edx, [rcx + rax]
>>> cmp edx, 8190
>>> lea eax, [rcx + rax - 8191]
>>> cmovbe eax, edx
>>> ret
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe at groups.riscv.org.
>>> To post to this group, send email to sw-dev at groups.riscv.org.
>>> Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
>>> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/2600D96D-94BC-4259-9D39-DE4993859281%40mac.com.
>>
>> --
>> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe at groups.riscv.org.
>> To post to this group, send email to sw-dev at groups.riscv.org.
>> Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
>> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/9F3C9DE6-F00B-4402-A83B-354455DEAFFA%40mac.com.
More information about the cfe-dev
mailing list