[cfe-dev] Converting float to int with FJCVTZS

Thu Mar 11 07:06:29 PST 2021

Hi, Craig!

Thanks for your reply. It seems I have indeed conflated some behavior here, specifically that it converts with cvttss2si and then truncates the 64 bit result.

Replacing fcvtzu with fjcvtzs does indeed produce the same result as x86_64, however, it does not generalize to other conversions from floating point to integer.

So my proposed solution to always use fjcvtzs was not great. Might there be some other way to get similar behavior? I will try to use intrinsics to get the same behavior not matter the conversion; but the question remains if it's possible to do with compiler flags instead of changing the code.

For a motivating example, see https://godbolt.org/z/sjeE6M

#include <stdio.h>
#include <cstdint>

void cast(float value) {
  printf("uint32_t(%.2f) = %u\n", value, uint32_t(value));
}

int main() {
  cast(4294967808.);
}

// output on x86_64:  uint32_t(4294967808.00) = 512
// output on aarch64: uint32_t(4294967808.00) = 4294967295

Replacing uint32_t(value) with __builtin_arm_jcvt(value) on aarch64 makes it behave like x86_64:

// output on aarch64: __builtin_arm_jcvt(4294967808.00) = 512

> On 10 Mar 2021, at 18:36, Craig Topper <craig.topper at gmail.com> wrote:
> 
> Oh I think I understand now. For unsigned int to float, x86-64 uses a 64-bit cvttss2si instruction and drops the upper 32 bits because there's no 32-bit unsigned conversion instruction without avx512.
> 
> So are you asking for AArch64 to also do a 64-bit conversion and truncate the result? Replacing a 32-bit fcvtzu with a 32-bit fjcvtzs wouldn't work would it?
> 
> ~Craig
> 
> 
> On Wed, Mar 10, 2021 at 9:26 AM Craig Topper <craig.topper at gmail.com> wrote:
> Hi Johannes,
> 
> I don't think cvttss2si wraps around. Instead it returns 0x80000000 for large values. "If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised. If this exception is masked, the indefinite integer value (80000000H or 80000000_00000000H if operand size is 64 bits) is returned."
> 
> Also isn't fcvtzu an unsigned conversion while cvttss2si and FJCVTZS are signed conversions? Am I missing something?
> 
> ~Craig
> 
> 
> On Wed, Mar 10, 2021 at 9:03 AM Johannes Hoff via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> Hi!
> 
> I'm working on a code base where a simulation needs to produce the exact same result on Aarch64 and x86_64 architectures.
> 
> This is indeed the case for the whole codebase, with one exception: Rounding floats to integers. Specifically, when we're in undefined behavior territory. In that case, you notice the difference between the emitted fcvtzu instruction on aarch64 (saturating cast) and cvttss2si on x86 (wrap-around).
> 
> Now, I know undefined behavior is not the main business of LLVM, but I wonder if it would be possible to ask it to emit FJCVTZS instead, which behaves like x86 outside of the integer range. Of course, this would be an opt-in flag.
> 
> What do you think? If it's not something that would be valuable for clang, do you have any pointers on how to patch it myself?
> 
> Of course, I can just use the compiler intrinsic __builtin_arm_jcvt to trigger this behavior, but then I need to be sure to catch all the places, and be sure that everyone on the team remembers to do the same in the future.
> 
> Thanks,
> Johannes
> 
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev