[PATCH] D22898: AMDGPU: Fix ffloor for SI
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 18 11:20:12 PDT 2016
arsenm added a comment.
In https://reviews.llvm.org/D22898#518537, @mareko wrote:
> Yeah I know about the CI bug, but it's not important for OpenGL.
>
> 0x3ff0000000000000 is 1.0.
> 0x3fefffffffffffff isn't 1.0. It's the largest number smaller than 1.0, which is 0.9999999999999........... it can also be written as: bitcast(1.0, i32) - 1
> If you print 0x3fefffffffffffff with 6 decimal places, it will be rounded to 1.0 for the purpose of printing. I guess that's where the confusion comes from.
>
> I don't know why 1.0 passes OpenCL conformance and bitcast(1.0, i32) - 1 doesn't. I suggest you check what the closed compiler does in this case.
Closed OpenCL on the AMDIL path uses a library expansion for floor, and doesn't try to use any of these instructions
In https://reviews.llvm.org/D22898#518537, @mareko wrote:
> Yeah I know about the CI bug, but it's not important for OpenGL.
>
> 0x3ff0000000000000 is 1.0.
> 0x3fefffffffffffff isn't 1.0. It's the largest number smaller than 1.0, which is 0.9999999999999........... it can also be written as: bitcast(1.0, i32) - 1
> If you print 0x3fefffffffffffff with 6 decimal places, it will be rounded to 1.0 for the purpose of printing. I guess that's where the confusion comes from.
>
> I don't know why 1.0 passes OpenCL conformance and bitcast(1.0, i32) - 1 doesn't. I suggest you check what the closed compiler does in this case.
It implements floor with a library expansion with bitops and doesn't attempt to use fract (though I don't think this is intentional). The custom floor lowering that I found was dead deleted in this patch also passes conformance
https://reviews.llvm.org/D22898
More information about the llvm-commits
mailing list