[PATCH] D32520: Support __fp16 vectors
Akira Hatanaka via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Apr 25 22:51:49 PDT 2017
ahatanak created this revision.
Herald added subscribers: rengolin, aemerson.
Currently, clang miscompiles operations on __fp16 vectors.
For example, when the following code is compiled:
typedef __fp16 half4 __attribute__ ((vector_size (8)));
half4 hv0, hv1, hv2;
void test() {
hv0 = hv1 + hv2;
}
clang generates the following IR on ARM64:
%1 = load <4 x half>, <4 x half>* @hv1, align 8
%2 = load <4 x half>, <4 x half>* @hv2, align 8
%3 = fadd <4 x half> %1, %2
store <4 x half> %3, <4 x half>* @hv0, align 8
This isn't correct since __fp16 values in C or C++ expressions have to be promoted to float if __fp16 is not a natively supported type (see gcc's documentation).
https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html
The IR is incorrect on X86 too. The addition is done on <4xi16>vectors:
%1 = load <4 x i16>, <4 x i16>* @hv1, align 8
%2 = load <4 x i16>, <4 x i16>* @hv2, align 8
%3 = add <4 x i16> %2, %1
store <4 x i16> %3, <4 x i16>* @hv0, align 8
This patch makes the changes needed in Sema and IRGen to generate the correct IR on targets that set HalfArgsAndReturns to true but don't support __fp16 natively (ARM and ARM64). It inserts implicit casts to promote __fp16 vector operands to float vectors and truncate the result back to a __fp16 vector.
I plan to fix X86 and other targets that don't set HalfArgsAndReturns to true in another patch.
https://reviews.llvm.org/D32520
Files:
include/clang/Sema/Sema.h
lib/CodeGen/CGExprScalar.cpp
lib/Sema/SemaExpr.cpp
test/CodeGen/fp16vec-ops.c
test/Sema/fp16vec-sema.c
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D32520.96664.patch
Type: text/x-patch
Size: 25009 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20170426/6dafa7a8/attachment-0001.bin>
More information about the cfe-commits
mailing list