[LLVMbugs] [Bug 23305] New: Support __fp16 vectors
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Tue Apr 21 15:22:49 PDT 2015
https://llvm.org/bugs/show_bug.cgi?id=23305
Bug ID: 23305
Summary: Support __fp16 vectors
Product: clang
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: -New Bugs
Assignee: unassignedclangbugs at nondot.org
Reporter: ahmed.bougacha at gmail.com
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
__fp16 is a storage-only type, and there are two CodeGen variants:
- soften to i16, promote using llvm.convert.to/from.fp16 (e.g., X86)
- when LangOptions::NativeHalfType or HalfArgsAndReturns, use the LLVM "half"
type, promote using fpext/fptrunc (e.g., AArch64)
In both cases, we don't do the right thing for vectors.
On X86, this:
typedef __fp16 __attribute__((__ext_vector_type__(4))) v4f16;
void foo(v4f16 *a, v4f16 *b, v4f16 *c) {
*c = *a + *b;
}
generates the very broken:
%3 = add <4 x i16> %1, %2
This is because the Sema::UsualUnaryConversions don't apply to VectorTypes (see
Sema::CheckVectorOperands), so we never try to promote to v4f32 (as we would
promote __fp16 to f32).
Even if we decide to reject that code and never do the implicit promotion, the
alternative is also broken:
typedef __fp16 __attribute__((__ext_vector_type__(4))) v4f16;
typedef float __attribute__((__ext_vector_type__(4))) v4f32;
void foo(v4f16 *a, v4f16 *b, v4f16 *c) {
*c = __builtin_convertvector(*a, v4f32);
}
Generates:
%2 = uitofp <4 x i16> %1 to <4 x float>
Even when "half" is used instead of i16 (AArch64, or after we migrate away from
the convert intrinsics), we generate IR without the promotion:
%3 = fadd <4 x half> %1, %2
Relying on the backend to do the promotion.
However, this has slightly different semantics, because LLVM works at the
instruction level, and clang at the expression level. Consider:
void foo(v4f16 *a, v4f16 *b, v4f16 *c) {
*c = (*a + *b) + *c;
}
Doing the promotion in clang means the intermediate result is a v4f32. Doing
it in LLVM means the intermediate result is truncated back to v4f16, before
being extended again to v4f32.
This can give different result, and it's probably best to mirror the scalar
clang behavior of promoting entire expressions.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150421/75f0bff0/attachment.html>
More information about the llvm-bugs
mailing list