[cfe-dev] Union Types
Pranav Bhandarkar
pranavb at codeaurora.org
Mon Apr 23 17:51:46 PDT 2012
Hi,
Consider the following simple testcase that uses a union.
****************************************************
typedef union {
int w[2];
long long int d;
}x; //64 bits.
extern void bar (long long int a);
void foo (int *a) {
x u;
u.w[0] = *a;
u.w[1] = *(a+1);
bar (u.d);
}
******************************************************
On Hexagon, the most optimal code would be (pseudo-code)
************
r0=load (*a);
r1=load *(a+1);
bar(); // u.d is passed in the register pair r1:r0
************
This can be done by the Scalar Replacement of Aggregates pass. However, when
I looked at the scalar replacement of aggregates pass, it process the
following alloca
*****
%u = alloca %union.x, align 8
*****
Where
%union.x = type { i64 }
So, scalar replacement converts this into a scalar of type i64 and for the
two loads generates two zero-extensions from 32 bits to 64 bits thus mucking
up any optimization opportunity. The following is what Scalar replacement
generates for u.w[0] = *a;
***********
%u = alloca i64
%0 = load i32* %a, align 4, !tbaa !0
%.in3 = load i64* %u
%1 = zext i32 %0 to i64
%mask4 = and i64 %.in3, -4294967296
%ins5 = or i64 %mask4, %1
store i64 %ins5, i64* %u
************
IMHO if the type %union.x was "type { [2 x i32] }" these unnecessary "zext"s
would not have been needed. So my question is what is the best way to fix
this problem ? Is the union type layout in clang a problem ? IMHO, this
layout is eliding information that there are two components at offsets 0 and
4 inside the union.
TIA,
Pranav
Qualcomm Innovation Center, (QuIC) is a member of the Code Aurora Forum.
More information about the cfe-dev
mailing list