[llvm-bugs] [Bug 40736] New: incompatible return of small struct in 32-bit PowerPC BSD
via llvm-bugs
llvm-bugs at lists.llvm.org
Thu Feb 14 18:38:10 PST 2019
https://bugs.llvm.org/show_bug.cgi?id=40736
Bug ID: 40736
Summary: incompatible return of small struct in 32-bit PowerPC
BSD
Product: libraries
Version: 7.0
Hardware: Macintosh
OS: OpenBSD
Status: NEW
Severity: normal
Priority: P
Component: Backend: PowerPC
Assignee: unassignedbugs at nondot.org
Reporter: kernigh at gmail.com
CC: llvm-bugs at lists.llvm.org, nemanja.i.ibm at gmail.com
clang 7.0.1 and gcc use incompatible conventions to return a small struct (of
up to 8 bytes) in my PowerBook G4 running OpenBSD/macppc, where gcc is the main
compiler and clang is a recent arrival. This causes my qt5 built with clang to
crash when trying to call my libxcb built with gcc. Functions like
xcb_intern_atom() return a cookie as a 4-byte struct containing an unsigned
int.
llvm/lib/Target/PowerPC provides the RetCC_PPC convention, but I don't see
where it returns structs. llvm and clang are returning structs in memory, with
the caller passing in r3 a pointer to the return area. gcc in OpenBSD returns
smaller structs in registers r3 and r4. gcc in NetBSD/macppc seems like
OpenBSD. I don't know what happens in FreeBSD.
There are 2 versions of the ELF ABI (called "SVR4" in llvm):
- System V ABI: PowerPC Processor Supplement (1995)
- Power Architecture 32-bit ABI Supplement 1.0 (2011)
It added features like secure PLT and thread-local storage.
Search for Power-Arch-32-bit-ABI-supp-1.0-Unified.pdf
The later ABI from 2011, in section 3.2.5 Return Values, said,
> ATR-LINUX: Aggregates or unions of any length will be returned in a storage
> buffer allocated by the caller. The caller will pass the address of this
> buffer as a hidden first argument in r3, causing the first explicit argument
> to be passed in r4. This hidden argument is treated as a normal formal
> parameter, and corresponds to the first doubleword of the parameter save
> area.
>
> ATR-EABI: Aggregates or unions whose size is less than or equal to eight
> bytes shall be returned in r3 and r4, as if they were first stored in memory
> area and then the low-addressed word were loaded in r3 and the
> high-addressed word were loaded into r4. Bits beyond the last member of the
> structure or union are not defined.
Larger structs in ATR-EABI get returned like in ATR-LINUX. llvm and clang
follow ATR-LINUX. gcc in OpenBSD is almost like ATR-EABI, but (in my
big-endian PowerPC) takes the undefined bytes from before the first member, not
"beyond the last member of the structure or union".
I compiled this code with -S in clang 7.0.1 and in gcc:
struct s1 { char c; };
struct s4 { char c[4]; };
struct s7 { char c[7]; };
struct s8 { char c[8]; };
struct s9 { char c[9]; };
struct sd { double d; };
struct s1 ret1(struct s1 *s) { return *s; }
struct s4 ret4(struct s4 *s) { return *s; }
struct s7 ret7(struct s7 *s) { return *s; }
struct s8 ret8(struct s8 *s) { return *s; }
struct s9 ret9(struct s9 *s) { return *s; }
struct sd retd(struct sd *s) { return *s; }
In clang, all these functions get the return area in r3, and the parameter s in
r4, so they copy the correct number of bytes from where r4 points to where r3
points. In gcc, all but ret9() get the parameter in r3, and copy bytes from
where r3 points to r3 itself, or to r3 and r4.
I now show instructions from
$ egcc -O2 -fno-stack-protector -S retxam.c
This is gcc 8.2.0 with OpenBSD's patches. Older versions of gcc in NetBSD and
OpenBSD write different instructions but seem to get the same result. I add my
own comments and change 3 to %r3.
ret1 in gcc:
lbz %r3, 0(%r3) # r3 = 0.0.0.c
ret4 in gcc:
lwz %r3, 0(%r3) # r3 = c0.c1.c2.c3
ret7 in gcc:
lhz %r8, 4(%r3) # r8 = 0.0.c4.c5
lwz %r7, 0(%r3) # r7 = c0.c1.c2.c3
rlwinm %r6, %r8, 8, 8, 15 # r6 = 0.c4.0.0
lbz %r4, 6(%r3) # r4 = 0.0.0.c6
slwi %r10, %r7, 24 # r10 = c3.0.0.0
rlwinm %r8, %r8, 8, 16, 23 # r8 = 0.0.c5.0
or %r9, %r10, %r6 # r9 = c3.c4.0.0
srwi %r3, %r7, 8 # r3 = 0.c0.c1.c2
or %r9, %r9, %r8 # r9 = c3.c4.c5.0
or %r4, %r9, %r4 # r4 = c3.c4.c5.c6
ret7 in gcc looks overly long. I would try to avoid the rotations by using
misaligned loads: lwz %r4, 3(%r3) and lhz *, 1(%r3)
ret8 in gcc:
lwz %r4, 4(%r3) # r4 = c4.c5.c6.c7
lwz %r3, 0(%r3) # r3 = c0.c1.c2.c3
ret9 in gcc:
lwz %r7, 0(%r4) # r7 = c0.c1.c2.c3
lwz %r8, 4(%r4) # r8 = c4.c5.c6.c7
lbz %r10, 8(%r4) # r10 = 0.0.0.c8
stw %r7, 0(%r3) # r3[0..3] = c0.c1.c2.c3
stw %r8, 4(%r3) # r3[4..7] = c4.c5.c6.c7
stb %r10, 8(%r3) # r3[8] = c8
retd in gcc is exactly like ret8: it puts the struct's 8-byte double in r3 and
r4, not in a floating-point register.
Right now, I can compile clang from llvm-project.git master, but I can't run
it, so I don't know how it returns structs. I am able to run OpenBSD's package
of clang 7.0.1.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190215/a308940f/attachment.html>
More information about the llvm-bugs
mailing list