<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - Xcode 8 possible incorrect output"

   href="https://bugs.llvm.org/show_bug.cgi?id=33640">33640</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>Xcode 8 possible incorrect output

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>clang

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>unspecified

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>Macintosh

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>MacOS X

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>C++

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedclangbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>andreww@blackmagicdesign.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>dgregor@apple.com, llvm-bugs@lists.llvm.org

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Created <span class=""><a href="attachment.cgi?id=18731" name="attach_18731" title="tar of code used to generate llvm-ir & asm">attachment 18731</a> <a href="attachment.cgi?id=18731&action=edit" title="tar of code used to generate llvm-ir & asm">[details]</a></span>

tar of code used to generate llvm-ir & asm

I realise this is not strictly clang's problem, it's more of an Xcode problem

as it appears to only affect Xcode 7.3.1 and greater.

I tried reproducing this with clang 3.3 through 4.0 and it appears to work

there :-)

I'm really asking for advice if what they are doing is legal or not as I can't

seem to determine from the ABI spec if this is ok.

What I am seeing is that for x86_64 asm, certain loads of adjacent 32-bit

values from a struct, in preparation for a function call, result in the 64 bit

registers not being truncated to 32-bits as the LLVM IR is instructing them to

be.

For instance, the simple test program I have attached generates the following

LLVM IR:

; Function Attrs: norecurse optsize ssp uwtable

define i32 @main() local_unnamed_addr #0 {

  %1 = tail call i8* @malloc(i64 32) #3

  %2 = bitcast i8* %1 to <2 x i8*>*

  store <2 x i8*> <i8* inttoptr (i64 3735928559 to i8*), i8* inttoptr (i64

3735928559 to i8*)>, <2 x i8*>* %2, align 8, !tbaa !2

  %3 = getelementptr inbounds i8, i8* %1, i64 16

  %4 = getelementptr inbounds i8, i8* %1, i64 24

  %5 = bitcast i8* %3 to <4 x i32>*

  store <4 x i32> <i32 170, i32 187, i32 204, i32 221>, <4 x i32>* %5, align 8,

!tbaa !6

  %6 = bitcast i8* %3 to i64*

  %7 = load i64, i64* %6, align 8

  %8 = trunc i64 %7 to i32

  %9 = lshr i64 %7, 32

  %10 = trunc i64 %9 to i32

  %11 = bitcast i8* %4 to i64*

  %12 = load i64, i64* %11, align 8

  %13 = trunc i64 %12 to i32

  %14 = lshr i64 %12, 32

  %15 = trunc i64 %14 to i32

  tail call void @_Z4bar1PaS_iiiii(i8* inttoptr (i64 3735928559 to i8*), i8*

inttoptr (i64 3735928559 to i8*), i32 %8, i32 %10, i32 %13, i32 %15, i32 %15)

#3

  ret i32 0

}

IIUC %7, %8, %9, & %10 should result in registers containing only the 32 bit

values from my struct.

The generated assembly tells a different story:

_main:

0000000000000000        pushq   %rbp

0000000000000001        movq    %rsp, %rbp

0000000000000004        subq    $0x10, %rsp

0000000000000008        movl    $0x20, %edi

000000000000000d        callq   0x12

0000000000000012        movaps  0x47(%rip), %xmm0

0000000000000019        movups  %xmm0, (%rax)

000000000000001c        movaps  0x4d(%rip), %xmm0

0000000000000023        movups  %xmm0, 0x10(%rax)

0000000000000027        movq    0x10(%rax), %rdx

000000000000002b        movq    0x18(%rax), %r8

000000000000002f        movq    %rdx, %rcx

0000000000000032        shrq    $0x20, %rcx

0000000000000036        movq    %r8, %r9

0000000000000039        shrq    $0x20, %r9

000000000000003d        movl    %r9d, (%rsp)

0000000000000041        movl    $0xdeadbeef, %edi

0000000000000046        movl    $0xdeadbeef, %esi

000000000000004b        callq   0x50

0000000000000050        xorl    %eax, %eax

0000000000000052        addq    $0x10, %rsp

0000000000000056        popq    %rbp

0000000000000057        retq

Registers r8 & rdx are left containing 2 32 bit values each when the function

call is made.  In the case of r8 it has a copy of r9 in the high bits and in

the case of rdx it has rcx in the high bits.

This is resulting in some wacky results in a called library which I can't

change, it may have been compiled with Intel's compiler and it does load %r8d

when accessing the 32-bit parameter as specified by its function signature, it

accesses %r8.

So, who is in the wrong here?  Should the caller be emitting clean 32-bit

registers over the x86_64 ABI or should the callee be accessing only the 32-bit

register aliases for those parameters specified as 32-bits?</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>