[LLVMbugs] [Bug 13438] New: X86 slow instruction selector incorrectly folds FS- and GS-relative loads

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Mon Jul 23 12:52:05 PDT 2012


             Bug #: 13438
           Summary: X86 slow instruction selector incorrectly folds FS-
                    and GS-relative loads
           Product: new-bugs
           Version: trunk
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: new bugs
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: csdavec at swan.ac.uk
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

The following IR generates the correct asm with the fast register allocator,
but incorrect asm with the slow instruction selector:

%struct.thread = type { i32, i32, i32, i32 }

define i32 @test() nounwind uwtable {
  %1 = load volatile %struct.thread* addrspace(256)* null
  %2 = getelementptr inbounds %struct.thread* %1, i64 0, i32 2
  %3 = load i32* %2, align 4, !tbaa !3
  ret i32 %3

I have verified that this is not modified during optimisation with llc
-print-after-all.  With llc -O0, the following code is produced:

    movq    %gs:0, %rax
    movl    8(%rax), %eax

Note that the gs-relative load occurs first (as in the IR) and then the
resulting pointer is used in arithmetic.  In contrast, the following is
generated at -O1:

    movq    %gs:0, %rax
    movl    %gs:8, %eax

Ignoring the fact that this generates a redundant load, the second load is
incorrect.  This would only be a valid transform if the base of the GS segment
is at linear address 0 (in which case there is no point using gs-relative
addressing anyway). 

Substitute address space 257 for 267 and exactly the same bug occurs with
FS-relative addressing.

