[llvm-commits] [llvm] r144267 - in /llvm/trunk: lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp test/CodeGen/X86/2006-05-11-InstrSched.ll test/CodeGen/X86/2009-04-21-NoReloadImpDef.ll test/CodeGen/X86/change-compare-stride-1.ll test/CodeGen/X86/fold-pcmpeqd-0.ll test/CodeGen/X86/iv-users-in-other-loops.ll test/CodeGen/X86/lsr-loop-exit-cond.ll test/CodeGen/X86/lsr-reuse-trunc.ll test/CodeGen/X86/masked-iv-safe.ll test/CodeGen/X86/multiple-loop-post-inc.ll test/CodeGen/X86/sse2.ll test/CodeGen/X86/sse3.ll

Evan Cheng evan.cheng at apple.com
Thu Nov 10 22:53:54 PST 2011


Ah, I've heard about this issue. I guess there is no way around this for gcc based compilers. I still don't see why this patch would expose this issue. Just luck?

Evan

On Nov 10, 2011, at 7:59 PM, Duncan Sands wrote:

> Hi Evan,
> 
>> That's really bizarre. It's obviously exposing a unrelated bug. Thanks for looking into this.
> 
> I think this is a case of PR11200: the code generators make decisions based on
> floating point computations.  On i386, GCC uses the floating point stack, while
> LLVM uses xmm registers, when building LLVM.  This results in codegen sometimes
> making different decisions due to different rounding.
> 
> I've disabled the part of the dragonegg bootstrap that compares object files
> produced by GCC compiled LLVM with those produced by LLVM compiled LLVM as a
> workaround.
> 
> Ciao, Duncan.
> 
>> 
>> Evan
>> 
>> On Nov 10, 2011, at 3:13 AM, Duncan Sands wrote:
>> 
>>> Hi Evan, this also broke the buildbot
>>> 
>>> http://lab.llvm.org:8011/builders/dragonegg-i386-linux/builds/365
>>> 
>>> I'm trying to work out what is wrong.
>>> 
>>> Ciao, Duncan.
>>> 
>>> On 10/11/11 08:43, Evan Cheng wrote:
>>>> Author: evancheng
>>>> Date: Thu Nov 10 01:43:16 2011
>>>> New Revision: 144267
>>>> 
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=144267&view=rev
>>>> Log:
>>>> Use a bigger hammer to fix PR11314 by disabling the "forcing two-address
>>>> instruction lower optimization" in the pre-RA scheduler.
>>>> 
>>>> The optimization, rather the hack, was done before MI use-list was available.
>>>> Now we should be able to implement it in a better way, perhaps in the
>>>> two-address pass until a MI scheduler is available.
>>>> 
>>>> Now that the scheduler has to backtrack to handle call sequences. Adding
>>>> artificial scheduling constraints is just not safe. Furthermore, the hack
>>>> is not taking all the other scheduling decisions into consideration so it's just
>>>> as likely to pessimize code. So I view disabling this optimization goodness
>>>> regardless of PR11314.
>>>> 
>>>> Modified:
>>>>     llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
>>>>     llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll
>>>>     llvm/trunk/test/CodeGen/X86/2009-04-21-NoReloadImpDef.ll
>>>>     llvm/trunk/test/CodeGen/X86/change-compare-stride-1.ll
>>>>     llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-0.ll
>>>>     llvm/trunk/test/CodeGen/X86/iv-users-in-other-loops.ll
>>>>     llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll
>>>>     llvm/trunk/test/CodeGen/X86/lsr-reuse-trunc.ll
>>>>     llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll
>>>>     llvm/trunk/test/CodeGen/X86/multiple-loop-post-inc.ll
>>>>     llvm/trunk/test/CodeGen/X86/sse2.ll
>>>>     llvm/trunk/test/CodeGen/X86/sse3.ll
>>>> 
>>>> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp (original)
>>>> +++ llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp Thu Nov 10 01:43:16 2011
>>>> @@ -89,6 +89,9 @@
>>>>  static cl::opt<bool>   DisableSchedHeight(
>>>>    "disable-sched-height", cl::Hidden, cl::init(false),
>>>>    cl::desc("Disable scheduled-height priority in sched=list-ilp"));
>>>> +static cl::opt<bool>   Disable2AddrHack(
>>>> +  "disable-2addr-hack", cl::Hidden, cl::init(true),
>>>> +  cl::desc("Disable scheduler's two-address hack"));
>>>> 
>>>>  static cl::opt<int>   MaxReorderWindow(
>>>>    "max-sched-reorder", cl::Hidden, cl::init(6),
>>>> @@ -2628,7 +2631,8 @@
>>>>  void RegReductionPQBase::initNodes(std::vector<SUnit>   &sunits) {
>>>>    SUnits =&sunits;
>>>>    // Add pseudo dependency edges for two-address nodes.
>>>> -  AddPseudoTwoAddrDeps();
>>>> +  if (!Disable2AddrHack)
>>>> +    AddPseudoTwoAddrDeps();
>>>>    // Reroute edges to nodes with multiple uses.
>>>>    if (!TracksRegPressure)
>>>>      PrescheduleNodesWithMultipleUses();
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll Thu Nov 10 01:43:16 2011
>>>> @@ -1,5 +1,5 @@
>>>>  ; RUN: llc<   %s -march=x86 -mtriple=i386-linux-gnu -mattr=+sse2 -stats -realign-stack=0 |&\
>>>> -; RUN:     grep {asm-printer} | grep 34
>>>> +; RUN:     grep {asm-printer} | grep 35
>>>> 
>>>>  target datalayout = "e-p:32:32"
>>>>  define void @foo(i32* %mc, i32* %bp, i32* %ms, i32* %xmb, i32* %mpp, i32* %tpmm, i32* %ip, i32* %tpim, i32* %dpp, i32* %tpdm, i32* %bpi, i32 %M) nounwind {
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/X86/2009-04-21-NoReloadImpDef.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2009-04-21-NoReloadImpDef.ll?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/X86/2009-04-21-NoReloadImpDef.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/X86/2009-04-21-NoReloadImpDef.ll Thu Nov 10 01:43:16 2011
>>>> @@ -5,7 +5,6 @@
>>>> 
>>>>  ; CHECK: pextrw $14
>>>>  ; CHECK-NEXT: shrl $8
>>>> -; CHECK-NEXT: (%ebp)
>>>>  ; CHECK-NEXT: pinsrw
>>>> 
>>>>  define void @update(i8** %args_list) nounwind {
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/X86/change-compare-stride-1.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/change-compare-stride-1.ll?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/X86/change-compare-stride-1.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/X86/change-compare-stride-1.ll Thu Nov 10 01:43:16 2011
>>>> @@ -3,6 +3,10 @@
>>>>  ; Nested LSR is required to optimize this case.
>>>>  ; We do not expect to see this form of IR without -enable-iv-rewrite.
>>>> 
>>>> +; xfailed for now because the scheduler two-address hack has been disabled.
>>>> +; Now it's generating a leal -1 rather than a decq.
>>>> +; XFAIL: *
>>>> +
>>>>  define void @borf(i8* nocapture %in, i8* nocapture %out) nounwind {
>>>>  ; CHECK: borf:
>>>>  ; CHECK-NOT: inc
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-0.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-0.ll?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-0.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-0.ll Thu Nov 10 01:43:16 2011
>>>> @@ -1,5 +1,7 @@
>>>> -; RUN: llc<   %s -mtriple=i386-apple-darwin -mcpu=yonah -regalloc=linearscan | FileCheck --check-prefix=I386 %s
>>>>  ; RUN: llc<   %s -mtriple=x86_64-apple-darwin | FileCheck --check-prefix=X86-64 %s
>>>> +; DISABLED: llc<   %s -mtriple=i386-apple-darwin -mcpu=yonah -regalloc=linearscan | FileCheck --check-prefix=I386 %s
>>>> +
>>>> +; i386 test has been disabled when scheduler 2-addr hack is disabled.
>>>> 
>>>>  ; This testcase shouldn't need to spill the -1 value,
>>>>  ; so it should just use pcmpeqd to materialize an all-ones vector.
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/X86/iv-users-in-other-loops.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/iv-users-in-other-loops.ll?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/X86/iv-users-in-other-loops.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/X86/iv-users-in-other-loops.ll Thu Nov 10 01:43:16 2011
>>>> @@ -1,9 +1,8 @@
>>>>  ; RUN: llc<   %s -march=x86-64 -enable-lsr-nested -o %t
>>>>  ; RUN: not grep inc %t
>>>>  ; RUN: grep dec %t | count 2
>>>> -; RUN: grep addq %t | count 12
>>>> +; RUN: grep addq %t | count 10
>>>>  ; RUN: not grep addb %t
>>>> -; RUN: not grep leaq %t
>>>>  ; RUN: not grep leal %t
>>>>  ; RUN: not grep movq %t
>>>> 
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll Thu Nov 10 01:43:16 2011
>>>> @@ -1,6 +1,7 @@
>>>>  ; RUN: llc -march=x86-64<   %s | FileCheck %s
>>>> 
>>>>  ; CHECK: decq
>>>> +; CHECK-NEXT: movl (
>>>>  ; CHECK-NEXT: jne
>>>> 
>>>>  @Te0 = external global [256 x i32]		;<[256 x i32]*>   [#uses=5]
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/X86/lsr-reuse-trunc.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/lsr-reuse-trunc.ll?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/X86/lsr-reuse-trunc.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/X86/lsr-reuse-trunc.ll Thu Nov 10 01:43:16 2011
>>>> @@ -4,13 +4,14 @@
>>>>  ; Full strength reduction wouldn't reduce register pressure, so LSR should
>>>>  ; stick with indexing here.
>>>> 
>>>> +; FIXME: This is worse off from disabling of scheduler 2-address hack.
>>>>  ; CHECK: movaps        (%{{rsi|rdx}},%rax,4), [[X3:%xmm[0-9]+]]
>>>> +; CHECK: leaq  4(%rax), %{{rcx|r9}}
>>>>  ; CHECK: cvtdq2ps
>>>>  ; CHECK: orps          {{%xmm[0-9]+}}, [[X4:%xmm[0-9]+]]
>>>>  ; CHECK: movaps        [[X4]], (%{{rdi|rcx}},%rax,4)
>>>> -; CHECK: addq  $4, %rax
>>>> -; CHECK: cmpl  %eax, (%{{rdx|r8}})
>>>> -; CHECK-NEXT: jg
>>>> +; CHECK: cmpl  %{{ecx|r9d}}, (%{{rdx|r8}})
>>>> +; CHECK: jg
>>>> 
>>>>  define void @vvfloorf(float* nocapture %y, float* nocapture %x, i32* nocapture %n) nounwind {
>>>>  entry:
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll Thu Nov 10 01:43:16 2011
>>>> @@ -3,10 +3,10 @@
>>>>  ; RUN: not grep movz %t
>>>>  ; RUN: not grep sar %t
>>>>  ; RUN: not grep shl %t
>>>> -; RUN: grep add %t | count 2
>>>> +; RUN: grep add %t | count 1
>>>>  ; RUN: grep inc %t | count 4
>>>>  ; RUN: grep dec %t | count 2
>>>> -; RUN: grep lea %t | count 2
>>>> +; RUN: grep lea %t | count 3
>>>> 
>>>>  ; Optimize away zext-inreg and sext-inreg on the loop induction
>>>>  ; variable using trip-count information.
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/X86/multiple-loop-post-inc.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/multiple-loop-post-inc.ll?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/X86/multiple-loop-post-inc.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/X86/multiple-loop-post-inc.ll Thu Nov 10 01:43:16 2011
>>>> @@ -1,6 +1,10 @@
>>>>  ; RUN: llc -asm-verbose=false -disable-branch-fold -disable-code-place -disable-tail-duplicate -march=x86-64<   %s | FileCheck %s
>>>>  ; rdar://7236213
>>>> 
>>>> +; Xfailed now that scheduler 2-address hack is disabled a lea is generated.
>>>> +; The code isn't any worse though.
>>>> +; XFAIL: *
>>>> +
>>>>  ; CodeGen shouldn't require any lea instructions inside the marked loop.
>>>>  ; It should properly set up post-increment uses and do coalescing for
>>>>  ; the induction variables.
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/X86/sse2.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sse2.ll?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/X86/sse2.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/X86/sse2.ll Thu Nov 10 01:43:16 2011
>>>> @@ -178,8 +178,8 @@
>>>>          %tmp27 = shufflevector<4 x float>   %tmp9,<4 x float>   %tmp21,<4 x i32>   <   i32 0, i32 1, i32 4, i32 5>                  ;<<4 x float>>   [#uses=1]
>>>>          ret<4 x float>   %tmp27
>>>>  ; CHECK: test14:
>>>> -; CHECK: 	addps	[[X1:%xmm[0-9]+]], [[X0:%xmm[0-9]+]]
>>>> -; CHECK: 	subps	[[X1]], [[X2:%xmm[0-9]+]]
>>>> +; CHECK: 	subps	[[X1:%xmm[0-9]+]], [[X2:%xmm[0-9]+]]
>>>> +; CHECK: 	addps	[[X1]], [[X0:%xmm[0-9]+]]
>>>>  ; CHECK: 	movlhps	[[X2]], [[X0]]
>>>>  }
>>>> 
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/X86/sse3.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sse3.ll?rev=144267&r1=144266&r2=144267&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/X86/sse3.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/X86/sse3.ll Thu Nov 10 01:43:16 2011
>>>> @@ -226,15 +226,16 @@
>>>>  }
>>>> 
>>>> 
>>>> -
>>>> +; FIXME: t15 is worse off from disabling of scheduler 2-address hack.
>>>>  define<8 x i16>   @t15(<8 x i16>   %T0,<8 x i16>   %T1) nounwind readnone {
>>>>  entry:
>>>>          %tmp8 = shufflevector<8 x i16>   %T0,<8 x i16>   %T1,<8 x i32>   <   i32 undef, i32 undef, i32 7, i32 2, i32 8, i32 undef, i32 undef , i32 undef>
>>>>          ret<8 x i16>   %tmp8
>>>>  ; X64: 	t15:
>>>> -; X64: 		pextrw	$7, %xmm0, %eax
>>>> +; X64:          movdqa %xmm0, %xmm2
>>>>  ; X64: 		punpcklqdq	%xmm1, %xmm0
>>>>  ; X64: 		pshuflw	$-128, %xmm0, %xmm0
>>>> +; X64: 		pextrw	$7, %xmm2, %eax
>>>>  ; X64: 		pinsrw	$2, %eax, %xmm0
>>>>  ; X64: 		ret
>>>>  }
>>>> @@ -247,12 +248,12 @@
>>>>          %tmp9 = shufflevector<16 x i8>   %tmp8,<16 x i8>   %T0,<16 x i32>   <   i32 0, i32 1, i32 2, i32 17,  i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef , i32 undef>
>>>>          ret<16 x i8>   %tmp9
>>>>  ; X64: 	t16:
>>>> -; X64: 		movdqa	%xmm1, %xmm0
>>>> -; X64: 		pslldq	$2, %xmm0
>>>> -; X64: 		pextrw	$1, %xmm0, %eax
>>>> -; X64: 		movd	%xmm0, %ecx
>>>> -; X64: 		pinsrw	$0, %ecx, %xmm0
>>>> -; X64: 		pextrw	$8, %xmm1, %ecx
>>>> +; X64: 		movdqa	%xmm1, %xmm2
>>>> +; X64: 		pslldq	$2, %xmm2
>>>> +; X64: 		movd	%xmm2, %eax
>>>> +; X64: 		pinsrw	$0, %eax, %xmm0
>>>> +; X64: 		pextrw	$8, %xmm1, %eax
>>>> +; X64: 		pextrw	$1, %xmm2, %ecx
>>>>  ; X64: 		ret
>>>>  }
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
> 




More information about the llvm-commits mailing list