[llvm-commits] x86 branch sequence optimization in LLVM code gen: please review

Umansky, Victor victor.umansky at intel.com
Wed Dec 7 07:03:24 PST 2011


Hi Chad, Anton, Bruno,

Thank you for the suggestion.

Unfortunately, it won't work in the case of brcond.ll file.

Indeed I can introduce different "check-prefix" values in order to separate checks for "core2" case from those for "penryn" case.
However, the compilation of all functions in a file will be done unconditionally for both "RUN" cases. And this will inevitably lead to the test failure (in instruction selection) when a function using "ptest" LLVM intrinsic will be processed with "-mcpu=core2" option.
That's why I was not able to include the test cases for "ptest" intrinsic sequence to a file which will be compiled for a pre-Penryn target.

A solution which does work is to have legacy brcond.ll LIT tests running under "-mcpu=penryn".
I'm attaching the file.
Are you OK with such solution?

Best Regards,
    Victor

From: Chad Rosier [mailto:mcrosier at apple.com]
Sent: Tuesday, December 06, 2011 19:48
To: Umansky, Victor
Cc: Bruno Cardoso Lopes; llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] x86 branch sequence optimization in LLVM code gen: please review

Hi Victor,
You should be able to include the test in brcond.ll by specifying a new run line and using the -check-prefix option.

See: http://llvm.org/docs/TestingGuide.html#FileCheck

It would look something like this:

; RUN: llc < %s -mtriple=i386-apple-darwin10 -mcpu=penryn | FileCheck %s -check-prefix=FOO

declare i32 @llvm.x86.sse41.ptestz(<4 x float> %p1, <4 x float> %p2) nounwind

define <4 x float> @test1(<4 x float> %a, <4 x float> %b) nounwind {
entry:
; FOO: test1:
; FOO: ptest
; FOO-NEXT: je

etc..

 Chad

On Dec 6, 2011, at 12:52 AM, Umansky, Victor wrote:


Hi Bruno,

Thank you for the response.
I've changed the LIT test towards common look (attached).

Unfortunately, I cannot put it inside brcond.ll because the "ptest" instruction was introduced only with SSE4.1 (i.e. requires "-mcpu=penryn"), while the  current version of brcond.ll is processed with "-mcpu=core2".
Will the replacement of-mcpu in brcond.ll with "penryn" be backward-compat with regard to LIT results?

Best Regards,
    Victor

From: Bruno Cardoso Lopes [mailto:bruno.cardoso at gmail.com]
Sent: Monday, December 05, 2011 19:13
To: Umansky, Victor
Cc: llvm-commits at cs.uiuc.edu<mailto:llvm-commits at cs.uiuc.edu>
Subject: Re: [llvm-commits] x86 branch sequence optimization in LLVM code gen: please review

Hi Victor,
On Mon, Dec 5, 2011 at 10:26 AM, Umansky, Victor <victor.umansky at intel.com<mailto:victor.umansky at intel.com>> wrote:
Hi,

My name is Victor Umansky; I'm an engineer in Intel OpenCL Team.

The attached patch contains an optimization of ptest-conditioned branch.

I.e., the following LLVM IR code

  %res = call i32 @llvm.x86.sse41.ptestz(<4 x float> %a, <4 x float> %a) nounwind
  %tmp = and i32 %res, 1
  %one = icmp eq i32 %tmp, 0
  br i1 %one, label %label1, label %label2


ends with the following x86 machine code sequence:

    ptest     XMM3, XMM3
    sete    AL
    movzx    EAX, AL
    test    EAX, EAX
    jne    LBB18_26


which can be optimized to:

             ptest     XMM3, XMM3
             je    LBB18_26



The current machine code sequence stems from the need to coordinate i32 return type from the ptestz intrinsic with i1 condition type for branch IR instruction.
Consequently we can optimize it in x86 codegen backend where the both condition producer (ptest) amd consumer (jcc) use the same x86 EFLAGS register, and thus in-between conversions of the condition can be quietly dropped.

The optimization is focused on x86 DAG combining (post-legalization stage) which recognizes the sequence and converts it to the minimized one.

The attached patch file includes both the x86 backend instruction combining modification and a LIT regression test for it.


I'd like to commit the fix to the LLVM trunk, and your feedback will be mostly appreciated.



+; RUN: llc %s -march=x86-64 -mcpu=corei7 -o %t.asm
+; RUN: FileCheck %s --input-file=%t.asm

Please do like the other tests, and read the file with "< %s". Also, place it under test/CodeGen/X86/brcond.ll

--
Bruno Cardoso Lopes
http://www.brunocardoso.cc
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.<ptest_sequence.ll>_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu<mailto:llvm-commits at cs.uiuc.edu>
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111207/69c65299/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: brcond.ll.patch
Type: application/octet-stream
Size: 1939 bytes
Desc: brcond.ll.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111207/69c65299/attachment.obj>


More information about the llvm-commits mailing list