[PATCH] D63692: [LSR] Improved code generation for Zero Compare loops

Joan LLuch via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Jun 23 10:25:53 PDT 2019


joanlluch created this revision.
joanlluch added reviewers: t.p.northover, eli.friedman, sanjoy.
Herald added subscribers: llvm-commits, jsji, kristof.beyls, javed.absar.
Herald added a project: LLVM.

Improves loop code generation. All targets are affected but most benefits are obtained for X86. Creates shorter code in a number of cases by allowing the Strength Reduce algorithm to consider both the direct and swapped forms of zero compare instructions, which enhances the opportunities to obtain an overall better LSR solution. Given equal LSR solution cost, the patch  also honours the direction of the loop induction variable specified in the user source code, which in practice also tends to result in a better solution.

The patch broke a number of regression tests due to inherent test fragility, not because of intended test failures. I fixed the CodeGen tests for the ARM and X86 architectures.

An example of code improved by this patch:

  int func(void);
  void func2(void);
  
  void LSRTest(int count)
  {
    count += func();
    for ( ; count != 20; ++count ) {
      func2();
    }
  }

Before:

  	.section	__TEXT,__text,regular,pure_instructions
  	.macosx_version_min 10, 12
  	.globl	_LSRTest
  _LSRTest:
  	.cfi_startproc
  	pushl	%ebp
  	.cfi_def_cfa_offset 8
  	.cfi_offset %ebp, -8
  	movl	%esp, %ebp
  	.cfi_def_cfa_register %ebp
  	pushl	%esi
  	pushl	%eax
  	.cfi_offset %esi, -12
  	calll	_func
  	addl	8(%ebp), %eax
  	pushl	$20
  	popl	%esi
  	subl	%eax, %esi
  	jmp	LBB0_1
  LBB0_2:
  	calll	_func2
  	decl	%esi
  LBB0_1:
  	testl	%esi, %esi
  	jne	LBB0_2
  	addl	$4, %esp
  	popl	%esi
  	popl	%ebp
  	retl
  	.cfi_endproc

After:

  	.section	__TEXT,__text,regular,pure_instructions
  	.macosx_version_min 10, 12
  	.globl	_LSRTest
  _LSRTest:
  	.cfi_startproc
  	pushl	%ebp
  	.cfi_def_cfa_offset 8
  	.cfi_offset %ebp, -8
  	movl	%esp, %ebp
  	.cfi_def_cfa_register %ebp
  	pushl	%esi
  	pushl	%eax
  	.cfi_offset %esi, -12
  	movl	8(%ebp), %esi
  	calll	_func
  	leal	-20(%eax,%esi), %esi
  	jmp	LBB0_1
  LBB0_2:
  	calll	_func2
  	incl	%esi
  LBB0_1:
  	testl	%esi, %esi
  	jne	LBB0_2
  	addl	$4, %esp
  	popl	%esi
  	popl	%ebp
  	retl
  	.cfi_endproc


Repository:
  rL LLVM

https://reviews.llvm.org/D63692

Files:
  lib/Transforms/Scalar/LoopStrengthReduce.cpp
  test/CodeGen/ARM/arm-shrink-wrapping.ll
  test/CodeGen/X86/avx-vzeroupper.ll
  test/CodeGen/X86/lsr-wrap.ll
  test/CodeGen/X86/masked-iv-safe.ll
  test/CodeGen/X86/reverse_branches.ll
  test/CodeGen/X86/x86-shrink-wrapping.ll
  test/CodeGen/X86/x86-win64-shrink-wrapping.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D63692.206138.patch
Type: text/x-patch
Size: 47618 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190623/4db11ec4/attachment.bin>


More information about the llvm-commits mailing list