[llvm-dev] Optimization generate super long function definition

Jason -Zhong Sheng- Hu via llvm-dev llvm-dev at lists.llvm.org
Mon Apr 13 15:43:51 PDT 2020

Hey Neil,

thank you for replying and the information.

I think it is dumb... I have changed my implementation to passing a byval array pointer. In this case, this optimization doesn't trigger.

I think the expansion in this optimization is quite useless because byval array pointer achieves the same (I suppose) while the generated size remains small.  This reveals unnecessarily complex execution detail.

Jason Hu
From: Neil Henning <neil.henning at unity3d.com>
Sent: April 13, 2020 9:10 AM
To: Jason -Zhong Sheng- Hu <fdhzs2010 at hotmail.com>
Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] Optimization generate super long function definition

Hey Jason Hu,

So I think this is the SROA pass breaking up the load/store of the aggregate (the [500 x i32] array) into individual load/stores so that it can see if any are unused or can have their stores forwarded. This is a bit of LLVM that I personally find pretty dumb - but if you look at how Clang generates code for your above pattern it'll output memcpy intrinsics instead of doing load/stores, which gets around this issue.

Note: the compiler hasn't hung, it's just spinning and comparing a few thousand instructions generated from SROA against each other, you might need to wait until the end of the universe but it should complete eventually ;)


On Mon, Apr 13, 2020 at 12:37 AM Jason -Zhong Sheng- Hu via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:

Hi all,

sorry to have sent the same question around. I am quite desperately looking for a solution to this problem and I figured the mailing list is the best bet.

In my code, I generate the following function:

define i32 @gl.qi([500 x i32] %x, i32 %i) {
  %x. = alloca [500 x i32]
  %i. = alloca i32
  %0 = alloca [500 x i32]
  store [500 x i32] %x, [500 x i32]* %x.
  store i32 %i, i32* %i.
  %x.1 = load [500 x i32], [500 x i32]* %x.
  %i.2 = load i32, i32* %i.
  store [500 x i32] %x.1, [500 x i32]* %0
  %1 = icmp slt i32 %i.2, 500
  br i1 %1, label %in-bound, label %out-of-bound

out-of-bound:                                     ; preds = %entry
  call void @gen.panic(i8* getelementptr inbounds ([22 x i8], [22 x i8]* @pool.str.2, i32 0, i32 0))

in-bound:                                         ; preds = %entry
  %2 = getelementptr inbounds [500 x i32], [500 x i32]* %0, i32 0, i32 %i.2
  %idx = load i32, i32* %2
  ret i32 %idx

the high level functionality is to use %i to index %x, and if %i is out of bound, a panic function is called instead.

consider the store line:

  store [500 x i32] %x, [500 x i32]* %x.

once I pass this function to opt -O1 -S --verify --verify-each, it generates code like this:

define i32 @gl.qi([500 x i32] %x, i32 %i) local_unnamed_addr {
  %0 = alloca [500 x i32], align 4
  %x.fca.0.extract = extractvalue [500 x i32] %x, 0
  %x.fca.1.extract = extractvalue [500 x i32] %x, 1
  %x.fca.2.extract = extractvalue [500 x i32] %x, 2
  %x.fca.3.extract = extractvalue [500 x i32] %x, 3
  %x.fca.4.extract = extractvalue [500 x i32] %x, 4
  %x.fca.5.extract = extractvalue [500 x i32] %x, 5

until 500. I put the number to 50000 and it won’t stop.

This is puzzling. I am not sure why must a store command be expanded to a sequence of etractvalues then stores? Is there a way to turn off this particular optimization without turning off the whole optimization?

Or am I looking at the wrong way to do this simple task?

Jason Hu
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>

Neil Henning
Senior Software Engineer Compiler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200413/0d308301/attachment.html>

More information about the llvm-dev mailing list