<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Any ScalarEvolution experts out there
have any thought about this patch?<br>
<br>
Thanks,<br>
Matthew Curtis.<br>
<br>
On 12/10/2012 4:13 PM, Matthew Curtis wrote:<br>
</div>
<blockquote cite="mid:50C65E78.6030608@codeaurora.org" type="cite">Hello
all,
<br>
<br>
I wanted to get some feedback on this patch for ScalarEvolution.
<br>
<br>
It addresses a performance problem I am seeing for simple
benchmark.
<br>
<br>
Starting with this C code:
<br>
<br>
01: signed char foo(void)
<br>
02: {
<br>
03: const int count = 8000;
<br>
04: signed char result = 0;
<br>
05: int j;
<br>
06:
<br>
07: for (j = 0; j < count; ++j) {
<br>
08: result += (result_t)(3);
<br>
09: }
<br>
10:
<br>
11: return result;
<br>
12: }
<br>
<br>
I end up with this IR feeding into Induction Variable
Simplification:
<br>
<br>
01: define signext i8 @foo() nounwind readnone {
<br>
02: entry:
<br>
03: br label %for.body
<br>
04:
<br>
05: for.body: ; preds =
%entry, %for.body
<br>
06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
<br>
07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ]
<br>
08: %conv2 = and i32 %result.03, 255
<br>
09: %add = add nsw i32 %conv2, 3
<br>
10: %inc = add nsw i32 %j.04, 1
<br>
11: %cmp = icmp slt i32 %inc, 8000
<br>
12: br i1 %cmp, label %for.body, label %for.end
<br>
13:
<br>
14: for.end: ; preds =
%for.body
<br>
15: %conv1 = trunc i32 %add to i8
<br>
16: ret i8 %conv1
<br>
17: }
<br>
<br>
Unfortunately, the 'and' on line 8 prevents Scalar Evolution from
<br>
being able to create an expression for '%add' that it knows how to
<br>
evaluate.
<br>
<br>
The patch detects this pattern in createNodeForPHI and creates an
<br>
equivalent expression that can be evaluated.
<br>
<br>
Note that SCEV translates the 'and' into
<br>
ZeroExtend(Truncate(%result.03, i8), i32)
<br>
<br>
So in terms of SCEV expressions, we essentially have
<br>
%add[n] = Add(ZeroExtend(Truncate(%add[n-1], i8), i32), 3)
<br>
<br>
(BTW, I'm no scholar here, just a layman, so my terminology is
<br>
probably all wrong).
<br>
<br>
The patch basically moves the ZeroExtend and Truncate outside of
the
<br>
recurrence, though it must take into account that for iteration n,
the
<br>
ZeroExtend and Truncate apply to the value at iteration n-1.
<br>
<br>
For a constant step this is accomplished by subtracting one step
from
<br>
the recurrence, performing the Truncate and ZeroExtend, and then
<br>
adding the step to the result. In other words:
<br>
<br>
%add[n] = Add(
<br>
ZeroExtend(
<br>
Truncate(
<br>
Minus(AddRec(Start=0,Step=3)[n], 3),
<br>
i8),
<br>
i32),
<br>
3)
<br>
<br>
It's a little more complicated when the Step is another
recurrence,
<br>
but essentially the same.
<br>
<br>
Thoughts?
<br>
<br>
Matthew C.
<br>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a class="moz-txt-link-freetext" href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a>
<a class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>
</pre>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation</pre>
</body>
</html>