[LLVMdev] GSoC:Loop Reversal Transformation

Matt Godbolt matt at godbolt.org
Wed Mar 18 10:44:15 PDT 2015


Hi Vishal,

I am still a little confused - can you share your new code?  Be aware that
you're relying on undefined behaviour: the "unisgned int" you are adding to
is overflowing, and clang will use this fact.

I was able to replicate your initial results but I found that if I ran the
counting code twice, the second time both types of loop took much the same
time:

$ make; and ./loop
make: Nothing to be done for `all'.
First run:
0:834
0:405
Second run:
0:381
0:384

(on a 3.5GHz Haswell).

I hacked together a very simplistic version using a memory barrier to
defeat the optimizer instead of adding (which means the loop actually gets
compiled: http://url.godbolt.org/hqwkx ). Additionally, I used the CPU
cycle counter on x86 (again, very simplistically, there's a huge art to
this: I can share other resources if you're interested). Counting to 2.4
billion takes an appreciable amount of time now :)

The results from that:

/t/lop $ make; and ./loop2
First run:
Counting up:   2528555165 cycles
Counting down: 2524361064 cycles
Second run:
Counting up:   2495025808 cycles
Counting down: 2511668857 cycles

Source and makefile attached.

If you look at the code clang is producing in both cases (see the URL) it
has already transformed the code into effectively the same: adding and/or
subtracting until hitting zero. At least in this case, where the loop
counter value itself isn't used.

Sorry I haven't added much to the discussion, but hopefully this email is
useful in gauging the relevance of the feature.

Best regards, Matt


On Wed, Mar 18, 2015 at 10:33 AM, vishal sarda <vishalksarda at gmail.com>
wrote:

> Hi Matt, All,
>
> Thanks for looking into this and your suggestions.
>
> I compiled the program with -O3 optimization level (clang -O3) on X86_64
> target.
> After your suggestion, i ran the program with iteration of 10000 runs and
> found that average runtimes are (attaching data collected as well)
>
> Forward loop traverse : 2.006 milli seconds
> Reverse loop traverse : 1.531 milli seconds
>
> Yes, i agree that this sample program may not be sufficient to say that
> loop reversal traversing will always be faster, however, difference in
> runtime is visible though. And that's where the profitability calculation
> comes into picture.
>
> I found mention of loop traverse in various papers (links mentioned in
> previous mail), and hence thought of implementing it.
>
> I am not sure though if its really helpful. Above papers mentioned that it
> might not be beneficial in itself, but opens up the opportunity for other
> optimizations.
>
> Suggestions on this are most welcomed. Waiting for others to pitch in too.
>
> Regards,
> Vishal Sarda,
> 3rd Year Undergraduate,
> Department of Computer Engineering,
> College of Engineering, Pune
>
> On Wed, Mar 18, 2015 at 1:20 AM, Matt Godbolt <matt at godbolt.org> wrote:
>
>> Hi,
>>
>> On Tue, Mar 17, 2015 at 2:00 PM, vishal sarda <vishalksarda at gmail.com>
>> wrote:
>>
>>> [snip]
>>>
>>> Loop counting up 0.15 ms
>>>
>>> Loop counting down 0.08 ms
>>>
>> I'm no llvm expert, but as an interest bystander: I suspect you compiled
>> the source without any optimizations applied - I tried to replicate this
>> behaviour and found the optimzer happily replaces both the inner loops you
>> had with a constant, and thus I got the same time on both loops. (e.g. see
>> http://goo.gl/aXFkVb )
>>
>> Benchmarks of this nature where the run time is so small are notoriously
>> prone to measurement errors, so I'd be a little careful drawing conclusions
>> from the sample you listed. Also; what architecture did you measure on, and
>> what spec machine?
>>
>> Not at all to dissuade you from investigating this optimization! I'm just
>> a little sceptical of the benchmark you posted!
>>
>> Best regards,
>>
>> Matt
>>
>>
>


-- 
Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150318/fb809bc6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: loop.tar.gz
Type: application/x-gzip
Size: 867 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150318/fb809bc6/attachment.bin>


More information about the llvm-dev mailing list