[llvm-dev] Comparing Clang and GCC: only clang stores updated value in each iteration.

Friedman, Eli via llvm-dev llvm-dev at lists.llvm.org
Fri Sep 21 11:23:57 PDT 2018


On 9/21/2018 12:15 AM, Jonas Paulsson wrote:
>>
>> If so, then yes, this is probably a case where the aggressive LoopPRE 
>> mentioned in the other thread that Eli linked to would be useful.  
>> Once we'd done the PRE, then everything else should collapse.
> Thanks for the link, it's good to know this issue is recognized. If I 
> understand it correctly, the reason clang is storing in each iteration 
> is due to concurrency.

Yes, basically... IIRC LLVM did the wrong thing a long time ago, but we 
fixed it as part of implementing the C++11 atomics model.

> As a newbie I wonder how this works in practice since even if the 
> value is stored in each iteration two threads could still do this 
> simultaneously if not some sort of atomic operation is doing it, 
> right? What happens here is that the value of 'a' is loaded once 
> before the loop, then incremented and stored in each iteration. How 
> does that help with multiple threads compared to storing it after the 
> loop?

The interesting case is the case where the store is dynamically dead 
(not executed in any iteration of the loop); we have to make sure we 
don't introduce a race in that case.  As you note, if the store is 
executed in any iteration, and there isn't any synchronization inside 
the loop, we can ignore the possibility of a race.

> Is there an option to change this behavior in gcc or clang? It seems 
> that gcc is assuming a single thread, while clang is not. It would be 
> nice to have the same setting here when comparing them. Or am I 
> missing something?

There is no option to control it; theoretically, we could add one, I 
guess, but it's a minor optimization in most cases, and most non-trivial 
programs are concurrent anyway.

For your loop, the condition of the if statement is a comparison of a 
constant and an induction variable, so it's possible to prove the store 
is always executed.  I assume gcc is proving that (either directly, or 
by performing some other transform which makes the condition trivial).

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project



More information about the llvm-dev mailing list