[cfe-dev] Optimization creates infinite loop

Alex Rønne Petersen via cfe-dev cfe-dev at lists.llvm.org
Fri Nov 1 05:34:04 PDT 2019


Hello,

On most 64-bit targets, loading a double (a 64-bit quantity) from
memory is atomic already, and relaxed memory order means no memory
barriers, so in practice it's really just informing the compiler that
the variable will be accessed from multiple threads.

Of course, on 32-bit targets, a 64-bit atomic load is not the same
thing as a 64-bit non-atomic load. But if you don't do an atomic load
on such targets and instead use volatile, your program would be likely
to encounter word tearing when reading the variable (putting aside the
fact that the program would be illegal due to undefined behavior
resulting from the data race).

On Fri, Nov 1, 2019 at 1:11 PM Gaier, Bjoern <Bjoern.Gaier at horiba.com> wrote:
>
> Hey,
>
> Thanks for the input! Does __atomic_load_n not generated additional overhead? In terms of instruction time and so on
>
> -----Original Message-----
> From: Alex Rønne Petersen <alex at alexrp.com>
> Sent: 01 November 2019 11:47
> To: Gaier, Bjoern <Bjoern.Gaier at horiba.com>
> Cc: cfe-dev at lists.llvm.org
> Subject: Re: [cfe-dev] Optimization creates infinite loop
>
> Hello,
>
> So in that case you'd want to do something like:
>
> extern double RT_Time;
> void Sample1sec()
> {
>        const double startTime = RT_Time;
>        while (__atomic_load_n (&RT_Time, __ATOMIC_RELAXED) < (startTime + 5.0005)); }
>
> This is assuming you're unable to change the definition side of RT_Time. If you *are* able to change it, you can (and should) just make it an std::atomic<double> and use .load()/.store() with memory_order_relaxed to interact with it. volatile would be the wrong tool here since you're dealing with threads.
>
>
> On Fri, Nov 1, 2019 at 11:35 AM Gaier, Bjoern <Bjoern.Gaier at horiba.com> wrote:
> >
> > Hey,
> >
> > You are right with your assumption - RT_Time is coming from a different object file and declared global there. You will also find the second thread there, that will modify the value.
> >
> > I expected to tell Clang with "extern" that stuff is happening to this value in a different source file, and telling it without having the 'const' specifier, that this value will be changed.
> >
> > -----Original Message-----
> > From: Alex Rønne Petersen <alex at alexrp.com>
> > Sent: 01 November 2019 11:26
> > To: Gaier, Bjoern <Bjoern.Gaier at horiba.com>
> > Cc: cfe-dev at lists.llvm.org
> > Subject: Re: [cfe-dev] Optimization creates infinite loop
> >
> > Hello,
> >
> > The compiler is making the (correct) assumption that RT_Time cannot be changed by anything other than the currently executing thread since you have not told it otherwise, and since the loop body (or rather, lack thereof) does not modify it.
> >
> > What exactly *is* changing RT_Time? Since the thread that executes
> > Sample1sec() is obviously intended to spin until it changes, I can only think that either some hardware is changing it or another thread is. In which case, see my earlier reply. The fact that it's extern isn't really relevant; I'm fairly confident Clang would generate the same code if you removed the extern keyword.
> >
> > On Fri, Nov 1, 2019 at 11:18 AM Gaier, Bjoern <Bjoern.Gaier at horiba.com> wrote:
> > >
> > > Hey,
> > >
> > > Thanks for the answer! As said, volatile fixed the issue for us - however you said:
> > > "the compiler has no reason to assume that any external actor can change the value of it"
> > > I would understand that sentence, when the variable would be declared 'const' - but it was not. Isn't the const statement supposed to declare such situations, when the value will not be changed?
> > >
> > > That sounds like any extern variable that will be changed should be marked 'volatile'.
> > >
> > > -----Original Message-----
> > > From: Alex Rønne Petersen <alex at alexrp.com>
> > > Sent: 01 November 2019 11:11
> > > To: Gaier, Bjoern <Bjoern.Gaier at horiba.com>
> > > Cc: cfe-dev at lists.llvm.org
> > > Subject: Re: [cfe-dev] Optimization creates infinite loop
> > >
> > > Hello,
> > >
> > > This isn't really surprising. Since RT_Time is not marked volatile, nor is it accessed through std::atomic functions (or the equivalent builtins), the compiler has no reason to assume that any external actor can change the value of it. So the load is moved out of the loop since doing it repeatedly would, from the compiler's point of view, be wasteful.
> > >
> > > If RT_Time can be changed by hardware or something along those lines, then volatile is appropriate. If another thread can change it, you should be using atomic APIs to access it.
> > >
> > >
> > > On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> > > >
> > > > Hello Clang-Developer,
> > > >
> > > >
> > > >
> > > > We ran into a – for us big problem – with the Clang compiler. We had the following code:
> > > >
> > > > extern double RT_Time;
> > > >
> > > > void Sample1sec()
> > > >
> > > > {
> > > >
> > > >        const double startTime = RT_Time;
> > > >
> > > >        while(RT_Time < (startTime + 5.0005));
> > > >
> > > > }
> > > >
> > > >
> > > >
> > > > The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
> > > >
> > > > "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
> > > >
> > > > .Lfunc_begin0:
> > > >
> > > >                 .cv_func_id 0
> > > >
> > > > # %bb.0:
> > > >
> > > >                 .cv_file 1
> > > > "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviatha
> > > > n\ \C M_Driver1\\CM_Driver1.cpp"
> > > > "4E92768969662F6E513C11595FEA9D63" 1
> > > >
> > > >                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
> > > >
> > > >                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] #
> > > > xmm0 = mem[0],zero
> > > >
> > > > .Ltmp0:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 movsd  xmm1, qword ptr [rip +
> > > > __real at 40140083126e978d] # xmm1 = mem[0],zero
> > > >
> > > >                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
> > > >
> > > >                 addsd   xmm1, xmm0
> > > >
> > > >                 ucomisd               xmm1, xmm0
> > > >
> > > >                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
> > > >
> > > >                 jbe         .LBB0_2
> > > >
> > > > .Ltmp1:
> > > >
> > > >                 .p2align                4, 0x90
> > > >
> > > > .LBB0_1:                                # =>This Inner Loop Header: Depth=1
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 jmp        .LBB0_1
> > > >
> > > > .Ltmp2:
> > > >
> > > > .LBB0_2:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
> > > >
> > > >                 ret
> > > >
> > > > .Ltmp3:
> > > >
> > > > .Lfunc_end0:
> > > >
> > > >
> > > >
> > > > Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
> > > >
> > > > Is this a bug or do I understand something wrong?
> > > >
> > > >
> > > >
> > > > Kind greetings and thank you for the help in advance,
> > > >
> > > > Björn
> > > >
> > > >
> > > >
> > > > P.S.
> > > >
> > > > My compile flags (when using -###) are:
> > > >
> > > > clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
> > > >
> > > >
> > > >
> > > > clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
> > > >
> > > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB
> > > > 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > > > Fukushima. Junichi Tajika
> > > > _______________________________________________
> > > > cfe-dev mailing list
> > > > cfe-dev at lists.llvm.org
> > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > > Fukushima. Junichi Tajika
> > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > Fukushima. Junichi Tajika
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika



More information about the cfe-dev mailing list