[cfe-dev] Optimization creates infinite loop

Keane, Erich via cfe-dev cfe-dev at lists.llvm.org
Fri Nov 1 07:00:32 PDT 2019


Yep they are.  See the C paper here that has a great justification for it: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1528.htm

Everything but "while(1)" is UB for infinite loops.  

-----Original Message-----
From: Gaier, Bjoern <Bjoern.Gaier at horiba.com> 
Sent: Friday, November 1, 2019 6:58 AM
To: Keane, Erich <erich.keane at intel.com>; Alex Rønne Petersen <alex at alexrp.com>; Clang Dev <cfe-dev at lists.llvm.org>
Subject: RE: [cfe-dev] Optimization creates infinite loop

Thank you Alex! I understood that!

Infinite loops are undefined behaviour? I thought they are a "common" thing in hardware...

-----Original Message-----
From: Keane, Erich <erich.keane at intel.com>
Sent: 01 November 2019 14:01
To: Alex Rønne Petersen <alex at alexrp.com>; Gaier, Bjoern <Bjoern.Gaier at horiba.com>; Clang Dev <cfe-dev at lists.llvm.org>
Subject: RE: [cfe-dev] Optimization creates infinite loop

To answer the initial question:
> > > > extern double RT_Time;
> > > >
> > > > void Sample1sec()
> > > >
> > > > {
> > > >
> > > >        const double startTime = RT_Time;
> > > >
> > > >        while(RT_Time < (startTime + 5.0005));
> > > >
> > > > }

The above is 1 of two things, an infinite loop (nothing else modifies the value), or a race condition (something else modifies the value in a different thread).  Race conditions are Undefined Behavior, so the compiler is allowed to assume that it is not the case.

Interestingly, infinite loops are ALSO undefined behavior, so the compiler would be welcome to actually remove the entire loop if it thought it was valuable.

-----Original Message-----
From: cfe-dev <cfe-dev-bounces at lists.llvm.org> On Behalf Of Alex Rønne Petersen via cfe-dev
Sent: Friday, November 1, 2019 5:34 AM
To: Gaier, Bjoern <Bjoern.Gaier at horiba.com>
Cc: cfe-dev at lists.llvm.org
Subject: Re: [cfe-dev] Optimization creates infinite loop

Hello,

On most 64-bit targets, loading a double (a 64-bit quantity) from memory is atomic already, and relaxed memory order means no memory barriers, so in practice it's really just informing the compiler that the variable will be accessed from multiple threads.

Of course, on 32-bit targets, a 64-bit atomic load is not the same thing as a 64-bit non-atomic load. But if you don't do an atomic load on such targets and instead use volatile, your program would be likely to encounter word tearing when reading the variable (putting aside the fact that the program would be illegal due to undefined behavior resulting from the data race).

On Fri, Nov 1, 2019 at 1:11 PM Gaier, Bjoern <Bjoern.Gaier at horiba.com> wrote:
>
> Hey,
>
> Thanks for the input! Does __atomic_load_n not generated additional 
> overhead? In terms of instruction time and so on
>
> -----Original Message-----
> From: Alex Rønne Petersen <alex at alexrp.com>
> Sent: 01 November 2019 11:47
> To: Gaier, Bjoern <Bjoern.Gaier at horiba.com>
> Cc: cfe-dev at lists.llvm.org
> Subject: Re: [cfe-dev] Optimization creates infinite loop
>
> Hello,
>
> So in that case you'd want to do something like:
>
> extern double RT_Time;
> void Sample1sec()
> {
>        const double startTime = RT_Time;
>        while (__atomic_load_n (&RT_Time, __ATOMIC_RELAXED) < 
> (startTime + 5.0005)); }
>
> This is assuming you're unable to change the definition side of RT_Time. If you *are* able to change it, you can (and should) just make it an std::atomic<double> and use .load()/.store() with memory_order_relaxed to interact with it. volatile would be the wrong tool here since you're dealing with threads.
>
>
> On Fri, Nov 1, 2019 at 11:35 AM Gaier, Bjoern <Bjoern.Gaier at horiba.com> wrote:
> >
> > Hey,
> >
> > You are right with your assumption - RT_Time is coming from a different object file and declared global there. You will also find the second thread there, that will modify the value.
> >
> > I expected to tell Clang with "extern" that stuff is happening to this value in a different source file, and telling it without having the 'const' specifier, that this value will be changed.
> >
> > -----Original Message-----
> > From: Alex Rønne Petersen <alex at alexrp.com>
> > Sent: 01 November 2019 11:26
> > To: Gaier, Bjoern <Bjoern.Gaier at horiba.com>
> > Cc: cfe-dev at lists.llvm.org
> > Subject: Re: [cfe-dev] Optimization creates infinite loop
> >
> > Hello,
> >
> > The compiler is making the (correct) assumption that RT_Time cannot be changed by anything other than the currently executing thread since you have not told it otherwise, and since the loop body (or rather, lack thereof) does not modify it.
> >
> > What exactly *is* changing RT_Time? Since the thread that executes
> > Sample1sec() is obviously intended to spin until it changes, I can only think that either some hardware is changing it or another thread is. In which case, see my earlier reply. The fact that it's extern isn't really relevant; I'm fairly confident Clang would generate the same code if you removed the extern keyword.
> >
> > On Fri, Nov 1, 2019 at 11:18 AM Gaier, Bjoern <Bjoern.Gaier at horiba.com> wrote:
> > >
> > > Hey,
> > >
> > > Thanks for the answer! As said, volatile fixed the issue for us - however you said:
> > > "the compiler has no reason to assume that any external actor can change the value of it"
> > > I would understand that sentence, when the variable would be declared 'const' - but it was not. Isn't the const statement supposed to declare such situations, when the value will not be changed?
> > >
> > > That sounds like any extern variable that will be changed should be marked 'volatile'.
> > >
> > > -----Original Message-----
> > > From: Alex Rønne Petersen <alex at alexrp.com>
> > > Sent: 01 November 2019 11:11
> > > To: Gaier, Bjoern <Bjoern.Gaier at horiba.com>
> > > Cc: cfe-dev at lists.llvm.org
> > > Subject: Re: [cfe-dev] Optimization creates infinite loop
> > >
> > > Hello,
> > >
> > > This isn't really surprising. Since RT_Time is not marked volatile, nor is it accessed through std::atomic functions (or the equivalent builtins), the compiler has no reason to assume that any external actor can change the value of it. So the load is moved out of the loop since doing it repeatedly would, from the compiler's point of view, be wasteful.
> > >
> > > If RT_Time can be changed by hardware or something along those lines, then volatile is appropriate. If another thread can change it, you should be using atomic APIs to access it.
> > >
> > >
> > > On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> > > >
> > > > Hello Clang-Developer,
> > > >
> > > >
> > > >
> > > > We ran into a – for us big problem – with the Clang compiler. We had the following code:
> > > >
> > > > extern double RT_Time;
> > > >
> > > > void Sample1sec()
> > > >
> > > > {
> > > >
> > > >        const double startTime = RT_Time;
> > > >
> > > >        while(RT_Time < (startTime + 5.0005));
> > > >
> > > > }
> > > >
> > > >
> > > >
> > > > The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
> > > >
> > > > "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
> > > >
> > > > .Lfunc_begin0:
> > > >
> > > >                 .cv_func_id 0
> > > >
> > > > # %bb.0:
> > > >
> > > >                 .cv_file 1
> > > > "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviat
> > > > ha n\ \C M_Driver1\\CM_Driver1.cpp"
> > > > "4E92768969662F6E513C11595FEA9D63" 1
> > > >
> > > >                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
> > > >
> > > >                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] #
> > > > xmm0 = mem[0],zero
> > > >
> > > > .Ltmp0:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 movsd  xmm1, qword ptr [rip + 
> > > > __real at 40140083126e978d] # xmm1 = mem[0],zero
> > > >
> > > >                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
> > > >
> > > >                 addsd   xmm1, xmm0
> > > >
> > > >                 ucomisd               xmm1, xmm0
> > > >
> > > >                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
> > > >
> > > >                 jbe         .LBB0_2
> > > >
> > > > .Ltmp1:
> > > >
> > > >                 .p2align                4, 0x90
> > > >
> > > > .LBB0_1:                                # =>This Inner Loop Header: Depth=1
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 jmp        .LBB0_1
> > > >
> > > > .Ltmp2:
> > > >
> > > > .LBB0_2:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
> > > >
> > > >                 ret
> > > >
> > > > .Ltmp3:
> > > >
> > > > .Lfunc_end0:
> > > >
> > > >
> > > >
> > > > Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
> > > >
> > > > Is this a bug or do I understand something wrong?
> > > >
> > > >
> > > >
> > > > Kind greetings and thank you for the help in advance,
> > > >
> > > > Björn
> > > >
> > > >
> > > >
> > > > P.S.
> > > >
> > > > My compile flags (when using -###) are:
> > > >
> > > > clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
> > > >
> > > >
> > > >
> > > > clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
> > > >
> > > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 
> > > > 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, 
> > > > Takeshi Fukushima. Junichi Tajika 
> > > > _______________________________________________
> > > > cfe-dev mailing list
> > > > cfe-dev at lists.llvm.org
> > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 
> > > 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi 
> > > Fukushima. Junichi Tajika
> > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, 
> > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi 
> > Fukushima. Junichi Tajika
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, 
> USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi 
> Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika


More information about the cfe-dev mailing list