[cfe-dev] [RFC] add Function Attribute to disable optimization

Mon Jun 17 17:23:48 PDT 2013

On Mon, Jun 17, 2013 at 7:19 PM, Joshua Cranmer 🐧 <Pidgeot18 at gmail.com> wrote:
> On 6/17/2013 5:32 PM, Jeffrey Walton wrote:
>>
>> On Mon, Jun 17, 2013 at 6:23 PM, Sean Silva <silvas at purdue.edu> wrote:
>>>
>>> On Mon, Jun 17, 2013 at 10:29 AM, Jeffrey Walton <noloader at gmail.com>
>>> wrote:
>>>>
>>>> First is to ensure dead-writes are not removed. For example, a
>>>> function that zeroizes or wipes memory is subject to removal during
>>>> optimization. I often have to look at program's disassembly to ensure
>>>> the memset is not removed by the optimizer.
>>>
>>> Appropriate use of `volatile` is probably sufficient for this use case.
>>
>> That brings up a good point. As I understand it, volatile is
>> essentially implementation defined. What is Clang/LLVM's
>> interpretation?
>
> Volatile has an explicit definition in C99/C11/C++03/C++11, and it's roughly
> the same in all of them. Volatile objects "may be modified in ways unknown
> to the
> implementation or have other unknown side effects" (to quote C99), and the
> implementation must therefore preserve the accesses and their order even
> when optimizing.
OK, thanks. I must have been quoted something else on another list.
I'll try and locate the email.

>> Here's what I know. Microsoft's interpretation allows me to use
>> volatile for the situation under MSVC++ [1]. GCC's interpretation of
>> volatile is for memory mapped hardware, so it does not allow me to use
>> the qualifier to tame the optimizer [2].
>
> Microsoft decided that they wanted to add additional semantics to volatile
> to make it enforce load/store barriers for multithreaded code. Since C11 and
> C++11 added explicit support for multithreaded atomic operations and memory
> models, those extra semantics are unnecessary; with the introduction of a
> processor with a relaxed memory model, it's also undesirable.
Microsoft was not the problem - it was GCC since the only use of
volatile is memory mapped hardware.

> For as vague as your description is, volatile appears to be sufficient for
> your use case (it's incorrect with respect to multithreaded memory
> visibility issues, but it's no less correct than not optimizing the code).
Here was the sample I asked about some time ago. It was from a slide
deck at [1].

volatile void clean_memory(volatile void* dest, size_t len)
{
    volatile unsigned char* p;
    for(p = (volatile unsigned char*)dest; len; dest[--len] = 0)
      ;;
}

Because the pointers above ('dest' and `p`) were not memory mapped
addresses, the GCC folks consider it an abuse. The same is true for
this too:

static volatile void* g_dummy;

static void clean_memory(volatile void* dest, size_t len) {
    memset(dest, 0x00, len);
    g_dummy = dest;
}

Again, the GCC folks consider it an abuse since the memory is not
mapped from hardware.

Jeff

[1] http://www.slideshare.net/guanzhi/crypto-with-openssl