[LLVMdev] C as used/implemented in practice: analysis of responses

Wed Jul 8 05:00:09 PDT 2015

----- Original Message -----
> From: "Sean Silva" <chisophugis at gmail.com>
> To: "Chris Lattner" <clattner at apple.com>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Tuesday, July 7, 2015 9:18:41 PM
> Subject: Re: [LLVMdev] C as used/implemented in practice: analysis of	responses
> 
> On Tue, Jul 7, 2015 at 10:26 AM, Chris Lattner < clattner at apple.com >
> wrote:
> 
> 
> On Jul 1, 2015, at 3:20 PM, Sean Silva < chisophugis at gmail.com >
> wrote:
> 
> On Wed, Jul 1, 2015 at 12:22 PM, Russell Wallace <
> russell.wallace at gmail.com > wrote:
> 
> 
> 
> I am arguing in favor of a point, and I understand you disagree with
> it, but I don't think I'm dismissing any use cases except a very
> small performance increment.
> 
> 
> I'm sure Google has numbers about how much electricity/server cost
> they save for X% performance improvement.
> I'm sure Apple has numbers about how much money they make with X%
> improved battery life.
> I'm not convinced that the cost of some of these bugs is actually
> larger than the benefit of faster programs. Nor am I convinced about
> the inverse. I'm just pointing out that pointing to a "bad bug"
> caused by a certain optimization without comparing the cost of the
> bug to the benefit of the optimization is basically meaningless.
> You'll need to quantify "very small performance improvement" and put
> it in context of the bugs you're talking about.
> 
> As with many things, it is more complicated than that. The
> performance effects of optimizations are often non-linear, and you
> can take a look at many of the worst forms of UB in C and easily
> show cases where they allow 2x speedups, not just 2%.
> 
> 
> For example, consider undefined behavior for integer overflow:
> 
> 
> for (int i = 0; i <= N; ++i) {
> …
> }
> 
> 
> When compiling for a 64-bit machine, you really want to promote the
> induction variable to 64-bits. Further, knowing the trip count of a
> loop is extremely important for many loop optimizations.
> Unfortunately, without being able to assume undefined integer
> wraparound, you get neither of these from C.
> 
> 
> -fstrict-aliasing is another great example. In many cases, it makes
> no difference whatsoever. OTOH, on code like:
> 
> 
> void doLoopThing(float *array, int *N) {
> for (int i = 0; i < *N; ++i) {
> array[i] = array[i] + 1;
> }
> 
> 
> You can easily get a 2x or more speedup due to auto-vectorization if
> you can assume -fstrict-aliasing. Of course usually you wouldn’t
> write this code, you’d get this because doLoopThing is a template,
> and N is passed in as a reference.
> 
> 
> 
> 
> You're absolutely right that it's complicated, but I don't think this
> is the best example.
> 
> 
> 
> The 2x speedup optimizations you're talking about can be done even
> without strict aliasing or signed overflow. You just have to emit
> runtime checks.

Memory overlap / dependence checks are fine for simple loops, but for loops dealing with lots of arrays, you can easily surpass the compiler's threshold for the number of checks it is willing to insert. Raising the threshold is often bad because the checks can become expensive for loops that, on average, have low (dynamic) trip counts.

 -Hal

> It's analogous to emitting the "remainder" loop when
> doing autovectorization. Imagine if the standard made it undefined
> for a loop over an array of more than 4 floats to not be suitable
> for vectorization: sure, that would be nice and make the
> vectorizer's life easier, but we can mostly get the "big bang"
> speedup without it, in the sense that not having the standard say it
> is UB probably results in closer to 2% "slowdown" in the aggregate
> across all these loops (factoring in icache, etc.) than 2x.
> 
> 
> -- Sean Silva
> 
> 
> Anyway, I could go on and on here, and I’ve spent a lot of time over
> the years thinking about how to improve the situation: can we make
> clang detect more of these, can we make the optimizer more
> conservative in certain cases etc? This is why (for example) our
> TBAA uses simple structural points-to analysis before using TBAA.
> With GCC’s implementation (circa GCC 4.0, I have no idea what they
> are doing now), GCC would “miscompile” code like:
> 
> 
> float bitcast(int x) {
> return *(float*)&x;
> }
> 
> 
> This code is a TBAA violation, but is also “obvious” what the
> programmer meant. LLVM being “nicer” in this case is a feature. It
> is irritating that the union version of this is also technically UB
> or implementation defined behavior, so that isn’t portable either (a
> C programmer needs to magically know that memcpy is the safe way to
> do this).
> 
> 
> However, as I’ve continued to dig into this, my feeling is that there
> really is no satisfactory solution to these issues. The problem here
> are pervasive structural problems in the C language: In the first
> example above, it is that “int” is the default type people generally
> reach for, not “long”, and that array indexing is expressed with
> integers instead of iterators. This isn’t something that we’re going
> to “fix" in the C language, the C community, or the body of existing
> C code. Likewise, while C++ has made definite progress here by
> replacing some of these idioms (e.g. with iterators), it adds its
> own layers of UB on, and doesn’t actually *subtract* the worst
> things in C.
> 
> 
> My conclusion is that C, and derivatives like C++, is a very
> dangerous language the write safety/correctness critical software
> in, and my personal opinion is that it is almost impossible to write
> *security* critical software in it. This isn’t really C’s fault if
> you consider where it was born and evolved from (some joke that it
> started as a *very* nice high level assembler for the PDP11
> https://en.wikipedia.org/wiki/C_(programming_language)#Early_developments
> :-).
> 
> 
> There are many more modern and much safer languages that either
> eliminate the UB entirely through language design (e.g. using a
> garbage collector to eliminate an entire class of memory safety
> issues, completely disallowing pointer casts to enable TBAA safely,
> etc), or by intentionally spending a bit of performance to provide a
> safe and correct programming model (e.g. by guaranteeing that
> integers will trap if they overflow). My hope is that the industry
> will eventually move to better systems programming languages, but
> that will take a very very long time...
> 
> 
> -Chris
> 
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory