[llvm-dev] GEP with a null pointer base

Mon Jul 31 09:48:13 PDT 2017

On Mon, Jul 31, 2017 at 7:40 AM Peter Lawrence <peterl95124 at sbcglobal.net>
wrote:

> Dave,
>           Dead code elimination is generally done in a pass called dead
> code elimination,
> Can you give concrete examples why the same would not be true for UB code
> elimination ?
>

I haven't actually looked at how optimizations on the basis of the code
being UB-free cause code to be eliminated - I'd hazard a guess that many
optimizations would replace certain code with undef or unreachable not only
a specific pass intended for that purpose. At least I'm pretty sure that's
how it is today (I know we don't have a "UB code elimination" pass - it
comes out as a result of many other passes making steps like mutation to
undef/unreachable)

> Yes, speculatively hoisting code requires it to be UB-free, but that has
> nothing to do with
> UBCE deleting entire blocks of code because of the existence of UB.
>

There's more similarity there than might first appear - if we assume the
code is UB-free, then when an unconditional store to a null pointer is
found we assume that must be dynamically unreachable (& the only reason it
came up is due to inlining, etc - not that the user wrote/intended to store
to null - maybe this particular call site got inlined and now what was a
dynamic property (the user always called that function with a parameter
that ensured the program didn't go down that code path and load from the
pointer - but other callers do, so that's fine and good and correct) is a
static property of this particular version/post-inlining state so the
compiler can trivially observe it and replace it with unreachable)

For example:

inline __attribute__((always_inline)) void f1(bool b, int *i) {
  if (b)
    *i = 3;
}

bool f2(int*);

void f3() {
  int *x = nullptr;
  f1(f2(x), x);
}

For the sake of argument, you could imagine that this code arose from
perfectly reasonable user code - the user knows that 'f2' always returns
false for a null pointer because it says so in the name (in the user code,
with real names, etc).

& it's actually Jump Threading that makes the code in the 'if' block (once
inlined into f3) unreachable:

*** IR Dump Before Jump Threading ***
; Function Attrs: uwtable
define void @_Z2f3v() local_unnamed_addr #1 {
entry:
  %call = call zeroext i1 @_Z2f2Pi(i32* null)
  br i1 %call, label %if.then.i, label %_Z2f1bPi.exit

if.then.i:                                        ; preds = %entry
  store i32 3, i32* null, align 4, !tbaa !2
  br label %_Z2f1bPi.exit

_Z2f1bPi.exit:                                    ; preds = %entry,
%if.then.i
  ret void
}
*** IR Dump After Jump Threading ***
; Function Attrs: uwtable
define void @_Z2f3v() local_unnamed_addr #1 {
entry:
  %call = call zeroext i1 @_Z2f2Pi(i32* null)
  br i1 %call, label %if.then.i, label %_Z2f1bPi.exit

if.then.i:                                        ; preds = %entry
  call void @llvm.trap()
  unreachable

_Z2f1bPi.exit:                                    ; preds = %entry
  ret void
}

Sure, it didn't remove the block, but it certainly removed the code. Nice
enough to replace it with a trap, though. (so, yeah, not a perfect example
- but you get the idea, all sorts of optimizations might decide to simplify
code on the basis that UB can't actually happen)

The former requires
> an analysis proving UB-absense, the later requires an analysis proving
> UB-presence. But it
> isn’t logical to say you must delete UB-presence to enable proving
> UB-absense, because
> in the real world after programs have passed static-analysis there won’t
> be any UB-presence
> to delete.
>

I don't understand this last bit, sorry. Without knowledge of the whole
program I can't prove that the example above will never execute UB - but
the C++ language lets me, the compiler, assume that that won't happen &
optimize on that basis. No local static analysis could prove it - but the
assumption helps improve the code. (& even if we had the whole program,
some invariants would be difficult/impossible to practically prove -
programmer indexes into an array based on the number of elementsn in a hash
table - pretty darn difficult to statically analyze the whole probing hash
table system, and the dynamic logic that populates it, to prove that only N
elements will ever end up in the hash table, and thus using its size as an
index into an array of N elements is safe - but we can assume, so there's
no need to synthesize bounds checks around the array access, etc)

>
>
> Peter Lawrence.
>
>
> On Jul 27, 2017, at 7:45 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Thu, Jul 27, 2017 at 7:30 PM Peter Lawrence <peterl95124 at sbcglobal.net>
> wrote:
>
>> Dave,
>>           The way I see it there should be just one pass that implements
>> deleting UB (maybe it would come to be called UBCE), and that one pass
>> should have a command line option simply for the reason than all passes
>> should have one.
>>
>
> I would hazard a guess that would be very difficult to implement the same
> functionality (as the compiler has today) efficiently with that approach -
> many optimizations would be inhibited from making progress because they
> couldn't prove the code didn't have UB, etc.
>
> Generally, even deleting UB code isn't approached directly, but may come
> about as a consequence of various other steps the compiler is taking.
>
> - Dave
>
>
>>
>>
>> Peter Lawrence.
>>
>>
>> On Jul 26, 2017, at 10:02 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>
>>
>> On Wed, Jul 26, 2017 at 9:23 PM Peter Lawrence <peterl95124 at sbcglobal.net>
>> wrote:
>>
>>> David,
>>>            -fsanitize=undefined sounds great, but is not quite what I
>>> want.
>>>
>>>
>>> I recently ran into a problem with "CodeGen/MachineSink.cpp” [*],  for
>>> a target
>>> that has to expand Select into control flow.
>>> The original IR had two select in a row that were based on the same
>>> condition,
>>> so the CMP that sets the FLAGS reg in the second select was MCSE’ed to
>>> the
>>> earlier CMP in the first select, so here we see the second Select
>>> without a CMP:
>>>
>>> BB#10: derived from LLVM BB %for.body.5
>>>     Predecessors according to CFG: BB#3 BB#9
>>>         %vreg49<def> = PHI %vreg47, <BB#9>, %vreg48, <BB#3>;
>>> DataRegs:%vreg49,%vreg47,%vreg48
>>>
>>>                         ////  <=== this SLLI clobbers FLAGS <============
>>>         %vreg46<def> = SLLI %vreg5, 1, %FLAGS<imp-def,dead>;
>>> DataRegs:%vreg46,%vreg5
>>>         BCC 2, <BB#12>, %FLAGS<imp-use>
>>>     Successors according to CFG: BB#11 BB#12
>>>
>>>
>>> The problem is that Machine Code Sinking put an “SLLI" instruction,
>>>  that
>>> modifies the FLAGS registers, in between the CMP and the BCC.
>>>
>>> The way I was able to work around this problem was to add
>>> a command line option to “MachineSink.cpp” that defaults to false,
>>> and add a check in its runOnMachineFunction() to omit this pass.
>>>
>>>
>>> But I should not have had to, every FunctionPass and MachineFunctionPass
>>> should have a name and a command line option to disable it by name.
>>>
>>> Other compilers I’ve worked on have had such options, and I use them to
>>> track down compiler bugs.  In this case I instead had to
>>> "—debug-after-all"
>>> and very tediously search through thousands of lines of output to locate
>>> this bug.
>>>
>>>
>>> So I hope you can see where I’m coming from, the pass that deletes UB
>>> should be no different, I should be able to disable it from the command
>>> line
>>> as a matter of course.
>>>
>>> Thoughts ?
>>>
>>
>> These things seem like distinct goals to me - debuggability of the
>> compiler and dealing with UB. I think -fsanitize=undefined is a pretty good
>> way to help users deal with, diagnose, and act on UB. It isolates the issue
>> - rather than having all optimizations have ways of switching themselves
>> off when part of their analysis relies on UB - instead the optimizations
>> rely on the IR definitions to make valid transformations, and it's a
>> separate pass in the frontend (clang) that can add extra checks to avoid UB
>> (& diagnose it as such, if that's the best thing to do).
>>
>> One could make an IR version of UBSan, that could take some IR and add
>> null checks around every load/store, all the other things UBSan does. But I
>> wouldn't expect that would be a priority for anyone to build (as it's more
>> a compiler developer tool at that point - smaller audience, etc).
>>
>> For your debugging situation - bugpoint can slice & dice the pass list to
>> try to reduce both the set of code being optimized, and the particular
>> optimization that seems to be causing the problem. That might be worth a
>> shot for you in cases like this?
>>
>> Otherwise I find print-after-all or the like to be pretty handy. Usually
>> I reduce the input code first (with something like creduce or delta) so the
>> IR isn't massive/hard to read. (bugpoint can help with that reduction as
>> well)
>>
>> - Dave
>>
>>
>>>
>>>
>>> Peter.
>>>
>>>
>>> [* I haven’t reported this as a bug yet because I’m on 3.7.1, and
>>> haven’t had time
>>>     to replicate it in 4.0.1,  but should be able to within a month.  My
>>> target resembles
>>>     MSP430, so I’ll try to replicate it for that target in 4.0.1 ]
>>>
>>>
>>>
>>> On Jul 24, 2017, at 9:08 AM, David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>>
>>>
>>> On Mon, Jul 24, 2017 at 9:02 AM Peter Lawrence <
>>> peterl95124 at sbcglobal.net> wrote:
>>>
>>>> On Jul 21, 2017, at 10:55 PM, Mehdi AMINI <joker.eph at gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> 2017-07-21 22:44 GMT-07:00 Peter Lawrence <peterl95124 at sbcglobal.net>:
>>>>
>>>>> Mehdi,
>>>>>            Hal’s transformation only kicks in in the *presence* of UB
>>>>>
>>>>
>>>> No, sorry I entirely disagree with this assertion: I believe we
>>>> optimize program where there is no UB. We delete dead code, code that never
>>>> runs, so it is code that does not exercise UB.
>>>>
>>>>
>>>>
>>>> Mehdi,
>>>>       I had to read that sentence several times to figure out what the
>>>> problem
>>>> is, which is sloppy terminology on my part
>>>>
>>>> Strictly speaking the C standard uses “undefined behavior” to describe
>>>> what
>>>> happens at runtime when an “illegal” construct is executed.  I have
>>>> been using
>>>> “undefined behavior” and UB to describe the “illegal” construct whether
>>>> it is
>>>> executed or not.
>>>>
>>>> Hence I say “Hal’s transform is triggered by UB”, when I should be
>>>> saying
>>>> “Hal’s transformation is triggered by illegal IR”.
>>>>
>>>> All I can say is I’m not the only one being sloppy, what started this
>>>> entire
>>>> conversation is the paper titled “Taming Undefined Behavior in LLVM”,
>>>> while
>>>> the correct title would be “Taming Illegal IR in LLVM”.  (I think we
>>>> are all
>>>> pretty confident that LLVM itself is UB-free, or at least we all hope
>>>> so :-).
>>>> I believe you are being sloppy when you say "we optimize program
>>>> where there is no UB”, because I believe you mean "we optimize program
>>>> under the assumption that there is no UB”. In other words we recognize
>>>> “Illegal” constructs and then assume they are unreachable, and delete
>>>> them, even when we can’t prove by any other means that they are
>>>> unreachable. We don’t know that there is no (runtime) UB, we just
>>>> assume it.
>>>>
>>>>
>>>> The example Hal showed does not exhibit UB, it is perfectly valid
>>>> according to the standard.
>>>>
>>>>
>>>> Whether it exhibits UB at runtime or not is not the issue, the issue is
>>>> what
>>>> a static analyzer or compiler can tell before runtime, see below
>>>>
>>>>
>>>>
>>>>> , and
>>>>> it does not matter how that UB got there, whether by function inlining
>>>>> or without function inlining.
>>>>>
>>>>> The problem with Hal’s argument is that the compiler does not have
>>>>> a built in ouija board with which it can conjure up the spirit of the
>>>>> author of the source code and find out if the UB was intentional
>>>>> with the expectation of it being deleted, or is simply a bug.
>>>>> Function inlining does not magically turn a bug into not-a-bug, nor
>>>>> does post-inlining simplification magically turn a bug into not-a-bug.
>>>>>
>>>>> Let me say it again:  if the compiler can find this UB (after whatever
>>>>> optimizations it takes to get there) then the static analyzer must
>>>>> be able to do the same thing, forcing the programmer to fix it
>>>>> rather than have the compiler optimize it.
>>>>>
>>>>
>>>> This is again incorrect: there is no UB in the program, there is
>>>> nothing the static analyzer should report.
>>>>
>>>>
>>>>
>>>> Hal’s example starts with this template
>>>>
>>>> template <typename T>
>>>> int do_something(T mask, bool cond) {
>>>>   if (mask & 2)
>>>>     return 42;
>>>>
>>>>   if (cond) {
>>>>     T high_mask = mask >> 48;                // UB if sizeof(T) < 8,
>>>> and cond true
>>>>     if (high_mask > 5)
>>>>       do_something_1(high_mask);
>>>>     else
>>>>       do_something_2();
>>>>   }
>>>>
>>>>   return 0;
>>>> }
>>>>
>>>>
>>>> Which is then instantiated with T = char,
>>>> and where it is impossible for either a static analyzer or a
>>>> compiler to figure out and prove that ‘cond’ is always false.
>>>>
>>>> Hence a static analyzer issues a warning about the shift,
>>>> while llvm gives no warning and instead optimizes the entire
>>>> if-statement away on the assumption that it is unreachable.
>>>>
>>>> Yes a static analyzer does issue a warning in this case.
>>>>
>>>>
>>>> This is not the only optimization to be based on assumption
>>>> rather than fact, for example type-based-alias-analysis is
>>>> based on the assumption that the program is free of this sort
>>>> of aliasing. The difference is that a user can disable TBAA
>>>> and only TBAA if a program seems to be running incorrectly
>>>> when optimized and thereby possibly track down a bug, but
>>>> so far there is no command line option to disable UB-based-
>>>> analysis (or ‘illegal-IR-based” :-), but there really needs to be.
>>>>
>>>> Do we at least agree on that last paragraph ?
>>>>
>>>
>>> We likely agree it's good to have tools to help developers
>>> identify/diagnose UB in their programs. And we have that:
>>> -fsanitize=undefined (not only does it effectively disable many UB-based
>>> optimizations (because it makes them not undefined - by conditionalizing
>>> the code to check that UB isn't reached, as such) - it even provides pretty
>>> diagnostics (of course you can't actually continue running the program - if
>>> the line after the diagnostic will dereference a null pointer - there's no
>>> non-null pointer we can magic-up, so execution must stop))
>>>
>>>
>>>>
>>>>
>>>> Peter Lawrence.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> The compile is still able to delete some code, because of breaking the
>>>> abstraction through inlining or template instantiation for example (cf Hal
>>>> example).
>>>>
>>>> --
>>>> Mehdi
>>>>
>>>>
>>>>
>>>>>
>>>>> Or, to put it another way:  there is no difference between a compiler
>>>>> and a static analyzer [*]. So regardless of whether it is the compiler
>>>>> or
>>>>> the static analyzer that finds any UB, the only rational thing to do
>>>>> with
>>>>> it is report it as a bug.
>>>>>
>>>>>
>>>>> Peter Lawrence.
>>>>>
>>>>>
>>>>> [* in fact that’s one of the primary reasons Apple adopted llvm, to use
>>>>>   It as a base for static analysis]
>>>>>
>>>>>
>>>>>
>>>>> On Jul 21, 2017, at 10:03 PM, Mehdi AMINI <joker.eph at gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> 2017-07-21 21:27 GMT-07:00 Peter Lawrence <peterl95124 at sbcglobal.net>:
>>>>>
>>>>>> Sean,
>>>>>>      Let me re-phrase a couple words to make it perfectly clear
>>>>>>
>>>>>> On Jul 21, 2017, at 6:29 PM, Peter Lawrence <
>>>>>> peterl95124 at sbcglobal.net> wrote:
>>>>>>
>>>>>> Sean,
>>>>>>
>>>>>> Dan Gohman’s “transform” changes a loop induction variable, but does
>>>>>> not change the CFG,
>>>>>>
>>>>>> Hal’s “transform” deletes blocks out of the CFG, fundamentally
>>>>>> altering it.
>>>>>>
>>>>>> These are two totally different transforms.
>>>>>>
>>>>>>
>>>>>>
>>>>>> And even the analysis is different,
>>>>>>
>>>>>> The first is based on an *assumption* of non-UB (actually there is no
>>>>>> analysis to perform)
>>>>>>
>>>>>>                        the *absence* of UB
>>>>>>
>>>>>>
>>>>>> the second Is based on a *proof* of existence of UB (here typically
>>>>>> some non-trivial analysis is required)
>>>>>>
>>>>>>                         the *presence* of UB
>>>>>>
>>>>>> These have, practically speaking, nothing in common.
>>>>>>
>>>>>>
>>>>>>
>>>>>> In particular, the first is an optimization, while the second is a
>>>>>> transformation that
>>>>>> fails to be an optimization because the opportunity for it happening
>>>>>> in real world
>>>>>> code that is expected to pass compilation without warnings, static
>>>>>> analysis without
>>>>>> warnings, and dynamic sanitizers without warnings, is zero.
>>>>>>
>>>>>> Or to put it another way, if llvm manages to find some UB that no
>>>>>> analyzer or
>>>>>> sanitizer does, and then deletes the UB, then the author of that part
>>>>>> of llvm
>>>>>> is in the wrong group, and belongs over in the analyzer and/or
>>>>>> sanitizer group.
>>>>>>
>>>>>
>>>>> I don't understand your claim, it does not match at all my understand
>>>>> of what we managed to get on agreement on in the past.
>>>>>
>>>>> The second transformation (dead code elimination to simplify) is based
>>>>> on the assumption that there is no UB.
>>>>>
>>>>> I.e. after inlining for example, the extra context of the calling
>>>>> function allows us to deduce the value of some conditional branching in the
>>>>> inline body based on the impossibility of one of the path *in the context
>>>>> of this particular caller*.
>>>>>
>>>>> This does not mean that the program written by the programmer has any
>>>>> UB inside.
>>>>>
>>>>> This is exactly the example that Hal gave.
>>>>>
>>>>> This can't be used to expose any meaningful information to the
>>>>> programmer, because it would be full of false positive. Basically a program
>>>>> could be clean of any static analyzer error, of any UBSAN error, and
>>>>> totally UB-free, and still exhibit tons and tons of such issues.
>>>>>
>>>>> --
>>>>> Mehdi
>>>>>
>>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170731/ca951c14/attachment.html>