[llvm-dev] Poison/Undef at CodeGen level Was: [poison] is select-of-select to logic+select allowed?

Fri May 26 20:15:48 PDT 2017

On 05/26/2017 04:27 PM, Matthias Braun wrote:
>
>> On May 26, 2017, at 2:02 PM, Hal Finkel <hfinkel at anl.gov 
>> <mailto:hfinkel at anl.gov>> wrote:
>>
>>
>> On 05/26/2017 03:02 PM, Matthias Braun wrote:
>>>
>>>> Regarding SDAG, and given that poison is already there, we would 
>>>> need to adopt a similar solution to the IR.  Maybe right now we can 
>>>> get away with it because nsw is not exploited significantly (as you 
>>>> say).  Just because there’s no explicit poison in SDAG, just having 
>>>> nsw is sufficient to cause miscompilations when combined with other 
>>>> transformations.
>>>> See, for example, this bug report for InstCombine regarding 
>>>> select+nsw:_https://bugs.llvm.org/show_bug.cgi?id=31633_
>>>
>>> Poison/Undef at the CodeGen level is a very interesting discussion! 
>>> I don't think there is any documentation about posion/undef at the 
>>> CodeGen level and I haven't discussed this much with others, so I'd 
>>> like to hear further feedback:
>>>
>>> - I think we should not introduce a notion of poison (which I would 
>>> call "delayed UB") at the SelectionDAG/CodeGen level[1]
>>> - Instructions either produce UB immediately, so while there are 
>>> nsw/nuw flags on SelectionDAG arithmetic operatiosn I think we can 
>>> only assume that they produce a target specific value on overflow, 
>>> but not arbitrary behavior. Instructions that can produce UB should 
>>> marked "hasSideEffect" and code motion around it be limited.
>>
>> You mean like every integer division? Having arbitrary side effects 
>> seems too strong.
> If we can model it more precise sure. But in principle if you divide 
> by zero many CPUs will trap immediately.

We don't want to limit our ability to move memory references around 
integer division (for scheduling, etc.).

>
>>
>>> - Typical optimization scenarios like establishing loop trip count 
>>> bounds for which poison/UB is helpful should not matter for CodeGen.
>>> - I don't have any evidence that optimizations in CodeGen require a 
>>> model of poison to work. If someone can given me a counter example 
>>> then I'd be hard pressed to disable the optimization in MI and push 
>>> it towards the IR level.
>>
>> I'm not sure it is always possible to push these optimizations to the 
>> IR level, at least not without adding more Value* ties to 
>> instructions (e.g. what we do with MMOs). I have some experience, for 
>> example, taking optimizations taking advantage of 
>> hardware-counter-based loops and moving into the IR level (so that we 
>> can take advantage of ScalarEvolution), and it works to some extent, 
>> but is also fairly hacky (the legality-checking part of 
>> lib/Target/PowerPC/PPCCTRLoops.cpp, for example, needs to understand 
>> what legalization will later do - it's the best we can do right now, 
>> but it's a mess). If we had better loop analysis at the MI level, 
>> this would be much better. We already have some interesting MI-level 
>> passes that do interesting things with loops 
>> (lib/CodeGen/MachinePipeliner.cpp, for example).
>>
>> I think that we should have the same semantics here on both the IR 
>> level and the MI level (whatever they are). Having different 
>> semantics in this regard is going to be confusing (and lead to subtle 
>> bugs because optimizations valid in one part of the pipeline will be 
>> invalid in other parts). These issues often come up around 
>> speculative execution, for example, and I definitely think we need to 
>> be able to deal with speculative execution at the MI level (because 
>> speculation opportunities often come from legalization/isel and 
>> pseudo-instruction expansion and because the modeling is better at 
>> the MI level).
>
> Does that mean a vreg (or even a physreg) can conceptually hold a 
> poison value? Given that failed to capture the semantics at the IR 
> level I have bad feelings about getting this right at the MI level 
> with all the additional machine specific semantics. Reasoning about 
> optimisations definitely gets easier without poison and I'd really 
> like to see some good example first to motivate the maintenance burden.
>
> A notion of unknown/unspecified value should be enough to enable 
> hoisting, we shouldn't need poison for that.

I think that we should get this right once, and when we do, we should 
use these self-consistent semantics everywhere. I don't see why 
machine-specific semantics make this harder or easier in general (there 
are certainly instructions that won't have UB even when their 
corresponding IR-level counterparts do, which might make things easier, 
and instructions can have more dependencies to track, which might make 
things harder).

  -Hal

>
> - Matthias

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170526/a8c5a1d3/attachment.html>