[PATCH] D91501: [VPlan] VPTransformState::get() can always return lane 0 for uniforms.

Thu Jan 28 13:18:18 PST 2021

a.elovikov added a comment.
Herald added a subscriber: tschuett.

Not blocking this review, but I think it's bug-prone to mix lane 0 of scalarized divergent values and truly uniform values that can be kept on a single scalar. Possible examples:

  top-loop:
    if %iv % VF != 0:
      inner-loop:
        %iv = [ 0, inner.ph ], [ %iv.next, inner.latch ] ; Uniform, but lane0 doesn't make much sense since it masked out.
        ...
        divergent exit condition

  bb:
    %sel = select i1 %divergent, 42, %divergent.def ; divergent in general
    use %sel
    br i1 %divergent, label %uni.use.bb, label %bb2

  uni.use.bb:
    %uni.phi = phi [ %sel, %bb ] ; "Conditionally" uniform - all active lanes have the same uniform value
    ; Long compute chain based on %uni.phi that we'd like to keep on a single scalar

In the latter case the correct extract for the uniform value would be from the first *active* lane, not from the lane 0. And I believe it's very easy to make a mistake if the same data storage is used for both scalarized parts of divergent values and for really uniform values that should be kept on a single scalar def/register.

To summarize - I think it's possible to implement everything correctly by repurposing lane0 storage for keeping uniform values, but it might lead (in future, once we try to implement more complex/complicated optimizations) to unexpected confusions and omissions that might lead to silent miscompiles (e.g. extracting undef values from lane0 instead of extracting required uniform values from the first active lane).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91501/new/

https://reviews.llvm.org/D91501