[PATCH] D26348: Allow convergent attribute for function arguments

Nicolai Hähnle via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 7 02:30:01 PST 2016


nhaehnle created this revision.
nhaehnle added reviewers: arsenm, tstellarAMD, mehdi_amini, jlebar.
nhaehnle added a subscriber: llvm-commits.
Herald added a subscriber: wdng.

While convergent functions are functions where the fact that they are called
must be uniform across multiple threads in an SIMT/SPMD-type execution
model, convergent function arguments are arguments whose value must be
uniform across multiple threads.

The problem that this is intended to address is that (for AMDGPU, but also
for general GLSL semantics):

  %v1 = texelFetch(%sampler, %coord0)
  %v2 = texelFetch(%sampler, %coord1)
  %v = select i1 %cond, vType %v1, %v2

is logically equivalent to and could benefit from being transformed to:

  %coord = select i1 %cond, cType %coord0, %coord1
  %v = texelFetch(%sampler, %coord)

On the other hand,

  %v1 = texelFetch(%sampler0, %coord)
  %v2 = texelFetch(%sampler1, %coord)
  %v = select i1 %cond, vType %v1, %v2

must not be transformed to

  %s = select i1 %cond, sType %sampler0, %sampler1
  %v = texelFetch(%s, %coord)

because of uniformity restrictions on the first argument of texelFetch.

While InstCombine does not actually perform these transforms today,
SimplifyCFG does tail sinking that amounts to the same thing, and there are
shaders in the wild that are mis-compiled because of it.

In other words, this patch is really a bug fix, but it tries to fix the bug
without unnecessary performance regression and keep the door open for future
optimization improvements.

This is very much an RFC with feedback very much appreciated, but I'd
personally be happy to push the patch as-is (minus the part of the
select-call.ll test which merely illustrates a potential future improvement
and minus the incomplete formalization).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97988


https://reviews.llvm.org/D26348

Files:
  docs/LangRef.rst
  include/llvm/IR/Intrinsics.td
  include/llvm/IR/IntrinsicsAMDGPU.td
  lib/AsmParser/LLParser.cpp
  lib/IR/Verifier.cpp
  lib/Transforms/Utils/SimplifyCFG.cpp
  test/Bitcode/attributes.ll
  test/Transforms/InstCombine/select-call.ll
  test/Transforms/SimplifyCFG/convergent.ll
  utils/TableGen/CodeGenIntrinsics.h
  utils/TableGen/CodeGenTarget.cpp
  utils/TableGen/IntrinsicEmitter.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D26348.77022.patch
Type: text/x-patch
Size: 14006 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161107/3e021b9a/attachment-0001.bin>


More information about the llvm-commits mailing list