[PATCH] D41651: AMDGPU: Add 32-bit constant address space

Thu Feb 15 04:04:03 PST 2018

alex-t added a comment.

In fact v_readfirstlane is inserted by the ISel to glue vector input to the unexpected scalar instruction.
This means that compiler user writing valid IR will get unexpected behavior.
Is this documented somewhere?

My objections WRT implementation are:
Bypassing the normal way of processing values divergence is misleading. I was very much surprised to see "amdgpu.uniform" metadata already set at the point (AMDGPUAnnotateUniformValues) where they are expected to be queried from DA.
Moreover they were set for the value that is reported by DA as divergent!

The correct place to do this is **TargetTransformInfo::isAlwaysUniform **hook that I  added specifically for handling target-specific features (like readfirstlane itself BTW).
Using that hook lets the DA to process values that produce uniform result irrelative of their operands divergence correctly. Then the DA computes the divergence for such exceptions in normal way and no hackery on metadata is needed.
We just query the divergence in AMDGPUAnnotateUniformValues as we did before and set metadata accordingly.

Repository:
  rL LLVM

https://reviews.llvm.org/D41651