[PATCH] D42079: AMDGPU: Add a function attribute that shrinks buggy s_buffer opcodes on GFX9

Mon Jan 15 11:23:52 PST 2018

mareko added inline comments.

================
Comment at: lib/Target/AMDGPU/SIShrinkInstructions.cpp:310-312
+      if (ST.hasBuggySBufferLoadStoreAtomicxN() &&
+          MFI->shrinkBuggySBufferLoadStoreAtomicxN() &&
+          TII->isSMRD(MI.getOpcode())) {
----------------
arsenm wrote:
> mareko wrote:
> > arsenm wrote:
> > > We shouldn't be putting bug workarounds in an optimization pass. This should probably be part of the initial selection
> > What initial section are you talking about? Note that the placement of this pass is perfect, because the new code needs to run after SILoadStoreOptimizer.
> One option would be to do this in the DAG and split the intrinsics before getting selected in the first place, or we could do this in AdjustInstrPostInstrSelection or in a new bug workaround pass. 
> 
> Why does it need to be after SILoadStoreOptimizer? Is it just because it will try to merge and form these? If it's buggy it should just not do that in the first place.
The intrinsic is translated into s_buffer_load_dword only. There are no intrinsics for the xN opcodes - this is actually the optimal situation because we have SILoadStoreOptimizer. SILoadStoreOptimizer merges s_buffer_load_dword into x2 and x4 - this is the only place that generates the xN opcodes. This new code needs to run after that to convert s_buffer_load into s_load for xN opcodes where N >= 2.

Repository:
  rL LLVM

https://reviews.llvm.org/D42079