[PATCH] D79272: [mlir][StandardToSPIRV] Emulate bitwidths not supported for store op.

Mon May 4 11:48:59 PDT 2020

mravishankar accepted this revision.
mravishankar added inline comments.
This revision is now accepted and ready to land.

================
Comment at: mlir/lib/Conversion/StandardToSPIRV/ConvertStandardToSPIRV.cpp:748-758
+  // Since there are multi threads in the processing, the emulation will be done
+  // with atomic operations. E.g., if the storing value is i8, rewrite the
+  // StoreOp to
+  // 1) load a 32-bit integer
+  // 2) clear 8 bits in the loading value
+  // 3) store 32-bit value back
+  // 4) load a 32-bit integer
----------------
mravishankar wrote:
> antiagainst wrote:
> > mravishankar wrote:
> > > I think these steps will not give the result that you want in all cases. There will be a race between the other threads that are modifying the same word, but different bits in the word. I dont see a way around it apart from using a atomic compare exchange.
> > > 
> > > 
> > > do {
> > >   %val = load %basePtr[%baseOffset] : !spv.ptr<i32, ..>
> > >   %clearbits = and %val, %mask : i32 // Clear the bits that need to be updated.
> > >   %updateval = or %clearbits, %update : i32 // Update the bits as required
> > > } while( %val != atomicCompareExchange(%basePtr, %updateVal, %val));
> > > 
> > > See semantics of compareExchange here : https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpAtomicCompareExchange. If the condition of the while failed, it implies the compareExchange succeeded, i.e. the value at %basePtr was same as the %val that was loaded. If the condition failed, then the value at %basePtr was different from what was loaded indicating a stale load (due to an other write to the same word).
> > > 
> > > 
> > > 
> > I might be missing something but the approach here should be fine given each thread is actually only touching strictly disjoint portion of the word? Could you elaborate a bit here regarding the issue?
> I dont think that is the case. Here is the proposed sequence of operations by this patch
> 
> load
> atomicAnd
> load
> atomicOr.
> 
> Lets just focus on the last two instructions
> 
> load
> atomicOr
> 
> between the load and the atomicOr some other thread might have written to the same word but at different bits. So the load value that is seen might be different than what exists in memory at the time of atomicOr. A thread can overwrite the entire word if it can guarantee that the entire word was same as it saw when it did the load. Otherwise its load value is stale. It has to abort and retry. Thats what the suggested snippet is doing. It ensures that when a thread updates the word, the value it saw at the load is not stale.
Sorry, got confused by the comment. There is no intermediate load. Then this works. Apologies for the noise.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79272/new/

https://reviews.llvm.org/D79272