[PATCH] D80119: [AMDGPU/MemOpsCluster] Code clean-up around mem ops clustering logic

Mon May 18 07:29:32 PDT 2020

foad added a comment.

> (1) NumLoads argument can initially be set to 2 instead of 1, this avoids incrementing NumLoads while calling shouldClusterMemOps()

Alternatively you could move `++ClusterLength` to just before the call to `shouldClusterMemOps`. But I really don't think this is very important.

> (2) Re-arrange the code within clusterNeighboringMemOps(), so that the code is more convenient to understand

I agree that using `continue` is more readable.

> (3) Improve the logic within shouldClusterMemOps() and remove all FIXMEs

You've changed it to accurately count bytes in a cluster, which is nice, but (a) it's a change in behvaiour so needs test updates and benchmarking, and (b) I don't particularly like the implementation. I think a cleaner way to do this would be to change all target `getMemOperandsWithOffset` functions to return a Width (in bytes) as well (some targets already have an internal `getMemOperandsWithOffsetWidth` function which does this) and change MachineScheduler to use this information to track the size of a cluster, and pass that information into `shouldClusterMemOps`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80119/new/

https://reviews.llvm.org/D80119