[PATCH] D73127: AMDGPU/GlobalISel: Widen non-power-of-2 load results

Wed Feb 5 05:59:52 PST 2020

arsenm marked an inline comment as done.
arsenm added a comment.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:720-722
+    unsigned Align = Query.MMODescrs[0].AlignInBits;
+    unsigned RoundedSize = NextPowerOf2(Size);
+    return (Align >= RoundedSize);
----------------
nhaehnle wrote:
> As long as the alignment and size are both at least 32, I believe we can always support it. E.g., a dwordx4 load from a pointer that's 4-byte aligned is okay.
> 
> Though... admittedly in that case you may end up loading an additional cache line which you otherwise wouldn't load, because you cross a cacheline boundary. So maybe the right decision is to keep the test as is?
> 
> Either way, the decision ought to be documented as a comment.
This isn't really changing the handling of alignment breakdown, or even the memory access. This is only for the result register type, and just needs to avoid doing something incorrect. The decisions for decomposing based on alignment is already present in the other legalization rules. The rule set here is pretty confusing, and soon due for a rewrite after a few more issues are fixed in the LegalizerHelper

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73127/new/

https://reviews.llvm.org/D73127