[PATCH] D144445: [ConstantFold][InstSimplify] folding load for constant global patterened arrays and structs

Kohei Asano via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 28 05:06:26 PST 2023


khei4 planned changes to this revision.
khei4 added a comment.

Thank you for the review!



================
Comment at: llvm/lib/Analysis/ConstantFolding.cpp:781
+  // elements
+  if ((!CSTy || !CSTy->isPacked()) && !CATy)
+    return nullptr;
----------------
nikic wrote:
> This looks like a leftover from a previous implementation? I don't think your current one has any limitations regarding the type of the initializer.
I might have cared padding bit, but it's no longer needed if I use `ConstantFoldLoadFromConst`


================
Comment at: llvm/lib/Analysis/ConstantFolding.cpp:786
+  // Bail for large initializers in excess of 64K to avoid allocating
+  // too much memory.
+  if (!GVSize || UINT16_MAX < GVSize)
----------------
nikic wrote:
> If we have a 64K global and are reading 4 byte elements (i32) from it, then we need to check that 16k elements are the same. This is too expensive, as the compile-time regression shows.
> 
> We should limit this more aggressively to achieve reasonable compile-times, e.g. to 1K globals.
Sounds reasonable. I should have estimated it.


================
Comment at: llvm/lib/Analysis/ConstantFolding.cpp:794
+  unsigned char *GVBytes = RawBytes.data();
+  if (!ReadDataFromGlobal(C, 0, GVBytes, GVSize, DL))
+    return nullptr;
----------------
nikic wrote:
> Rather than `ReadDataFromGlobal()`, we should use `ConstantFoldLoadFromConst()` here (or rather in the loop below). This has two advantages: It's going to handle all types (including pointer loads), and it allows an early bailout if not all elements are the same. Your current implementation will always read the entire global, even if it's clear after two elements that it's not uniform.
> 
> It will also take care of endianness itself.
> Rather than `ReadDataFromGlobal()`, we should use `ConstantFoldLoadFromConst()` here (or rather in the loop below). This has two advantages: It's going to handle all types (including pointer loads), and it allows an early bailout if not all elements are the same. Your current implementation will always read the entire global, even if it's clear after two elements that it's not uniform.
> It will also take care of endianness itself.

Thank you! I couldn't notice that function. That sounds much more efficient than the current whole global variable reading. 


================
Comment at: llvm/lib/Analysis/ConstantFolding.cpp:798
+  unsigned LoadSize = LoadTy->getScalarSizeInBits() / 8;
+  if (LoadAlign <= GVSize)
+    for (unsigned ByteOffset = LoadAlign, E = GVSize - LoadSize;
----------------
nikic wrote:
> If the load alignment is used, you also need to make sure that the global alignment is `>=` the load alignment. Otherwise it doesn't really tell you anything.
> 
> Though I'm not sure we really want to use alignment to determine access stride -- this is pretty unusual. Based this on the getelementptr stride would be more typical and handle cases that alignment can't (e.g. non-power-of-two stride).
> If the load alignment is used, you also need to make sure that the global alignment is >= the load alignment. Otherwise it doesn't really tell you anything.

Right. I'll handle that case. 

> Based this on the getelementptr stride would be more typical and handle cases that alignment can't (e.g. non-power-of-two stride)

Ok, I am a little confused. Overall I thought I don't need to care about the case that alive decides any value could be returned. Although I didn't,  `undef` value could be returned?

- if `Global Variable's alignment` > `load's alignment`, then whatever offsetted pointer is given to load, then any value could be returned.

https://alive2.llvm.org/ce/z/5_EDLB

- even if `Global Variable's alignment` <= `load's alignment`, the offset is not the multiple of load's alignment, then any value could be returned.

https://alive2.llvm.org/ce/z/Qnon4K

based on these assumptions, I used alignment to calculate possible valid accessible offsets. Are these assumptions and alive results correct?
I might have misunderstood or broadly interpret alive results.

But I'm feeling it's also good to compute stride from GEP and use a bigger one. A possible minimum stride can be gained by calculating the greatest common divisor of GEP-type sizes, so it's simpler than I expected.
(This might be out of scope, it might be better not to use GEP idx, I saw [[ https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699 | your proposal ]]  )


================
Comment at: llvm/lib/Analysis/InstructionSimplify.cpp:6640
+  if (!GV || !GV->isConstant() ||
+      (!GV->hasDefinitiveInitializer() && !GV->hasUniqueInitializer()))
     return nullptr;
----------------
nikic wrote:
> hasUniqueInitializer() is only relevant if you want to modify an initializer, it should not be checked here.
Yeah, you're also right here. I removed it. Thanks!


================
Comment at: llvm/lib/Analysis/InstructionSimplify.cpp:6651
+          GV->getInitializer(), LI->getType(), LI->getAlign().value(), Q.DL))
+    return C;
 
----------------
nikic wrote:
> It might make sense to move this to the `else` of the below branch. If we know a constant offset, then we don't need to go through this code.
Sounds reasonable. I'll fix.


================
Comment at: llvm/test/Transforms/InstSimplify/load-patterned-aggregates.ll:3
 ; RUN: opt < %s -passes=instsimplify -S | FileCheck %s
 target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32"
 @constzeroarray = internal constant [4 x i32] zeroinitializer
----------------
nikic wrote:
> Rather than hardcoding the data layout, it's better to do something like this:
> ```
> ; RUN: opt < %s -passes=instsimplify -S -data-layout="e" | FileCheck %s --check-prefixes=CHECK,LE
> ; RUN: opt < %s -passes=instsimplify -S -data-layout="E" | FileCheck %s --check-prefixes=CHECK,BE
> ```
> This will check both little and big endian.
Cool! Thanks!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144445/new/

https://reviews.llvm.org/D144445



More information about the llvm-commits mailing list