[PATCH] D76096: [clang] allow const structs to be constant expressions in initializer lists

Eli Friedman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 24 17:45:55 PDT 2023


efriedma added a comment.

> It makes sense to me that const int foo[] = [ ...massive list...]; would take long to validate the entire initializer as all constant expressions

The expensive part we're currently avoiding by bailing out of the constant evaluator (the code D76169 <https://reviews.llvm.org/D76169> removes) is actually constructing an APValue; constructing APValues for large arrays and structs is disproportionately expensive compared to just walking the corresponding AST.



================
Comment at: clang/lib/CodeGen/CGExprConstant.cpp:1013
+      if (V->hasInit())
+        return Visit(V->getInit(), V->getType());
+    return nullptr;
----------------
nickdesaulniers wrote:
> efriedma wrote:
> > rsmith wrote:
> > > efriedma wrote:
> > > > rsmith wrote:
> > > > > nickdesaulniers wrote:
> > > > > > efriedma wrote:
> > > > > > > You need to be more careful here; we can call ConstExprEmitter for arbitrary expressions.
> > > > > > "Be more careful" how?
> > > > > Here are some specific cases in which you need to be more careful because the code as-is would do the wrong thing:
> > > > > 
> > > > >  * when emitting a global's initializer in C++, where the value of the object denoted by the DeclRefExpr could have changed between its initialization and the expression we're currently emitting 
> > > > >  * when emitting anything other than a global initializer in C, where the value of a global could have changed after its emission
> > > > >  * when emitting a reference to a non-global declaration in C (local variables might change between initialization and use)
> > > > > 
> > > > > We would need to restrict this to the cases where the variable's value cannot possibly have changed between initialization and use.
> > > > > 
> > > > > In C, that's (mostly) the case for a static storage variable referenced from the initializer of a static storage variable, for a thread storage variable referenced from the initializer of a static storage variable, or for a thread storage variable referenced from the initializer of a thread storage variable. Even then, this isn't strictly correct in the presence of DSOs, but I think it should be correct if the two variables are defined in the same translation unit.
> > > > > 
> > > > > In C++, that's (mostly) the case when the variable is `const` or `constexpr` and has no mutable subobjects. (There's still the case where the reference happens outside the lifetime of the object -- for the most part we can handwave that away by saying it must be UB, but that's not true in general in the period of construction and period of destruction.)
> > > > > 
> > > > > In both cases, the optimization is (arguably) still wrong if there are any volatile subobjects.
> > > > And this is why I don't want to duplicate the logic. :)
> > > > 
> > > > I'd rather not make different assumptions for C and C++; instead, I'd prefer to just use the intersection that's safe in both.  I'm concerned that we could come up with weird results for mixed C and C++ code, otherwise.  Also, it's easier to define the C++ rules because we can base them on the constexpr rules in the standard.
> > > I agree we probably want the same outcome as D76169 provides, if either the performance is acceptable or we can find a way to avoid the performance cost for the cases we already accept. Perhaps we could get that outcome by ensuring that we try the CGExprConstant fast-path (at least in C compilations, maybe in general) before we consider the complete-but-slow evaluation strategy in ExprConstant?
> > I like the idea of guarding constant evaluation with a "fast path" that doesn't actually compute the APValues in simple cases.  We can just implement the simple stuff, and fall back to the full logic for complicated stuff.
> > 
> > My one concern is that we could go to a bunch of effort to emit a variable on the fast path, then fall off the fast path later because "return a[0];" tries to constant-fold a big array "a".
> > the CGExprConstant fast-path
> 
> Sorry, what is "the CGExprConstant fast-path"?
The "fast-path" would be the existing code in ConstExprEmitter.   The idea is to make the constant evaluator the primary source of truth in both C and C++, along the lines of D76169, but keep some code around in CGExprConstant to go directly from the AST to LLVM IR in cases where we don't really need the full power of the constant evaluator.

So take the following steps:
- Change ConstantEmitter::tryEmitPrivate/ConstantEmitter::tryEmitPrivateForVarInit to try ConstExprEmitter first, instead of falling back to ConstExprEmitter.
- Fix any bugs that come out of that.
- Change the constant evaluator to handle structs in C (D76169)
- Run some tests to check that we don't accidentally invoke the evaluator for C structs/arrays though some other codepath.
- Fix the constant evaluator to handle CK_ToUnion and DesignatedInitUpdateExpr, so mixing them with constructs not supported by ConstExprEmitter doesn't error out.
- Add more simple cases to ConstExprEmitter to address any bottlenecks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76096/new/

https://reviews.llvm.org/D76096



More information about the cfe-commits mailing list