[PATCH] D98363: [Sema] Fold VLA types in compound literals to constant arrays.

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Mar 16 05:27:11 PDT 2021


aaron.ballman accepted this revision.
aaron.ballman added a comment.
This revision is now accepted and ready to land.

LGTM, though I think we should consider changing the diagnostic text (for all of the VLA folding cases) as a follow-up.



================
Comment at: clang/test/Sema/vla.c:134
+  // expected-warning at +1{{variable length array folded to constant array as an extension}}
+  char (*a9)[] = (char[2][ksize]) {{1,2,3,4},{4,3,2,1}};
+
----------------
rsmith wrote:
> efriedma wrote:
> > aaron.ballman wrote:
> > > efriedma wrote:
> > > > aaron.ballman wrote:
> > > > > efriedma wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > Doesn't this violate the constraints in C17 6.5.2.5p1, "The type name shall specify a complete object type or an array of unknown size, but not a variable-length array type"?
> > > > > > Yes, this is a constraint violation.  This patch downgrades the error to a warning, for compatibility with older versions of clang.
> > > > > It was an error in older versions of Clang, so downgrading it to a warning can't make code *more* compatible with older versions. This also makes it harder to port C code to other compilers because Clang happily accepts code that the C standard says should be rejected.
> > > > > 
> > > > > I'm not strongly opposed to the change, but I also don't like ignoring constraint violations in the standard as that comes awfully close to what `-fpermissive` does (and at least with that awful flag you have to opt *into* the less strictly conforming behavior instead of getting it by default as with extensions).
> > > > > 
> > > > > I'm curious if @rsmith has thoughts here?
> > > > > It was an error in older versions of Clang
> > > > 
> > > > https://godbolt.org/z/rvbffY
> > > Oh, I hadn't realized this changed *that* recently! Is this breaking some significant amounts of code now that we err on it (or regressing performance by not folding the VLA)?
> > > 
> > > My point still stands about disliking when we ignore constraint violations. To expound a bit, in this particular case, I think the standard is being over-constraining because we reasonably *can* fold this to a more efficient form. Normally, I'd suggest filing a DR with WG14 over this and treating the constraint as UB we can extend. However, I think the "solution" that comes out of WG14 would (likely) be to make compound literal expressions of VLA types be undefined behavior rather than a constraint violation (because I don't see many other options in the standard wording for a more targeted fix to allow *just* this safe usage). Given the security concerns around misuse of VLAs already, I think that would be a worse world to live in even if it gets us what we want with this specific case in mind. This is where my caution is coming from. Knowing a bit more about the experience in the proprietary code bases would be helpful, if it's something you can share more details about.
> > The pattern I saw looked something like the following, at global scope, where it couldn't really be confused for a VLA:
> > 
> > ```
> > const int x = 2;
> > int *p = (int[x]){0};
> > ```
> > 
> > As for the frequency of it showing up, I think I got confused.  Looking again, there were multiple codebases with new -Wgnu-folding-constant warnings, but only one codebase with this specific construct.  So that makes the argument a bit weaker.
> > 
> > That said, we're doing this folding for variables already, so extending it to compound literals doesn't seem like a big deal.  And the warning is on by default.
> > Is this breaking some significant amounts of code now that we err on it (or regressing performance by not folding the VLA)?
> 
> We only ever constant-fold VLAs to constant-length arrays in cases where a VLA would be invalid, so there's no regressing-performance concern. (We used to do such folding, but it resulted in our miscompiling certain testcases, which is why Clang's behavior changed recently.) As a result, there's no risk of someone seeing a surprise VLA here, that should have been a constraint violation.
> 
> Given the above, I don't think that changing this from a constraint violation to UB in the C standard would make a difference. Either way, I think a strictly conforming program cannot contain such a construct, and a conforming implementation can, as an extension, define the meaning of programs containing such a construct.
> 
> It's common for us to have conforming extensions that permit us to accept more real-world code. Accepting with an on-by-default warning is often the best tradeoff between accepting real-world code and diagnosing non-portable constructs, especially in cases where past versions of GCC (<= 4.4) and Clang accepted the code as an extension. I'm inclined to think that's the best we can do here.
> 
> If we want to more strongly discourage use of this extension (and I don't have any particular preferences on whether we should do so), we could promote the warning to `DefaultError` (but leave the extension in, so that people can still build legacy codebases), but I think we should do that consistently across all the places we fold VLAs to constant arrays, not only for compound literals.
> Given the above, I don't think that changing this from a constraint violation to UB in the C standard would make a difference. Either way, I think a strictly conforming program cannot contain such a construct, and a conforming implementation can, as an extension, define the meaning of programs containing such a construct.

I think changing from a constraint violation to UB in the C standard would make a difference, but subtly. When something is a constraint violation, the implicit expectation is that implementations diagnose it because the committee felt it was a sufficiently bad construct to warrant forcing implementations to tell users not to use it. When something is UB, there is no such expectation -- some UB is explicitly for implementation extensions, others are because it would be impossible to diagnose, etc. So I agree that the outcome is the same, but the messaging to implementers (and other readers of the standard) is quite different.

> It's common for us to have conforming extensions that permit us to accept more real-world code. Accepting with an on-by-default warning is often the best tradeoff between accepting real-world code and diagnosing non-portable constructs, especially in cases where past versions of GCC (<= 4.4) and Clang accepted the code as an extension. I'm inclined to think that's the best we can do here.

Agreed, but "real-world code", to me, isn't "someone wrote it once" so much as "this is being relied upon and it's an important use case." For example, I am highly sympathetic to a standard library (or other popular system libraries) that relied on older compiler behavior (accidentally or purposefully), but less so for "regular user code" unless it's a popular idiom.

> If we want to more strongly discourage use of this extension (and I don't have any particular preferences on whether we should do so), we could promote the warning to DefaultError (but leave the extension in, so that people can still build legacy codebases), but I think we should do that consistently across all the places we fold VLAs to constant arrays, not only for compound literals.

100% agreed that anything we do here should be done consistently. I think I'd feel most comfortable if the diagnostic made it more obvious that this code is not valid C code, like `ISO C forbids use of a variable length array in this context; folding to a constant array as an extension`. I think it's reasonable to not promote it to `DefaultError` because I think each of these cases is not introducing some potentially unsafe usage so much as enabling questionable code to compile with reasonable results. (If I'm wrong about that, I'd be curious to know of it.)

As for this specific case of defining the behavior... if this was the first VLA constant folding extension, I'd argue it's not motivated well enough, but because we have plenty of other instances of doing the same thing, I'm fine adding the extension here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98363/new/

https://reviews.llvm.org/D98363



More information about the cfe-commits mailing list