[PATCH] D104975: Implement P1949

Aaron Ballman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 2 12:13:20 PDT 2022


aaron.ballman added a comment.

In D104975#3486313 <https://reviews.llvm.org/D104975#3486313>, @intractabilis wrote:

> Can you roll this back and don't support P1949 <https://reviews.llvm.org/P1949>?

No; P1949 <https://reviews.llvm.org/P1949> was adopted for C++23, so this is effectively a nonstarter unless WG21 backs the paper out and I'm not aware of any push within the committee to do that.

> For some inexplicable reason ∂, 𝜕 partial derivative symbols are now not supported. Neither as XID_Start nor as XID_Continue. Where is logic in that?

According to the Unicode consortium, both of those are mathematical symbols, not identifier characters. WG21 and WG14 now both defer to the Unicode consortium on what constitutes a valid identifier character and only deviate to allow `_` for historical reasons.

It's worth noting that Clang has never accepted `∂` as a valid identifier even though we did accept `𝜕`: https://godbolt.org/z/bTKoKjjdc

> It has no sense, it's not some crazy emoji symbols. Moreover, it breaks the old code.

That's expected and intentional, which is unfortunate for folks in your situation and I'm sorry to hear you're caught out by this.

Clang could support a more relaxed mode via a feature flag to opt into allowing non-conforming identifiers in C++23, but I think we should be cautious about adding feature flags to deviate from the standard here until the standards bodies have had the opportunity to weigh in on the topic, unless there's a significant amount of code breaking in the wild. Thus far, I'm not aware of any system headers or other major third-party libraries that are impacted (do you know of widely deployed libraries that are caught by this?). We could also elect not to implement this paper in older language modes, but as with the feature flag, I think we want to be cautious about that. The issue with either approach is: previous support for Unicode characters lead to mysterious behaviors that can be dangerous (see recent discussions about trojan source as an example).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104975/new/

https://reviews.llvm.org/D104975



More information about the llvm-commits mailing list