[PATCH] D100346: [Clang] String Literal and Wide String Literal Encoding from the Preprocessor

ThePhD via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Apr 13 08:57:52 PDT 2021


ThePhD marked 4 inline comments as done.
ThePhD added a comment.

In D100346#2685530 <https://reviews.llvm.org/D100346#2685530>, @aaron.ballman wrote:

> ...
>
> What about for folks using this from C where there isn't `constexpr` functionality to help them?

The unfortunate bit is that C won't be able to guarantee compile-time culling of branches, even if all values present are `constexpr`. Implementations like Clang and GCC have significantly powerful enough optimizers that usage of `strcmp` and `memcmp` can be recognized and turned into builtins, before being const-folded down. This can provide dead code elimination. But, otherwise, C can't **guarantee** elision of the code branches without an integral constant identifier in a Macro. This is more or less a deficiency in how weak Constant Expressions are in C, leaving most people to rely on implementation-defined behavior for constant folding to do anything worthwhile with their toolset.

I think in the future, if we are really invested in this path, we should come up with a canonical mapping and a specific way of saying "if this is a recognized encoding name it has an Integer Constant Expression of value `X` as defined by table `Y` in the documentation". We could provide macros `__clang_literal_encoding_id__` and `__clang_wide_literal_encoding_id` that has the `X` integer constant expression value. But I think that should be a follow-on patch that evaluates the totality of encodings, and also maybe contacts some IBM folks who did the `-fexec-charset` patches so they can also give over any additional encoding mappings they want.

Because Clang is open, anyone could add to it and that way people could have that kind of ability in C. iconv has a very full list of encodings, and you'd also need to define a resistant equality function similar to what's implemented in soasis/text (https://github.com/soasis/text/blob/main/include/ztd/text/detail/encoding_name.hpp#L120) or in P1885 <https://reviews.llvm.org/P1885> (http://wg21.link/p1885) so you can compare names in a consistent manner across platforms. After that equality you then yield the integer value, and then you'd go from there. It doesn't have to be standard, just compiler-specific.

I'm not sure all of that belongs in this patch, though, and I think I'd wait for the other patches about `iconv` literal converters to drop before having the fullness of that conversation.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100346/new/

https://reviews.llvm.org/D100346



More information about the cfe-commits mailing list