[PATCH] D106598: [llvm-rc] Allow dashes as part of resource name strings

Martin Storsjö via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 23 09:44:33 PDT 2021


mstorsjo added a comment.

In D106598#2900614 <https://reviews.llvm.org/D106598#2900614>, @amccarth wrote:

> Cool.  I'd never seen anyone use strings that don't look like a C or C++ identifier as a resource identifier, but it should work given that FindResource just uses the string as an opaque key.

Yep. This seems to be used by some project in the form `ice40/chipdb-384.bin RCDATA "path/to/ice40/chipdb-384.bin"` (see https://github.com/msys2/MINGW-packages/issues/9180). (Also some projects include filenames without quotes, e.g. `40 RCDATA unquoted-filename-with-dashes.ico` as one identifier.)

> If we maximally munch everywhere, we could consume an expression as a single identifier.  For example:
>
>   #define WIDTH 400
>   #define HEIGHT 300
>   #define MARGIN 8
>   
>   MyDlg DIALOGBOXEX 10, 10, WIDTH, HEIGHT
>   BEGIN
>     LTEXT "Text in dialog", 4, 4, WIDTH-MARGIN, HEIGHT-MARGIN
>   END
>
> In this context, `WIDTH-MARGIN` better be recognized as three tokens representing an expression rather than as a single (undefined) identifier.  I know there's code out there like that.

Yes, but in this case, `WIDTH-MARGIN` would be expanded into `400-8` by the preprocessor, so llvm-rc's own parser would never see that expression in that form, only in the numeric form.

So if it's only possible at the preprocessor level to have expressions with named constants, but not on the RC level itself, then I think this change should be safe.

> Also, we'd better not allow these types of identifiers to be defined, i.e., don't allow `#define FOO-BAR 42`.
>
> If you add those examples as test cases and they do the right thing, I'd be happy with this change.

Unfortunately, the main llvm-rc tests are on output from the preprocessor, as clang isn't available as a dependency for anything under the llvm repo. The converse is possible, we can have a test under clang that invokes llvm-rc including preprocessing. We have some very small tests of that, invoking llvm-rc just to see that the preprocessing works though. I'm a bit unsure whether we should add more llvm-rc specific testcases in clang, as the preprocessor output in this case simply is `400-8`.

> My guess is that RC.EXE is implemented as a recursive descent parser that doesn't really have a uniform concept of "tokens."  When it's expecting a "numeric or string identifier," it applies one set of munching rules, but when it's expecting an expression, it applies a different set.

Yup, I get that feeling too.

> I haven't looked closely enough to see whether llvm-rc has that kind of flexibility at tokenization.

Not sure; llvm-rc first tokenizes the whole input, and then reads the feed of tokens. There's some different functions that expect different kinds of tokens depending on the context though, but e.g. for this case, I much prefer a single Identifier token to include all of this, instead of having a higher level logic that might try to fuse multiple tokens under some specific circumstances.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106598/new/

https://reviews.llvm.org/D106598



More information about the llvm-commits mailing list