[llvm-dev] RFC: Add a preprocessor to yaml2obj (and other YAML tools)

George Rimar via llvm-dev llvm-dev at lists.llvm.org
Tue Feb 4 01:20:07 PST 2020


?The idea itself is indeed good.


Regarding to escaping: I think we should have it.

Imagine the following example (I've took it from D73828).


--- !ELF
FileHeader:
  Class:   ELFCLASS[[BITS]]
  Data:    ELFDATA2LSB
  Type:    ET_EXEC
  Machine: EM_386


# RUN: yaml2obj %s --docnum=4 -D BITS=32 -o %t-32bit.o
# RUN: yaml2obj %s --docnum=4 -D BITS=64 -o %t-64bit.o

Without escaping it would be:
Class:   ELFCLASSBITS


What does not look so clear as a version with escaping IMO.


Best regards,
George | Developer | Access Softek, Inc
________________________________
От: James Henderson <jh7370.2008 at my.bristol.ac.uk>
Отправлено: 4 февраля 2020 г. 12:09
Кому: Fangrui Song
Копия: llvm-dev; George Rimar
Тема: Re: RFC: Add a preprocessor to yaml2obj (and other YAML tools)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.  If you suspect potential phishing or spam email, report it to ReportSpam at accesssoftek.com
As someone who suggested this kind of functionality in a review earlier, this will certainly be useful, I think. I think the syntax makes sense to me, if we allow for unrecognised macros to just be treated as part of the input string. This means we don't have to worry about complexities like escaping "[[" etc. In the (unlikely) event that somebody's YAML needs to include the literal "[[FOO]]", they simply should not also use -DFOO - use a different name instead, e.g. -DBAR.

On Mon, 3 Feb 2020 at 22:04, Fangrui Song <maskray at google.com<mailto:maskray at google.com>> wrote:
I am adding -D k=v to yaml2obj, similar to clang -D. This makes it easy
to generate {32-bit,64-bit} x {big-endian,little-endian} tests.

   --- !ELF
   FileHeader:
     Class:   ELFCLASS[[BITS]]
     Data:    ELFDATA2[[ENCODE]]
     Type:    ET_DYN
     Machine: EM_X86_64

# RUN: yaml2obj -D BITS=32 -D ENCODE=LSB %s -o %t.32le
# RUN: yaml2obj -D BITS=32 -D ENCODE=MSB %s -o %t.32le
# RUN: yaml2obj -D BITS=64 -D ENCODE=LSB %s -o %t.64le
# RUN: yaml2obj -D BITS=64 -D ENCODE=MSB %s -o %t.64be

See https://reviews.llvm.org/D73828 for examples how -D simplifies tests.

Do people think it may be useful in other YAML tools? If yes, I'll move
the yaml2obj implementation (https://reviews.llvm.org/D73821 ) to
include/llvm/Support/YAMLTraits.h llvm::yaml::Input so that other YAML
tools can use the feature.

Do people prefer a different syntax? I think [[PATTERN]] is nice because
it is what FileCheck -DFILE=... uses:

   # CHECK: ... [[FILE]]

   FileCheck only preprocesses patterns in CHECK lines.
   D73821 preprocesses both comment lines (which include CHECK lines) and non-comment lines (which include YAML).
   It is not a problem that the YAML preprocessor also processes CHECK lines, because tokens on a comment line will be ignored.

If -D UNDEF= is not specified, should [[UNDEF]] in the source be considered an error?
I think it is fine not to treat it as an error because there can be
legitimate use cases of unterminated [[, for example, [[ in a string literal.
YAML parsing is complex. I don't expect the preprocessor to be smart
enough to recognize string literals. (llvm/lib/Support/YAMLParser.cpp does not seem to provide raw strings of
spaces and comments. Hooking a preprocessor into the scanner does not seem to be simple.)

Do people know other preprocessing features which may be useful?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200204/9dd2a16a/attachment.html>


More information about the llvm-dev mailing list