[LLVMdev] [RFC] YAML I/O

Justin Holewinski justin.holewinski at gmail.com
Fri Jul 27 04:37:12 PDT 2012


I really like this!  +1 for inclusion in LLVMSupport instead of lld.

I have a project that could definitely make use of this.  Right now, I 
am using YAMLParser directly; it's not difficult, but this would 
definitely make it easier.

On 07/25/2012 03:43 PM, Nick Kledzik wrote:
> I've been working on reading and writing yaml encoded documents for 
> the lld project.  Michael Spencer added the YAMLParser.h functionality 
> to llvm/Support to help in parsing yaml documents.  That parser 
> greatly helps at the syntax level, but you still need to hand write a 
> lot of semantic checking and then convert the various node types in to 
> something usable.
>
> I've developed a layer on top of YAMLParser.h I'm calling YAMLIO.h 
> (yaml I/O) which unifies parsing and writing yaml documents and 
> handles most semantic checking, and is very easy to use!  Basically, 
> you define your yaml document schema as a mix of C++ structs and 
> vectors, and YAMLIO does the rest.   Lets look at a quick example 
> first.  Suppose this is your yaml document:
>
> - name:          Tom
>   age:           20
> - name:          Richard
>   age:           27
>   speaks-french: true
> - name:          Harry
>   age:           23
>
> To read or write such yaml data you would define a C++ type: for the 
> mapping (a struct),  one for the sequence of those mappings (a 
> typedef).  In the struct you add a yamlMapping() method which 
> associates mapping keys with field names and the fields's type. (Note: 
> the yamlMapping() method was inspired by the boost serialize() method).
>
> using llvm::yaml::Sequence;
> using llvm::yaml::DocumentList;
> using llvm::yaml::IO;
> using llvm::yaml::Input;
> using llvm::yaml::Output;
> using llvm::yaml::YamlMap;
>
> struct Person : public YamlMap {
>   StringRef     name;
>   uint8_t       age;
>   bool          speaks_french;
>
>   void yamlMapping(IO &io) {
> requiredKey(io, name,  "name");
> requiredKey(io, age,   "age");
> optionalKey(io, speaks_french, "speaks-french");
>   }
> };
>
> typedef Sequence<Person> PersonList;
> typedef DocumentList<PersonList>  PersonDocumentList;
>
> That's it.  The yamlMapping() method is  processed by both the Input 
> and Output to properly handle key-values in a yaml mapping.  The 
> Sequence and DocumentList templates are subclasses of std::vector<>.
>
> The data structures are regular structs and vectors.  An example of 
> creating them:
>
>   // build a person
> Person a;
>   a.name = "Tom";
>   a.age = 27;
>   a.speaks_french = false;
>   // build sequence of persons
> PersonList persons.
> persons.push_back(a);
>
> To write a yaml documents your code looks like:
>
> void dump(PersonList &persons, raw_ostream &out) {
>   Output yout(out);
> yout << persons;
> }
>
> To read a yaml  document your code looks like:
>
> void readYaml(StringRef filePath) {
>   Input yin(filePath);
>   DocumentList<PersonList> docList;
> yin >> docList;
> // if there was an error parsing, message already printed out
> if ( yin.error() )
>    return;
>   for(PersonList &pl : docList) {
>   for(Person &person : pl) {
>     // process each Person
>   }
> }
> }
>
>
> YAMLIO also handles semantic error checking for you.  For instance if 
> your document contained an illegal value for a key like:
>
> - name:          Richard
>   age:           27
>   speaks-french: oui
>
> You would  get an error like:
>
> YAML:6:18: error: invalid boolean
>   speaks-french: oui
>        ^~~~
>
> If the document has an key not in your schema like:
>
> - name:          Tom
>   pets:          true
>   age:           20
>
> You would  get an error like:
>
> YAML:3:18: error: unknown key 'pets'
>   pets:          true
>   ^~~~
>
> As you see, the model of YAMLIO is that you define intermediate data 
> structures which define your yaml schema.  The job of YAML IO is to 
> convert between those intermediate data structures and yaml documents. 
>  YAMLIO most likely won't be able to convert between your existing 
> native data structures and yaml.  You will probably need to define new 
> intermediate data structures (the schema) and then write code to 
> convert between your native data structures and the intermediate ones. 
>  But that glue code is super simple, mostly just copying fields and 
> iterating lists. All the yaml specific work (formatting and semantic 
> checking) is done by YAMLIO.
>
>
> In the example above the scalar types (strings, integers, booleans) 
> were all built-in types .  YAMLIO also has support for enumerations 
> and bit masks. Here is an example of a simple enumeration (color) and 
> a bit mask set (flags).  Suppose your data structures already defines 
> Colors and Flags:
>
> enum Colors {
>  cRed,
>  cBlue,
>  cGreen
> };
> #define FlagBig     1
>   #define FlagLittle  2
>   #define FlagRound   4
>   #define FlagPointy  8
>
> And you want the yaml documents to use human readable values for 
> colors and flags, rather than just the integer value used internally. 
>  To handle that, you define conversion tables and hand them to YAMLIO. 
>  For instance:
> using llvm::yaml::IO;
> using llvm::yaml::Input;
> using llvm::yaml::Output;
> using llvm::yaml::YamlMap;
> using llvm::yaml::UniqueValue;
> using llvm::yaml::BitValue;
>
> static const UniqueValue<Colors> colorConversions[] = {
>   {cRed,   "red"},
> {cBlue,    "blue"},
> {cGreen,   "green"},
> {cRed,      NULL} // default value for optional keys
> };
>
> static const BitValue<uint32_t> flagConversions[] = {
>   {FlagBig,     "big"},
>   {FlagLittle,  "little"},
>   {FlagRound, "round"},
> {FlagPointy,  "pointy"},
> {0,            NULL}
> };
>
> struct Test : public YamlMap {
>  StringRef     name;
>   Color         color;
>   uint32_t      flags;
>
>   void yamlMapping(IO &io) {
> requiredKey(io, name,  "name");
>   optionalKey(io, color, "color", colorConversions);
> requiredKey(io, flags, "flags", flagConversions);
>   }
> };
>
> The above defines a yaml mapping with three keys: name, color, and 
> flags.  When writing the color value out, the table colorConversions 
> is used to map the in memory value to a string.  In this case, the 
> color field is marked as optional.  That means when reading the yaml 
> document, if there is no "color:" key, the struct's color field will 
> be filled in with the last value (the one with the NULL string 
> pointer) in the table, in this case the value red.
>
> When writing the flags value out, the table flagConversion is used to 
> convert the bits in the flags field to a sequence of flag values.
>
> A valid yaml document for this schema is:
>
> - name:          Tom
>   color: blue
>   flags:         [ big ]
> - name: Richard
>   color:         red
>   flags:         [ little, pointy ]
> - name:          Harry
>   flags:         [ little, round ]
>
>
> My initial plan was to add YAMLIO  to lld and let it mature there, but 
> a got a request to move this down into llvm for another llvm client to 
> use.   So, I thought I'd see what llvm community thought of this support.
>
> To see a larger example, attached is a sample mach-o object file (for 
> hello world) encoded in yaml along with the YAMLIO based schema for 
> reading or writing those documents.
>
>
>
>
>
> -Nick
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-- 
Thanks,

Justin Holewinski

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120727/66a24a96/attachment.html>


More information about the llvm-dev mailing list