[llvm-commits] [PATCH] YAML I/O

Wed Aug 8 12:46:18 PDT 2012

> But EnumValue is not quite right because it can be used with #defines too.

Do we really want to encourage people to use #defines? Is there any
set of constants in the LLVM tree which are defined with #defines and
not in an enum?

> I'm not sure what you mean by traits-based in this context.

A traits-based design means that you have a class template which
provides a collection of type-specific information which is provided
by specializing the class template for a particular type. For example,
see include/llvm/ADT/GraphTraits.h, which uses GraphTraits<T> to
specify how to adapt T to a common interface that graph algorithms can
use. This is noninvasive (maybe needing a friend declaration at most).
Your current approach using inheritance and virtual functions is
invasive, forces the serializable class to inherit (causing multiple
inheritance in the case that the serializable class already has a
base), and forces the serializable class to suddenly have virtual
functions.

Overall, I think a traits-based design would be simpler, more loosely
coupled, and seems to fit the use case more naturally.

--Sean Silva

On Tue, Aug 7, 2012 at 4:57 PM, Nick Kledzik <kledzik at apple.com> wrote:
> On Aug 7, 2012, at 2:07 PM, Sean Silva wrote:
>> Thanks for writing awesome docs!
>>
>> +Sometime sequences are known to be short and the one entry per line is too
>> +verbose, so YAML offers an alternate syntax for sequences called a "Flow
>> +Sequence" in which you put comma separated sequence elements into square
>> +brackets.  The above example could then be simplified to :
>>
>> It's probably worth mentioning here that the "Flow" syntax is
>> (exactly?) JSON. Also, noting that JSON is a proper subset of YAML is
>> in general is probably worth mentioning.
>>
>> +   .. code-block:: none
>>
>> pygments (and hence Sphinx) supports `yaml` highlighting
>> <http://pygments.org/docs/lexers/>
>>
>> +the following document:
>> +
>> +   .. code-block:: none
>>
>> The precedent for code listings is generally that the `..
>> code-block::` is at the same level of indentation as the paragraph
>> introducing it.
>>
>> +You can combine mappings and squences by indenting.  For example a sequence
>> +of mappings in which one of the mapping values is itself a sequence:
>>
>> s/squences/sequences/
>>
>> +of a new document is denoted with "---".  So in order for Input to handle
>> +multiple documents, it operators on an llvm::yaml::Document<>.
>>
>> s/operators/operates/
>>
>> +can set values in the context in the outer map's yamlMapping() method and
>> +retrive those values in the inner map's yamlMapping() method.
>>
>> s/retrive/retrieve/
>>
>> +of a new document is denoted with "---".  So in order for Input to handle
>>
>> For clarity, I would put the --- in monospace (e.g. "``---``"), here
>> and in other places.
> Thanks for the Sphinx tips.  I've incorporated them and ran a spell checker too ;-)
>
>
>>
>> +UniqueValue
>> +-----------
>>
>> I think that EnumValue be more self-documenting than UniqueValue.
> I'm happy to give UniqueValue a better name.  But EnumValue is not quite right because it can be used with #defines too.  The real constraint is that there be a one-to-one mapping of strings to values.    I want it to contrast with BitValue which maps a set (sequence) of strings to a set of values OR'ed together.
>
>
>
>> At a design level, what are the pros/cons of this approach compared
>> with a traits-based approach? What made you choose this design versus
>> a traits-based approach?
>
> I'm not sure what you mean by traits-based in this context.    The back story is that for lld I've been writing code to read and write yaml documents.  Michael's YAMLParser.h certainly makes reading more robust, but there is still tons of (semantic level) error checking you to hand code.  It seemed like most of my code was checking for errors.  Also it was a pain to keep the yaml reading code is sync with yaml writing code.
>
> What we really need was a way to describe the schema of the yaml documents and have some tool generate the code to read and write.  There is a tool called Kwalify which defines a way to express a yaml schema and can check it.  But it has a number of limitations.
>
> Last month a wrote up a proposal for defining a yaml schema language and a tool that would use that schema to generate C++ code to read/validate and write yaml conforming to the schema.  The best feedback I got  (from Daniel Dunbar) was that rather than create another language (yaml schema language) and tools, to try to see if you could express the schema in C++ directly, using meta-programming or whatever.   I looked at Boost serialization for inspiration and came up with this Yaml I/O library.
>
> -Nick
>
>
>>
>> On Mon, Aug 6, 2012 at 12:17 PM, Nick Kledzik <kledzik at apple.com> wrote:
>>> Attached is a patch for review which implements the Yaml I/O library I proposed on llvm-dev July 25th.
>>>
>>>
>>>
>>>
>>> The patch includes the implementation, test cases, and documentation.
>>>
>>> I've included a PDF of the documentation, so you don't have to install the patch and run sphinx to read it.
>>>
>>>
>>>
>>> There are probably more aspects of yaml we can support in YAML I/O, but the current patch is enough to support my needs for encoding mach-o as yaml for lld test cases.
>>>
>>> I was initially planning on just adding this code to lld, but I've had two requests to push it down into llvm.
>>>
>>> Again, here are examples of the mach-o schema and an example mach-o document:
>>>
>>>
>>>
>>>
>>>
>>>
>>> -Nick
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>