[llvm-commits] [PATCH] YAML parser.

Michael Spencer bigcheesegs at gmail.com
Sat Feb 4 21:58:50 PST 2012


On 2012/02/02 10:19:21, klimek wrote:
> First run over the header. Generally, the tooling code doesn't change
too much
> with the new interface, so it's definitely in an "I can live with it"
state.

> One main high-level design question: Do we really need all the
implemented
YAML
> features?

We need the core data model and main syntax, tags, and aliases, yes. The
rest of
it, probably not, but they do interact. We haven't fully decided how we
plan to
use YAML in the linker yet.

> Some general design things we probably just have different opinions
about,
which
> I ran into while changing the tooling code to use the YAMLParser. Feel
free to
> strongly object :)
> - I find it easier as a user to have the public section as first
section in a
> class definition; you seem to have a mixture, which I find a little
confusing

Agreed. I'll fix this.

> - const iterators are missing

I'll add them.

> - I find references to higher-level scope objects as members in a
class to be
> bug-prone (you cannot see that the reference is kept from the call
site)

I agree, but I'm not sure how to fix this, as the link is needed.

> - I'm a strong opponent of abbrs; I'm not sure whether as a non-native
speaker
> it's just harder to read them or I'm just obsessive, but I'm a big fan
of
> spending the 0.01 seconds it takes to spell out Current or Position.

I'll fix this where it does not conflict with type names.

I'm sending this now without a new patch so that we can discuss some of
the
questions I had while I'm fixing some of the larger issues.

Thanks for the review!

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h
File include/llvm/Support/YAMLParser.h (right):

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode18
include/llvm/Support/YAMLParser.h:18: //    * BOMs anywhere other than
the first Unicode scalar value in the file.
On 2012/02/02 10:19:22, klimek wrote:
> It would be cool to have a pointer to the first class I have to look
at as a
> user (Stream). In general, I'm wondering whether we can reduce the
size of the
> visible interface here...

Added some user documentation to the top.

I would like to reduce the size. I'm just not sure how best to do it.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode44
include/llvm/Support/YAMLParser.h:44: UEF_UTF32_LE,
On 2012/02/02 10:19:22, klimek wrote:
> Perhaps document LE vs. BE...

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode64
include/llvm/Support/YAMLParser.h:64: struct StreamStartInfo {
On 2012/02/02 10:19:22, klimek wrote:
> Is this only used for the scanning? If so, would it make sense to
split up the
> header? Why's this a struct instead of a typedef? (do you intend to
extend
> this?) Some documentation might help. Same for the next 2 structs.

I do plan to extend a few of these, along with adding a few more (as I
currently overload ScalarInfo for things).

I'll try splitting up the header.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode77
include/llvm/Support/YAMLParser.h:77: struct Token : ilist_node<Token> {
On 2012/02/02 10:19:22, klimek wrote:
> Are Tokens meant to be changed after being created? If not, I'd vote
for a
> non-modifiable design.

They cannot be modified after being created.

By non-modifiable do you mean private members with const getters?

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode103
include/llvm/Support/YAMLParser.h:103: /// A string of length 0 or more
who's begin() points to the logical location
On 2012/02/02 10:19:22, klimek wrote:
> s/who's/whose/

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode108
include/llvm/Support/YAMLParser.h:108: ///       can't be in a union in
C++03.
On 2012/02/02 10:19:22, klimek wrote:
> Why not use inheritance instead?

That would make this a polymorphic class. This should just be a simple
value type.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode155
include/llvm/Support/YAMLParser.h:155: /// Simple keys are handled by
creating an entry in SimpleKeys for each Token
On 2012/02/02 10:19:22, klimek wrote:
> I don't find the concept of a "simple key" in the YAML spec - I think
it would
> be helpful to reference the standard.

They are not mentioned directly in the standard. However, the need for
them comes from grammar rule [151].

I guess I can explain that :P

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode161
include/llvm/Support/YAMLParser.h:161: /// others) before the
SimpleKey's Tok.
On 2012/02/02 10:19:22, klimek wrote:
> In general, this comment doesn't really help me understand the
concept. It
> basically explains some coupling, but that looks pretty random when
I'm
> contextually at this part of the file.

Agreed, I'll try moving this with the rest of the scanner over to
another header.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode162
include/llvm/Support/YAMLParser.h:162: struct SimpleKey {
On 2012/02/02 10:19:22, klimek wrote:
> At a first glance this looks like an internal structure used for
parsing - can
> we hide this?

It is. I would like to. Not sure how.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode191
include/llvm/Support/YAMLParser.h:191: if (Pos >= End)
On 2012/02/02 10:19:22, klimek wrote:
> Why can that happen?

There are some places where it errors at the end of a file. For example
it expected something that isn't there. So instead of having clients
correct this before calling, I correct it here.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode215
include/llvm/Support/YAMLParser.h:215: bool isAtEnd(StringRef::iterator
i = 0) {
On 2012/02/02 10:19:22, klimek wrote:
> Unused?

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode233
include/llvm/Support/YAMLParser.h:233: /// @brief Skip a single
nb-char[27] starting at Pos.
On 2012/02/02 10:19:22, klimek wrote:
> What does 'nb' stand for here? Same for the functions below, I have no
idea what
> those initializations mean.

> Also, here and elsewhere - are those bracketed numbers referring to
> documentation somewhere?

> And lastly, style: shouldn't methods be camelCase?

These are all names taken directly from the yaml spec. Which is also why
the style is different, to closely match the spec.

I have added documentation that explains what the names represent.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode262
include/llvm/Support/YAMLParser.h:262: /// @brief Skip minimal
well-formed code unit subsequence's until Func
On 2012/02/02 10:19:22, klimek wrote:
> s/'//?

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode270
include/llvm/Support/YAMLParser.h:270: StringRef::iterator i =
(this->*Func)(Pos);
On 2012/02/02 10:19:22, klimek wrote:
> Style: capital letter for variables?

Not required in this case.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode304
include/llvm/Support/YAMLParser.h:304: /// @brief Remove simple keys
that can no longer be valid simple keys.
On 2012/02/02 10:19:22, klimek wrote:
> If they can no longer be valid simple keys, perhaps call this
> removeStaleSimpleKeyCandidates?

Done, and also renamed the other functions with SimpleKey in them.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode319
include/llvm/Support/YAMLParser.h:319: bool rollIndent( int Col
On 2012/02/02 10:19:22, klimek wrote:
> Is the indent here intentional?

Yes.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode413
include/llvm/Support/YAMLParser.h:413: TokenQueueT TokenQueue;
On 2012/02/02 10:19:22, klimek wrote:
> If this only holds a limited number of tokens (as simple keys are
limited to
> 1024 tokens if I understand it correctly), would using a SmallVector
lead to the
> same performance with less code?

The TokenQueue is actually an ilist. This is needed because we must be
able to append and insert Tokens into the queue without invalidating the
iterators that SimpleKeys use to track which tokens they apply to.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode422
include/llvm/Support/YAMLParser.h:422: class document_iterator;
On 2012/02/02 10:19:22, klimek wrote:
>  From your naming I assume that using lower_case for a class is correct
style for
> iterators?

For std:: style code, which would include iterators, yes.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode467
include/llvm/Support/YAMLParser.h:467: Node(unsigned int Type, Document
*D, StringRef A);
On 2012/02/02 10:19:22, klimek wrote:
> It's really hard to guess what 'A' stands for here without reading the
context -
> can we call it Anchor?

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode477
include/llvm/Support/YAMLParser.h:477: // These functions forward to
Document and Scanner.
On 2012/02/02 10:19:22, klimek wrote:
> This makes the Node interface pretty large... Is that really
necessary?

I don't know how to simplify this.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode493
include/llvm/Support/YAMLParser.h:493: static inline bool classof(const
NullNode *) { return true; }
On 2012/02/02 10:19:22, klimek wrote:
> Does the 'inline' keyword actually change anything here?

Not sure, this is how the function is defined elsewhere in llvm.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode512
include/llvm/Support/YAMLParser.h:512: // utf8).
On 2012/02/02 10:19:22, klimek wrote:
> Why not whatever the original encoding was?

Because then the parser would be required to understand these other
encodings. The idea is to convert everything to UTF-8 before parsing.
This also simplifies client code.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode536
include/llvm/Support/YAMLParser.h:536: static inline bool classof(const
KeyValueNode *) { return true; }
On 2012/02/02 10:19:22, klimek wrote:
> Is there an underlying strategy how we order the type information vs.
the normal
> methods?

After a quick search, it tends to go last in the class. I'll change
these.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode541
include/llvm/Support/YAMLParser.h:541: /// @brief Parse and return the
key.
On 2012/02/02 10:19:22, klimek wrote:
> Document that calling this multiple times is ok?

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode541
include/llvm/Support/YAMLParser.h:541: /// @brief Parse and return the
key.
On 2012/02/02 10:19:22, klimek wrote:
> I'd document that calling this multiple times is ok...

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode586
include/llvm/Support/YAMLParser.h:586: if ((Base && Other.Base) &&
Base->CurrentEntry != Other.Base->CurrentEntry)
On 2012/02/02 10:19:22, klimek wrote:
> Just return...

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode613
include/llvm/Support/YAMLParser.h:613: Type MType;
On 2012/02/02 10:19:22, klimek wrote:
> MappingType? Or call the enum MappingType and this Type?

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode745
include/llvm/Support/YAMLParser.h:745: Stream &S;
On 2012/02/02 10:19:22, klimek wrote:
> /me cries. But yea, I know, I'm not in the majority with my strong
aversion of
> Os.

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode756
include/llvm/Support/YAMLParser.h:756: return S.S.peekNext();
On 2012/02/02 10:19:22, klimek wrote:
> And this is why: I can deduce locally that the first S is the Stream,
but I had
> to go back and read the Stream implementation to see that the second S
is the
> Scanner.

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode771
include/llvm/Support/YAMLParser.h:771: void handleTagDirective(const
Token &t) {
On 2012/02/02 10:19:22, klimek wrote:
> I assume this is a todo?

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode777
include/llvm/Support/YAMLParser.h:777: bool dir = false;
On 2012/02/02 10:19:22, klimek wrote:
> Style: CapVariables. Also, 'dir' sounds like 'directory'... (yea yea,
I know,
> I'm strange).

Done.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode809
include/llvm/Support/YAMLParser.h:809: Token &t = peekNext();
On 2012/02/02 10:19:22, klimek wrote:
> Style: CapVariables

Local's don't apply.

http://codereview.appspot.com/5604054/diff/1/include/llvm/Support/YAMLParser.h#newcode856
include/llvm/Support/YAMLParser.h:856: // In place destruct and new are
used here to avoid an unneeded
On 2012/02/02 10:19:22, klimek wrote:
> This looks a little over the top - I don't expect this to be
performance
> relevant at all...

And it's also totally wrong! :P. Fixed.

http://codereview.appspot.com/5604054/

- Michael Spencer




More information about the llvm-commits mailing list