One idea might be to have the entry contain 2 StringRefs. `str` and `quoted_str`. This way you never get access to the underlying quote char, just the full arg, either quoted or unquoted (although doing this would still be better done orthogonally to this patch)<br><div class="gmail_quote"><div dir="ltr">On Sat, Nov 19, 2016 at 5:48 AM Zachary Turner <<a href="mailto:zturner@google.com">zturner@google.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Assuming we do that, what interface do you think would be simpler? We still need easy access to both a StringRef and a c_str(), since StringRef::data is not guaranteed to be null terminated, so the entry thing is still nice. <br class="gmail_msg"><div class="gmail_quote gmail_msg"><div dir="ltr" class="gmail_msg">On Sat, Nov 19, 2016 at 5:44 AM Zachary Turner <<a href="mailto:zturner@google.com" class="gmail_msg" target="_blank">zturner@google.com</a>> wrote:<br class="gmail_msg"></div><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The quote char is only exposed as a means to not break existing code which depends on it (most of which, not surprisingly, is in the Args class itself.<br class="gmail_msg"><br class="gmail_msg">We could try to come up with a way to kill it, but that seems like a separate refactor (and perhaps quite difficult since different platforms have different rules)<br class="gmail_msg"><div class="gmail_quote gmail_msg"><div dir="ltr" class="gmail_msg">On Sat, Nov 19, 2016 at 5:23 AM Pavel Labath <<a href="mailto:labath@google.com" class="gmail_msg" target="_blank">labath@google.com</a>> wrote:<br class="gmail_msg"></div><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">labath added a comment.<br class="gmail_msg">
<br class="gmail_msg">
I don't know how deep do you want this refactor to be, but there is one issue I would like us to consider, if only to decide it is out of scope of this change. I am talking about the `quote_char` thingy. The main problem for me is that I don't think it's possible to sanely define the meaning of that field. According to POSIX quoting rules (which our command line more-or-less follows) a single argument can be quoted in a great many ways, using various combinations of quote characters. For example, these are all valid ways to represent the argument `asdf` in a POSIX shell:<br class="gmail_msg">
<br class="gmail_msg">
asdf<br class="gmail_msg">
"asdf"<br class="gmail_msg">
'asdf'<br class="gmail_msg">
a"sd"f<br class="gmail_msg">
"as"df<br class="gmail_msg">
"as""df"<br class="gmail_msg">
"as"'df'<br class="gmail_msg">
"a"s'd'"f"<br class="gmail_msg">
... (you get my point)<br class="gmail_msg">
<br class="gmail_msg">
I don't think there is a self-consistent way to define what the `quote_char` field will be for each of these options. Moreover, I don't see why one would ever need to use that field. It can only encourage someone to try to "quote" the argument by doing `quote_char+value+quote_char`, which is absolutely wrong if you ever want that result to be machine parsable.(*) For proper quoting I think we should just have a free-standing `std::string quote_for_posix_shell(llvm::StringRef)` function (and maybe `quote_for_windows_cmd`, and whatever else quoting scheme we need), and then the user can decide which one to use based on who is going to be consuming it. Then we can just kill the `quote` field. The only thing is... I have no idea how much work that will be (but I am ready to chip in to make it happen).<br class="gmail_msg">
<br class="gmail_msg">
So, yea, if we decide not to do that, then I think the interface looks great. Otherwise, I think we can design a slightly simpler (and more consistent) one.<br class="gmail_msg">
<br class="gmail_msg">
(*) Bonus question: Try to start an executable under lldb, so that in enters `main()` with `argc=2` and `argv[1]="'"` I.e., as if it had been started this way via bash:<br class="gmail_msg">
<br class="gmail_msg">
$ /bin/cat \'<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
<a href="https://reviews.llvm.org/D26883" rel="noreferrer" class="gmail_msg" target="_blank">https://reviews.llvm.org/D26883</a><br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
</blockquote></div></blockquote></div></blockquote></div>