[llvm-dev] RFC: General purpose type-safe formatting library

Zachary Turner via llvm-dev llvm-dev at lists.llvm.org
Mon Nov 7 09:05:01 PST 2016


Wanted to ping again on this now that the LLVM Developer Conference is over
and people are (presumably) back to normal.

chandlerc@: Are your concerns sufficiently addressed?
Anyone: Does anyone feel strongly that the proposed syntax is or is not the
way to go, based on the arguments outlined previously?

On Wed, Nov 2, 2016 at 3:54 PM Zachary Turner <zturner at google.com> wrote:

> * UDL Syntax is removed in the latest version of the patch
> <https://reviews.llvm.org/D25587>.
> * Name changed to `formatv` since `format_string` is too much to type.
> * Added conversion operators for `std::string` and `llvm::SmallString`.
>
> I had some feedback offline (not on this thread, unfortunately) that it
> might be worth using a printf style syntax instead of this Python-esque
> syntax.  FTR, I actually somewhat object to this, for a couple of reasons:
>
> 1) It makes back-reference syntax ugly.   "{0} {1} {0}" is much clearer to
> me than "%0$ %1$ %0$".  The latter syntax is also not a very well known
> feature of printf and so unlikely to be used by people with a printf-style
> implementation, whereas it's un-missable with the python-style syntax.
>
> 2) I don't see why we should need to specify the type of the argument with
> %d if the compiler knows it's an integer.  Even if the we can add
> compile-time checking to make it error, it seems unnecessary to even
> encounter this situation in the first place.  I believe the compiler should
> simply format what you give it.
>
> 3) One of the most useful aspects of the current approach is the ability
> to plug in custom formatters for application specific data types.  This is
> not straightforward with a printf-style syntax.
>
> You might be able to hook up a template-specialization like mechanic to
> the processing of %s (similar to my current approach), but it's not obvious
> how you proceed from there to get custom format strings for individual
> types.  For example, a formatter which can print a TimeSpan in different
> units depending on style options you pass in.  This is especially useful
> when trying to print ranges where you often want to be able to specify a
> different separator, or control the formatting of the underlying type.
>  (e.g. it's not clear how you would elegantly format a range of integers in
> hex using this style of approach).
>
> I'm open to feedback here, so if you have an opinion one way or the other,
> please LMK.
>
>
> On Tue, Nov 1, 2016 at 5:39 AM Zachary Turner <zturner at google.com> wrote:
>
> The big problem i see is that to get compile time checking without the UDL
> we're going to have to do something like FORMAT_STRING("{0}") where this is
> a macro. It just seems really gross. It is true that it is harder to find
> the documentation, but that could be alleviated by putting all of this in
> its own namespace like llvm::formatv, then one could search the namespace
> On Mon, Oct 31, 2016 at 11:41 PM Sean Silva <chisophugis at gmail.com> wrote:
>
> On Mon, Oct 31, 2016 at 5:21 PM, Chandler Carruth via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> On Mon, Oct 31, 2016 at 3:46 PM Zachary Turner via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi all,
>
> Tentatively final version is up here: https://reviews.llvm.org/D25587
>
> It has a verbal LGTM, but I plan to wait a bit longer just in case anyone
> has some additional thoughts.  It's a large patch, but if you're
> interested, one way you can help without doing a full-blown review is to
> look at the large comment blocks in FormatVariadic.h and
> FormatProviders.h.  Here I provide a formal description of the grammar of
> the replacement sequences and format syntax.  So you can look at this
> without looking at the code behind it and see if you have comments just on
> the format language.
>
> Here's a summary of (most) everything contained in this patch:
>
> 1) UDL Syntax for outputting to a stream or converting to a string.
>     outs() << "{0}"_fmt.stream(1)
>     std::string S = "{0}"_fmt.string(1);
>
>
> I continue to have a strong objection to using UDLs for this (or anything
> else in LLVM).
>
> I think this feature is poorly known by many programmers. I think it will
> produce error messages that are confusing and hard to debug. I think it
> will have a significant negative impact on compile time. I also think that
> it will exercise substantially less well tested parts of every host
> compiler for LLVM and subject us to an increased rate of mysterious host
> compiler bugs.
>
> I also think it forces programmers to be aware of a "magical" construct
> that doesn't really fit with the rest of the language.
>
> It isn't that any of these issues in isolation cannot be overcome, it is
> that I think the value provided by the UDL specifically is substantially
> smaller than the cost.
>
> I would *very strongly* prefer that this is accomplished with "normal" C++
> syntax, and that compile time checking is done with constexpr when
> available. I think that will give the overwhelming majority of the benefit
> with dramatically lower cost.
>
>
> +1, the UDL seems a bit too automagical.
> `format_string("{0}", 1)` is not that much longer than
> `"{0}"_fmt.string(1)`, but significantly less magical.
>
> Simple example: what should I type into a search engine to find the LLVM
> doxygen for the UDL? I know to search "llvm format_string" for the format
> string, but just from looking at a use of the UDL syntax it might not be
> clear that format_string is even called.
>
> -- Sean Silva
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161107/367bc3a5/attachment.html>


More information about the llvm-dev mailing list