[llvm-dev] Should we split llvm Support and ADT?

Wed Jul 5 23:29:08 PDT 2017

The dependencies for StringRef don't seem particularly clean. StringRef.cpp
includes APInt.h and APFloat.h. APInt.cpp includes FoldingSet.h and
SmallString.h.

~Craig

On Wed, Jul 5, 2017 at 11:04 PM, Chandler Carruth via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> On Wed, Jul 5, 2017 at 8:53 PM Zachary Turner <zturner at google.com> wrote:
>
>> What I mean is, *why* would a non-LLVM project want or need everything in
>> Support?  There is a ton of compiler-specific stuff in there.  There's also
>> file i/o stuff, string formatting, threading support, floating point
>> support, even target specific stuff.  Who needs this other than LLVM?  And
>> if the answer is "nobody" (which I suspect it is), then why should non-LLVM
>> projects import it?  "Because it's the status quo" doesn't seem like a good
>> reason.
>>
>
> What "non-LLVM" project should be using this library at all though? I
> don't know about others, but this is a non-goal for me. If we want to use a
> generic and re-usable library, I would start with one of the many existing
> ones rather than inventing our own, and that has its own host of problems.
>
> But also, "file i/o stuff, string formatting, threading support, floating
> point support" all are also part of the standard library and folks seem OK
> with that. So I'm trying to understand the specific split you want to make
> and what motivates it. I don't think the fact that we *can* make a split is
> actually sufficient justification for making a split.
>
> I continue to think the most effective thing to do is to not think about
> this in terms of "splitting" and instead look for specific, well-defined
> components that could be extracted and layered above support.
>
> I was pretty happy with extracting a BinaryFormat library for example.
> Here, because we gained a clear role and scope for the library, improved
> organization was a compelling motivation. But I don't see any arbitrary
> split as particularly better or less well organized, and based on the
> *previous* arbitrary split (ADT vs. Support) I suspect we will continually
> put new things into the wrong location or want to move things across the
> boundary.
>
> So I kind of come back again to my original question: what problem are you
> trying to solve? I can guess at a few things, but probably will be wrong...
> my guesses:
>
> 1) Better organization of code & libraries
> 2) A technical incompatibility like removing global constructors
> 3) Reducing the size of tools using the library
>
> For #1: I continue to think finding clearly defined components to extract
> is the best approach here. I suspect we'll be left with essentially a
> replacement for facilities that one might expect to find in a standard
> library, and I suspect that's about the best we can do.
>
> For #2: This would be awesome. I would probably approach it by trying to
> extract libraries that actually need global constructors into separate
> libraries that document this requirement. My suspicion is that these are
> very few and far between.
>
> For #3: I don't understand why this matters -- dynamic linking shouldn't
> care, and static linking should drop the unused code. But if this isn't
> working for some reason, I'm totally down with solving it, but the first
> step is probably to understand the specific issue being hit.
>
> Anyways, mostly guessing at this to make sure we make progress. Don't want
> to just be obstructionist here, I'm just genuinely trying to find the best
> path forward.
>
> -Chandler
>
>
>>
>> On Wed, Jul 5, 2017 at 8:39 PM Chandler Carruth <chandlerc at gmail.com>
>> wrote:
>>
>>> On Wed, Jul 5, 2017 at 8:14 PM Zachary Turner <zturner at google.com>
>>> wrote:
>>>
>>>> Is there actually a valid use case for using the entire Support library
>>>> though?
>>>>
>>>> One thing that splitting solves is that I can have StringRef and
>>>> ArrayRef split up and committed by tomorrow.  The same can't be said for
>>>> the entire Support library :)
>>>>
>>>
>>> Huh?
>>>
>>> I'm asking what is the (remaining) use case for *any* split. unsplit is
>>> the status-quo, so I don't see how the easy of doing a split is a use
>>> case... But maybe I'm misunderstanding something...
>>>
>>>
>>>>
>>>> On Wed, Jul 5, 2017 at 7:18 PM Chandler Carruth <chandlerc at gmail.com>
>>>> wrote:
>>>>
>>>>> On Wed, Jul 5, 2017 at 7:14 PM Sean Silva <chisophugis at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Wed, Jul 5, 2017 at 6:46 PM, Chandler Carruth via llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> Having watched a similar library go through this exact evolution, I
>>>>>>> really doubt we want to make any split around "things known to be in C++ in
>>>>>>> the future"... It turns out that this is nearly impossible to predict and
>>>>>>> precludes a tremendous amount of useful utilities.
>>>>>>>
>>>>>>> For example, there is no indication that the range helpers LLVM
>>>>>>> provides will ever end up in C++'s standard library, but they certainly
>>>>>>> seem useful for the demangler (the concrete use case cited).
>>>>>>>
>>>>>>> What is the concrete problem with just linking the support library,
>>>>>>> in all its glory, into the demangler? Why *shouldn't* we do that? I feel
>>>>>>> like that has gotten lost (for me)....
>>>>>>>
>>>>>>
>>>>>> I think it's for reuse in low-level libraries (e.g. libcxxabi).
>>>>>>
>>>>>
>>>>> Ok, but the challenges there seem substantially larger:
>>>>>
>>>>> 1) We have to fix the licensing thing (we're working on that, but it's
>>>>> not going to be instantaneous).
>>>>>
>>>>> 2) We would have to somehow "sandbox" every symbol linked into this
>>>>> library so that when libcxxabi itself is linked with some slightly
>>>>> different version of LLVM than it was built with things don't explode.
>>>>>
>>>>> If we solve #1 and #2 at all, running the appropriate build steps to
>>>>> strip *any* unused part of the big Support library out to minimize the
>>>>> runtime library cost seems pretty easy. And then we can use the entire
>>>>> Support library, no need to split.
>>>>>
>>>>> So I'm wondering what the *splitting* is solving I guess?
>>>>>
>>>>>
>>>>>>
>>>>>> -- Sean Silva
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 5, 2017 at 6:27 PM Zachary Turner via llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> You mentioned that a good line to draw is one where we're adding
>>>>>>>> things that are *known* to be added to c++ future. How strictly do we want
>>>>>>>> to enforce this? There are lots of things have equally broad utility, but
>>>>>>>> aren't necessarily known to be added to c++ in the future.
>>>>>>>>
>>>>>>>> For example, all of MathExtras and StringExtras, many member
>>>>>>>> functions of StringRef that are not in string_view, etc. can we still have
>>>>>>>> these in the top level compatibility library?
>>>>>>>>
>>>>>>>> We could still aim for interfaces that 1-to-1 match STL, but it
>>>>>>>> would nice if we could have some equally low level extras to enhance these
>>>>>>>> classes
>>>>>>>> On Wed, Jul 5, 2017 at 5:44 PM Chris Lattner <clattner at nondot.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Sure, I guess that splitting the arrayref/stringref headers out is
>>>>>>>>> a fine first step.
>>>>>>>>>
>>>>>>>>> -Chris
>>>>>>>>>
>>>>>>>>> On Jul 5, 2017, at 5:07 PM, Zachary Turner <zturner at google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Re-writing StringRef / ArrayRef etc to use the exact same API is a
>>>>>>>>> good idea long term, but there's a lot of ugly messy details that need to
>>>>>>>>> be dealt with.  There's thousands of uses of take_front / drop_front, etc
>>>>>>>>> that have to be converted.  Then there's some methods that aren't in
>>>>>>>>> string_view at all, like consume_integer(), consume_front(), etc that would
>>>>>>>>> have to be raised up to global functions in StringExtras.  All of this can
>>>>>>>>> certainly be done, but it's going to be a *ton* of churn and hours spent to
>>>>>>>>> get it all STL-ified.
>>>>>>>>>
>>>>>>>>> Do you consider this a blocker for doing such a split?  Would it
>>>>>>>>> make sense to do it incrementally where we first just move StringRef et all
>>>>>>>>> wholesale, and then incrementally work to STL-ify the interface?
>>>>>>>>>
>>>>>>>>> On Wed, Jul 5, 2017 at 5:01 PM Chris Lattner <clattner at nondot.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Yes, that proposal makes sense to me: the split would be between
>>>>>>>>>> things that *are* known to be subsumed into later versions of C++, and
>>>>>>>>>> therefore are a compatibility library.
>>>>>>>>>>
>>>>>>>>>> What do you think about this as an implementation approach:
>>>>>>>>>>
>>>>>>>>>>  - Rewrite StringRef (et al) to use the exact same APIs as
>>>>>>>>>> std::string_view.  Keep the StringRef name for now.
>>>>>>>>>>  - When cmake detects that C++’17 mode is supported, the build
>>>>>>>>>> would set a -D flag.
>>>>>>>>>>  - StringRef.h would just include the C++’17 header and typedef
>>>>>>>>>> StringRef to that type.
>>>>>>>>>>  - When we start requiring C++’17, someone can
>>>>>>>>>> “StringRef”->RAUW(“std::string_view”) and nuke the header.
>>>>>>>>>>
>>>>>>>>>> This allows us to have a clean path out of these custom types,
>>>>>>>>>> and makes it very clear that these headers are compatibility shims that go
>>>>>>>>>> away in the future.  It also makes it clear what the division is.
>>>>>>>>>>
>>>>>>>>>> -Chris
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Jul 5, 2017, at 10:38 AM, Zachary Turner <zturner at google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> So, here is an example of where I think a split would be really
>>>>>>>>>> helpful.
>>>>>>>>>>
>>>>>>>>>> https://reviews.llvm.org/D34667
>>>>>>>>>>
>>>>>>>>>> This code would benefit vastly even from just being able to use
>>>>>>>>>> StringRef and ArrayRef.  We have other cases as well where we export some
>>>>>>>>>> code that cannot depend on the rest of LLVM.
>>>>>>>>>>
>>>>>>>>>> Thinking about it some, StringRef, ArrayRef, and various other
>>>>>>>>>> things like STLExtras and iterator.h basically can be summarized as "things
>>>>>>>>>> that are either already already planned for, or wouldn't be entirely out of
>>>>>>>>>> place in the STL itself".  For example, StringRef is std::string_view.
>>>>>>>>>> ArrayRef is std::array_view.  iterator_facade_base is a better version of
>>>>>>>>>> std::iterator.
>>>>>>>>>>
>>>>>>>>>> So I would drop my suggestion to split the libraries in such a
>>>>>>>>>> way that it might benefit TableGen, and instead re-word my suggestion to
>>>>>>>>>> include only classes such as StringRef, ArrayRef, and other STL-like
>>>>>>>>>> utilities that can benefit utilities like our demangler etc that cannot
>>>>>>>>>> depend on the rest of LLVM.  If and when we ever require C++17 for building
>>>>>>>>>> LLVM (a long ways away, obviously, but we might as well be forward
>>>>>>>>>> thinking), we would certainly be able to use std::string_view and
>>>>>>>>>> std::array_view in the demangler.  So splitting things in a way such as
>>>>>>>>>> this makes long term sense IMO.
>>>>>>>>>>
>>>>>>>>>> On Sun, Jun 4, 2017 at 10:50 AM Zachary Turner <
>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Fair enough, i sort of regret mentioning that specific method of
>>>>>>>>>>> splitting originally.
>>>>>>>>>>>
>>>>>>>>>>> For the record, i think any splitting should make sense on its
>>>>>>>>>>> own merit without considering tablegen, and hopefully the end result of
>>>>>>>>>>> "tablegen eventually depends on less stuff" would happen naturally
>>>>>>>>>>> On Sun, Jun 4, 2017 at 10:37 AM Chris Lattner <
>>>>>>>>>>> clattner at nondot.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> > On May 26, 2017, at 5:47 PM, Zachary Turner via llvm-dev <
>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > Changing a header file somewhere and having to spend 10
>>>>>>>>>>>> minutes waiting for a build leads to a lot of wasted developer time.
>>>>>>>>>>>> >
>>>>>>>>>>>> > The real culprit here is tablegen.  Can we split support and
>>>>>>>>>>>> ADT into two - the parts that tablegen depends on and the parts that it
>>>>>>>>>>>> doesn’t?
>>>>>>>>>>>>
>>>>>>>>>>>> In all the comments downthread, I think there is one thing that
>>>>>>>>>>>> hasn't been mentioned: doing a split like this makes tblgen evolution more
>>>>>>>>>>>> difficult.  If libsupport was split into “used by tblgen” and “not used by
>>>>>>>>>>>> tblgen” sections, and then a new tblgen feature needs to use other parts of
>>>>>>>>>>>> libsupport, they’d have to be moved into the “used by tblgen” directory.
>>>>>>>>>>>>
>>>>>>>>>>>> Splitting libsupport as a whole out into its own llvm
>>>>>>>>>>>> subproject has come up many times though, and does make a lot of sense.
>>>>>>>>>>>>
>>>>>>>>>>>> -Chris
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170705/6adfaac7/attachment.html>