[llvm-dev] Should we split llvm Support and ADT?

Wed Jul 5 23:04:07 PDT 2017

On Wed, Jul 5, 2017 at 8:53 PM Zachary Turner <zturner at google.com> wrote:

> What I mean is, *why* would a non-LLVM project want or need everything in
> Support?  There is a ton of compiler-specific stuff in there.  There's also
> file i/o stuff, string formatting, threading support, floating point
> support, even target specific stuff.  Who needs this other than LLVM?  And
> if the answer is "nobody" (which I suspect it is), then why should non-LLVM
> projects import it?  "Because it's the status quo" doesn't seem like a good
> reason.
>

What "non-LLVM" project should be using this library at all though? I don't
know about others, but this is a non-goal for me. If we want to use a
generic and re-usable library, I would start with one of the many existing
ones rather than inventing our own, and that has its own host of problems.

But also, "file i/o stuff, string formatting, threading support, floating
point support" all are also part of the standard library and folks seem OK
with that. So I'm trying to understand the specific split you want to make
and what motivates it. I don't think the fact that we *can* make a split is
actually sufficient justification for making a split.

I continue to think the most effective thing to do is to not think about
this in terms of "splitting" and instead look for specific, well-defined
components that could be extracted and layered above support.

I was pretty happy with extracting a BinaryFormat library for example.
Here, because we gained a clear role and scope for the library, improved
organization was a compelling motivation. But I don't see any arbitrary
split as particularly better or less well organized, and based on the
*previous* arbitrary split (ADT vs. Support) I suspect we will continually
put new things into the wrong location or want to move things across the
boundary.

So I kind of come back again to my original question: what problem are you
trying to solve? I can guess at a few things, but probably will be wrong...
my guesses:

1) Better organization of code & libraries
2) A technical incompatibility like removing global constructors
3) Reducing the size of tools using the library

For #1: I continue to think finding clearly defined components to extract
is the best approach here. I suspect we'll be left with essentially a
replacement for facilities that one might expect to find in a standard
library, and I suspect that's about the best we can do.

For #2: This would be awesome. I would probably approach it by trying to
extract libraries that actually need global constructors into separate
libraries that document this requirement. My suspicion is that these are
very few and far between.

For #3: I don't understand why this matters -- dynamic linking shouldn't
care, and static linking should drop the unused code. But if this isn't
working for some reason, I'm totally down with solving it, but the first
step is probably to understand the specific issue being hit.

Anyways, mostly guessing at this to make sure we make progress. Don't want
to just be obstructionist here, I'm just genuinely trying to find the best
path forward.

-Chandler

>
> On Wed, Jul 5, 2017 at 8:39 PM Chandler Carruth <chandlerc at gmail.com>
> wrote:
>
>> On Wed, Jul 5, 2017 at 8:14 PM Zachary Turner <zturner at google.com> wrote:
>>
>>> Is there actually a valid use case for using the entire Support library
>>> though?
>>>
>>> One thing that splitting solves is that I can have StringRef and
>>> ArrayRef split up and committed by tomorrow.  The same can't be said for
>>> the entire Support library :)
>>>
>>
>> Huh?
>>
>> I'm asking what is the (remaining) use case for *any* split. unsplit is
>> the status-quo, so I don't see how the easy of doing a split is a use
>> case... But maybe I'm misunderstanding something...
>>
>>
>>>
>>> On Wed, Jul 5, 2017 at 7:18 PM Chandler Carruth <chandlerc at gmail.com>
>>> wrote:
>>>
>>>> On Wed, Jul 5, 2017 at 7:14 PM Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>>
>>>>> On Wed, Jul 5, 2017 at 6:46 PM, Chandler Carruth via llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> Having watched a similar library go through this exact evolution, I
>>>>>> really doubt we want to make any split around "things known to be in C++ in
>>>>>> the future"... It turns out that this is nearly impossible to predict and
>>>>>> precludes a tremendous amount of useful utilities.
>>>>>>
>>>>>> For example, there is no indication that the range helpers LLVM
>>>>>> provides will ever end up in C++'s standard library, but they certainly
>>>>>> seem useful for the demangler (the concrete use case cited).
>>>>>>
>>>>>> What is the concrete problem with just linking the support library,
>>>>>> in all its glory, into the demangler? Why *shouldn't* we do that? I feel
>>>>>> like that has gotten lost (for me)....
>>>>>>
>>>>>
>>>>> I think it's for reuse in low-level libraries (e.g. libcxxabi).
>>>>>
>>>>
>>>> Ok, but the challenges there seem substantially larger:
>>>>
>>>> 1) We have to fix the licensing thing (we're working on that, but it's
>>>> not going to be instantaneous).
>>>>
>>>> 2) We would have to somehow "sandbox" every symbol linked into this
>>>> library so that when libcxxabi itself is linked with some slightly
>>>> different version of LLVM than it was built with things don't explode.
>>>>
>>>> If we solve #1 and #2 at all, running the appropriate build steps to
>>>> strip *any* unused part of the big Support library out to minimize the
>>>> runtime library cost seems pretty easy. And then we can use the entire
>>>> Support library, no need to split.
>>>>
>>>> So I'm wondering what the *splitting* is solving I guess?
>>>>
>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>>
>>>>>> On Wed, Jul 5, 2017 at 6:27 PM Zachary Turner via llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> You mentioned that a good line to draw is one where we're adding
>>>>>>> things that are *known* to be added to c++ future. How strictly do we want
>>>>>>> to enforce this? There are lots of things have equally broad utility, but
>>>>>>> aren't necessarily known to be added to c++ in the future.
>>>>>>>
>>>>>>> For example, all of MathExtras and StringExtras, many member
>>>>>>> functions of StringRef that are not in string_view, etc. can we still have
>>>>>>> these in the top level compatibility library?
>>>>>>>
>>>>>>> We could still aim for interfaces that 1-to-1 match STL, but it
>>>>>>> would nice if we could have some equally low level extras to enhance these
>>>>>>> classes
>>>>>>> On Wed, Jul 5, 2017 at 5:44 PM Chris Lattner <clattner at nondot.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Sure, I guess that splitting the arrayref/stringref headers out is
>>>>>>>> a fine first step.
>>>>>>>>
>>>>>>>> -Chris
>>>>>>>>
>>>>>>>> On Jul 5, 2017, at 5:07 PM, Zachary Turner <zturner at google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Re-writing StringRef / ArrayRef etc to use the exact same API is a
>>>>>>>> good idea long term, but there's a lot of ugly messy details that need to
>>>>>>>> be dealt with.  There's thousands of uses of take_front / drop_front, etc
>>>>>>>> that have to be converted.  Then there's some methods that aren't in
>>>>>>>> string_view at all, like consume_integer(), consume_front(), etc that would
>>>>>>>> have to be raised up to global functions in StringExtras.  All of this can
>>>>>>>> certainly be done, but it's going to be a *ton* of churn and hours spent to
>>>>>>>> get it all STL-ified.
>>>>>>>>
>>>>>>>> Do you consider this a blocker for doing such a split?  Would it
>>>>>>>> make sense to do it incrementally where we first just move StringRef et all
>>>>>>>> wholesale, and then incrementally work to STL-ify the interface?
>>>>>>>>
>>>>>>>> On Wed, Jul 5, 2017 at 5:01 PM Chris Lattner <clattner at nondot.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Yes, that proposal makes sense to me: the split would be between
>>>>>>>>> things that *are* known to be subsumed into later versions of C++, and
>>>>>>>>> therefore are a compatibility library.
>>>>>>>>>
>>>>>>>>> What do you think about this as an implementation approach:
>>>>>>>>>
>>>>>>>>>  - Rewrite StringRef (et al) to use the exact same APIs as
>>>>>>>>> std::string_view.  Keep the StringRef name for now.
>>>>>>>>>  - When cmake detects that C++’17 mode is supported, the build
>>>>>>>>> would set a -D flag.
>>>>>>>>>  - StringRef.h would just include the C++’17 header and typedef
>>>>>>>>> StringRef to that type.
>>>>>>>>>  - When we start requiring C++’17, someone can
>>>>>>>>> “StringRef”->RAUW(“std::string_view”) and nuke the header.
>>>>>>>>>
>>>>>>>>> This allows us to have a clean path out of these custom types, and
>>>>>>>>> makes it very clear that these headers are compatibility shims that go away
>>>>>>>>> in the future.  It also makes it clear what the division is.
>>>>>>>>>
>>>>>>>>> -Chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Jul 5, 2017, at 10:38 AM, Zachary Turner <zturner at google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> So, here is an example of where I think a split would be really
>>>>>>>>> helpful.
>>>>>>>>>
>>>>>>>>> https://reviews.llvm.org/D34667
>>>>>>>>>
>>>>>>>>> This code would benefit vastly even from just being able to use
>>>>>>>>> StringRef and ArrayRef.  We have other cases as well where we export some
>>>>>>>>> code that cannot depend on the rest of LLVM.
>>>>>>>>>
>>>>>>>>> Thinking about it some, StringRef, ArrayRef, and various other
>>>>>>>>> things like STLExtras and iterator.h basically can be summarized as "things
>>>>>>>>> that are either already already planned for, or wouldn't be entirely out of
>>>>>>>>> place in the STL itself".  For example, StringRef is std::string_view.
>>>>>>>>> ArrayRef is std::array_view.  iterator_facade_base is a better version of
>>>>>>>>> std::iterator.
>>>>>>>>>
>>>>>>>>> So I would drop my suggestion to split the libraries in such a way
>>>>>>>>> that it might benefit TableGen, and instead re-word my suggestion to
>>>>>>>>> include only classes such as StringRef, ArrayRef, and other STL-like
>>>>>>>>> utilities that can benefit utilities like our demangler etc that cannot
>>>>>>>>> depend on the rest of LLVM.  If and when we ever require C++17 for building
>>>>>>>>> LLVM (a long ways away, obviously, but we might as well be forward
>>>>>>>>> thinking), we would certainly be able to use std::string_view and
>>>>>>>>> std::array_view in the demangler.  So splitting things in a way such as
>>>>>>>>> this makes long term sense IMO.
>>>>>>>>>
>>>>>>>>> On Sun, Jun 4, 2017 at 10:50 AM Zachary Turner <zturner at google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Fair enough, i sort of regret mentioning that specific method of
>>>>>>>>>> splitting originally.
>>>>>>>>>>
>>>>>>>>>> For the record, i think any splitting should make sense on its
>>>>>>>>>> own merit without considering tablegen, and hopefully the end result of
>>>>>>>>>> "tablegen eventually depends on less stuff" would happen naturally
>>>>>>>>>> On Sun, Jun 4, 2017 at 10:37 AM Chris Lattner <
>>>>>>>>>> clattner at nondot.org> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> > On May 26, 2017, at 5:47 PM, Zachary Turner via llvm-dev <
>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > Changing a header file somewhere and having to spend 10
>>>>>>>>>>> minutes waiting for a build leads to a lot of wasted developer time.
>>>>>>>>>>> >
>>>>>>>>>>> > The real culprit here is tablegen.  Can we split support and
>>>>>>>>>>> ADT into two - the parts that tablegen depends on and the parts that it
>>>>>>>>>>> doesn’t?
>>>>>>>>>>>
>>>>>>>>>>> In all the comments downthread, I think there is one thing that
>>>>>>>>>>> hasn't been mentioned: doing a split like this makes tblgen evolution more
>>>>>>>>>>> difficult.  If libsupport was split into “used by tblgen” and “not used by
>>>>>>>>>>> tblgen” sections, and then a new tblgen feature needs to use other parts of
>>>>>>>>>>> libsupport, they’d have to be moved into the “used by tblgen” directory.
>>>>>>>>>>>
>>>>>>>>>>> Splitting libsupport as a whole out into its own llvm subproject
>>>>>>>>>>> has come up many times though, and does make a lot of sense.
>>>>>>>>>>>
>>>>>>>>>>> -Chris
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170706/46db6510/attachment-0001.html>