[llvm-dev] Should we split llvm Support and ADT?

Mehdi AMINI via llvm-dev llvm-dev at lists.llvm.org
Mon May 29 10:22:51 PDT 2017


2017-05-29 9:25 GMT-07:00 Zachary Turner <zturner at google.com>:

> On Sun, May 28, 2017 at 8:54 PM Mehdi AMINI via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> 2017-05-26 17:47 GMT-07:00 Zachary Turner via llvm-dev <
>> llvm-dev at lists.llvm.org>:
>>
>>> Changing a header file somewhere and having to spend 10 minutes waiting
>>> for a build leads to a lot of wasted developer time.
>>>
>>> The real culprit here is tablegen.  Can we split support and ADT into
>>> two - the parts that tablegen depends on and the parts that it doesn't?
>>>
>>
>> Splitting ADT just based on tablegen usage seems dubious to me. If we
>> need to go this route, I'd replace as many uses of ADT data structure with
>> STL ones to begin with to reduce the surface.
>>
>
> Tablegen would not need to determine WHERE to split, it would just
> motivate the why.
>

Well even the why :)
(note I was mentioning ADT and not Support above).



>   It's obvious just from looking at Support's include directory though
> that a lot of stuff in there doesn't belong together.  A quick look over
> the include directory already suggests a split into "broadly useful stuff"
> and "narrowly useful stuff"
>

I agree, Support is a mess IMO (we have target specific stuff here just for
the sake of sharing code with clang AFAIK) and I'm not sure anyone would
oppose to split it. Ideally the way I would split it would be such that it
could (at some point) be useful outside of LLVM (just like ADT), so one
main criteria could be "could this component of Support be useful outside
of LLVM (and its subprojects)".



> Broadly useful stuff:
> AlignOf
> Allocator
> ArrayRecycler
> Atomic
> AtomicOrdering
> Capacity
> Casting
> Chrono
> circular_raw_ostream
> COM.h
> CommandLine.h
> Compiler.h
> ConvertUTF.h
> CrashRecoveryContext.h
> DataExtractor.h
> Debug.h
> Endian.h
> EndianStream.h
> Errc.h
> Errno.h
> Error.h
> ErrorHandling.h
> ErrorOr.h
> FileOutputBuffer.h
> FileSystem.h
> FileUtilities.h
> Format*.h
> GlobPattern.h
> Host.h
> JamCRC.h
> KnownBits.h
> LineIterator.h
> Locale.h
> ManagedStatic.h
> MathExtras.h
> MD5.h
> Memory.h
> MemoryBuffer.h
> Mutex.h
> MutexGuard.h
> NativeFormatting.h
> Options.h
> Parallel.h
> Path.h
> PointerLikeTypeTraits.h
> PrettyStackTrace.h
> Printable.h
> Process.h
> Program.h
> RandomNumberGenerator.h
> raw_os_ostream.h
> raw_ostream.h
> raw_sha1_ostream.h
> Recycler.h
> RecyclingAllocator.h
> Regex.h
> RWMutex.h
> SaveAndRestore.h
> ScaledNumber.h
> SHA1.h
> Signals.h
> StringPool.h
> StringSaver.h
> SwapByteOrder.h
> SystemUtils.h
> thread.h
> Threading.h
> ThreadLocal.h
> ThreadPool.h
> Timer.h
> TrailingObjects.h
> Unicode.h
> UnicodeCharRanges.h
> UniqueLock.h
> Watchdog.h
> Win64EH.h
> WindowsError.h
> xxhash.h
>
>
> Narrowly useful stuff:
> AArch64TargetParser.def
> ARMAttributeParser.h
> ARMBuildAttriubtes.h
> ARMEHABI.h
> ARMTargetParser.def
> ARMWinEH.h
> Binary*Stream*.h
> BlockFrequency.h
> BranchProbability.h
> CachePruning.h
> CBindingWrapping.h
> CodeGen.h
> CodeGenCWrappers.h
> COFF.h
> Compression.h
> DebugCounter.h
> DotGraphTraits.h
> Dwarf.def
> Dwarf.h
> DynamicLibrary.h
> ELF.h
> GCOV.h
> GenericDomTree.h
> GenericDomTreeConstruction.h
> GraphWriter.h
> LEB128.h
>

LEB128.h seems quite generic.



> LockFileManager.h
> LowLevelTypeImpl.h
> MachO.def
> MachO.h
> MipsABIFlags.h
> OnDiskHashTable.h
> PluginLoader.h
> Registry.h
> ScopedPrinter.h
> SMLoc.h
> Solaris.h
> SourceMgr.h
> SpecialCaseList.h
> TargetParser.h
> TargetRegistry.h
> TargetSelect.h
> TarWriter.h
> ToolOutputFile.h
> TrigramIndex.h
> TypeName.h
> Valgrind.h
> Wasm.h
> YAMLParser.h
> YAMLTraits.h
>

YAML Parser as well.

-- 
Mehdi




>
>
> So, as a very crude first attempt, you call the first group of stuff
> "Base", the second group "Support", and add a dependency from Support to
> Base.  This has nothing to do with tablegen, btw, and tablegen would still
> probably depend on Support even after this separation, but it makes sense
> even from a high level layering perspective (IMO)
>
>
>>
>> 2017-05-28 8:25 GMT-07:00 Krzysztof Parzyszek via llvm-dev <
>> llvm-dev at lists.llvm.org>:
>>
>>> On 5/26/2017 7:59 PM, Zachary Turner via llvm-dev wrote:
>>>
>>>> It's that TableGen depends on Support, so if you change one file in
>>>> support, support gets recompiled into a new static archive, which
>>>> triggers a rerun of tablegen on all the tablegen inputs, which is
>>>> extremely slow.
>>>>
>>>
>>> What exactly is extremely slow? In my experience TableGen itself takes a
>>> negligible amount of time compared to the rest of the build. This is
>>> particularly true in cases when something in Support or ADT is modified, as
>>> this usually triggers recompilation of large parts of LLVM.
>>>
>>
>> Tablegen built in debug is really slow though, I remember an out-of-tree
>> backend where running llvm-tblgen was taking up to 5 min per file!
>> The CMake option LLVM_OPTIMIZED_TABLEGEN helps a lot with this. Otherwise
>> LLVM_TABLEGEN is even more efficient, but it's a double-edged sword.
>>
>> But we could also use the diff-and-copy approach not on the tablegen
>> output but on the llvm-tblgen binary itself, that way we wouldn't re-run it
>> when it does not change itself (I'm not sure why CMake does not use this
>> strategy by default for any file including .o and .a?).
>>
>> --
>> Mehdi
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170529/296e8503/attachment.html>


More information about the llvm-dev mailing list