[llvm-dev] RFC: Adding a code size analysis tool

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Mon Oct 1 15:25:41 PDT 2018


On Mon, Oct 1, 2018 at 3:24 PM JF Bastien <jfbastien at apple.com> wrote:

> On Oct 1, 2018, at 3:16 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
> (my vote, somewhat biased - is that I'd love to see more investment in
> Bloaty (to keep all these sort of size analysis tools and tricks in one
> place), but sort of accept folks are probably going to keep building more
> infrastructure for this sort of thing in LLVM directly)
>
>
> I get where that comes from, but it seems a bit like a Valgrind versus
> sanitizer argument: integrating with the toolchain gives you things you
> can’t really get otherwise. Valgrind is still great as a self-standing
> thing.
>

Not sure that's quite the same though - with sanitizer integrating with the
optimizers is the key here.

With bloaty - it could, at worst, use LLVM's libDebugInfo as a library to
implement the more advanced debug-using features without being less
functional than an in-LLVM implementation.

- Dave


>
>
> On Wed, Sep 26, 2018 at 12:03 PM Vedant Kumar <vsk at apple.com> wrote:
>
>> Hello,
>>
>> I worked on a code size analysis tool for a 'week of code' project and
>> think
>> that it might be useful enough to upstream.
>>
>> The tool is inspired by bloaty (https://github.com/google/bloaty), but
>> tries to
>> do more to attribute code size in actionable ways.
>>
>> For example, it can calculate how many bytes inlined instances of a
>> function
>> added to a binary. In its diff mode, it can show how much more
>> aggressively a
>> function was inlined compared to a baseline. This can be useful when
>> you're,
>> say, trying to figure out why firmware compiled by a new compiler is just
>> a few
>> bytes over the size limit imposed by your embedded device :). In this
>> case,
>> extra information about inlining can help inform a decision to either
>> tweak the
>> inliner's cost model or to judiciously add a few `noinline` attributes.
>> (Note
>> that if you're willing to recompile & write a few SQL queries,
>> optimization
>> remarks can give you similar information, albeit at the IR level.)
>>
>> As another example, this code size tool can attribute code size to
>> semantically
>> interesting groups of code, like C++/Swift classes, or files. In the diff
>> mode,
>> you can see how the code size of a class/file grew compared to a
>> baseline. The
>> tool understands inheritance, so you can also see interesting high-level
>> trends.
>> E.g `clang::Sema` grew more than `llvm::Pass` between clang-6 and clang-7.
>>
>> Unlike bloaty, this tool focuses exclusively on the text segment. Also
>> unlike
>> bloaty, it uses LLVM's DWARF parser instead of rolling its own. The tool
>> is
>> currently implemented as a sub-tool of llvm-dwarfdump.
>>
>> To get size information about a program, you do:
>>
>>   llvm-dwarfdump size-info -baseline <object> -stats-dir <dir>
>>
>> This emits four *.stats files into <dir>, each containing a distinct
>> 'view' into
>> the code groups in <object>. There's a file view, a function view, a
>> class view,
>> and an inlining view. Each view is sorted by code size, so you can see the
>> largest functions/classes/etc immediately.
>>
>> The *.stats files are just human-readable text files. As it happens, they
>> use
>> the flamegraph format (http://brendangregg.com/flamegraphs.html). This
>> makes it
>> easy to visualize any view as a flamegraph. (If you haven't seen one
>> before,
>> it's a hierarchical visualization where the width of each entry
>> corresponds to
>> its frequency (or in this case size).)
>>
>> To look at code growth between two programs, you'd do:
>>
>>   llvm-dwarfdump size-info -baseline <object> -target <object> -stats-dir
>> <dir>
>>
>> Similarly, this emits four 'view' files into <dir>, but with a *.diffstats
>> suffix. The format is the same.
>>
>> Pending Work
>> ------------
>>
>> I think the main piece of work the tool needs is better testing. Currently
>> there's just a single end-to-end test in clang. It might be better to
>> check in
>> a few binaries so we can check that the tool reports sizes correctly.
>>
>> Also, it may turn out that folks are interested in different ways of
>> visualizing
>> size data. While the textual format of flamegraphs is really convenient
>> for
>> humans to read, the graphs themselves do make more sense when the
>> underlying
>> data have a frequentist interpretation. If there's enough interest I can
>> explore
>> using an alternative format for visualization, e.g:
>>
>>   http://neugierig.org/software/chromium/bloat/
>>   https://github.com/evmar/webtreemap
>>
>> (Thanks JF for pointing these out!)
>>
>> Here's a link to the source code:
>>
>>   https://github.com/vedantk/llvm-project/tree/sizeinfo
>>
>> Selected Examples
>> -----------------
>>
>> Here are a few interesting snippets from a comparison of clang-6 vs.
>> clang-7.
>>
>> First, let's take a look at the function view diffstat. Here are the 10
>> functions which grew in size the most. On the left hand side, you'll see
>> the
>> demangled function name. The *change* in code size in bytes is reported
>> on the
>> right hand side (only positive changes are reported).
>>
>>   clang::Sema::CheckHexagonBuiltinCpu([snip]) [function] 170316
>>   ProcessDeclAttribute([snip]) [function] 125893
>>   llvm::AArch64InstPrinter::printAliasInstr([snip]) [function] 105133
>>   llvm::AArch64AppleInstPrinter::printAliasInstr([snip]) [function] 105133
>>   ParseCodeGenArgs([snip]) [function] 64692
>>   unswitchNontrivialInvariants([snip]) [function] 40180
>>   getAttrKind([snip]) [function] 35811
>>   clang::DumpCompilerOptionsAction::ExecuteAction() [function] 32417
>>   llvm::UpgradeIntrinsicCall([snip]) [function] 30239
>>   bool llvm::InstructionSelector::executeMatchTable<(anonymous
>> namespace)::ARMInstructionSelector const, [snip]) const [function] 29352
>>
>>
>> Next, let's look at the file view diffstat. This can be useful because it
>> goes
>> beyond simply identifying the files which grew the most. It actually
>> describes
>> which *functions* grew the most in those files, creating more
>> opportunites to
>> do something about the code growth.
>>
>>   lib/Target/X86/X86ISelLowering.cpp
>> [file];combineX86ShuffleChain([snip]) [function] 24864
>>   lib/Target/X86/X86ISelLowering.cpp [file];combineMul([snip]) [function]
>> 14907
>>   lib/Target/X86/X86ISelLowering.cpp [file];combineStore([snip])
>> [function] 12220
>>   ...
>>   tools/clang/lib/Sema/SemaExpr.cpp
>> [file];clang::Sema::CheckCompareOperands([snip]) [function] 16024
>>   tools/clang/lib/Sema/SemaExpr.cpp
>> [file];diagnoseTautologicalComparison([snip]) [function] 1740
>>   tools/clang/lib/Sema/SemaExpr.cpp
>> [file];clang::Sema::ActOnNumericConstant([snip]) [function] 1436
>>   tools/clang/lib/Sema/SemaExpr.cpp
>> [file];checkThreeWayNarrowingConversion([snip]) [function] 1356
>>   tools/clang/lib/Sema/SemaExpr.cpp
>> [file];CheckIdentityFieldAssignment([snip]) [function] 1280
>>
>>
>> The class view diffstat is a bit different because it has more levels of
>> nesting than the other views, due to inheritance. This might help give a
>> sense
>> for the high-level changes in a program, but may also be less actionable.
>>
>>   clang::Sema [class];clang::Sema::CheckHexagonBuiltinCpu([snip])
>> [function] 170316
>>   clang::Sema [class];clang::Sema::CheckHexagonBuiltinArgument([snip])
>> [function] 24156
>>   clang::Sema [class];clang::Sema::ActOnTag([snip]) [function] 22373
>>   ...
>>   llvm::AArch64InstPrinter [class];llvm::AArch64AppleInstPrinter
>> [class];llvm::AArch64AppleInstPrinter::printAliasInstr([snip]) [function]
>> 105133
>>   llvm::AArch64InstPrinter [class];llvm::AArch64AppleInstPrinter
>> [class];llvm::AArch64AppleInstPrinter::printInstruction([snip]) [function]
>> 5824
>>   ...
>>   llvm::Pass [class];llvm::FunctionPass [class];llvm::MachineFunctionPass
>> [class];(anon)::X86SpeculativeLoadHardeningPass [class];(anonymous
>> namespace)::X86SpeculativeLoadHardeningPass::checkAllLoads(llvm::MachineFunction&)
>> [function] 19287
>>   ...
>>   llvm::Pass [class];llvm::FunctionPass [class];llvm::MachineFunctionPass
>> [class];(anon)::MachineLICMBase [class];(anonymous
>> namespace)::MachineLICMBase::runOnMachineFunction(llvm::MachineFunction&)
>> [function] 20343
>>
>> Here's a link to a flamegraph of the class view diffstat (warning: it's
>> big):
>>
>>
>> http://net.vedantk.com/static/llvm/swift-clang-4.2-vs-5.0.class-view.diffstats.svg
>>
>> Finally, here are a few interesting entries from the inlining view
>> diffstat. As
>> with all of the other views, the right hand side still shows code growth
>> in
>> bytes. For a given inlining target, this size is computed by diffing the
>> sum of
>> PC range lengths from all DW_TAG_inlined_subroutines referring to that
>> target.
>> This allows the size tool to attribute code size to an inlining target
>> even
>> when the inlined code is not contiguous in the caller.
>>
>>   llvm::raw_ostream::operator<<(char const*) [inlining-target] 66720
>>   llvm::MCRegisterClass::contains(unsigned int) const [inlining-target]
>> 64161
>>   llvm::StringRef::StringRef(char const*) [inlining-target] 39262
>>   llvm::MCInst::getOperand(unsigned int) const [inlining-target] 33268
>>   clang::CodeCompletionResult::~CodeCompletionResult() [inlining-target]
>> 25763
>>   llvm::operator+(llvm::Twine const&, llvm::Twine const&)
>> [inlining-target] 25525
>>   clang::ASTImporter::Import(clang::SourceLocation) [inlining-target]
>> 21096
>>   clang::Sema::Diag(clang::SourceLocation, unsigned int)
>> [inlining-target] 20898
>>
>> Feedback & questions welcome!
>>
>> thanks,
>> vedant
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181001/5a445cfe/attachment-0001.html>


More information about the llvm-dev mailing list