[llvm-dev] RFC: LLVM Assembly format for ThinLTO Summary

Thu May 3 15:23:12 PDT 2018

On Thu, May 3, 2018 at 3:16 PM, Steven Wu <stevenwu at apple.com> wrote:

>
>
> On May 3, 2018, at 2:58 PM, Teresa Johnson <tejohnson at google.com> wrote:
>
>
>
> On Thu, May 3, 2018 at 2:16 PM, Steven Wu <stevenwu at apple.com> wrote:
>
>>
>>
>> On May 1, 2018, at 4:50 PM, Teresa Johnson via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> Hi Mehdi, thanks for the comments, responses and a tweaked proposal
>> below. Teresa
>>
>> On Tue, May 1, 2018 at 11:37 AM, Mehdi AMINI <joker.eph at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> My main concern is this one:
>>>
>>> > Currently, I am emitting the summary entries at the end, after the
>>> metadata nodes. Note that the ModuleSummaryIndex is not currently
>>> referenced from the Module, and isn’t currently created when parsing the
>>> Module IR bitcode (there is a separate derived class for reading the
>>> ModuleSummaryIndex from bitcode). This is because they are not currently
>>> used at the same time. However, in the future there is no reason why we
>>> couldn’t tag the global values in the Module’s LLVM assembly with the
>>> corresponding summary entry if the ModuleSummaryIndex is available when
>>> printing the Module in the assembly writer. I.e. we could do the following
>>> for “main” from the above example when printing the IR definition (note the
>>> “^3” at the end):
>>>
>>>
>>> I believe the reason that the ModuleSummaryIndex is not attached to the
>>> Module is that it is fundamentally not a piece of IR, but it is
>>> conceptually really an Analysis result.
>>> Usually other analyses don't serialize their result, we happen to
>>> serialize this one for an optimization purpose (reloading it and making the
>>> thin-link faster).
>>>
>>
>> True. My understanding is that the push for having it serialized via
>> assembly is due to the fact that it is emitted into the bitcode. I know
>> there is disagreement on this reasoning, I am hoping to have a proposal
>> that is acceptable to everyone. =)
>>
>>
>> I totally agree with Mehdi’s concern. It is probably much easier to treat
>> module summary as analysis result, rather than part of the module. Making
>> it part of IR might be overkill just to fix the debugging and readability
>> of the summary.
>>
>>
>>
>>
>>> The fundamental problem is that an analysis result has to be able to be
>>> invalidated with IR changes, attaching this directly to the module wouldn't
>>> achieve this. The risk is that when the IR and the summary get out-of-sync
>>> (`clang -O2 my_module_with_summaries.ll -emit-llvm -o my_optimized
>>> module_with_summaries.ll`) the summaries would be badly wrong.
>>>
>>> Have you looked into what it'd take to make it a "real" analysis in the
>>> pass manager?
>>>
>>
>> Thanks for raising this issue specifically, I hadn't addressed it in my
>> proposal and it is a big one. I am not proposing that we attempt to
>> maintain the summary through optimization passes, and definitely don't
>> think we should do that. IMO deserializing it should be for testing the
>> thin link and the combined summaries in the backends only. To that end, I
>> have an idea (below some background first).
>>
>> Note that in some cases the module summary analysis is an analysis pass.
>> I.e. when invoked by "opt -module-summary=". However, some time ago when
>> Peter added the support for splitting the bitcode (for CFI purposes) and
>> therefore needed to generate a summary in each partition (Thin and
>> Regular), he added the ThinLTOBitcodeWriterPass, which invokes the module
>> summary builder directly (twice). This writer is what gets invoked now when
>> building via "clang -flto=thin", and with "opt -thinlto-bc". So there it is
>> not invoked/maintained as an analysis pass/result. It would be tricky to
>> figure out how to even split rather than recompute the module summary index
>> in that case. Even in the case where we are still invoking as an analysis
>> pass (opt -module-summary), we would need to figure out how to read in the
>> module summary to use as the analysis result when available (so that it
>> could be invalidated and recomputed when stale).
>>
>> Rather than add this plumbing, and just have it discarded if opt does any
>> optimization, I think we should focus at least for the time being on
>> supporting reading the summary from assembly exactly where we currently
>> read in the summary from bitcode:
>> 1) For the thin link (e.g. tools such as llvm-lto2 or llvm-lto, which
>> currently have to be preceded by "opt -module-summary/-thinlto-bc" to
>> generate an index, but could just build it from assembly instead).
>> 2) For the LTO backends (e.g. tools such as llvm-lto which can consume a
>> combined index and invoke the backends, or "clang -fthinlto-index=" for
>> distributed ThinLTO backend testing), where we could build the combined
>> summary index from assembly instead.
>>
>> This greatly simplifies the reading side, as there are no optimizations
>> performed on the IR after the index is read in these cases that would
>> require invalidation. It also simplifies adding the parsing support, since
>> it gets invoked exactly where we expect to build an index currently (i.e.
>> since we don't currently build or store the ModuleSummaryIndex when parsing
>> the Module from bitcode). It doesn't preclude someone from figuring out how
>> to compute the module summary analysis result from the assembly, and
>> invalidating it after optimization, when reading the Module IR via 'opt' in
>> the future.
>>
>> Does this seem like a reasonable proposal to everyone?
>>
>>
>> I would +1 for making this an analysis pass. How about a
>> ModuleSummaryAnalysis that knows how to serialize its result into bitcode
>> and YAML format? We can also add a module flag to indicate if the module
>> should contain module summary or not.
>>
>> * For thinLTO compilation, bitcode writer runs the analysis pass if the
>> module flag is set, and emit module summary into bitcode.
>>
> * When reading bitcode with module summary, the summary is parsed into
>> ModuleSummaryAnalysis result.
>>
>
> Note the reason we don't do this now is that we don't ever need to use a
> the summary at the same time as when we use the IR. We only read the
> summary for the thin link (no IR involved), or for distributed ThinLTO
> build backends (but only the combined index is read, the summary attached
> to the module is not read as it is not needed at all). Which is why I
> suggested the simplification in my proposal above of only reading the
> summary from assembly where we currently read the bitcode summary (which is
> not when we normally read the IR).
>
> Also, note as I mentioned above that the default path may split the IR
> into regular LTO and ThinLTO partitions, which is why we stopped invoking
> it as analysis pass on the default path, and instead invoke the index
> builder directly for each partition. Figuring out how to split a single
> module summary analysis result into two parts is not something I think we
> should block this on.
>
>
> * If there are any transform invalidates the analysis, module summary will
>> be recomputed automatically when writing bitcode, otherwise, it will
>> serialize out the same result as input (probably with some auto upgrade).
>> * For debugging, ModuleSummaryAnalysis can be serialized out to YAML
>> * For testing, ModuleSummaryAnalysis result can be created from YAML
>>
>> I suggested YAML but it can be any other format, including LLVM assembly.
>> For readability, I would prefer YAML.
>>
>
> That is similar to the original proposal from last year, but there was not
> agreement from other upstream maintainers to dump as something other than
> assembly that could be serialized in.
>
> That's why I'm proposing the version above - serialize in as assembly
> exactly where we consume the bitcode version of the module summary index
> today. The advantage is that it doesn't require analysis pass work and
> figuring out how to make the analysis pass work with split LTO partitions.
> At the same time, it doesn't preclude doing that in the future either. I
> think my proposal enables testing of the ThinLTO pipeline via textual
> assembly, which is one of the main goals of being able to serialize it back
> in.
>
>
> Sounds good. I am totally fine with using LLVM assembly as textual format
> and I have already stated my preference of few design choice in the preview
> email. I don’t have anything against the design if you treat the summary as
> something separated from IR even though they are in the same assembly file.
>

Ok great, thanks!
Teresa

>
> Steven
>
>
> Teresa
>
>
>> Thanks
>>
>> Steven
>>
>>
>> Thanks,
>> Teresa
>>
>>
>>
>>> Cheers,
>>>
>>> --
>>> Mehdi
>>>
>>>
>>>
>>>
>>> Le mar. 1 mai 2018 à 11:10, Peter Collingbourne <peter at pcc.me.uk> a
>>> écrit :
>>>
>>>>
>>>>
>>>> On Tue, May 1, 2018 at 9:00 AM, Teresa Johnson <tejohnson at google.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 30, 2018 at 10:21 AM Peter Collingbourne <peter at pcc.me.uk>
>>>>> wrote:
>>>>>
>>>>>> On Mon, Apr 30, 2018 at 8:32 AM, Teresa Johnson <tejohnson at google.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Hi Peter,
>>>>>>> Thanks for your comments, replies below.
>>>>>>> Teresa
>>>>>>>
>>>>>>> On Wed, Apr 25, 2018 at 2:08 PM Peter Collingbourne <peter at pcc.me.uk>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Teresa,
>>>>>>>>
>>>>>>>> Thanks for sending this proposal out.
>>>>>>>>
>>>>>>>> I would again like to register my disagreement with the whole idea
>>>>>>>> of writing summaries in LLVM assembly format. In my view it is clear that
>>>>>>>> this is not the right direction, as it only invites additional complexity
>>>>>>>> and more ways for things to go wrong for no real benefit. However, I don't
>>>>>>>> have the energy to argue that point any further, so I won't stand in the
>>>>>>>> way here.
>>>>>>>>
>>>>>>>
>>>>>>> I assume you are most concerned with the re-assembly/deserialization
>>>>>>> of the summary. My main goal is to get this dumped into a text format, and
>>>>>>> I went this route since the last dumper RFC was blocked with the LLVM
>>>>>>> assembly direction pushed.
>>>>>>>
>>>>>>
>>>>>> Yes, my main concern is with the deserialization. My view is that it
>>>>>> should only be allowed for combined summaries -- allowing it for per-module
>>>>>> is unnecessary as it creates the possibility of things gettting out of
>>>>>> sync. Given that, we don't actually need an assembly representation and we
>>>>>> can use whichever format is most convenient. But given the opposition to
>>>>>> this viewpoint I am willing to concede getting the format (IMHO) right in
>>>>>> favour of something that is acceptable to others, just so that we can test
>>>>>> things in a more reasonable way.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On Tue, Apr 24, 2018 at 7:43 AM, Teresa Johnson <
>>>>>>>> tejohnson at google.com> wrote:
>>>>>>>>
>>>>>>>>> Hi everyone,
>>>>>>>>>
>>>>>>>>> I started working on a long-standing request to have the summary
>>>>>>>>> dumped in a readable format to text, and specifically to emit to LLVM
>>>>>>>>> assembly. Proposal below, please let me know your thoughts.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Teresa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *RFC: LLVM Assembly format for ThinLTO
>>>>>>>>> Summary========================================Background-----------------ThinLTO
>>>>>>>>> operates on small summaries computed during the compile step (i.e. with “-c
>>>>>>>>> -flto=thin”), which are then analyzed and updated during the Thin Link
>>>>>>>>> stage, and utilized to perform IR updates during the post-link ThinLTO
>>>>>>>>> backends. The summaries are emitted as LLVM Bitcode, however, not currently
>>>>>>>>> in the LLVM assembly.There are two ways to generate a bitcode file
>>>>>>>>> containing summary records for a module: 1. Compile with “clang -c
>>>>>>>>> -flto=thin”2. Build from LLVM assembly using “opt -module-summary”Either of
>>>>>>>>> these will result in the ModuleSummaryIndex analysis pass (which builds the
>>>>>>>>> summary index in memory for a module) to be added to the pipeline just
>>>>>>>>> before bitcode emission.Additionally, a combined index is created by
>>>>>>>>> merging all the per-module indexes during the Thin Link, which is
>>>>>>>>> optionally emitted as a bitcode file.Currently, the only way to view these
>>>>>>>>> records is via “llvm-bcanalyzer -dump”, then manually decoding the raw
>>>>>>>>> bitcode dumps.Relatedly, there is YAML reader/writer support for CFI
>>>>>>>>> related summary fields (-wholeprogramdevirt-read-summary and
>>>>>>>>> -wholeprogramdevirt-write-summary). Last summer, GSOC student Charles
>>>>>>>>> Saternos implemented support to dump the summary in YAML from llvm-lto2
>>>>>>>>> (D34080), including the rest of the summary fields (D34063), however, there
>>>>>>>>> was pushback on the related RFC for dumping via YAML or another format
>>>>>>>>> rather than emitting as LLVM assembly.Goals: 1. Define LLVM assembly format
>>>>>>>>> for summary index2. Define interaction between parsing of summary from LLVM
>>>>>>>>> assembly and synthesis of new summary index from IR.3. Implement printing
>>>>>>>>> and parsing of summary index LLVM assemblyProposed LLVM Assembly
>>>>>>>>> Format----------------------------------------------There are several top
>>>>>>>>> level data structures within the ModuleSummaryIndex: 1.
>>>>>>>>> ModulePathStringTable: Holds the paths to the modules summarized in the
>>>>>>>>> index (only one entry for per-module indexes and multiple in the combined
>>>>>>>>> index), along with their hashes (for incremental builds and global
>>>>>>>>> promotion).2. GlobalValueMap: A map from global value GUIDs to the
>>>>>>>>> corresponding function/variable/alias summary (or summaries for the
>>>>>>>>> combined index and weak linkage).3. CFI-related data structures (TypeIdMap,
>>>>>>>>> CfiFunctionDefs, and CfiFunctionDecls)I have a WIP patch to AsmWriter.cpp
>>>>>>>>> to print the ModuleSummaryIndex that I was using to play with the format.
>>>>>>>>> It currently prints 1 and 2 above. I’ve left the CFI related summary data
>>>>>>>>> structures as a TODO for now, until the format is at least conceptually
>>>>>>>>> agreed, but from looking at those I don’t see an issue with using the same
>>>>>>>>> format (with a note/question for Peter on CFI type test representation
>>>>>>>>> below).I modeled the proposed format on metadata, with a few key
>>>>>>>>> differences noted below. Like metadata, I propose enumerating the entries
>>>>>>>>> with the SlotTracker, and prefixing them with a special character. Avoiding
>>>>>>>>> characters already used in some fashion (i.e. “!” for metadata and “#” for
>>>>>>>>> attributes), I initially have chosen “^”. Open to suggestions
>>>>>>>>> though.Consider the following example:extern void foo();int X;int bar() {
>>>>>>>>>  foo();  return X;}void barAlias() __attribute__ ((alias ("bar")));int
>>>>>>>>> main() {  barAlias();  return bar();}The proposed format has one entry per
>>>>>>>>> ModulePathStringTable entry and one per GlobalValueMap/GUID, and looks
>>>>>>>>> like:^0 = module: {path: testA.o, hash: 5487197307045666224}^1 = gv: {guid:
>>>>>>>>> 1881667236089500162, name: X, summaries: {variable: {module: ^0, flags:
>>>>>>>>> {linkage: common, notEligibleToImport: 0, live: 0, dsoLocal: 1}}}}^2 = gv:
>>>>>>>>> {guid: 6699318081062747564, name: foo}^3 = gv: {guid: 15822663052811949562,
>>>>>>>>> name: main, summaries: {function: {module: ^0, flags: {linkage: extern,
>>>>>>>>> notEligibleToImport: 1, live: 0, dsoLocal: 1}, insts: 5, funcFlags:
>>>>>>>>> {readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0}, calls:
>>>>>>>>> {{callee: ^5, hotness: unknown}, {callee: ^4, hotness: unknown}}}}}^4 = gv:
>>>>>>>>> {guid: 16434608426314478903, name: bar, summaries: {function: {module: ^0,
>>>>>>>>> flags: {linkage: extern, notEligibleToImport: 1, live: 0, dsoLocal: 1},
>>>>>>>>> insts: 3, funcFlags: {readNone: 0, readOnly: 0, noRecurse: 0,
>>>>>>>>> returnDoesNotAlias: 0}, calls: {{callee: ^2, hotness: unknown}}, refs:
>>>>>>>>> {^1}}}}^5 = gv: {guid: 18040127437030252312, name: barAlias, summaries:
>>>>>>>>> {alias: {module: ^0, flags: {linkage: extern, notEligibleToImport: 0, live:
>>>>>>>>> 0, dsoLocal: 1}, aliasee: ^4}}}Like metadata, the fields are tagged
>>>>>>>>> (currently using lower camel case, maybe upper camel case would be
>>>>>>>>> preferable).The proposed format has a structure that reflects the data
>>>>>>>>> structures in the summary index. For example, consider the entry “^4”. This
>>>>>>>>> corresponds to the function “bar”. The entry for that GUID in the
>>>>>>>>> GlobalValueMap contains a list of summaries. For per-module summaries such
>>>>>>>>> as this, there will be at most one summary (with no summary list for an
>>>>>>>>> external function like “foo”). In the combined summary there may be
>>>>>>>>> multiple, e.g. in the case of linkonce_odr functions which have definitions
>>>>>>>>> in multiple modules. The summary list for bar (“^4”) contains a
>>>>>>>>> FunctionSummary, so the summary is tagged “function:”. The FunctionSummary
>>>>>>>>> contains both a flags structure (inherited from the base GlobalValueSummary
>>>>>>>>> class), and a funcFlags structure (specific to FunctionSummary). It
>>>>>>>>> therefore contains a brace-enclosed list of flag tags/values for each.Where
>>>>>>>>> a global value summary references another global value summary (e.g. via a
>>>>>>>>> call list, reference list, or aliasee), the entry is referenced by its
>>>>>>>>> slot. E.g. the alias “barAlias” (“^5”) references its aliasee “bar” as
>>>>>>>>> “^4”.Note that in comparison metadata assembly entries tend to be much more
>>>>>>>>> decomposed since many metadata fields are themselves metadata (so then
>>>>>>>>> entries tend to be shorter with references to other metadata
>>>>>>>>> nodes).Currently, I am emitting the summary entries at the end, after the
>>>>>>>>> metadata nodes. Note that the ModuleSummaryIndex is not currently
>>>>>>>>> referenced from the Module, and isn’t currently created when parsing the
>>>>>>>>> Module IR bitcode (there is a separate derived class for reading the
>>>>>>>>> ModuleSummaryIndex from bitcode). This is because they are not currently
>>>>>>>>> used at the same time. However, in the future there is no reason why we
>>>>>>>>> couldn’t tag the global values in the Module’s LLVM assembly with the
>>>>>>>>> corresponding summary entry if the ModuleSummaryIndex is available when
>>>>>>>>> printing the Module in the assembly writer. I.e. we could do the following
>>>>>>>>> for “main” from the above example when printing the IR definition (note the
>>>>>>>>> “^3” at the end):define  dso_local i32 @main() #0 !dbg !17 ^3 {For CFI data
>>>>>>>>> structures, the format would be similar. It appears that TypeIds are
>>>>>>>>> referred to by string name in the top level TypeIdMap (std::map indexed by
>>>>>>>>> std::string type identifier), whereas they are referenced by GUID within
>>>>>>>>> the FunctionSummary class (i.e. the TypeTests vector and the VFuncId
>>>>>>>>> structure). For the LLVM assembly I think there should be a top level entry
>>>>>>>>> for each TypeIdMap, which lists both the type identifier string and its
>>>>>>>>> GUID (followed by its associated information stored in the map), and the
>>>>>>>>> TypeTests/VFuncId references on the FunctionSummary entries can reference
>>>>>>>>> it by summary slot number. I.e. something like:^1 = typeid: {guid: 12345,
>>>>>>>>> identifier: name_of_type, …^2 = gv: {... {function: {.... typeTests: {^1,
>>>>>>>>> …Peter - is that correct and does that sound ok?*
>>>>>>>>>
>>>>>>>>
>>>>>>>> I don't think that would work because the purpose of the top-level
>>>>>>>> TypeIdMap is to contain resolutions for each type identifier, and
>>>>>>>> per-module summaries do not contain resolutions (only the combined summary
>>>>>>>> does). What that means in practice is that we would not be able to recover
>>>>>>>> and write out a type identifier name for per-module summaries as part of ^1
>>>>>>>> in your example (well, we could in principle, because the name is stored
>>>>>>>> somewhere in the function's IR, but that could get complicated).
>>>>>>>>
>>>>>>>
>>>>>>> Ah ok. I guess the top-level map then is generated by the regular
>>>>>>> LTO portion of the link (since it presumably requires IR during the Thin
>>>>>>> Link to get into the combined summary)?
>>>>>>>
>>>>>>
>>>>>> Yes, we fill in the map during the LowerTypeTests and
>>>>>> WholeProgramDevirt passes in the regular LTO part of the link, e.g. here:
>>>>>>  http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/LowerTypeTests.cpp#823
>>>>>>
>>>>>>
>>>>>>> Probably the easiest thing to do is to keep the type identifiers as
>>>>>>>> GUIDs in the function summaries and write out the mapping of type
>>>>>>>> identifiers as a top-level entity.
>>>>>>>>
>>>>>>>
>>>>>>> To confirm, you mean during the compile step create a top-level
>>>>>>> entity that maps GUID -> identifier?
>>>>>>>
>>>>>>
>>>>>> I mean that you could represent this with something like:
>>>>>>
>>>>>> ^typeids = {^1, ^2, ^3}
>>>>>> ^1 = typeid: {identifier: typeid1, ...}
>>>>>> ^2 = typeid: {identifier: typeid2, ...}
>>>>>> ^3 = typeid: {identifier: typeid3, ...}
>>>>>>
>>>>>> There's no need to store the GUIDs here because they can be computed
>>>>>> from the type identifiers. The GUIDs would only be stored in the typeTests
>>>>>> (etc.) fields in each function summary.
>>>>>>
>>>>>
>>>>> I suppose we don't need to store the GUIDs at the top level in the
>>>>> in-memory summary. But I think it would be good to emit the GUIDs in the
>>>>> typeid assembly entries because it makes the association in the assembly
>>>>> much more obvious. I.e. going back to my original example:
>>>>>
>>>>> ^1 = typeid: {guid: 12345, identifier: name_of_type, …
>>>>> ^2 = gv: {... {function: {.... typeTests: {^1, …
>>>>>
>>>>> If we didn't include the GUID in the typeid entry, but rather just the
>>>>> identifier, and put the GUID in the typeTest list in the GV's entry, it
>>>>> wouldn't be obvious at all from the assembly listing which typeid goes with
>>>>> which typeTest. It's also less compact to include the GUID in each
>>>>> typeTests list.
>>>>>
>>>>
>>>> I get that, but my point was that in a per-module summary the TypeIdMap
>>>> is empty, so there will be no names, only GUIDs.
>>>>
>>>> For "making the association more obvious" we might just want to have
>>>> the assembly writer emit the GUID of a name as a comment.
>>>>
>>>>
>>>>> Or perhaps we are saying the same thing - I can't tell from your above
>>>>> example if the GUID is also emitted in the "typeid:" entries.
>>>>>
>>>>
>>>> No, it wouldn't be.
>>>>
>>>> I'm not sure there is a need for the:
>>>>> ^typeids = {^1, ^2, ^3}
>>>>> We can just build the typeids list on the fly as " = typeid: " entries
>>>>> are read in.
>>>>>
>>>>
>>>> That's true. Given that nothing actually needs to refer to them, we can
>>>> just represent the typeids as something like
>>>> typeid: {identifier: typeid1, ...} ; guid = 123
>>>> typeid: {identifier: typeid2, ...} ; guid = 456
>>>> typeid: {identifier: typeid3, ...} ; guid = 789
>>>> without an associated number.
>>>>
>>>> Peter
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>> Teresa
>>>>>
>>>>>
>>>>>> Peter
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Teresa
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Peter
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Issues when Parsing of Summaries from
>>>>>>>>> Assembly--------------------------------------------------------------------When
>>>>>>>>> reading an LLVM assembly file containing module summary entries, a
>>>>>>>>> ModuleSummaryIndex will be created from the entries.Things to consider are
>>>>>>>>> the behavior when: - Invoked with “opt -module-summary” (which currently
>>>>>>>>> builds a new summary index from the IR). Options:1. recompute summary and
>>>>>>>>> throw away summary in the assembly file2. ignore -module-summary and build
>>>>>>>>> the summary from the LLVM assembly3. give an error4. compare the two
>>>>>>>>> summaries (one created from the assembly and the new one created by the
>>>>>>>>> analysis phase from the IR), and error if they are different.My opinion is
>>>>>>>>> to do a),  so that the behavior using -module-summary doesn’t change. We
>>>>>>>>> also need a way to force building of a fresh module summary for cases where
>>>>>>>>> the user has modified the LLVM assembly of the IR (see below). - How to
>>>>>>>>> handle older LLVM assembly files that don’t contain new summary fields.
>>>>>>>>> Options:1. Force the LLVM assembly file to be recreated with a new summary.
>>>>>>>>> I.e. “opt -module-summary -o - | llvm-dis”.2. Auto-upgrade, by silently
>>>>>>>>> creating conservative values for the new summary entries.I lean towards b)
>>>>>>>>> (when possible) for user-friendliness and to reduce required churn on test
>>>>>>>>> inputs. - How to handle partial or incorrect LLVM assembly summary entries.
>>>>>>>>> How to handle partial summaries depends in part on how we answer the prior
>>>>>>>>> question about auto-upgrading. I think the best option like there is to
>>>>>>>>> handle it automatically when possible. However, I do think we should error
>>>>>>>>> on glaring errors like obviously missing information. For example, when
>>>>>>>>> there is summary data in the LLVM assembly, but summary entries are missing
>>>>>>>>> for some global values. E.g. if the user modified the assembly to add a
>>>>>>>>> function but forgot to add a corresponding summary entry. We could still
>>>>>>>>> have subtle issues (e.g. user adds a new call but forgets to update the
>>>>>>>>> caller’s summary call list), but it will be harder to detect those.*
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>>>>>>>>>  408-460-2413
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> --
>>>>>>>> Peter
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>>>>>>>  408-460-2413
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> --
>>>>>> Peter
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>>>>>  408-460-2413
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --
>>>> Peter
>>>>
>>>
>>
>>
>> --
>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>> 408-460-2413
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413
>
>
>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180503/70735eb6/attachment-0001.html>