[LLVMdev] bit code file incompatibility due to debug info changes

David Blaikie dblaikie at gmail.com
Fri Nov 22 09:08:48 PST 2013


On Thu, Nov 21, 2013 at 4:17 PM, Manman Ren <manman.ren at gmail.com> wrote:

>
>
>
> On Thu, Nov 21, 2013 at 12:01 PM, David Blaikie <dblaikie at gmail.com>wrote:
>
>>
>>
>>
>> On Thu, Nov 21, 2013 at 11:45 AM, Manman Ren <manman.ren at gmail.com>wrote:
>>
>>>
>>>
>>>
>>> On Thu, Nov 21, 2013 at 11:26 AM, David Blaikie <dblaikie at gmail.com>wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Thu, Nov 21, 2013 at 11:06 AM, Manman Ren <manman.ren at gmail.com>wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Nov 21, 2013 at 10:57 AM, David Blaikie <dblaikie at gmail.com>wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Nov 21, 2013 at 10:38 AM, Manman Ren <manman.ren at gmail.com>wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 19, 2013 at 8:47 PM, Manman Ren <manman.ren at gmail.com>wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Nov 19, 2013 at 5:26 PM, Manman Ren <manman.ren at gmail.com>wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 18, 2013 at 11:00 AM, Chandler Carruth <
>>>>>>>>> chandlerc at google.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 18, 2013 at 10:55 AM, David Blaikie <
>>>>>>>>>> dblaikie at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> It depends a bit, also, on what kind of guarantees we need to
>>>>>>>>>>> offer. If the guarantee when reading IR from disk is "will not crash" then
>>>>>>>>>>> there's nothing for it but to run full debug info verification.
>>>>>>>>>>>
>>>>>>>>>>> On the other hand, if we can assume that some specific metadata
>>>>>>>>>>> implies the correctness of some other metadata, then all we need to do is
>>>>>>>>>>> check a known debug info version number. If it's the current number, do
>>>>>>>>>>> nothing (assume the rest of the debug info is up to date and well formed),
>>>>>>>>>>> otherwise do all the work to find the debug info and dump it (no need to
>>>>>>>>>>> verify it - we just have to do the work to find it, so we don't go
>>>>>>>>>>> following bad links later on - that should be as easy as dropping the
>>>>>>>>>>> llvm.dbg.cu named node and removing all debug intrinsics and
>>>>>>>>>>> the instruction metadata line references). But this latter scheme isn't
>>>>>>>>>>> robust against arbitrary metadata (that could, coincidentally, have the
>>>>>>>>>>> right version number and arbitrary metadata that breaks all our debug info
>>>>>>>>>>> metadata assumptions)
>>>>>>>>>>>
>>>>>>>>>>> If the latter is sufficient for everyone's needs/principles,
>>>>>>>>>>> great.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This makes sense to me, but I see Eric's fundamental concern with
>>>>>>>>>> upgrading test cases.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> One other possible idea: we could artificially force coarseness
>>>>>>>>>> on metadata while things are still iterating rapidly by just having a
>>>>>>>>>> version number that rotates every "release". (Even non-open-source-releases
>>>>>>>>>> could be handled by letting anyone, any time they need to rev it, update
>>>>>>>>>> what the current version is and update every test case to reference that
>>>>>>>>>> version.) If the version is nicely factored, it should at least be an
>>>>>>>>>> obvious diff if still huge.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Adding a debug info version number and ignoring the debug info
>>>>>>>>> MDNodes when version does not match makes sense. And changing the version
>>>>>>>>> number with every release sounds reasonable.
>>>>>>>>>
>>>>>>>>> One implementation is
>>>>>>>>> 1> completely get rid of version number in the tag
>>>>>>>>>   LLVMDebugVersion is only used in two places and we should be
>>>>>>>>> able to remove it.
>>>>>>>>>     include/llvm/DebugInfo.h:    return getUnsignedField(0) &
>>>>>>>>> ~LLVMDebugVersionMask;
>>>>>>>>>     lib/IR/DIBuilder.cpp:  assert((Tag & LLVMDebugVersionMask) ==
>>>>>>>>> 0 &&
>>>>>>>>>     lib/IR/DIBuilder.cpp:  return
>>>>>>>>> ConstantInt::get(Type::getInt32Ty(VMContext), Tag | LLVMDebugVersion);
>>>>>>>>>   I remember there was something that blocks David from completely
>>>>>>>>> getting rid of version number, but it seems the blocker is gone now.
>>>>>>>>>   This can be done separately as well.
>>>>>>>>>
>>>>>>>>> 2> introduce a debug info version in module flags similar to
>>>>>>>>> "dwarf version" module flag
>>>>>>>>>      LTO can correctly handle the module flag during linking and
>>>>>>>>> we only need to change one line in the debug info testing cases when we
>>>>>>>>> update debug info version number.
>>>>>>>>>      One issue is that we don't have a mode that emits a warning
>>>>>>>>> and outputs the largest value (we have a mode that emits a warning and
>>>>>>>>> outputs the value from the first module, but we can definitely add another
>>>>>>>>> mode if necessary)
>>>>>>>>>
>>>>>>>>> 3> when to drop the debug info MDNodes?
>>>>>>>>>      We could do this in the bit code loader but it seems a
>>>>>>>>> violation of layers. One possibility is to run a module pass right after
>>>>>>>>> loading the bit code, to remove debug intrinsics and debug tags on
>>>>>>>>> instructions.
>>>>>>>>>
>>>>>>>>
>>>>>>>> We have an existing pass StripSymbols that can strip debug info in
>>>>>>>> the module.
>>>>>>>> So it is just a matter of adding that pass after bit code loader if
>>>>>>>> the debug info version number does not match.
>>>>>>>>
>>>>>>>
>>>>>>> Can I assume no objection to the approach? :)
>>>>>>> Bill, are you okay with adding another mode to module flags? i.e.
>>>>>>> emits a warning and outputs the largest value.
>>>>>>>
>>>>>>
>>>>>> How will this work as a module flag, though? If the flag gets merged
>>>>>> and maximized, how will we know which compile units have out of date debug
>>>>>> info metadata and drop them?
>>>>>>
>>>>>
>>>>> The dropping part comes from item 3 above: adding StripSymbols pass
>>>>> right after bit code loader if the debug info version number does not match.
>>>>>
>>>>
>>>> (what Eric said, but also)
>>>>
>>>
>>>
>>>>
>>>> I'm just not really understanding how you choose which things to strip
>>>> in (3). If you've already merged IR modules together and the debug info
>>>> version is bumped to the maximum - if we had an old and a new module, the
>>>> resulting version would be 'new' and we wouldn't know we still had some old
>>>> data in there we needed to remove, would we? (and even if we did, we
>>>> wouldn't know which of the compile units we needed to drop and which
>>>> intrinsics we needed to strip, etc - since some was new and good and some
>>>> was old and busted)
>>>>
>>>
>>> To David,
>>>
>>> I was saying that running StripSymbols pass right after loading a bit
>>> code file (a source module), then the merging part will come afterwards
>>> when linking the source module in.
>>>
>>
>> OK - but then we'd need another special case for non-LTO, when someone's
>> just passed a bitcode file to Clang, for example - yes?
>>
>> By building it into the loader or as close to that as possible (so that
>> any LLVM code loading a module never sees out of date debug info metadata)
>> we ensure we only have to change one place, not every client of bitcode
>> loading.
>>
>
> We can drop the debug info in the auto upgrading path similar to how we
> upgrade TBAA tags.
> I am going to move StripDebugInfo from StripSymbols.cpp to DebugInfo.cpp
> so the implementation can be shared between StripSymbols and AutoUpgrade.
> After that, a sample patch looks like:
> Index: include/llvm/AutoUpgrade.h
> ===================================================================
> --- include/llvm/AutoUpgrade.h  (revision 195264)
> +++ include/llvm/AutoUpgrade.h  (working copy)
> @@ -57,6 +57,10 @@
>    /// with different address spaces: the instruction is replaced by a pair
>    /// ptrtoint+inttoptr.
>    Value *UpgradeBitCastExpr(unsigned Opc, Constant *C, Type *DestTy);
> +
> +  /// Check the debug info version number, if it is out-dated, drop the
> debug
> +  /// info.
> +  bool UpgradeDebugInfo(Module &M);
>  } // End llvm namespace
>
>  #endif
> Index: lib/AsmParser/LLParser.cpp
> ===================================================================
> --- lib/AsmParser/LLParser.cpp  (revision 195264)
> +++ lib/AsmParser/LLParser.cpp  (working copy)
> @@ -182,6 +182,8 @@
>    for (Module::iterator FI = M->begin(), FE = M->end(); FI != FE; )
>      UpgradeCallsToIntrinsic(FI++); // must be post-increment, as we remove
>
> +  UpgradeDebugInfo(*M);
> +
>    return false;
>  }
>
> Index: lib/Bitcode/Reader/BitcodeReader.cpp
> ===================================================================
> --- lib/Bitcode/Reader/BitcodeReader.cpp        (revision 195264)
> +++ lib/Bitcode/Reader/BitcodeReader.cpp        (working copy)
> @@ -3152,6 +3152,7 @@
>    for (unsigned I = 0, E = InstsWithTBAATag.size(); I < E; I++)
>      UpgradeInstWithTBAATag(InstsWithTBAATag[I]);
>
> +  UpgradeDebugInfo(*M);
>    return error_code::success();
>  }
>
> Index: lib/IR/AutoUpgrade.cpp
> ===================================================================
> --- lib/IR/AutoUpgrade.cpp      (revision 195264)
> +++ lib/IR/AutoUpgrade.cpp      (working copy)
> @@ -12,6 +12,7 @@
>
>  //===----------------------------------------------------------------------===//
>
>  #include "llvm/AutoUpgrade.h"
> +#include "llvm/DebugInfo.h"
>  #include "llvm/IR/Constants.h"
>  #include "llvm/IR/Function.h"
>  #include "llvm/IR/IRBuilder.h"
> @@ -489,3 +490,11 @@
>
>    return 0;
>  }
> +
> +bool llvm::UpgradeDebugInfo(Module &M) {
> +  if (getDebugInfoVersionFromModule(M) == LLVMDebugVersion)
> +    return false;
> +
> +  StripDebugInfo(M);
> +  return true;
> +}
>
> Of course, testing cases will be added when I actually submit the patch.
>
>
>>
>>>
>>> To Eric,
>>> < You'll want to do it at loading time and not worry about merging at
>>> < all. I think this will need to go just after reading the module and
>>> < check the version against the hard coded version in the bit code
>>> < reader/somewhere. I think we should do it there rather than as a later
>>> < pass as we really don't want to do anything to the metadata other than
>>> < strip it.
>>>
>>> I think I was saying the same thing: right after bit code loader :)
>>> Eric, are you suggesting to strip the debug info inside the loader?
>>>
>>> The merging part is mostly about where to store the debug info version
>>> number in the bit code file.
>>> If we store it as a module flag, we need to specify a mode for the
>>> module flag so the linker knows how to merge the module flag.
>>>
>>
>> Except we'd never actually need to merge the flag - since we would drop
>> it from any module that didn't have the latest, right? I'm not sure this
>> would need to be a module flag, then, since there would be no need to
>> define merging behavior. Is there an advantage to using a module flag over
>> just putting it on the compile unit metadata node somewhere?
>>
>
> Agreed. So we have two choices here, use a module flag and remove the
> module flag when stripping the debug info, so we will never merge modules
> with different debug info versions. (no new mode is required)
>
> The other choice is putting it on the compile unit MDNode as a field.
> There is no reliable way of checking if an old bc file has a version
> number, because it requires parsing the CU MDNodes in the old bc files.
> Also if we are not going to support multiple debug info versions in a
> single compiler, there is no point of putting a version number in each CU
> MDNode.
>

Fair point. Any useful tradeoffs to consider between a single named MDNode
and a module flag?


>
> If we decide to go with the module flag, I can start adding the module
> flag and update all existing testing cases. Let me know your vote :)
>
> Manman
>
>
>>
>>> If we decide to save the debug info version number as a module flag,
>>> then we may need another mode for it.
>>>
>>
>> I would imagine we could strip the flag whenever we strip the debug info,
>> then we would never merge modules that had differing debug info versions.
>> So we could use any merging (or none at all) definition, since we'd only
>> ever merge matching ones.
>>
>>
>>>
>>> Thanks,
>>> Manman
>>>
>>>
>>>>
>>>>> Item 3 is shared between non-LTO and LTO. For LTO, we want the debug
>>>>> info version number gets merged and maximized.
>>>>>
>>>>> Thanks
>>>>> Manman
>>>>>
>>>>>
>>>>>>
>>>>>> - David
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Manman
>>>>>>>
>>>>>>>
>>>>>>>> 4> how to report warning?
>>>>>>>>     Is errs() << "WARNING: " good enough? BTW this is how we emit
>>>>>>>> warning when linking module flags.
>>>>>>>>
>>>>>>>> Any objection to the above? Any other suggestion?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Manman
>>>>>>>>
>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131122/45ee6322/attachment.html>


More information about the llvm-dev mailing list