[LLVMdev] bit code file incompatibility due to debug info changes

Eric Christopher echristo at gmail.com
Fri Nov 22 10:17:33 PST 2013


On Fri, Nov 22, 2013 at 10:14 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Fri, Nov 22, 2013 at 9:54 AM, Eric Christopher <echristo at gmail.com>
> wrote:
>>
>> On Fri, Nov 22, 2013 at 9:08 AM, David Blaikie <dblaikie at gmail.com> wrote:
>> >
>> >
>> >
>> > On Thu, Nov 21, 2013 at 4:17 PM, Manman Ren <manman.ren at gmail.com>
>> > wrote:
>> >>
>> >>
>> >>
>> >>
>> >> On Thu, Nov 21, 2013 at 12:01 PM, David Blaikie <dblaikie at gmail.com>
>> >> wrote:
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Thu, Nov 21, 2013 at 11:45 AM, Manman Ren <manman.ren at gmail.com>
>> >>> wrote:
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Thu, Nov 21, 2013 at 11:26 AM, David Blaikie <dblaikie at gmail.com>
>> >>>> wrote:
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Thu, Nov 21, 2013 at 11:06 AM, Manman Ren <manman.ren at gmail.com>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On Thu, Nov 21, 2013 at 10:57 AM, David Blaikie
>> >>>>>> <dblaikie at gmail.com>
>> >>>>>> wrote:
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Thu, Nov 21, 2013 at 10:38 AM, Manman Ren
>> >>>>>>> <manman.ren at gmail.com>
>> >>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Tue, Nov 19, 2013 at 8:47 PM, Manman Ren
>> >>>>>>>> <manman.ren at gmail.com>
>> >>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Tue, Nov 19, 2013 at 5:26 PM, Manman Ren
>> >>>>>>>>> <manman.ren at gmail.com>
>> >>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> On Mon, Nov 18, 2013 at 11:00 AM, Chandler Carruth
>> >>>>>>>>>> <chandlerc at google.com> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Mon, Nov 18, 2013 at 10:55 AM, David Blaikie
>> >>>>>>>>>>> <dblaikie at gmail.com> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> It depends a bit, also, on what kind of guarantees we need to
>> >>>>>>>>>>>> offer. If the guarantee when reading IR from disk is "will
>> >>>>>>>>>>>> not crash" then
>> >>>>>>>>>>>> there's nothing for it but to run full debug info
>> >>>>>>>>>>>> verification.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On the other hand, if we can assume that some specific
>> >>>>>>>>>>>> metadata
>> >>>>>>>>>>>> implies the correctness of some other metadata, then all we
>> >>>>>>>>>>>> need to do is
>> >>>>>>>>>>>> check a known debug info version number. If it's the current
>> >>>>>>>>>>>> number, do
>> >>>>>>>>>>>> nothing (assume the rest of the debug info is up to date and
>> >>>>>>>>>>>> well formed),
>> >>>>>>>>>>>> otherwise do all the work to find the debug info and dump it
>> >>>>>>>>>>>> (no need to
>> >>>>>>>>>>>> verify it - we just have to do the work to find it, so we
>> >>>>>>>>>>>> don't go following
>> >>>>>>>>>>>> bad links later on - that should be as easy as dropping the
>> >>>>>>>>>>>> llvm.dbg.cu
>> >>>>>>>>>>>> named node and removing all debug intrinsics and the
>> >>>>>>>>>>>> instruction metadata
>> >>>>>>>>>>>> line references). But this latter scheme isn't robust against
>> >>>>>>>>>>>> arbitrary
>> >>>>>>>>>>>> metadata (that could, coincidentally, have the right version
>> >>>>>>>>>>>> number and
>> >>>>>>>>>>>> arbitrary metadata that breaks all our debug info metadata
>> >>>>>>>>>>>> assumptions)
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> If the latter is sufficient for everyone's needs/principles,
>> >>>>>>>>>>>> great.
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> This makes sense to me, but I see Eric's fundamental concern
>> >>>>>>>>>>> with
>> >>>>>>>>>>> upgrading test cases.
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> One other possible idea: we could artificially force
>> >>>>>>>>>>> coarseness
>> >>>>>>>>>>> on metadata while things are still iterating rapidly by just
>> >>>>>>>>>>> having a
>> >>>>>>>>>>> version number that rotates every "release". (Even
>> >>>>>>>>>>> non-open-source-releases
>> >>>>>>>>>>> could be handled by letting anyone, any time they need to rev
>> >>>>>>>>>>> it, update
>> >>>>>>>>>>> what the current version is and update every test case to
>> >>>>>>>>>>> reference that
>> >>>>>>>>>>> version.) If the version is nicely factored, it should at
>> >>>>>>>>>>> least be an
>> >>>>>>>>>>> obvious diff if still huge.
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> Adding a debug info version number and ignoring the debug info
>> >>>>>>>>>> MDNodes when version does not match makes sense. And changing
>> >>>>>>>>>> the version
>> >>>>>>>>>> number with every release sounds reasonable.
>> >>>>>>>>>>
>> >>>>>>>>>> One implementation is
>> >>>>>>>>>> 1> completely get rid of version number in the tag
>> >>>>>>>>>>   LLVMDebugVersion is only used in two places and we should be
>> >>>>>>>>>> able to remove it.
>> >>>>>>>>>>     include/llvm/DebugInfo.h:    return getUnsignedField(0) &
>> >>>>>>>>>> ~LLVMDebugVersionMask;
>> >>>>>>>>>>     lib/IR/DIBuilder.cpp:  assert((Tag & LLVMDebugVersionMask)
>> >>>>>>>>>> ==
>> >>>>>>>>>> 0 &&
>> >>>>>>>>>>     lib/IR/DIBuilder.cpp:  return
>> >>>>>>>>>> ConstantInt::get(Type::getInt32Ty(VMContext), Tag |
>> >>>>>>>>>> LLVMDebugVersion);
>> >>>>>>>>>>   I remember there was something that blocks David from
>> >>>>>>>>>> completely
>> >>>>>>>>>> getting rid of version number, but it seems the blocker is gone
>> >>>>>>>>>> now.
>> >>>>>>>>>>   This can be done separately as well.
>> >>>>>>>>>>
>> >>>>>>>>>> 2> introduce a debug info version in module flags similar to
>> >>>>>>>>>> "dwarf version" module flag
>> >>>>>>>>>>      LTO can correctly handle the module flag during linking
>> >>>>>>>>>> and
>> >>>>>>>>>> we only need to change one line in the debug info testing cases
>> >>>>>>>>>> when we
>> >>>>>>>>>> update debug info version number.
>> >>>>>>>>>>      One issue is that we don't have a mode that emits a
>> >>>>>>>>>> warning
>> >>>>>>>>>> and outputs the largest value (we have a mode that emits a
>> >>>>>>>>>> warning and
>> >>>>>>>>>> outputs the value from the first module, but we can definitely
>> >>>>>>>>>> add another
>> >>>>>>>>>> mode if necessary)
>> >>>>>>>>>>
>> >>>>>>>>>> 3> when to drop the debug info MDNodes?
>> >>>>>>>>>>      We could do this in the bit code loader but it seems a
>> >>>>>>>>>> violation of layers. One possibility is to run a module pass
>> >>>>>>>>>> right after
>> >>>>>>>>>> loading the bit code, to remove debug intrinsics and debug tags
>> >>>>>>>>>> on
>> >>>>>>>>>> instructions.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> We have an existing pass StripSymbols that can strip debug info
>> >>>>>>>>> in
>> >>>>>>>>> the module.
>> >>>>>>>>> So it is just a matter of adding that pass after bit code loader
>> >>>>>>>>> if
>> >>>>>>>>> the debug info version number does not match.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Can I assume no objection to the approach? :)
>> >>>>>>>> Bill, are you okay with adding another mode to module flags? i.e.
>> >>>>>>>> emits a warning and outputs the largest value.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> How will this work as a module flag, though? If the flag gets
>> >>>>>>> merged
>> >>>>>>> and maximized, how will we know which compile units have out of
>> >>>>>>> date debug
>> >>>>>>> info metadata and drop them?
>> >>>>>>
>> >>>>>>
>> >>>>>> The dropping part comes from item 3 above: adding StripSymbols pass
>> >>>>>> right after bit code loader if the debug info version number does
>> >>>>>> not match.
>> >>>>>
>> >>>>>
>> >>>>> (what Eric said, but also)
>> >>>>
>> >>>>
>> >>>>>
>> >>>>>
>> >>>>> I'm just not really understanding how you choose which things to
>> >>>>> strip
>> >>>>> in (3). If you've already merged IR modules together and the debug
>> >>>>> info
>> >>>>> version is bumped to the maximum - if we had an old and a new
>> >>>>> module, the
>> >>>>> resulting version would be 'new' and we wouldn't know we still had
>> >>>>> some old
>> >>>>> data in there we needed to remove, would we? (and even if we did, we
>> >>>>> wouldn't know which of the compile units we needed to drop and which
>> >>>>> intrinsics we needed to strip, etc - since some was new and good and
>> >>>>> some
>> >>>>> was old and busted)
>> >>>>
>> >>>>
>> >>>> To David,
>> >>>>
>> >>>> I was saying that running StripSymbols pass right after loading a bit
>> >>>> code file (a source module), then the merging part will come
>> >>>> afterwards when
>> >>>> linking the source module in.
>> >>>
>> >>>
>> >>> OK - but then we'd need another special case for non-LTO, when
>> >>> someone's
>> >>> just passed a bitcode file to Clang, for example - yes?
>> >>>
>> >>> By building it into the loader or as close to that as possible (so
>> >>> that
>> >>> any LLVM code loading a module never sees out of date debug info
>> >>> metadata)
>> >>> we ensure we only have to change one place, not every client of
>> >>> bitcode
>> >>> loading.
>> >>
>> >>
>> >> We can drop the debug info in the auto upgrading path similar to how we
>> >> upgrade TBAA tags.
>> >> I am going to move StripDebugInfo from StripSymbols.cpp to
>> >> DebugInfo.cpp
>> >> so the implementation can be shared between StripSymbols and
>> >> AutoUpgrade.
>> >> After that, a sample patch looks like:
>> >> Index: include/llvm/AutoUpgrade.h
>> >> ===================================================================
>> >> --- include/llvm/AutoUpgrade.h  (revision 195264)
>> >> +++ include/llvm/AutoUpgrade.h  (working copy)
>> >> @@ -57,6 +57,10 @@
>> >>    /// with different address spaces: the instruction is replaced by a
>> >> pair
>> >>    /// ptrtoint+inttoptr.
>> >>    Value *UpgradeBitCastExpr(unsigned Opc, Constant *C, Type *DestTy);
>> >> +
>> >> +  /// Check the debug info version number, if it is out-dated, drop
>> >> the
>> >> debug
>> >> +  /// info.
>> >> +  bool UpgradeDebugInfo(Module &M);
>> >>  } // End llvm namespace
>> >>
>> >>  #endif
>> >> Index: lib/AsmParser/LLParser.cpp
>> >> ===================================================================
>> >> --- lib/AsmParser/LLParser.cpp  (revision 195264)
>> >> +++ lib/AsmParser/LLParser.cpp  (working copy)
>> >> @@ -182,6 +182,8 @@
>> >>    for (Module::iterator FI = M->begin(), FE = M->end(); FI != FE; )
>> >>      UpgradeCallsToIntrinsic(FI++); // must be post-increment, as we
>> >> remove
>> >>
>> >> +  UpgradeDebugInfo(*M);
>> >> +
>> >>    return false;
>> >>  }
>> >>
>> >> Index: lib/Bitcode/Reader/BitcodeReader.cpp
>> >> ===================================================================
>> >> --- lib/Bitcode/Reader/BitcodeReader.cpp        (revision 195264)
>> >> +++ lib/Bitcode/Reader/BitcodeReader.cpp        (working copy)
>> >> @@ -3152,6 +3152,7 @@
>> >>    for (unsigned I = 0, E = InstsWithTBAATag.size(); I < E; I++)
>> >>      UpgradeInstWithTBAATag(InstsWithTBAATag[I]);
>> >>
>> >> +  UpgradeDebugInfo(*M);
>> >>    return error_code::success();
>> >>  }
>> >>
>> >> Index: lib/IR/AutoUpgrade.cpp
>> >> ===================================================================
>> >> --- lib/IR/AutoUpgrade.cpp      (revision 195264)
>> >> +++ lib/IR/AutoUpgrade.cpp      (working copy)
>> >> @@ -12,6 +12,7 @@
>> >>
>> >>
>> >> //===----------------------------------------------------------------------===//
>> >>
>> >>  #include "llvm/AutoUpgrade.h"
>> >> +#include "llvm/DebugInfo.h"
>> >>  #include "llvm/IR/Constants.h"
>> >>  #include "llvm/IR/Function.h"
>> >>  #include "llvm/IR/IRBuilder.h"
>> >> @@ -489,3 +490,11 @@
>> >>
>> >>    return 0;
>> >>  }
>> >> +
>> >> +bool llvm::UpgradeDebugInfo(Module &M) {
>> >> +  if (getDebugInfoVersionFromModule(M) == LLVMDebugVersion)
>> >> +    return false;
>> >> +
>> >> +  StripDebugInfo(M);
>> >> +  return true;
>> >> +}
>> >>
>> >> Of course, testing cases will be added when I actually submit the
>> >> patch.
>> >>
>> >>>
>> >>>>
>> >>>>
>> >>>> To Eric,
>> >>>> < You'll want to do it at loading time and not worry about merging at
>> >>>> < all. I think this will need to go just after reading the module and
>> >>>> < check the version against the hard coded version in the bit code
>> >>>> < reader/somewhere. I think we should do it there rather than as a
>> >>>> later
>> >>>> < pass as we really don't want to do anything to the metadata other
>> >>>> than
>> >>>> < strip it.
>> >>>>
>> >>>> I think I was saying the same thing: right after bit code loader :)
>> >>>> Eric, are you suggesting to strip the debug info inside the loader?
>> >>>>
>> >>>> The merging part is mostly about where to store the debug info
>> >>>> version
>> >>>> number in the bit code file.
>> >>>> If we store it as a module flag, we need to specify a mode for the
>> >>>> module flag so the linker knows how to merge the module flag.
>> >>>
>> >>>
>> >>> Except we'd never actually need to merge the flag - since we would
>> >>> drop
>> >>> it from any module that didn't have the latest, right? I'm not sure
>> >>> this
>> >>> would need to be a module flag, then, since there would be no need to
>> >>> define
>> >>> merging behavior. Is there an advantage to using a module flag over
>> >>> just
>> >>> putting it on the compile unit metadata node somewhere?
>> >>
>> >>
>> >> Agreed. So we have two choices here, use a module flag and remove the
>> >> module flag when stripping the debug info, so we will never merge
>> >> modules
>> >> with different debug info versions. (no new mode is required)
>> >>
>> >> The other choice is putting it on the compile unit MDNode as a field.
>> >> There is no reliable way of checking if an old bc file has a version
>> >> number, because it requires parsing the CU MDNodes in the old bc files.
>> >> Also if we are not going to support multiple debug info versions in a
>> >> single compiler, there is no point of putting a version number in each
>> >> CU
>> >> MDNode.
>> >
>> >
>> > Fair point. Any useful tradeoffs to consider between a single named
>> > MDNode
>> > and a module flag?
>> >
>>
>> Probably simplicity at the moment. A module flag is pretty easy to
>> read/parse/etc at load time versus having to put it in the metadata
>> itself.
>
>
> If the module flag is easier - I'm all for it. I wasn't sure if they were
> schematized more strictly than metadata or anything.
>
>>
>> In the metadata itself though we'd have everything self
>> contained (i.e. just stick it as a field on the compile unit or
>> something).
>
>
> I'm with Manman on the "if we never have mismatched versions, paying a
> per-CU overhead is silly" position, so I'd put it in its own named metadata
> instead (since necessarily anything other than the loading code would never
> see CUs with anything other than the current debug version, so it would be
> entirely redundant) - but sounds like module is simpler, so that's great.
>

Lots of agreement here. :)

-eric

> - David
>
>>
>>
>> -eric
>>
>> >>
>> >>
>> >> If we decide to go with the module flag, I can start adding the module
>> >> flag and update all existing testing cases. Let me know your vote :)
>> >>
>> >> Manman
>> >>
>> >>>
>> >>>>
>> >>>> If we decide to save the debug info version number as a module flag,
>> >>>> then we may need another mode for it.
>> >>>
>> >>>
>> >>> I would imagine we could strip the flag whenever we strip the debug
>> >>> info,
>> >>> then we would never merge modules that had differing debug info
>> >>> versions. So
>> >>> we could use any merging (or none at all) definition, since we'd only
>> >>> ever
>> >>> merge matching ones.
>> >>>
>> >>>>
>> >>>>
>> >>>> Thanks,
>> >>>> Manman
>> >>>>
>> >>>>>
>> >>>>>>
>> >>>>>> Item 3 is shared between non-LTO and LTO. For LTO, we want the
>> >>>>>> debug
>> >>>>>> info version number gets merged and maximized.
>> >>>>>>
>> >>>>>> Thanks
>> >>>>>> Manman
>> >>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> - David
>> >>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Thanks,
>> >>>>>>>> Manman
>> >>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> 4> how to report warning?
>> >>>>>>>>>     Is errs() << "WARNING: " good enough? BTW this is how we
>> >>>>>>>>> emit
>> >>>>>>>>> warning when linking module flags.
>> >>>>>>>>>
>> >>>>>>>>> Any objection to the above? Any other suggestion?
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Manman
>> >>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> _______________________________________________
>> >>>>>>>>>> LLVM Developers mailing list
>> >>>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>
>



More information about the llvm-dev mailing list