[LLVMdev] bit code file incompatibility due to debug info changes

David Blaikie dblaikie at gmail.com
Fri Nov 22 10:14:18 PST 2013


On Fri, Nov 22, 2013 at 9:54 AM, Eric Christopher <echristo at gmail.com>wrote:

> On Fri, Nov 22, 2013 at 9:08 AM, David Blaikie <dblaikie at gmail.com> wrote:
> >
> >
> >
> > On Thu, Nov 21, 2013 at 4:17 PM, Manman Ren <manman.ren at gmail.com>
> wrote:
> >>
> >>
> >>
> >>
> >> On Thu, Nov 21, 2013 at 12:01 PM, David Blaikie <dblaikie at gmail.com>
> >> wrote:
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Nov 21, 2013 at 11:45 AM, Manman Ren <manman.ren at gmail.com>
> >>> wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Nov 21, 2013 at 11:26 AM, David Blaikie <dblaikie at gmail.com>
> >>>> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Thu, Nov 21, 2013 at 11:06 AM, Manman Ren <manman.ren at gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Nov 21, 2013 at 10:57 AM, David Blaikie <dblaikie at gmail.com
> >
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, Nov 21, 2013 at 10:38 AM, Manman Ren <manman.ren at gmail.com
> >
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Tue, Nov 19, 2013 at 8:47 PM, Manman Ren <manman.ren at gmail.com
> >
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, Nov 19, 2013 at 5:26 PM, Manman Ren <
> manman.ren at gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Nov 18, 2013 at 11:00 AM, Chandler Carruth
> >>>>>>>>>> <chandlerc at google.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Nov 18, 2013 at 10:55 AM, David Blaikie
> >>>>>>>>>>> <dblaikie at gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> It depends a bit, also, on what kind of guarantees we need to
> >>>>>>>>>>>> offer. If the guarantee when reading IR from disk is "will
> not crash" then
> >>>>>>>>>>>> there's nothing for it but to run full debug info
> verification.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On the other hand, if we can assume that some specific
> metadata
> >>>>>>>>>>>> implies the correctness of some other metadata, then all we
> need to do is
> >>>>>>>>>>>> check a known debug info version number. If it's the current
> number, do
> >>>>>>>>>>>> nothing (assume the rest of the debug info is up to date and
> well formed),
> >>>>>>>>>>>> otherwise do all the work to find the debug info and dump it
> (no need to
> >>>>>>>>>>>> verify it - we just have to do the work to find it, so we
> don't go following
> >>>>>>>>>>>> bad links later on - that should be as easy as dropping the
> llvm.dbg.cu
> >>>>>>>>>>>> named node and removing all debug intrinsics and the
> instruction metadata
> >>>>>>>>>>>> line references). But this latter scheme isn't robust against
> arbitrary
> >>>>>>>>>>>> metadata (that could, coincidentally, have the right version
> number and
> >>>>>>>>>>>> arbitrary metadata that breaks all our debug info metadata
> assumptions)
> >>>>>>>>>>>>
> >>>>>>>>>>>> If the latter is sufficient for everyone's needs/principles,
> >>>>>>>>>>>> great.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> This makes sense to me, but I see Eric's fundamental concern
> with
> >>>>>>>>>>> upgrading test cases.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> One other possible idea: we could artificially force coarseness
> >>>>>>>>>>> on metadata while things are still iterating rapidly by just
> having a
> >>>>>>>>>>> version number that rotates every "release". (Even
> non-open-source-releases
> >>>>>>>>>>> could be handled by letting anyone, any time they need to rev
> it, update
> >>>>>>>>>>> what the current version is and update every test case to
> reference that
> >>>>>>>>>>> version.) If the version is nicely factored, it should at
> least be an
> >>>>>>>>>>> obvious diff if still huge.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Adding a debug info version number and ignoring the debug info
> >>>>>>>>>> MDNodes when version does not match makes sense. And changing
> the version
> >>>>>>>>>> number with every release sounds reasonable.
> >>>>>>>>>>
> >>>>>>>>>> One implementation is
> >>>>>>>>>> 1> completely get rid of version number in the tag
> >>>>>>>>>>   LLVMDebugVersion is only used in two places and we should be
> >>>>>>>>>> able to remove it.
> >>>>>>>>>>     include/llvm/DebugInfo.h:    return getUnsignedField(0) &
> >>>>>>>>>> ~LLVMDebugVersionMask;
> >>>>>>>>>>     lib/IR/DIBuilder.cpp:  assert((Tag & LLVMDebugVersionMask)
> ==
> >>>>>>>>>> 0 &&
> >>>>>>>>>>     lib/IR/DIBuilder.cpp:  return
> >>>>>>>>>> ConstantInt::get(Type::getInt32Ty(VMContext), Tag |
> LLVMDebugVersion);
> >>>>>>>>>>   I remember there was something that blocks David from
> completely
> >>>>>>>>>> getting rid of version number, but it seems the blocker is gone
> now.
> >>>>>>>>>>   This can be done separately as well.
> >>>>>>>>>>
> >>>>>>>>>> 2> introduce a debug info version in module flags similar to
> >>>>>>>>>> "dwarf version" module flag
> >>>>>>>>>>      LTO can correctly handle the module flag during linking and
> >>>>>>>>>> we only need to change one line in the debug info testing cases
> when we
> >>>>>>>>>> update debug info version number.
> >>>>>>>>>>      One issue is that we don't have a mode that emits a warning
> >>>>>>>>>> and outputs the largest value (we have a mode that emits a
> warning and
> >>>>>>>>>> outputs the value from the first module, but we can definitely
> add another
> >>>>>>>>>> mode if necessary)
> >>>>>>>>>>
> >>>>>>>>>> 3> when to drop the debug info MDNodes?
> >>>>>>>>>>      We could do this in the bit code loader but it seems a
> >>>>>>>>>> violation of layers. One possibility is to run a module pass
> right after
> >>>>>>>>>> loading the bit code, to remove debug intrinsics and debug tags
> on
> >>>>>>>>>> instructions.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> We have an existing pass StripSymbols that can strip debug info
> in
> >>>>>>>>> the module.
> >>>>>>>>> So it is just a matter of adding that pass after bit code loader
> if
> >>>>>>>>> the debug info version number does not match.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Can I assume no objection to the approach? :)
> >>>>>>>> Bill, are you okay with adding another mode to module flags? i.e.
> >>>>>>>> emits a warning and outputs the largest value.
> >>>>>>>
> >>>>>>>
> >>>>>>> How will this work as a module flag, though? If the flag gets
> merged
> >>>>>>> and maximized, how will we know which compile units have out of
> date debug
> >>>>>>> info metadata and drop them?
> >>>>>>
> >>>>>>
> >>>>>> The dropping part comes from item 3 above: adding StripSymbols pass
> >>>>>> right after bit code loader if the debug info version number does
> not match.
> >>>>>
> >>>>>
> >>>>> (what Eric said, but also)
> >>>>
> >>>>
> >>>>>
> >>>>>
> >>>>> I'm just not really understanding how you choose which things to
> strip
> >>>>> in (3). If you've already merged IR modules together and the debug
> info
> >>>>> version is bumped to the maximum - if we had an old and a new
> module, the
> >>>>> resulting version would be 'new' and we wouldn't know we still had
> some old
> >>>>> data in there we needed to remove, would we? (and even if we did, we
> >>>>> wouldn't know which of the compile units we needed to drop and which
> >>>>> intrinsics we needed to strip, etc - since some was new and good and
> some
> >>>>> was old and busted)
> >>>>
> >>>>
> >>>> To David,
> >>>>
> >>>> I was saying that running StripSymbols pass right after loading a bit
> >>>> code file (a source module), then the merging part will come
> afterwards when
> >>>> linking the source module in.
> >>>
> >>>
> >>> OK - but then we'd need another special case for non-LTO, when
> someone's
> >>> just passed a bitcode file to Clang, for example - yes?
> >>>
> >>> By building it into the loader or as close to that as possible (so that
> >>> any LLVM code loading a module never sees out of date debug info
> metadata)
> >>> we ensure we only have to change one place, not every client of bitcode
> >>> loading.
> >>
> >>
> >> We can drop the debug info in the auto upgrading path similar to how we
> >> upgrade TBAA tags.
> >> I am going to move StripDebugInfo from StripSymbols.cpp to DebugInfo.cpp
> >> so the implementation can be shared between StripSymbols and
> AutoUpgrade.
> >> After that, a sample patch looks like:
> >> Index: include/llvm/AutoUpgrade.h
> >> ===================================================================
> >> --- include/llvm/AutoUpgrade.h  (revision 195264)
> >> +++ include/llvm/AutoUpgrade.h  (working copy)
> >> @@ -57,6 +57,10 @@
> >>    /// with different address spaces: the instruction is replaced by a
> >> pair
> >>    /// ptrtoint+inttoptr.
> >>    Value *UpgradeBitCastExpr(unsigned Opc, Constant *C, Type *DestTy);
> >> +
> >> +  /// Check the debug info version number, if it is out-dated, drop the
> >> debug
> >> +  /// info.
> >> +  bool UpgradeDebugInfo(Module &M);
> >>  } // End llvm namespace
> >>
> >>  #endif
> >> Index: lib/AsmParser/LLParser.cpp
> >> ===================================================================
> >> --- lib/AsmParser/LLParser.cpp  (revision 195264)
> >> +++ lib/AsmParser/LLParser.cpp  (working copy)
> >> @@ -182,6 +182,8 @@
> >>    for (Module::iterator FI = M->begin(), FE = M->end(); FI != FE; )
> >>      UpgradeCallsToIntrinsic(FI++); // must be post-increment, as we
> >> remove
> >>
> >> +  UpgradeDebugInfo(*M);
> >> +
> >>    return false;
> >>  }
> >>
> >> Index: lib/Bitcode/Reader/BitcodeReader.cpp
> >> ===================================================================
> >> --- lib/Bitcode/Reader/BitcodeReader.cpp        (revision 195264)
> >> +++ lib/Bitcode/Reader/BitcodeReader.cpp        (working copy)
> >> @@ -3152,6 +3152,7 @@
> >>    for (unsigned I = 0, E = InstsWithTBAATag.size(); I < E; I++)
> >>      UpgradeInstWithTBAATag(InstsWithTBAATag[I]);
> >>
> >> +  UpgradeDebugInfo(*M);
> >>    return error_code::success();
> >>  }
> >>
> >> Index: lib/IR/AutoUpgrade.cpp
> >> ===================================================================
> >> --- lib/IR/AutoUpgrade.cpp      (revision 195264)
> >> +++ lib/IR/AutoUpgrade.cpp      (working copy)
> >> @@ -12,6 +12,7 @@
> >>
> >>
> //===----------------------------------------------------------------------===//
> >>
> >>  #include "llvm/AutoUpgrade.h"
> >> +#include "llvm/DebugInfo.h"
> >>  #include "llvm/IR/Constants.h"
> >>  #include "llvm/IR/Function.h"
> >>  #include "llvm/IR/IRBuilder.h"
> >> @@ -489,3 +490,11 @@
> >>
> >>    return 0;
> >>  }
> >> +
> >> +bool llvm::UpgradeDebugInfo(Module &M) {
> >> +  if (getDebugInfoVersionFromModule(M) == LLVMDebugVersion)
> >> +    return false;
> >> +
> >> +  StripDebugInfo(M);
> >> +  return true;
> >> +}
> >>
> >> Of course, testing cases will be added when I actually submit the patch.
> >>
> >>>
> >>>>
> >>>>
> >>>> To Eric,
> >>>> < You'll want to do it at loading time and not worry about merging at
> >>>> < all. I think this will need to go just after reading the module and
> >>>> < check the version against the hard coded version in the bit code
> >>>> < reader/somewhere. I think we should do it there rather than as a
> later
> >>>> < pass as we really don't want to do anything to the metadata other
> than
> >>>> < strip it.
> >>>>
> >>>> I think I was saying the same thing: right after bit code loader :)
> >>>> Eric, are you suggesting to strip the debug info inside the loader?
> >>>>
> >>>> The merging part is mostly about where to store the debug info version
> >>>> number in the bit code file.
> >>>> If we store it as a module flag, we need to specify a mode for the
> >>>> module flag so the linker knows how to merge the module flag.
> >>>
> >>>
> >>> Except we'd never actually need to merge the flag - since we would drop
> >>> it from any module that didn't have the latest, right? I'm not sure
> this
> >>> would need to be a module flag, then, since there would be no need to
> define
> >>> merging behavior. Is there an advantage to using a module flag over
> just
> >>> putting it on the compile unit metadata node somewhere?
> >>
> >>
> >> Agreed. So we have two choices here, use a module flag and remove the
> >> module flag when stripping the debug info, so we will never merge
> modules
> >> with different debug info versions. (no new mode is required)
> >>
> >> The other choice is putting it on the compile unit MDNode as a field.
> >> There is no reliable way of checking if an old bc file has a version
> >> number, because it requires parsing the CU MDNodes in the old bc files.
> >> Also if we are not going to support multiple debug info versions in a
> >> single compiler, there is no point of putting a version number in each
> CU
> >> MDNode.
> >
> >
> > Fair point. Any useful tradeoffs to consider between a single named
> MDNode
> > and a module flag?
> >
>
> Probably simplicity at the moment. A module flag is pretty easy to
> read/parse/etc at load time versus having to put it in the metadata
> itself.


If the module flag is easier - I'm all for it. I wasn't sure if they were
schematized more strictly than metadata or anything.


> In the metadata itself though we'd have everything self
> contained (i.e. just stick it as a field on the compile unit or
> something).
>

I'm with Manman on the "if we never have mismatched versions, paying a
per-CU overhead is silly" position, so I'd put it in its own named metadata
instead (since necessarily anything other than the loading code would never
see CUs with anything other than the current debug version, so it would be
entirely redundant) - but sounds like module is simpler, so that's great.

- David


>
> -eric
>
> >>
> >>
> >> If we decide to go with the module flag, I can start adding the module
> >> flag and update all existing testing cases. Let me know your vote :)
> >>
> >> Manman
> >>
> >>>
> >>>>
> >>>> If we decide to save the debug info version number as a module flag,
> >>>> then we may need another mode for it.
> >>>
> >>>
> >>> I would imagine we could strip the flag whenever we strip the debug
> info,
> >>> then we would never merge modules that had differing debug info
> versions. So
> >>> we could use any merging (or none at all) definition, since we'd only
> ever
> >>> merge matching ones.
> >>>
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Manman
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Item 3 is shared between non-LTO and LTO. For LTO, we want the debug
> >>>>>> info version number gets merged and maximized.
> >>>>>>
> >>>>>> Thanks
> >>>>>> Manman
> >>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> - David
> >>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Manman
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 4> how to report warning?
> >>>>>>>>>     Is errs() << "WARNING: " good enough? BTW this is how we emit
> >>>>>>>>> warning when linking module flags.
> >>>>>>>>>
> >>>>>>>>> Any objection to the above? Any other suggestion?
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Manman
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> LLVM Developers mailing list
> >>>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131122/dea1781a/attachment.html>


More information about the llvm-dev mailing list