[LLVMdev] bit code file incompatibility due to debug info	changes
    Manman Ren 
    manman.ren at gmail.com
       
    Thu Nov 21 16:17:51 PST 2013
    
    
  
On Thu, Nov 21, 2013 at 12:01 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Thu, Nov 21, 2013 at 11:45 AM, Manman Ren <manman.ren at gmail.com> wrote:
>
>>
>>
>>
>> On Thu, Nov 21, 2013 at 11:26 AM, David Blaikie <dblaikie at gmail.com>wrote:
>>
>>>
>>>
>>>
>>> On Thu, Nov 21, 2013 at 11:06 AM, Manman Ren <manman.ren at gmail.com>wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Thu, Nov 21, 2013 at 10:57 AM, David Blaikie <dblaikie at gmail.com>wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Nov 21, 2013 at 10:38 AM, Manman Ren <manman.ren at gmail.com>wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 19, 2013 at 8:47 PM, Manman Ren <manman.ren at gmail.com>wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 19, 2013 at 5:26 PM, Manman Ren <manman.ren at gmail.com>wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 18, 2013 at 11:00 AM, Chandler Carruth <
>>>>>>>> chandlerc at google.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 18, 2013 at 10:55 AM, David Blaikie <
>>>>>>>>> dblaikie at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> It depends a bit, also, on what kind of guarantees we need to
>>>>>>>>>> offer. If the guarantee when reading IR from disk is "will not crash" then
>>>>>>>>>> there's nothing for it but to run full debug info verification.
>>>>>>>>>>
>>>>>>>>>> On the other hand, if we can assume that some specific metadata
>>>>>>>>>> implies the correctness of some other metadata, then all we need to do is
>>>>>>>>>> check a known debug info version number. If it's the current number, do
>>>>>>>>>> nothing (assume the rest of the debug info is up to date and well formed),
>>>>>>>>>> otherwise do all the work to find the debug info and dump it (no need to
>>>>>>>>>> verify it - we just have to do the work to find it, so we don't go
>>>>>>>>>> following bad links later on - that should be as easy as dropping the
>>>>>>>>>> llvm.dbg.cu named node and removing all debug intrinsics and the
>>>>>>>>>> instruction metadata line references). But this latter scheme isn't robust
>>>>>>>>>> against arbitrary metadata (that could, coincidentally, have the right
>>>>>>>>>> version number and arbitrary metadata that breaks all our debug info
>>>>>>>>>> metadata assumptions)
>>>>>>>>>>
>>>>>>>>>> If the latter is sufficient for everyone's needs/principles,
>>>>>>>>>> great.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This makes sense to me, but I see Eric's fundamental concern with
>>>>>>>>> upgrading test cases.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> One other possible idea: we could artificially force coarseness on
>>>>>>>>> metadata while things are still iterating rapidly by just having a version
>>>>>>>>> number that rotates every "release". (Even non-open-source-releases could
>>>>>>>>> be handled by letting anyone, any time they need to rev it, update what the
>>>>>>>>> current version is and update every test case to reference that version.)
>>>>>>>>> If the version is nicely factored, it should at least be an obvious diff if
>>>>>>>>> still huge.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Adding a debug info version number and ignoring the debug info
>>>>>>>> MDNodes when version does not match makes sense. And changing the version
>>>>>>>> number with every release sounds reasonable.
>>>>>>>>
>>>>>>>> One implementation is
>>>>>>>> 1> completely get rid of version number in the tag
>>>>>>>>   LLVMDebugVersion is only used in two places and we should be able
>>>>>>>> to remove it.
>>>>>>>>     include/llvm/DebugInfo.h:    return getUnsignedField(0) &
>>>>>>>> ~LLVMDebugVersionMask;
>>>>>>>>     lib/IR/DIBuilder.cpp:  assert((Tag & LLVMDebugVersionMask) == 0
>>>>>>>> &&
>>>>>>>>     lib/IR/DIBuilder.cpp:  return
>>>>>>>> ConstantInt::get(Type::getInt32Ty(VMContext), Tag | LLVMDebugVersion);
>>>>>>>>   I remember there was something that blocks David from completely
>>>>>>>> getting rid of version number, but it seems the blocker is gone now.
>>>>>>>>   This can be done separately as well.
>>>>>>>>
>>>>>>>> 2> introduce a debug info version in module flags similar to "dwarf
>>>>>>>> version" module flag
>>>>>>>>      LTO can correctly handle the module flag during linking and we
>>>>>>>> only need to change one line in the debug info testing cases when we update
>>>>>>>> debug info version number.
>>>>>>>>      One issue is that we don't have a mode that emits a warning
>>>>>>>> and outputs the largest value (we have a mode that emits a warning and
>>>>>>>> outputs the value from the first module, but we can definitely add another
>>>>>>>> mode if necessary)
>>>>>>>>
>>>>>>>> 3> when to drop the debug info MDNodes?
>>>>>>>>      We could do this in the bit code loader but it seems a
>>>>>>>> violation of layers. One possibility is to run a module pass right after
>>>>>>>> loading the bit code, to remove debug intrinsics and debug tags on
>>>>>>>> instructions.
>>>>>>>>
>>>>>>>
>>>>>>> We have an existing pass StripSymbols that can strip debug info in
>>>>>>> the module.
>>>>>>> So it is just a matter of adding that pass after bit code loader if
>>>>>>> the debug info version number does not match.
>>>>>>>
>>>>>>
>>>>>> Can I assume no objection to the approach? :)
>>>>>> Bill, are you okay with adding another mode to module flags? i.e.
>>>>>> emits a warning and outputs the largest value.
>>>>>>
>>>>>
>>>>> How will this work as a module flag, though? If the flag gets merged
>>>>> and maximized, how will we know which compile units have out of date debug
>>>>> info metadata and drop them?
>>>>>
>>>>
>>>> The dropping part comes from item 3 above: adding StripSymbols pass
>>>> right after bit code loader if the debug info version number does not match.
>>>>
>>>
>>> (what Eric said, but also)
>>>
>>
>>
>>>
>>> I'm just not really understanding how you choose which things to strip
>>> in (3). If you've already merged IR modules together and the debug info
>>> version is bumped to the maximum - if we had an old and a new module, the
>>> resulting version would be 'new' and we wouldn't know we still had some old
>>> data in there we needed to remove, would we? (and even if we did, we
>>> wouldn't know which of the compile units we needed to drop and which
>>> intrinsics we needed to strip, etc - since some was new and good and some
>>> was old and busted)
>>>
>>
>> To David,
>>
>> I was saying that running StripSymbols pass right after loading a bit
>> code file (a source module), then the merging part will come afterwards
>> when linking the source module in.
>>
>
> OK - but then we'd need another special case for non-LTO, when someone's
> just passed a bitcode file to Clang, for example - yes?
>
> By building it into the loader or as close to that as possible (so that
> any LLVM code loading a module never sees out of date debug info metadata)
> we ensure we only have to change one place, not every client of bitcode
> loading.
>
We can drop the debug info in the auto upgrading path similar to how we
upgrade TBAA tags.
I am going to move StripDebugInfo from StripSymbols.cpp to DebugInfo.cpp so
the implementation can be shared between StripSymbols and AutoUpgrade.
After that, a sample patch looks like:
Index: include/llvm/AutoUpgrade.h
===================================================================
--- include/llvm/AutoUpgrade.h  (revision 195264)
+++ include/llvm/AutoUpgrade.h  (working copy)
@@ -57,6 +57,10 @@
   /// with different address spaces: the instruction is replaced by a pair
   /// ptrtoint+inttoptr.
   Value *UpgradeBitCastExpr(unsigned Opc, Constant *C, Type *DestTy);
+
+  /// Check the debug info version number, if it is out-dated, drop the
debug
+  /// info.
+  bool UpgradeDebugInfo(Module &M);
 } // End llvm namespace
 #endif
Index: lib/AsmParser/LLParser.cpp
===================================================================
--- lib/AsmParser/LLParser.cpp  (revision 195264)
+++ lib/AsmParser/LLParser.cpp  (working copy)
@@ -182,6 +182,8 @@
   for (Module::iterator FI = M->begin(), FE = M->end(); FI != FE; )
     UpgradeCallsToIntrinsic(FI++); // must be post-increment, as we remove
+  UpgradeDebugInfo(*M);
+
   return false;
 }
Index: lib/Bitcode/Reader/BitcodeReader.cpp
===================================================================
--- lib/Bitcode/Reader/BitcodeReader.cpp        (revision 195264)
+++ lib/Bitcode/Reader/BitcodeReader.cpp        (working copy)
@@ -3152,6 +3152,7 @@
   for (unsigned I = 0, E = InstsWithTBAATag.size(); I < E; I++)
     UpgradeInstWithTBAATag(InstsWithTBAATag[I]);
+  UpgradeDebugInfo(*M);
   return error_code::success();
 }
Index: lib/IR/AutoUpgrade.cpp
===================================================================
--- lib/IR/AutoUpgrade.cpp      (revision 195264)
+++ lib/IR/AutoUpgrade.cpp      (working copy)
@@ -12,6 +12,7 @@
 //===----------------------------------------------------------------------===//
 #include "llvm/AutoUpgrade.h"
+#include "llvm/DebugInfo.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/Function.h"
 #include "llvm/IR/IRBuilder.h"
@@ -489,3 +490,11 @@
   return 0;
 }
+
+bool llvm::UpgradeDebugInfo(Module &M) {
+  if (getDebugInfoVersionFromModule(M) == LLVMDebugVersion)
+    return false;
+
+  StripDebugInfo(M);
+  return true;
+}
Of course, testing cases will be added when I actually submit the patch.
>
>>
>> To Eric,
>> < You'll want to do it at loading time and not worry about merging at
>> < all. I think this will need to go just after reading the module and
>> < check the version against the hard coded version in the bit code
>> < reader/somewhere. I think we should do it there rather than as a later
>> < pass as we really don't want to do anything to the metadata other than
>> < strip it.
>>
>> I think I was saying the same thing: right after bit code loader :)
>> Eric, are you suggesting to strip the debug info inside the loader?
>>
>> The merging part is mostly about where to store the debug info version
>> number in the bit code file.
>> If we store it as a module flag, we need to specify a mode for the module
>> flag so the linker knows how to merge the module flag.
>>
>
> Except we'd never actually need to merge the flag - since we would drop it
> from any module that didn't have the latest, right? I'm not sure this would
> need to be a module flag, then, since there would be no need to define
> merging behavior. Is there an advantage to using a module flag over just
> putting it on the compile unit metadata node somewhere?
>
Agreed. So we have two choices here, use a module flag and remove the
module flag when stripping the debug info, so we will never merge modules
with different debug info versions. (no new mode is required)
The other choice is putting it on the compile unit MDNode as a field.
There is no reliable way of checking if an old bc file has a version
number, because it requires parsing the CU MDNodes in the old bc files.
Also if we are not going to support multiple debug info versions in a
single compiler, there is no point of putting a version number in each CU
MDNode.
If we decide to go with the module flag, I can start adding the module flag
and update all existing testing cases. Let me know your vote :)
Manman
>
>> If we decide to save the debug info version number as a module flag, then
>> we may need another mode for it.
>>
>
> I would imagine we could strip the flag whenever we strip the debug info,
> then we would never merge modules that had differing debug info versions.
> So we could use any merging (or none at all) definition, since we'd only
> ever merge matching ones.
>
>
>>
>> Thanks,
>> Manman
>>
>>
>>>
>>>> Item 3 is shared between non-LTO and LTO. For LTO, we want the debug
>>>> info version number gets merged and maximized.
>>>>
>>>> Thanks
>>>> Manman
>>>>
>>>>
>>>>>
>>>>> - David
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Manman
>>>>>>
>>>>>>
>>>>>>> 4> how to report warning?
>>>>>>>     Is errs() << "WARNING: " good enough? BTW this is how we emit
>>>>>>> warning when linking module flags.
>>>>>>>
>>>>>>> Any objection to the above? Any other suggestion?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Manman
>>>>>>>
>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131121/c45514c0/attachment.html>
    
    
More information about the llvm-dev
mailing list