[PATCH] Add llvm-pdbdump to tools

Fri Jan 23 15:13:55 PST 2015

On Fri, Jan 23, 2015 at 5:59 PM, Zachary Turner <zturner at google.com> wrote:
> Thanks for the detailed review!  Comments inline, and a new patch incoming
> soon.
>
> On Fri Jan 23 2015 at 2:39:49 PM Aaron Ballman <aaron.ballman at gmail.com>
> wrote:
>>
>> Additional thoughts and comments to what Chandler posted.
>>
>> > Index: tools/CMakeLists.txt
>> > ===================================================================
>> > --- tools/CMakeLists.txt
>> > +++ tools/CMakeLists.txt
>> > @@ -61,6 +61,10 @@
>> >
>> >  add_llvm_tool_subdirectory(llvm-go)
>> >
>> > +if(MSVC)
>> > +  add_llvm_tool_subdirectory(llvm-pdbdump)
>> > +endif()
>> > +
>> >  if(NOT CYGWIN AND LLVM_ENABLE_PIC)
>> >    add_llvm_tool_subdirectory(lto)
>> >    add_llvm_tool_subdirectory(llvm-lto)
>> > Index: tools/LLVMBuild.txt
>> > ===================================================================
>> > --- tools/LLVMBuild.txt
>> > +++ tools/LLVMBuild.txt
>> > @@ -16,7 +16,7 @@
>> >
>> > ;===------------------------------------------------------------------------===;
>> >
>> >  [common]
>> > -subdirectories = bugpoint llc lli llvm-ar llvm-as llvm-bcanalyzer
>> > llvm-cov llvm-diff llvm-dis llvm-dwarfdump llvm-extract llvm-jitlistener
>> > llvm-link llvm-lto llvm-mc llvm-nm llvm-objdump llvm-profdata llvm-rtdyld
>> > llvm-size macho-dump opt llvm-mcmarkup verify-uselistorder dsymutil
>> > +subdirectories = bugpoint llc lli llvm-ar llvm-as llvm-bcanalyzer
>> > llvm-cov llvm-diff llvm-dis llvm-dwarfdump llvm-extract llvm-jitlistener
>> > llvm-link llvm-lto llvm-mc llvm-nm llvm-objdump llvm-pdbdump llvm-profdata
>> > llvm-rtdyld llvm-size macho-dump opt llvm-mcmarkup verify-uselistorder
>> > dsymutil
>> >
>> >  [component_0]
>> >  type = Group
>> > Index: tools/llvm-pdbdump/CMakeLists.txt
>> > ===================================================================
>> > --- /dev/null
>> > +++ tools/llvm-pdbdump/CMakeLists.txt
>> > @@ -0,0 +1,16 @@
>> > +set(LLVM_LINK_COMPONENTS
>> > +  Support
>> > +  )
>> > +
>> > +set(MSVC_DIA_SDK_DIR "$ENV{VSINSTALLDIR}DIA SDK")
>> > +include_directories(${MSVC_DIA_SDK_DIR}/include)
>> > +if (CMAKE_SIZEOF_VOID_P EQUAL 8)
>> > +  link_directories(${MSVC_DIA_SDK_DIR}/lib/amd64)
>> > +else()
>> > +  link_directories(${MSVC_DIA_SDK_DIR}/lib)
>> > +endif()
>> > +
>> > +add_llvm_tool(llvm-pdbdump
>> > +  llvm-pdbdump.cpp
>> > +  )
>> > +target_link_libraries(llvm-pdbdump diaguids)
>> > \ No newline at end of file
>> > Index: tools/llvm-pdbdump/COMExtras.h
>> > ===================================================================
>> > --- /dev/null
>> > +++ tools/llvm-pdbdump/COMExtras.h
>> > @@ -0,0 +1,264 @@
>> > +//===- COMExtras.h - Helper files for COM operations -------------*-
>> > C++-*-===//
>> > +//
>> > +//                     The LLVM Compiler Infrastructure
>> > +//
>> > +// This file is distributed under the University of Illinois Open
>> > Source
>> > +// License. See LICENSE.TXT for details.
>> > +//
>> >
>> > +//===----------------------------------------------------------------------===//
>> > +
>> > +#ifndef LLVM_TOOLS_LLVM_PDBDUMP_COMEXTRAS_H
>> > +#define LLVM_TOOLS_LLVM_PDBDUMP_COMEXTRAS_H
>> > +
>> > +#include "llvm/ADT/ArrayRef.h"
>> > +#include "llvm/ADT/SmallVector.h"
>> > +
>> > +template <class F> struct function_traits;
>>
>> This nit applies to the entire patch: prefer typename over class in
>> template delcarations. Also, this should all be namespaced.
>>
>> > +
>> > +template <class R, class... Args>
>> > +struct function_traits<R (*)(Args...)> : public
>> > function_traits<R(Args...)> {};
>>
>> Variadic templates will be a problem until we drop MSVC 2012 support.
>
> Since this is intended only to support my use case of COM, and ultimately
> doesn't need to be completely generic, I'm going to add a codepath for the
> case of no variadic template support that only provides a 3 and 5 argument
> version.

With a giant FIXME that mentions the correct way to do it? ;-)

>
>
>>
>>
>> > +};
>> > +
>> > +template <class C, class R, class... Args>
>> > +struct function_traits<R (__stdcall C::*)(Args...)>
>> > +    : public function_traits<R(Args...)> {};
>> > +
>> > +template <class FuncTraits, std::size_t arg> struct function_arg {
>> > +  // Writing function_arg as a separate class that accesses the tuple
>> > from
>> > +  // function_traits is necessary due to what appears to be a bug in
>> > MSVC.
>> > +  // If you write a nested class inside function_traits like this:
>> > +  // template<std::size_t ArgIndex>
>> > +  // struct Argument
>> > +  // {
>> > +  //   typedef typename
>> > +  //     std::tuple_element<ArgIndex, std::tuple<Args...>>::type type;
>> > +  // };
>> > +  // MSVC encounters a parsing error.
>>
>> What parsing error does MSVC hit?
>
> This is the smallest repro I could come up with:
>
> #include <tuple>
>
> class FooClass
> {
> public:
>   int __stdcall FooFunc(int a, double b, float c)
>   {
>     return 0;
>   }
> };
>
> template <class F> struct function_traits;
>
> template <class R, class... Args>
> struct function_traits<R (*)(Args...)> : public function_traits<R(Args...)>
> {};
>
> template <class R, class... Args> struct function_traits<R(Args...)> {
>   using return_type = R;
>
>   using args_tuple = std::tuple<Args...>;
>
>   template<std::size_t N>
>   struct Argument
>   {
>     typedef typename std::tuple_element<N, std::tuple<Args...>>::type type;
>   };
> };
>
> template <class C, class R, class... Args>
> struct function_traits<R (__stdcall C::*)(Args...)>
>   : public function_traits < R(Args...) > {};
>
> template <typename ClassType>
> struct TestClass
> {
>   typedef typename
> function_traits<decltype(&ClassType::FooFunc)>::Argument<2>::type mytype;
> };
>
> int _tmain(int argc, _TCHAR* argv[])
> {
>   TestClass<FooClass> test;
>
> return 0;
> }
>
>
> 1>c:\users\zachary\documents\visual studio
> 2013\projects\debuginfotest\debuginfotest.cpp(40): error C2059: syntax error
> : '<'
> 1>          c:\users\zachary\documents\visual studio
> 2013\projects\debuginfotest\debuginfotest.cpp(41) : see reference to class
> template instantiation 'TestClass<ClassType>' being compiled
> 1>c:\users\zachary\documents\visual studio
> 2013\projects\debuginfotest\debuginfotest.cpp(40): error C2238: unexpected
> token(s) preceding ';'
>
> I don't consider myself a C++ expert, but note the same line works if I pull
> it out of TestClass and replace ClassType with FooClass.

Huh, that's neat. Thank you for sharing!

>
>>
>> >
>> > +//=============================================================================
>> > +template <class EnumeratorType> class ComDataRecordIterator {
>> > +public:
>> > +  explicit ComDataRecordIterator(CComPtr<EnumeratorType> enumerator,
>> > +                                 uint32_t currentRecord)
>> > +      : Enumerator(enumerator), CurrentRecord(currentRecord) {}
>> > +  ComDataRecordIterator() {}
>> > +
>> > +  ComDataRecordIterator &operator++() {
>> > +    // Release the current item so that Enumerator->Next() is happy.
>> > +    ++CurrentRecord;
>> > +    ReadNextRecord();
>>
>> This weirds me out a bit. Since ReadNextRecord can fail, but you are
>> incrementing CurrentRecord regardless, doesn't this mean you can get
>> to end() but not compare equal to something returned by end()?
>
>
> Presumably, but isn't incrementing end() undefined behavior anyway?

It is, but I am talking about:

auto i = blah.begin(); // Succeeds
i++; // Fails for some reason
if (i == end()) // ??
  ;

>
>>
>> > +
>> > +private:
>> > +  void ReadNextRecord() {
>> > +    RecordData.clear();
>> > +    ULONG Count = 0;
>> > +    DWORD RequiredBufferSize;
>> > +    HRESULT hr = Enumerator->Next(1, 0, &RequiredBufferSize, nullptr,
>> > &Count);
>> > +    if (hr == S_OK) {
>>
>> Please use the SUCCEEDED macro.
>
> Specifically here, that won't work.
> https://msdn.microsoft.com/en-us/library/64c9d90x.aspx  It returns S_OK if
> it succeeds, and S_FALSE if it fails.  Note that SUCCEEDED(S_FALSE) is true.

Ughhhhhh! I forgot about that horribleness. :-/

>
>>
>>
>> > +      RecordData.resize(RequiredBufferSize);
>> > +      DWORD BytesRead = 0;
>> > +      hr = Enumerator->Next(1, RequiredBufferSize, &BytesRead,
>> > +                            RecordData.data(), &Count);
>> > +    }
>> > +    if (hr == S_FALSE) {
>>
>> Please use the FAILED macro. This code will fail to end the
>> enumeration if Next() fails for any reason other than S_FALSE.
>
> Same issue as before, in that FAILED(S_FALSE) is false.  But your point is
> still valid, that other error cases aren't being handled here.  So I think I
> need to check hr == S_FALSE || FAILED(hr)

Yes, and same facedesking from me. ;-)

>
>>
>>
>> > +      // This is the end of the enumeration.
>> > +      RecordData.clear();
>> > +    }
>> > +  }
>> > +
>> > +  CComPtr<EnumeratorType> Enumerator;
>> > +  uint32_t CurrentRecord;
>> > +  llvm::SmallVector<uint8_t, 128> RecordData;
>>
>> Why 128? It seems rather large, and like something the caller might
>> have a much better idea over.
>
> You're right it's probably large.  32 or 64 might be better.  Some
> experiments show that most records are 16, 32, or 40 bytes.  I didn't know
> that until I wrote this and ran my first dump though.  On the other hand,
> letting the caller specify it clutters up the template argument list, which
> is kind of unfortunate.  Perhaps best to just err on the smaller side with
> something like 32?

For generality, I think it would make sense to allow the consumer to
pick it, but since this seems to be only for use with llvm-pdbdump, I
think it's fine to set to 32 for right now.

>
>>
>> > +
>> > +namespace llvm {
>> > +namespace sys {
>> > +namespace windows {
>> > +extern std::error_code UTF8ToUTF16(StringRef utf8,
>> > +                                   SmallVectorImpl<wchar_t> &utf16);
>> > +extern std::error_code UTF16ToUTF8(const wchar_t *utf16, size_t
>> > utf16_len,
>> > +                                   SmallVectorImpl<char> &utf8);
>>
>> Ugh, this really needs to find a reasonable home. I'm starting to
>> suspect that now is the time.
>
> Noo!!!!!!!!!!!!!!!  Don't make me do it :)  Or at least, lemme get this
> change in first.  I agree that it's unfortunate.  Months ago I pushed for
> allowing windows-specific stuff in Support to be exposed through headers,
> but it seemed like there was considerable resistance to it.  This is exactly
> the kind of thing I was worried about.

I would be fine doing it as a secondary step to clean this code up,
but this code (and other places where it's used) certainly provide a
compelling case for the refactoring.

~Aaron