[PATCH] Add llvm-pdbdump to tools
Aaron Ballman
aaron.ballman at gmail.com
Fri Jan 23 14:39:48 PST 2015
Additional thoughts and comments to what Chandler posted.
> Index: tools/CMakeLists.txt
> ===================================================================
> --- tools/CMakeLists.txt
> +++ tools/CMakeLists.txt
> @@ -61,6 +61,10 @@
>
> add_llvm_tool_subdirectory(llvm-go)
>
> +if(MSVC)
> + add_llvm_tool_subdirectory(llvm-pdbdump)
> +endif()
> +
> if(NOT CYGWIN AND LLVM_ENABLE_PIC)
> add_llvm_tool_subdirectory(lto)
> add_llvm_tool_subdirectory(llvm-lto)
> Index: tools/LLVMBuild.txt
> ===================================================================
> --- tools/LLVMBuild.txt
> +++ tools/LLVMBuild.txt
> @@ -16,7 +16,7 @@
> ;===------------------------------------------------------------------------===;
>
> [common]
> -subdirectories = bugpoint llc lli llvm-ar llvm-as llvm-bcanalyzer llvm-cov llvm-diff llvm-dis llvm-dwarfdump llvm-extract llvm-jitlistener llvm-link llvm-lto llvm-mc llvm-nm llvm-objdump llvm-profdata llvm-rtdyld llvm-size macho-dump opt llvm-mcmarkup verify-uselistorder dsymutil
> +subdirectories = bugpoint llc lli llvm-ar llvm-as llvm-bcanalyzer llvm-cov llvm-diff llvm-dis llvm-dwarfdump llvm-extract llvm-jitlistener llvm-link llvm-lto llvm-mc llvm-nm llvm-objdump llvm-pdbdump llvm-profdata llvm-rtdyld llvm-size macho-dump opt llvm-mcmarkup verify-uselistorder dsymutil
>
> [component_0]
> type = Group
> Index: tools/llvm-pdbdump/CMakeLists.txt
> ===================================================================
> --- /dev/null
> +++ tools/llvm-pdbdump/CMakeLists.txt
> @@ -0,0 +1,16 @@
> +set(LLVM_LINK_COMPONENTS
> + Support
> + )
> +
> +set(MSVC_DIA_SDK_DIR "$ENV{VSINSTALLDIR}DIA SDK")
> +include_directories(${MSVC_DIA_SDK_DIR}/include)
> +if (CMAKE_SIZEOF_VOID_P EQUAL 8)
> + link_directories(${MSVC_DIA_SDK_DIR}/lib/amd64)
> +else()
> + link_directories(${MSVC_DIA_SDK_DIR}/lib)
> +endif()
> +
> +add_llvm_tool(llvm-pdbdump
> + llvm-pdbdump.cpp
> + )
> +target_link_libraries(llvm-pdbdump diaguids)
> \ No newline at end of file
> Index: tools/llvm-pdbdump/COMExtras.h
> ===================================================================
> --- /dev/null
> +++ tools/llvm-pdbdump/COMExtras.h
> @@ -0,0 +1,264 @@
> +//===- COMExtras.h - Helper files for COM operations -------------*- C++-*-===//
> +//
> +// The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +
> +#ifndef LLVM_TOOLS_LLVM_PDBDUMP_COMEXTRAS_H
> +#define LLVM_TOOLS_LLVM_PDBDUMP_COMEXTRAS_H
> +
> +#include "llvm/ADT/ArrayRef.h"
> +#include "llvm/ADT/SmallVector.h"
> +
> +template <class F> struct function_traits;
This nit applies to the entire patch: prefer typename over class in
template delcarations. Also, this should all be namespaced.
> +
> +template <class R, class... Args>
> +struct function_traits<R (*)(Args...)> : public function_traits<R(Args...)> {};
Variadic templates will be a problem until we drop MSVC 2012 support.
> +
> +template <class R, class... Args> struct function_traits<R(Args...)> {
> + using return_type = R;
> +
> + using args_tuple = std::tuple<Args...>;
You should probably be including <tuple> instead of relying on it
being pulled in by other ADT headers.
> +};
> +
> +template <class C, class R, class... Args>
> +struct function_traits<R (__stdcall C::*)(Args...)>
> + : public function_traits<R(Args...)> {};
> +
> +template <class FuncTraits, std::size_t arg> struct function_arg {
> + // Writing function_arg as a separate class that accesses the tuple from
> + // function_traits is necessary due to what appears to be a bug in MSVC.
> + // If you write a nested class inside function_traits like this:
> + // template<std::size_t ArgIndex>
> + // struct Argument
> + // {
> + // typedef typename
> + // std::tuple_element<ArgIndex, std::tuple<Args...>>::type type;
> + // };
> + // MSVC encounters a parsing error.
What parsing error does MSVC hit?
> + typedef
> + typename std::tuple_element<arg, typename FuncTraits::args_tuple>::type
> + type;
> +};
> +
> +template <class T> struct remove_double_pointer {};
> +template <class T> struct remove_double_pointer<T **> { typedef T type; };
> +
> +//=============================================================================
> +// class ComIterator<>
> +//
> +// A common idiom in the COM world is to have an enumerator interface, say
> +// IMyEnumerator. It's responsible for enumerating over some child data type,
> +// say IChildType. You do the enumeration by calling IMyEnumerator::Next()
> +// to get a pointer to a pointer to the child type. Eventually it fails,
> +// and that means you're at the end.
> +//
> +// ComIterator represents a single point-in-time of this iteration. It is
> +// used by ComEnumerator to support iterating in this fashion via range-based
> +// for loops and other common C++ paradigms.
> +//=============================================================================
> +template <class EnumeratorType, std::size_t ArgIndex> class ComIterator {
> +private:
No need for the private access specifier.
> + using FunctionTraits = function_traits<decltype(&EnumeratorType::Next)>;
> + typedef typename function_arg<FunctionTraits, ArgIndex>::type FuncArgType;
> + // FuncArgType is now something like ISomeCOMInterface **. Remove both
> + // pointers, so we can make a CComPtr<T> out of it.
> + typedef typename remove_double_pointer<FuncArgType>::type EnumDataType;
> +
> +public:
> + explicit ComIterator(CComPtr<EnumeratorType> Enumerator,
> + CComPtr<EnumDataType> Current)
> + : EnumeratorObject(Enumerator), CurrentItem(Current) {}
> + ComIterator() {}
> +
> + ComIterator &operator++() {
> + // EnumeratorObject->Next() expects CurrentItem to be NULL.
> + CurrentItem.Release();
> + ULONG Count = 0;
> + HRESULT hr = EnumeratorObject->Next(1, &CurrentItem, &Count);
> + if (FAILED(hr) || Count == 0)
> + *this = ComIterator();
> +
> + return *this;
> + }
> +
> + CComPtr<EnumDataType> operator*() { return CurrentItem; }
> +
> + bool operator==(const ComIterator &other) const {
> + return (EnumeratorObject == other.EnumeratorObject) &&
> + (CurrentItem == other.CurrentItem);
> + }
> +
> + bool operator!=(const ComIterator &other) const { return !(*this == other); }
> +
> + ComIterator &operator=(const ComIterator &other) {
> + EnumeratorObject = other.EnumeratorObject;
> + CurrentItem = other.CurrentItem;
> + return *this;
> + }
Since this is an iterator, it should expose other iterator
functionality (like operator++(int), and the iterator typedefs
value_type, difference_type, pointer, reference, and
iterator_category).
> +
> +private:
> + CComPtr<EnumeratorType> EnumeratorObject;
> + CComPtr<EnumDataType> CurrentItem;
Might as well stick these up with the other private items.
> +};
> +
> +//=============================================================================
> +// class ComEnumerator<>
> +//
> +// ComEnumerator is the top-level "range class" used to support iteration via
> +// range-based for loops. It simply provides a begin() and end() method which
> +// return appropriately constructed ComIterator<> classes.
> +//=============================================================================
> +template <class EnumeratorType, std::size_t ArgIndex> class ComEnumerator {
> +private:
Private access specifier not required.
> + typedef function_traits<decltype(&EnumeratorType::Next)> FunctionTraits;
> + typedef typename function_arg<FunctionTraits, ArgIndex>::type FuncArgType;
> + typedef typename remove_double_pointer<FuncArgType>::type EnumDataType;
> +
> +public:
> + ComEnumerator(CComPtr<EnumeratorType> Enumerator)
> + : EnumeratorObject(Enumerator) {}
> +
> + ComIterator<EnumeratorType, ArgIndex> begin() {
> + if (!EnumeratorObject)
> + return end();
> +
> + EnumeratorObject->Reset();
> + ULONG Count = 0;
> + CComPtr<EnumDataType> FirstItem;
> + HRESULT hr = EnumeratorObject->Next(1, &FirstItem, &Count);
> + return (FAILED(hr) || Count == 0)
> + ? end()
> + : ComIterator<EnumeratorType, ArgIndex>(EnumeratorObject, FirstItem);
> + }
> +
> + ComIterator<EnumeratorType, ArgIndex> end() {
> + return ComIterator<EnumeratorType, ArgIndex>();
> + }
> +
> +private:
> + CComPtr<EnumeratorType> EnumeratorObject;
Might as well move this up as well.
> +};
> +
> +//=============================================================================
> +// class ComDataRecordIterator<>
> +//
> +// Similar to ComIterator<>, but uses a
I'm dying to know what this uses. ;-)
> +//=============================================================================
> +template <class EnumeratorType> class ComDataRecordIterator {
> +public:
> + explicit ComDataRecordIterator(CComPtr<EnumeratorType> enumerator,
> + uint32_t currentRecord)
> + : Enumerator(enumerator), CurrentRecord(currentRecord) {}
> + ComDataRecordIterator() {}
> +
> + ComDataRecordIterator &operator++() {
> + // Release the current item so that Enumerator->Next() is happy.
> + ++CurrentRecord;
> + ReadNextRecord();
This weirds me out a bit. Since ReadNextRecord can fail, but you are
incrementing CurrentRecord regardless, doesn't this mean you can get
to end() but not compare equal to something returned by end()?
> + return *this;
> + }
> +
> + llvm::ArrayRef<uint8_t> operator*() {
> + if (CurrentRecord == 0)
> + ReadNextRecord();
> +
> + return llvm::ArrayRef<uint8_t>(RecordData.begin(), RecordData.end());
> + }
> +
> + bool operator==(const ComDataRecordIterator &other) const {
> + return (Enumerator == other.Enumerator) &&
> + (CurrentRecord == other.CurrentRecord);
> + }
> +
> + bool operator!=(const ComDataRecordIterator &other) const {
> + return !(*this == other);
> + }
Same comments here regarding the iterator interface as above.
> +
> +private:
> + void ReadNextRecord() {
> + RecordData.clear();
> + ULONG Count = 0;
> + DWORD RequiredBufferSize;
> + HRESULT hr = Enumerator->Next(1, 0, &RequiredBufferSize, nullptr, &Count);
> + if (hr == S_OK) {
Please use the SUCCEEDED macro.
> + RecordData.resize(RequiredBufferSize);
> + DWORD BytesRead = 0;
> + hr = Enumerator->Next(1, RequiredBufferSize, &BytesRead,
> + RecordData.data(), &Count);
> + }
> + if (hr == S_FALSE) {
Please use the FAILED macro. This code will fail to end the
enumeration if Next() fails for any reason other than S_FALSE.
> + // This is the end of the enumeration.
> + RecordData.clear();
> + }
> + }
> +
> + CComPtr<EnumeratorType> Enumerator;
> + uint32_t CurrentRecord;
> + llvm::SmallVector<uint8_t, 128> RecordData;
Why 128? It seems rather large, and like something the caller might
have a much better idea over.
> +};
> +
> +//=============================================================================
> +// class ComEnumerator<>
> +//
> +// ComEnumerator is the top-level "range class" used to support iteration via
> +// range-based for loops. It simply provides a begin() and end() method which
> +// return appropriately constructed ComIterator<> classes.
This comment is incorrect copypasta.
> +//=============================================================================
> +template <class EnumeratorType> class ComDataRecordEnumerator {
> +public:
> + ComDataRecordEnumerator(CComPtr<EnumeratorType> enumerator)
> + : Enumerator(enumerator) {}
> +
> + ComDataRecordIterator<EnumeratorType> begin() {
> + if (Enumerator)
> + Enumerator->Reset();
> + return ComDataRecordIterator<EnumeratorType>(Enumerator, 0);
> + }
> +
> + ComDataRecordIterator<EnumeratorType> end() {
> + LONG NumElts = 0;
> + HRESULT hr = Enumerator->get_Count(&NumElts);
> + return (FAILED(hr))
> + ? ComDataRecordIterator<EnumeratorType>(Enumerator, 0)
> + : ComDataRecordIterator<EnumeratorType>(Enumerator, NumElts);
> + }
> +
> +private:
> + CComPtr<EnumeratorType> Enumerator;
> +};
> +
> +//=============================================================================
> +// function com_enumerator<>
> +//
> +// com_enumerator puts together all the magic C++ incantations to deduce all
> +// necessary types (enumerator, child type) automatically. You need only write
> +// for (auto item : com_enumerator(MyEnumerator))
> +// {
> +// }
> +//=============================================================================
> +template <class EnumeratorType>
> +ComEnumerator<EnumeratorType, 1>
> +com_enumerator(CComPtr<EnumeratorType> Enumerator) {
> + return ComEnumerator<EnumeratorType, 1>(Enumerator);
> +}
make_com_enumerator to be more STL-like?
> +
> +//=============================================================================
> +// function com_data_record_enumerator<>
> +//
> +// com_data_record_enumerator returns a ComDataRecordEnumerator appropriately
> +// parameterized so that it can be used in a range-based for loop. You need
> +// only write
> +// for (auto record : com_data_record_enumerator(MyEnumerator))
> +// {
> +// }
> +//=============================================================================
> +template <class EnumeratorType>
> +ComDataRecordEnumerator<EnumeratorType>
> +com_data_record_enumerator(CComPtr<EnumeratorType> Enumerator) {
> + return ComDataRecordEnumerator<EnumeratorType>(Enumerator);
> +}
make_com_data_record_enumerator is a mouthful, but yay consistency in
suggestions?
> +#endif
> Index: tools/llvm-pdbdump/LLVMBuild.txt
The rest of the patch seems like it's a separate commit from COM
functionality, but I'm not strongly tied to that idea.
> ===================================================================
> --- /dev/null
> +++ tools/llvm-pdbdump/LLVMBuild.txt
> @@ -0,0 +1,23 @@
> +;===- ./tools/llvm-pdbdump/LLVMBuild.txt -----------------------*- Conf -*--===;
> +;
> +; The LLVM Compiler Infrastructure
> +;
> +; This file is distributed under the University of Illinois Open Source
> +; License. See LICENSE.TXT for details.
> +;
> +;===------------------------------------------------------------------------===;
> +;
> +; This is an LLVMBuild description file for the components in this subdirectory.
> +;
> +; For more information on the LLVMBuild system, please see:
> +;
> +; http://llvm.org/docs/LLVMBuild.html
> +;
> +;===------------------------------------------------------------------------===;
> +
> +[component_0]
> +type = Tool
> +name = llvm-pdbdump
> +parent = Tools
> +required_libraries =
> +
> Index: tools/llvm-pdbdump/llvm-pdbdump.cpp
> ===================================================================
> --- /dev/null
> +++ tools/llvm-pdbdump/llvm-pdbdump.cpp
> @@ -0,0 +1,152 @@
> +//===- llvm-pdbdump.cpp - Dump debug info from a PDB file -------*- C++ -*-===//
> +//
> +// The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +//
> +// Dumps debug information present in PDB files. This utility makes use of
> +// the Microsoft Windows SDK, so will not compile or run on non-Windows
> +// platforms.
> +//
> +//===----------------------------------------------------------------------===//
> +
> +#define NTDDI_VERSION NTDDI_VISTA
> +#define _WIN32_WINNT _WIN32_WINNT_VISTA
> +#define WINVER _WIN32_WINNT_VISTA
> +#ifndef NOMINMAX
> +#define NOMINMAX
> +#endif
> +
> +#include <atlbase.h>
> +#include <windows.h>
> +#include <dia2.h>
> +
> +#include "llvm/ADT/ArrayRef.h"
> +#include "llvm/Support/CommandLine.h"
> +#include "llvm/Support/ConvertUTF.h"
> +#include "llvm/Support/Format.h"
> +#include "llvm/Support/ManagedStatic.h"
> +#include "llvm/Support/raw_ostream.h"
> +#include "llvm/Support/Process.h"
> +#include "llvm/Support/PrettyStackTrace.h"
> +#include "llvm/Support/Signals.h"
> +
> +#include "COMExtras.h"
> +
> +using namespace llvm;
> +
> +namespace llvm {
> +namespace sys {
> +namespace windows {
> +extern std::error_code UTF8ToUTF16(StringRef utf8,
> + SmallVectorImpl<wchar_t> &utf16);
> +extern std::error_code UTF16ToUTF8(const wchar_t *utf16, size_t utf16_len,
> + SmallVectorImpl<char> &utf8);
Ugh, this really needs to find a reasonable home. I'm starting to
suspect that now is the time.
> +}
> +}
> +}
> +
> +namespace opts {
> +cl::list<std::string> InputFilenames(cl::Positional,
> + cl::desc("<input PDB files>"),
> + cl::OneOrMore);
> +
> +// -streams, -s
> +cl::opt<bool> Streams("streams", cl::desc("Display data stream information"));
> +cl::alias StreamsShort("s", cl::desc("Alias for --streams"),
> + cl::aliasopt(Streams));
> +
> +// stream-data, -S
> +cl::opt<bool> StreamData("stream-data",
> + cl::desc("Dumps stream record data as bytes"));
> +cl::alias StreamDataShort("S", cl::desc("Alias for --stream-data"),
> + cl::aliasopt(StreamData));
> +}
> +
> +static void dumpDataStreams(IDiaSession *session) {
> + CComPtr<IDiaEnumDebugStreams> DebugStreams = nullptr;
> + if (session->getEnumDebugStreams(&DebugStreams) == S_OK) {
Should use the SUCCEEDED macro.
> + LONG Count = 0;
> + if (FAILED(DebugStreams->get_Count(&Count)))
> + return;
> + outs() << "Data Streams [count=" << Count << "]\n";
> +
> + llvm::SmallString<32> Name8;
> +
> + for (auto Stream : com_enumerator(DebugStreams)) {
> + BSTR Name16;
> + if (Stream->get_name(&Name16) != S_OK)
Should use the FAILED macro.
> + continue;
> + if (!llvm::sys::windows::UTF16ToUTF8(Name16, SysStringLen(Name16), Name8))
> + outs() << " " << Name8;
> + ::SysFreeString(Name16);
> + if (SUCCEEDED(Stream->get_Count(&Count))) {
> + outs() << " [" << Count << " records]\n";
> + if (opts::StreamData) {
> + int RecordIndex = 0;
> + for (auto StreamRecord : com_data_record_enumerator(Stream)) {
> + outs() << " Record " << RecordIndex << " ["
> + << StreamRecord.size() << " bytes]";
> + for (size_t i = 0; i < StreamRecord.size(); ++i) {
Why not a range-based for loop here as well?
> + outs() << " "
> + << llvm::format_hex(StreamRecord[i], 2, true, false);
> + }
> + outs() << "\n";
> + ++RecordIndex;
> + }
> + }
> + } else
> + outs() << "\n";
> + }
> + }
> + outs().flush();
> +}
> +
> +static void dumpInput(StringRef Path) {
> + SmallVector<wchar_t, 128> path_utf16;
> + std::error_code EC = llvm::sys::windows::UTF8ToUTF16(Path, path_utf16);
> + CComPtr<IDiaDataSource> source;
> + HRESULT hr =
> + ::CoCreateInstance(CLSID_DiaSource, nullptr, CLSCTX_INPROC_SERVER,
> + __uuidof(IDiaDataSource), (void **)&source);
> + if (FAILED(hr))
> + return;
> + if (FAILED(source->loadDataFromPdb(path_utf16.data())))
> + return;
> + CComPtr<IDiaSession> session;
> + if (FAILED(source->openSession(&session)))
> + return;
> + if (opts::Streams || opts::StreamData) {
> + dumpDataStreams(session);
> + }
> +}
> +
> +int main(int argc_, const char *argv_[]) {
> + // Print a stack trace if we signal out.
> + sys::PrintStackTraceOnErrorSignal();
> + PrettyStackTraceProgram X(argc_, argv_);
> +
> + SmallVector<const char *, 256> argv;
> + llvm::SpecificBumpPtrAllocator<char> ArgAllocator;
> + std::error_code EC = llvm::sys::Process::GetArgumentVector(
> + argv, llvm::makeArrayRef(argv_, argc_), ArgAllocator);
> + if (EC) {
> + llvm::errs() << "error: couldn't get arguments: " << EC.message() << '\n';
> + return 1;
> + }
> +
> + llvm_shutdown_obj Y; // Call llvm_shutdown() on exit.
> +
> + cl::ParseCommandLineOptions(argv.size(), argv.data(), "LLVM PDB Dumper\n");
> +
> + CoInitializeEx(nullptr, COINIT_MULTITHREADED);
> +
> + std::for_each(opts::InputFilenames.begin(), opts::InputFilenames.end(),
> + dumpInput);
> +
> + CoUninitialize();
> + return 0;
> +}
>
~Aaron
On Fri, Jan 23, 2015 at 4:48 PM, Zachary Turner <zturner at google.com> wrote:
> ================
> Comment at: tools/CMakeLists.txt:64
> @@ -63,1 +63,3 @@
>
> +if(MSVC)
> + add_llvm_tool_subdirectory(llvm-pdbdump)
> ----------------
> chandlerc wrote:
>> Shouldn't this be checking if the host OS is Windows rather than the compiler?
> COM uses a bunch of Microsoft extensions and stuff. I'm pretty sure this won't compile with MinGW, for example. Is there a better check for "any MSVC compatible compiler, including clang"?
>
> ================
> Comment at: tools/llvm-pdbdump/COMExtras.h:18-19
> @@ +17,4 @@
> +
> +template <class R, class... Args>
> +struct function_traits<R (*)(Args...)> : public function_traits<R(Args...)> {};
> +
> ----------------
> chandlerc wrote:
>> MSVC 2012 doesn't support variadiac templates I thought?
> Umm.. Crap? Can we drop support for VS 2012 yet? lol. I'm not really sure how to implement this without it.
>
> ================
> Comment at: tools/llvm-pdbdump/llvm-pdbdump.cpp:44-47
> @@ +43,6 @@
> +namespace windows {
> +extern std::error_code UTF8ToUTF16(StringRef utf8,
> + SmallVectorImpl<wchar_t> &utf16);
> +extern std::error_code UTF16ToUTF8(const wchar_t *utf16, size_t utf16_len,
> + SmallVectorImpl<char> &utf8);
> +}
> ----------------
> chandlerc wrote:
>> If these aren't provided by Support, they should be. If they are, just include the header rather than declaring them yourself.
> They're part of that stuff that I used to frequently complain about which is in Support, but not exposed through a header file since it's Windows-specific.
>
> ================
> Comment at: tools/llvm-pdbdump/llvm-pdbdump.cpp:127
> @@ +126,3 @@
> +
> +int main(int argc_, const char *argv_[]) {
> + // Print a stack trace if we signal out.
> ----------------
> chandlerc wrote:
>> Why the _s?
> Because I use llvm::Sys::Process::GetArgumentVector() so there ends up being a second argv. I used _ to make it clear that they're not going to be used, and that the argv vector should be used instead.
>
> http://reviews.llvm.org/D7153
>
> EMAIL PREFERENCES
> http://reviews.llvm.org/settings/panel/emailpreferences/
>
>
More information about the llvm-commits
mailing list