[PATCH] D88988: [llvm-symbolizer] Add inline stack traces for Windows.

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 26 00:37:02 PST 2020


jhenderson added a subscriber: Higuoxing.
jhenderson added inline comments.


================
Comment at: lld/test/COFF/symbolizer-inline.s:3-5
+# RUN: lld-link -entry:main -nodefaultlib %t.obj -out:%t.exe -pdb:%t.pdb -debug
+# RUN: llvm-symbolizer --obj=%t.exe --use-native-pdb-reader --relative-address \
+# RUN:   0x1014 0x1018 0x101c 0x1023 0x1024 | FileCheck %s
----------------
rnk wrote:
> dblaikie wrote:
> > akhuang wrote:
> > > dblaikie wrote:
> > > > rnk wrote:
> > > > > dblaikie wrote:
> > > > > > After updated the dependencies (adding a dependency on llvm-symbolizer from lld's tests so this test could run 27e73816d6f9a7e627db73c445c4329db2ecfeaf ) that got  me looking at this/thinking: If this is the first use of llvm-symbolizer in lld, maybe it's out of place here?
> > > > > > 
> > > > > > Indeed this patch made no changes to lld, so it seems unsuitable that tests be added to lld - changes to llvm should be tested from within llvm. For ELF, at least, we're leaning towards writing hand-crafted assembly and the assembling that (with llvm-mc) and running llvm-symbolizer on the assembled file. If that model would work for COFF that'd be great - but otherwise it is acceptable to include source and repro steps in a file, and checkin a binary file for running llvm-symbolizer over.
> > > > > IMO it is important for test readability that we start with assembly, not a checked in binary file. It allows us to come up with creative .cv_loc transitions from one instruction to the next, and validate that we get the right source location at each instruction boundary.
> > > > > 
> > > > > There is prior art for using llvm-mc to produce an object file in llvm and then using llvm-symbolizer on that object file, but it's impossible to do the same for COFF. llvm-symbolizer expects to operate on a PDB file. The only tool capable of making a PDB from an object right now is LLD.
> > > > > 
> > > > > While it's unfortunate that the test lives in the wrong repo, the great increase in testability makes it worth it to me. Debug info is historically undertested or only tested via interactive debugger integration tests. I think there is a huge amount of value to this level (medium size integration?) of testing.
> > > > > IMO it is important for test readability that we start with assembly, not a checked in binary file.
> > > > 
> > > > I agree that it's important - but to me, not at the cost of leaving the code untested in LLVM. That code is tested in the repository where it's committed seems like a fairly core value of the LLVM project in my experience.
> > > > 
> > > > > It allows us to come up with creative .cv_loc transitions from one instruction to the next, and validate that we get the right source location at each instruction boundary.
> > > > 
> > > > Not sure I follow here - a checked in binary, built from source including creative .cv_loc transitions, etc, could have the same test coverage, right?
> > > > 
> > > > > There is prior art for using llvm-mc to produce an object file in llvm and then using llvm-symbolizer on that object file,
> > > > 
> > > > Yeah, I still have sort of mixed feelings about that - but generally accepted the tradeoff (of increasing the size of the code underneath the test - including all of llvm-mc, rather than only the dumping/symbolizing code itself) in test maintainability is worthwhile, given the fairly limited complexity of functionality in llvm-mc that's required for most of these tests. (mostly don't care about eh_frame, so the line table generation is about the worst part of that - the rest of the DWARF is generated using basic assembly directives so the probability of bugs in llvm-mc messing with the tests is relatively low)
> > > > 
> > > > >  but it's impossible to do the same for COFF. llvm-symbolizer expects to operate on a PDB file. The only tool capable of making a PDB from an object right now is LLD.
> > > > 
> > > > Yeah, I thought that might be the case hence the caveat that maybe a checked in binary would be needed.
> > > > 
> > > > > While it's unfortunate that the test lives in the wrong repo, the great increase in testability makes it worth it to me.
> > > > 
> > > > I'm not sure it's any less testable, though - generating and checking in a binary doesn't seem like a huge hurdle to writing tests here. It was done in the past & while moving to assembly based tests has some value add, I don't think the checked in binaries testing libDebugInfo have been a major burden on LLVM development/maintenance.
> > > > 
> > > > > Debug info is historically undertested
> > > > 
> > > > I don't think this test is adding significantly more/different testing than I think would be reasonably requested otherwise. An LLVM test with a checked in binary and an lld test using assembly and a debug info dump of some kind (if that isn't already covered - I'd have assumed it was already covered when the relevant functionality was added to lld in other/previous patches?). 
> > > > 
> > > > > or only tested via interactive debugger integration tests. I think there is a huge amount of value to this level (medium size integration?) of testing.
> > > > 
> > > > I'm not sure I see this particular test providing a lot more value over testing the components separately. And even then, I think it's important to have the isolated testing in addition to any end-to-end testing.
> > > > 
> > > > There's the debuginfo-tests directory for more end-to-end testing, if that's desired (not sure if/how well that fits/is usable for COFF/PDB testing, though - could be worth expanding to support if it's not already testable there). Could have a full end-to-end test there, source code down to PDBs and symbolized. (might even be able to be portable between DWARF and PDB?)
> > > > I agree that it's important - but to me, not at the cost of leaving the code untested in LLVM. That code is tested in the repository where it's committed seems like a fairly core value of the LLVM project in my experience.
> > > 
> > > We could also add a binary for testing in the llvm-symbolizer tests, where the other tests are. I didn't consider the fact that the code is untested in LLVM.
> > > 
> > > I agree that the assembly test doesn't provide any extra coverage that a checked in binary can't; I think the main upside of including the assembly test in lld is so that the test is easier to understand/modify. But maybe it's not too difficult to just include the assembly file in the llvm test inputs directory and re-build the binary whenever the test needs to be changed.
> > > > I agree that it's important - but to me, not at the cost of leaving the code untested in LLVM. That code is tested in the repository where it's committed seems like a fairly core value of the LLVM project in my experience.
> > > We could also add a binary for testing in the llvm-symbolizer tests, where the other tests are. I didn't consider the fact that the code is untested in LLVM.
> > > 
> > > I agree that the assembly test doesn't provide any extra coverage that a checked in binary can't; I think the main upside of including the assembly test in lld is so that the test is easier to understand/modify. But maybe it's not too difficult to just include the assembly file in the llvm test inputs directory and re-build the binary whenever the test needs to be changed.
> > 
> > Agreed that being able to write the assembly is beneficial (though even the DWARF tests from assembly aren't all that maintainable - maybe the yaml2obj will eventually make DWARF more human writable) - but hopefully not too bad to write a test like the many other llvm-symbolizer tests that are using checked in binaries.
> > 
> > (I'd argue, though not as firmly as I am about there being some testing in llvm itself, that once such testing is added, the lld test should probably be removed - as we tend not to intentionally implement end-to-end tests like this in LLVM - maybe something really end-to-end in the debuginfo-tests repository would be suitable)
> > 
> > Broader/side question: Is it possible to implement pdbs in assembly? Are they "just" a COFF file with particular bytes in them? Are those bytes reasonable for a human to write, or do they, for instance, contain hash tables or other things that would be hard to construct by hand?
> > Agreed that being able to write the assembly is beneficial (though even the DWARF tests from assembly aren't all that maintainable - maybe the yaml2obj will eventually make DWARF more human writable) - but hopefully not too bad to write a test like the many other llvm-symbolizer tests that are using checked in binaries.
> 
> We did host an intern whose project goal was to make CodeView assembly tests easier to write and maintain. The same could be done for DWARF. We made several directives to support a lot of the hard-to-write-by-hand parts into MC, the assembler. For example, .cv_inline_line_table is a directive that generates these wacky state machine opcodes. I suppose for DWARF we are more constrained by what gas can do.
>  
> > (I'd argue, though not as firmly as I am about there being some testing in llvm itself, that once such testing is added, the lld test should probably be removed - as we tend not to intentionally implement end-to-end tests like this in LLVM - maybe something really end-to-end in the debuginfo-tests repository would be suitable)
> 
> I think debuginfo-tests aren't really the right place for this. They mostly focus on testing integration with interactive debuggers. To my knowledge, we aren't running debuginfo-tests continuously on Windows yet. The setup and configuration was sufficiently difficult that it fell off the top of the stack.
> 
> > Broader/side question: Is it possible to implement pdbs in assembly? Are they "just" a COFF file with particular bytes in them? Are those bytes reasonable for a human to write, or do they, for instance, contain hash tables or other things that would be hard to construct by hand?
> 
> Nope, they are not COFF files. PDBs are a kind of "multistream file" (MSF). We do have the means to make PDB files from yaml, though, so we could test things that way. However, the inline line tables, the thing under test in this case, are not represented symbolically, I believe they are just strings of hex bytes. We could make more testing tools to make PDBs, but at a certain point you're going to apply COFF relocations, and now you have yourself a second linker.
> 
> I think if we want to make llvm-symbolizer more testable from LLVM, the way to go is to:
> - Generalize the debug info parsing over COFF files as well as PDBs (the contents are the same, just unrelocated). This is similar to how llvm-symbolizer works on relocatable ELF object files.
> - Keep coverage of the PDB codepaths with yaml-ified PDB files. They are not good for fine grained inline line table testing, but they'd give us coverage.
>> Agreed that being able to write the assembly is beneficial (though even the DWARF tests from assembly aren't all that maintainable - maybe the yaml2obj will eventually make DWARF more human writable) - but hopefully not too bad to write a test like the many other llvm-symbolizer tests that are using checked in binaries.
> We did host an intern whose project goal was to make CodeView assembly tests easier to write and maintain. The same could be done for DWARF. We made several directives to support a lot of the hard-to-write-by-hand parts into MC, the assembler. For example, .cv_inline_line_table is a directive that generates these wacky state machine opcodes. I suppose for DWARF we are more constrained by what gas can do.

Just chiming in that @Higuoxing spent this year's GSOC period significantly improving DWARF support in yaml2obj and to a lesser extent obj2yaml. I don't think it's covered every case yet, but it's certainly possible to write many more tests using YAML than it was before (see llvm\test\tools\yaml2obj\ELF\DWARF for a number of tests that demonstrate the behaviour that has been added), and with greater ease, as some of the information at least is now automatically derived by yaml2obj without needing to be explicitly specified in the YAML itself. Unfortunately, that doesn't help with COFF/PDB of course.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88988/new/

https://reviews.llvm.org/D88988



More information about the llvm-commits mailing list