[llvm-commits] [PATCH] Teach llvm-objdump to dump Win64 exception tables

Sun Nov 25 13:45:22 PST 2012

Just a few tiny nits and the first patch looks ready to be committed
(although I'd like Michael to check off on it).

> The type casts are necessary, otherwise the output is garbage. I think the
> ulittle8_t value is interpreted as a char, not a number.

You might want to note that in a comment.

+  case UOP_SaveNonVolBig:
+  case UOP_SaveXMM128Big:
+    UsedSlots = 3;
+    break;
+  case UOP_AllocLarge:
+    UsedSlots =  (UnwindCode.getOpInfo() == 0) ? 2 : 3;
+    break;
+  }
+  return UsedSlots;
+}

You can get rid of UsedSlots here and just return the size directly.

+static void printUnwindCode(ArrayRef<UnwindCode> UCs) {

+static void printCOFFUnwindCode(ArrayRef<UnwindCode> UCs) {

I think these names need to be adjusted. The calling sequence
printCOFFUnwindCode -> printUnwindCode doesn't make sense, since it
seems like a more specific function is calling a more generic
function. I'm not super familiar with the terminology here, but
shouldn't these be plural, since they are printing multiple unwind
codes `ArrayRef<UnwindCode> UCs`? The naming should also reflect that
"printCOFFUnwindCode" prints multiple of whatever "printUnwindCode"
prints. From the comment, it seems like "printUnwindCode" prints
"unwind code entry"'s, so how about renaming them to
printCOFFUnwindCodeEntries and printCOFFUnwindCodeEntry?

Also, depending on how useful this would be in other places, (in a
future patch set) you could write an "unwind code entry iterator"
which is a forward iterator whose value_type is a class
UnwindCodeEntry. I'm not sure how useful that would be, but that seems
like a really clean way to work with these if you need to do something
more complicated with them besides just printing them.

+// Prints one unwind code entry. Because an entry can occupy up to 3 slots in
+// the codes array, this function requires that the correct number slots is
+// provided.

Even though the check has already been done in the caller, it probably
wouldn't hurt to `assert(UCs.size() >= getNumUsedSlots(UCs.front()))`
at the start of this function for a bit of bulletproofing.

+  uint8_t getVersion() const {
+  return VersionAndFlags & 0x07;
+  }
+  uint8_t getFlags() const {
+  return (VersionAndFlags >> 3) & 0x1f;
+  }
+  uint8_t getFrameRegister() const {
+  return FrameRegisterAndOffset & 0x0f;
+  }
+  uint8_t getFrameOffset() const {
+  return (FrameRegisterAndOffset >> 4) & 0x0f;
+  }

Indent the bodies here.

Regarding 02-win64eh.diff, it looks like there are a number of
different changes smashed into a single patch. Your original patch
only contains changes from uint64_t to uint32_t. What was the problem
that was fixed by that? I think it should be a separate patch so it is
clear what is being changed and why. The change from bitfields to
functions should be broken off into its own patch too. Breaking
changes up like this is fundamental to LLVM's incremental development
style; the goal of each patch is to communicate a single focused,
commitable change to the reviewer.

   void *getLanguageSpecificData() {
-    return reinterpret_cast<void *>(&unwindCodes[(numCodes+1) & ~1]);
+    return reinterpret_cast<void *>(&UnwindCodes[(NumCodes+1) & ~1]);
   }

You could clean up this function to make this be templated on the
return type and have it directly return a pointer to the desired type,
which should greatly reduce the number of reinterpret_cast<>'s in the
surrounding code. You can do this in a separate patch (now or in the
future).

So in summary, (as far as I understand what is trying to be
accomplished in this patch set) I think that you should break these
changes into three patches:

1. fix the uint64_t to uint32_t size issues (include a good commit
message explaining why this is needed)
2. cleanup the bitfields to functions
3. The changes in 01-llvm-objdump.diff (with the small fixes as requested above)
[4. (optional) make getLanguageSpecificData be templated on the return
type and clean up the reinterpret casts (you can put this anywhere in
the patch set)]

Also, I see that you are using git. I recommend that you use
git-format-patch to make the patches, since that will preserve your
commit message (otherwise the committer might make one up that you
don't like). Don't be afraid to write nice long commit messages
explaining the change and why it is needed ;)

-- Sean Silva

On Sun, Nov 25, 2012 at 11:01 AM, Kai <kai at redstar.de> wrote:
> Hi Sean!
>
> The hint to use ArrayRef's turned out to be really good. I have the feeling
> that a lot of the complexity is gone.
>
> I also implemented your other suggestions.
>
> The type casts are necessary, otherwise the output is garbage. I think the
> ulittle8_t value is interpreted as a char, not a number. The casts make sure
> that the value is formatted as a number. However I changed the old-style
> cast to a static_cast<>.
>
> Regards
> Kai
>
>
> On 24.11.2012 21:47, Sean Silva wrote:
>>
>> This is a lot better!
>>
>> +unsigned printUnwindCode(const UnwindCode *UnwindCodes, unsigned
>> SlotsAvail) {
>>
>> Use ArrayRef's instead of pointer-and-length. Here and in any callers.
>>
>> +void printCOFFUnwindCode(const UnwindCode *UnwindCodes, unsigned
>> NumCodes) {
>> +  for (unsigned i = 0; i < NumCodes; ) {
>> +    i += printUnwindCode(&UnwindCodes[i], NumCodes - i);
>> +  }
>> +}
>>
>> The `NumCodes-i` here is not immediately obvious; better to use an
>> ArrayRef. How about
>>
>> void printCOFFUnwindCode(ArrayRef<UnwindCode> UCs) {
>>    for (const UnwindCode *I = UCs.begin(), *E = UCs.end(); I < E;)
>>     I += printUnwindCode(ArrayRef<UnwindCode>(I, E));
>> }
>>
>> ?
>>
>> +  switch (UnwindCodes[0].getUnwindOp()) {
>> +  default: llvm_unreachable("Invalid unwind code");
>> +  case UOP_PushNonVol:
>> +  case UOP_AllocSmall:
>> +  case UOP_SetFPReg:
>> +  case UOP_PushMachFrame:
>> +    UsedSlots = 1;
>> +    break;
>> +  case UOP_SaveNonVol:
>> +  case UOP_SaveXMM128:
>> +    UsedSlots = 2;
>> +    break;
>> +  case UOP_SaveNonVolBig:
>> +  case UOP_SaveXMM128Big:
>> +    UsedSlots = 3;
>> +    break;
>> +  case UOP_AllocLarge:
>> +    UsedSlots =  (UnwindCodes[0].getOpInfo() == 0) ? 2 : 3;
>> +    break;
>> +  }
>>
>> How about breaking this out into a static function getNumUsedSlots()?
>> Same for the other switch in this function. You can probably then
>> inline what remains of this function (which should be very little)
>> into the loop in printCOFFUnwindCode().
>>
>>
>> +namespace {
>>
>> For functions, use static. See
>> <http://llvm.org/docs/CodingStandards.html#anonymous-namespaces>.
>>
>> +  if (UsedSlots > SlotsAvail) {
>> +    outs() << "Unwind data corrupted\n";
>> +    return SlotsAvail;
>> +  }
>>
>> Can you diagnose the situation better? Maybe say something like
>> "encountered unwind op <...> of size <N> slots, but only <X> remaining
>> in buffer".
>>
>> +        outs() << "  Size of prolog: " << (int) UI->PrologSize << "\n";
>> +        outs() << "  Number of Codes: " << (int) UI->NumCodes << "\n";
>>
>> Why the casts here?
>>
>> +    if (Name == ".pdata") {
>> +      Pdata = Obj->getCOFFSection(SI);
>>
>> I feel like you could use an early exit/continue to reduce the nesting
>> here.
>>
>> +      unsigned i = 0;
>> +      while ((Contents.size() - i) >= sizeof(RuntimeFunction)) {
>> +        const RuntimeFunction *RF =
>> +            reinterpret_cast<const RuntimeFunction *>(Contents.data() +
>> i);
>>
>> This loop is wacky. Can you rewrite it as a regular for loop over
>> RuntimeFunction*'s? It's really hard to tell what is actually being
>> iterated over.
>>
>> +  resolveSymbolName(Rels, Offset, Sym);
>>
>> not checking the error code?
>>
>> -- Sean Silva
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>