[Lldb-commits] [PATCH] D138618: [LLDB] Enable 64 bit debug/type offset

Fri Dec 16 15:17:49 PST 2022

ayermolo added a comment.

In D138618#3999154 <https://reviews.llvm.org/D138618#3999154>, @labath wrote:

> Thanks for splitting this up. We still need to figure out what to do with the first patch, but these two are looking very good now.
>
> At least, in the sense that one can clearly see what is happening -- I'd still like to have to discussion about the DIERef->user_id conversion. Essentially, I'd like to see if we can preserve the property that the conversion always produces a valid user_id. With this patch that is not true anymore, because the OSO DIERefs need to be passed through the SymbolFileDWARF::GetUID function, even though it is _very_ tempting to just call `get_id()` on them. Previously, there was no get_id function, so one had no choice but to call GetUID. If we can make sure that the OSO symbol files always construct DIERefs with the oso field populated correctly, then I think this approach would be fine (and we should be able to delete the GetUID function). Otherwise, I'd like to explore some options of keeping the DIERef->user_id conversion tightly controlled. Personally, I am still not convinced that doing the conversion in the GetUID function is a bad thing. IIUC, the main problem is the part where we do a bitwise or of the DIERef (the die offset part) and the OSO ID (`GetID() | ref.die_offset()`), but I don't see why we couldn't do something about that. Basically we could just create an inverse function of `GetOSOIndexFromUserID` (which extracts the OSO symbol file from the user id) -- and make sure the functions are close to each other and their implementation matches.

I don't think creation of ID was that tightly controlled. 
For example
oso_symfile->SetID(((uint64_t)m_cu_idx + 1ull) << 32ull);
in SymbolFileDWARFDebugMap.cpp
But on more general note all the assumptions about bit layout scattered through few places:
GetOSOIndexFromUserID
GetUID
SymbolFileDWARFDwo constructor
GetDwoNum
Encode/Decode UID
and even something like 
const dw_offset_t function_die_offset = func.GetID();
return DecodedUID{

  *dwarf, {std::nullopt, DIERef::Section::DebugInfo, dw_offset_t(uid)}};

So probably moving to something uniform to construct uid for dwo and oso cases, and extract relevant fields is a step in a right direction.

I do see your point that it's currently not quite there.

This

  if (GetDebugMapSymfile()) {
      DIERef die_ref(GetID());
      die_ref.set_die_offset(ref.die_offset());
      return die_ref.get_id();
    }

Is kind of not ideal.
In other places it does work.

  static uint32_t GetOSOIndexFromUserID(lldb::user_id_t uid) {
      llvm::Optional<uint32_t> OsoNum = DIERef(uid).oso_num();
      lldbassert(OsoNum && "Invalid OSO Index");
      return *OsoNum;
    }

  oso_symfile->SetID(DIERef(DIERef::IndexType::OSONum, m_cu_idx,
                                        DIERef::Section(0), 0)
                                     .get_id());

  llvm::Optional<uint32_t> GetDwoNum() override {
      return DIERef(GetID()).dwo_num();
    }

I guess main issue with GetUID is that we rely on an internal state of SymbolFileDWARF to

1. figure out if we are dealing with dwo or oso with check for GetDebugMapSymfile
2. get extra data GetDwoNum(), and GetID()

We can either push that logic on the caller side of things (not I deal I would think) and remove GetUID, or extend the constructor to be a bit more explicit. This way all the bit settings are still consolidated, but the logic of when to create what is still hidden in GetDebugMapSymfile.

What do you think?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138618/new/

https://reviews.llvm.org/D138618