[Lldb-commits] [PATCH] D72133: Data formatters: Look through array element typedefs

Tue Jan 7 02:27:29 PST 2020

teemperor added a comment.

In D72133#1806822 <https://reviews.llvm.org/D72133#1806822>, @clayborg wrote:

> So as long as the following are true from this patch I am ok:
>
> - if I ask for the array element type of "str" in the test that was added, it should return "MCHAR". We shouldn't be removing any typedefs from the type. The user can call SBType::GetCanonicalType() if they need to (or equivalent with internal APIs)
> - If there are no formatters for "MCHAR[]" we can fall back to "char[]".
> - If there are no formatters for "MMCHAR[]" from my example we fall back to the _first_ array formatter that supports the array of typedefs, or back to the the array of canonical types. Only the first array based formatter should be returned as the desired formatter
>
>   This patch seems to redefine what getting the element type of an array is which seems wrong to me. We just need to make the code that uses these APIs smarter.

This patch is moving the API closer to what you describe. The **current** behavior is that we strip the typedefs. The behavior we **expect** to get is that we don't strip the typedefs (which is what you describe). This patch moves the internal API closer to the **expected behavior**. The public SBAPI will be changed in a follow-up patch to be closer to the expected behavior. So no part of this patch is about adding any implicit stripping of typedefs which I think we all agree is unexpected and bad behavior. The only reason why we add the canonical type conversion to the SBAPI is to keep the bad SBAPI behavior for now (but this doesn't add any new typedef stripping, it just keeps the old one around for the SBAPI).

So from what I can tell this patch is improving the behavior by making it closer to what Greg describes (and what is correct), so it should land IMHO. The remaining existing typedef stripping can be fixed later (unless Greg feels strongly about fixing all typedef stripping it in this patch which is fine by me).

================
Comment at: lldb/source/API/SBType.cpp:215-218
+  CompilerType canonical_type =
+      m_opaque_sp->GetCompilerType(true).GetCanonicalType();
+  return LLDB_RECORD_RESULT(
+      SBType(TypeImplSP(new TypeImpl(canonical_type.GetArrayElementType()))));
----------------
jarin wrote:
> clayborg wrote:
> > Why does getting the canonical type of an array type change the array element type? I would think it wouldn't matter but it obviously must?
> Removal of typedef for element type is the existing behavior that I have been trying preserve. Raphael already suggested we should not preserve it.
> 
> For example, I get the following output for my program (both without and with my patch):
> 
> ```
> Type:    MCHAR [10]
> El type: char
> ```
> 
> Here is the script
> 
> ```
> import lldb, os
> 
> debugger = lldb.SBDebugger.Create()
> debugger.SetAsync(False)
> target = debugger.CreateTargetWithFileAndTargetTriple("df.out", \
>   "x86_64-pc-linux")
> main_bp = target.BreakpointCreateByLocation("df.cc", 11)
> process = target.LaunchSimple(None, None, os.getcwd())
> frame = process.selected_thread.GetSelectedFrame()
> var_a = process.selected_thread.GetSelectedFrame().GetValueForVariablePath("a")
> print("Type:    " + var_a.GetType().GetName())
> print("El type: " + var_a.GetType().GetArrayElementType().GetName())
> ```
> 
> If we remove the canonicalization here, we will indeed get the behavior that you want:
> 
> ```
> Type:    MCHAR [10]
> El type: MCHAR
> ```
> 
> 
> Why does getting the canonical type of an array type change the array element type? I would think it wouldn't matter but it obviously must?

It does matter because that's how canonical types work, see the Clang documentation in ASTContext.h:

```lang=c++
  /// The non-canonical version of a type may have many "decorated" versions of
  /// types.  Decorators can include typedefs, 'typeof' operators, etc. The
  /// returned type is guaranteed to be free of any of these, allowing two
  /// canonical types to be compared for exact equality with a simple pointer
  /// comparison.
```

================
Comment at: lldb/source/DataFormatters/FormatManager.cpp:245-261
+  uint64_t array_size;
+  if (compiler_type.IsArrayType(nullptr, &array_size, nullptr)) {
+    CompilerType element_type = compiler_type.GetArrayElementType();
+    if (element_type.IsTypedefType()) {
+      // Get the stripped element type and compute the stripped array type
+      // from it.
+      CompilerType deffed_array_type =
----------------
jarin wrote:
> clayborg wrote:
> > Wouldn't we want to add one for each typedef that has a formatter? Lets say we had:
> > 
> > ```
> > typedef char MCHAR;
> > typedef MCHAR MMCHAR;
> > 
> > int main() {
> >   MMCHAR str[5] = "abcd";
> >   return 0;
> > }
> > ```
> > 
> > if we had a special formatter for MCHAR that would maybe display MCHAR somehow special, would we be ok with always getting the standard C string formatter in this case? Seems like we should add one for each typedef that has a formatter and possibly also the base type?
> This does not work as expected with my patch, let me investigate why.
I tested this case when I reviewed this patch and IIRC this was caused by yet another call in GetArrayType (not sure though). But I don't think that's really blocking this patch as the old behaviour had the same flaw from what I can remember.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72133/new/

https://reviews.llvm.org/D72133