[Lldb-commits] [lldb] Make ValueObject::Cast work for casts from smaller to larger structs in the cases where this currently can work. (PR #84588)

Fri Mar 8 16:39:06 PST 2024

https://github.com/jimingham created https://github.com/llvm/llvm-project/pull/84588

The ValueObjectConstResult classes that back expression result variables play a complicated game with where the data for their values is stored.  They try to make it appear as though they are still tied to the memory in the target into which their value was written when the expression is run, but they also keep a copy in the Host which they use after the value is made (expression results are "history values" so that's how we make sure they have "the value at the time of the expression".)

However, that means that if you ask them to cast themselves to a value bigger than their original size, they don't have a way to get more memory for that purpose.  The same thing is true of ValueObjects backed by DataExtractors, the data extractors don't know how to get more data than they were made with in general.

The only place where we actually ask ValueObjects to sample outside their captured bounds is when you do ValueObject::Cast from one structure type to a bigger structure type.  In https://reviews.llvm.org/D153657 I handled this by just disallowing casts from one structure value to a larger one.  My reasoning at the time was that the use case for this was to support discriminator based C inheritance schemes, and you can't directly cast values in C, only pointers, so this was not a natural way to handle those types. It seemed logical that since you would have had to start with pointers in the implementation, that's how you would write your lldb introspection code as well.

Famous last words...

Turns out there are some heavy users of the SB API's who were relying on this working, and this is a behavior change, so this patch makes this work in the cases where it used to work before, while still disallowing the cases we don't know how to support.

Note that if you had done this Cast operation before with either expression results or value objects from data extractors, lldb would not have returned the correct results, so the cases this patch outlaws are ones that actually produce invalid results.  So nobody should be using Cast in these cases, or if they were, this patch will point out the bug they hadn't yet noticed.

>From e85c0108f3508537b4d2d9fe9120dd07ebd00361 Mon Sep 17 00:00:00 2001
From: Jim Ingham <jingham at apple.com>
Date: Fri, 8 Mar 2024 15:39:50 -0800
Subject: [PATCH] Make ValueObject::Cast work for casts from smaller to larger
 structs in the cases where this currently can work.

---
 lldb/source/Core/ValueObject.cpp              | 20 ++++--
 .../test/API/python_api/value/TestValueAPI.py | 68 ++++++++++++++++---
 lldb/test/API/python_api/value/main.c         | 15 +++-
 3 files changed, 89 insertions(+), 14 deletions(-)

diff --git a/lldb/source/Core/ValueObject.cpp b/lldb/source/Core/ValueObject.cpp
index d813044d02ff5f..f39bd07a255366 100644
--- a/lldb/source/Core/ValueObject.cpp
+++ b/lldb/source/Core/ValueObject.cpp
@@ -2744,8 +2744,19 @@ ValueObjectSP ValueObject::DoCast(const CompilerType &compiler_type) {
 
 ValueObjectSP ValueObject::Cast(const CompilerType &compiler_type) {
   // Only allow casts if the original type is equal or larger than the cast
-  // type.  We don't know how to fetch more data for all the ConstResult types,
-  // so we can't guarantee this will work:
+  // type, unless we know this is a load address.  Getting the size wrong for
+  // a host side storage could leak lldb memory, so we absolutely want to 
+  // prevent that.  We may not always get the right value, for instance if we
+  // have an expression result value that's copied into a storage location in
+  // the target may not have copied enough memory.  I'm not trying to fix that
+  // here, I'm just making Cast from a smaller to a larger possible in all the
+  // cases where that doesn't risk making a Value out of random lldb memory.
+  // You have to check the ValueObject's Value for the address types, since
+  // ValueObjects that use live addresses will tell you they fetch data from the
+  // live address, but once they are made, they actually don't.
+  // FIXME: Can we make ValueObject's with a live address fetch "more data" from
+  // the live address if it is still valid?
+
   Status error;
   CompilerType my_type = GetCompilerType();
 
@@ -2753,9 +2764,10 @@ ValueObjectSP ValueObject::Cast(const CompilerType &compiler_type) {
       = ExecutionContext(GetExecutionContextRef())
           .GetBestExecutionContextScope();
   if (compiler_type.GetByteSize(exe_scope)
-      <= GetCompilerType().GetByteSize(exe_scope)) {
+      <= GetCompilerType().GetByteSize(exe_scope) 
+      || m_value.GetValueType() == Value::ValueType::LoadAddress)
         return DoCast(compiler_type);
-  }
+
   error.SetErrorString("Can only cast to a type that is equal to or smaller "
                        "than the orignal type.");
 
diff --git a/lldb/test/API/python_api/value/TestValueAPI.py b/lldb/test/API/python_api/value/TestValueAPI.py
index 18376f76e3c850..512100912d6fe7 100644
--- a/lldb/test/API/python_api/value/TestValueAPI.py
+++ b/lldb/test/API/python_api/value/TestValueAPI.py
@@ -148,14 +148,66 @@ def test(self):
 
         # Test some other cases of the Cast API.  We allow casts from one struct type
         # to another, which is a little weird, but we don't support casting from a
-        # smaller type to a larger as we often wouldn't know how to get the extra data:
-        val_f = target.EvaluateExpression("f")
-        bad_cast = val_s.Cast(val_f.GetType())
-        self.assertFailure(
-            bad_cast.GetError(),
-            "Can only cast to a type that is equal to or smaller than the orignal type.",
-        )
-        weird_cast = val_f.Cast(val_s.GetType())
+        # smaller type to a larger when the underlying data is not in the inferior,
+        # since then we have no way to fetch the out-of-bounds values.
+        # For an expression that references a variable, or a FindVariable result,
+        # or an SBValue made from an address and a type, we can get back to the target,
+        # so those will work.  Make sure they do and get the right extra values as well.
+
+        # We're casting everything to the type of "f", so get that first:
+        f_var = frame0.FindVariable("f")
+        self.assertSuccess(f_var.error, "Got f")
+        bigger_type = f_var.GetType()
+
+        # First try a value that we got from FindVariable
+        container = frame0.FindVariable("my_container")
+        self.assertSuccess(container.error, "Found my_container")
+        fv_small = container.GetValueForExpressionPath(".data.small")
+        self.assertSuccess(fv_small.error, "Found small in my_container")
+        fv_cast = fv_small.Cast(bigger_type)
+        self.assertSuccess(fv_cast.error, "Can cast up from FindVariable")
+        child_checks = [
+            ValueCheck(name="a", value="33", type="int"),
+            ValueCheck(name="b", value="44", type="int"),
+            ValueCheck(name="c", value="55", type="int"),
+        ]
+        cast_check = ValueCheck(type=bigger_type.name, children=child_checks)
+
+        # Now try one we made with expr.  This one should fail, because expr
+        # stores the "canonical value" in host memory, and doesn't know how
+        # to augment that from the live address.
+        expr_cont = frame0.EvaluateExpression("my_container")
+        self.assertSuccess(expr_cont.error, "Got my_container by expr")
+        expr_small = expr_cont.GetValueForExpressionPath(".data.small")
+        self.assertSuccess(expr_small.error, "Got small by expr")
+        expr_cast = expr_small.Cast(bigger_type)
+        self.assertFailure(expr_cast.error, msg="Cannot cast expr result")
+
+        # Now try one we made with CreateValueFromAddress.  That will succeed
+        # because this directly tracks the inferior memory.
+        small_addr = fv_small.addr
+        self.assertTrue(small_addr.IsValid())
+        small_type = fv_small.GetType()
+        vfa_small = target.CreateValueFromAddress(
+            "small_from_addr", small_addr, small_type
+        )
+        self.assertSuccess(vfa_small.error, "Made small from address")
+        vfa_cast = vfa_small.Cast(bigger_type)
+        self.assertSuccess(vfa_cast.error, "Made a cast from vfa_small")
+        cast_check.check_value(self, vfa_cast, "Cast of ValueFromAddress succeeds")
+
+        # Next try ValueObject created from data.  They should fail as there's no
+        # way to grow the data:
+        data_small = target.CreateValueFromData(
+            "small_from_data", fv_small.data, fv_small.type
+        )
+        self.assertSuccess(data_small.error, "Made a valid object from data")
+        data_cast = data_small.Cast(bigger_type)
+        self.assertFailure(data_cast.error, msg="Cannot cast data backed SBValue")
+
+        # Now check casting from a larger type to a smaller, we can always do this,
+        # so just test one case:
+        weird_cast = f_var.Cast(val_s.GetType())
         self.assertSuccess(weird_cast.GetError(), "Can cast from a larger to a smaller")
         self.assertEqual(
             weird_cast.GetChildMemberWithName("a").GetValueAsSigned(0),
diff --git a/lldb/test/API/python_api/value/main.c b/lldb/test/API/python_api/value/main.c
index 672b0df376dc5a..cdb2aa2f6147bf 100644
--- a/lldb/test/API/python_api/value/main.c
+++ b/lldb/test/API/python_api/value/main.c
@@ -22,7 +22,7 @@ const char *weekdays[5] = { "Monday",
 const char **g_table[2] = { days_of_week, weekdays };
 
 typedef int MyInt;
-
+  
 struct MyStruct
 {
   int a;
@@ -36,6 +36,15 @@ struct MyBiggerStruct
   int c;
 };
 
+struct Container
+{
+  int discriminator;
+  union Data {
+    struct MyStruct small;
+    struct MyBiggerStruct big;
+  } data;
+};
+  
 int main (int argc, char const *argv[])
 {
     uint32_t uinthex = 0xE0A35F10;
@@ -43,8 +52,10 @@ int main (int argc, char const *argv[])
 
     int i;
     MyInt a = 12345;
-    struct MyStruct s = { 11, 22 };
+    struct MyStruct s = {11, 22};
     struct MyBiggerStruct f = { 33, 44, 55 };
+    struct Container my_container;
+    my_container.data.big = f;
     int *my_int_ptr = &g_my_int;
     printf("my_int_ptr points to location %p\n", my_int_ptr);
     int *fixed_int_ptr = (int*)(void*)0xAA;