[clang] [llvm] Adapted MemRegion::getDescriptiveName to handle ElementRegions (PR #85104)

via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 14 06:05:25 PDT 2024


================
@@ -720,14 +720,20 @@ std::string MemRegion::getDescriptiveName(bool UseQuotes) const {
       CI->getValue().toString(Idx);
       ArrayIndices = (llvm::Twine("[") + Idx.str() + "]" + ArrayIndices).str();
     }
-    // If not a ConcreteInt, try to obtain the variable
-    // name by calling 'getDescriptiveName' recursively.
-    else {
-      std::string Idx = ER->getDescriptiveName(false);
-      if (!Idx.empty()) {
-        ArrayIndices = (llvm::Twine("[") + Idx + "]" + ArrayIndices).str();
+    // Index is a SymbolVal.
+    else if (auto SI = ER->getIndex().getAs<nonloc::SymbolVal>()) {
+      if (SymbolRef SR = SI->getAsSymbol()) {
+        if (const MemRegion *OR = SR->getOriginRegion()) {
+          std::string Idx = OR->getDescriptiveName(false);
+          ArrayIndices = (llvm::Twine("[") + Idx + "]" + ArrayIndices).str();
+        }
       }
     }
+    // Index is neither a ConcreteInt nor SymbolVal, give up and return.
+    else {
+      assert(false && "we should have a descriptive name");
+      return "";
+    }
----------------
NagyDonat wrote:

```suggestion
    // Index is symbolic, but may have a descriptive name
    else {
      auto SI = ER->getIndex().getAs<nonloc::SymbolVal>();
      if (!SI)
        return "";
        
      const MemRegion *OR = SI->getAsSymbol()->getOriginRegion();
      if (!OR)
        return "";

      std::string Idx = OR->getDescriptiveName(false);
      if (Idx.empty())
        return "";
      
      ArrayIndices = (llvm::Twine("[") + Idx + "]" + ArrayIndices).str();
    }
```

I realize it just now, but there are several failure cases that are not handled/handled incorrectly within this code:
(1) `ER->getIndex().getAs<nonloc::SymbolVal>()` may return `nullptr` in the unusual but perfectly legal case when the index is a symbolic value that is e.g. a `nonloc::LocAsInteger` (a pointer converted to an integer type).

(2) `SymbolVal::getAsSymbol()` returns nonnull, but `SI->getAsSymbol()->getOriginRegion();` may return `nullptr` when we have a symbolic value that doesn't originate as the value of a memory region (e.g. it's a conjured symbol, the integer returned by an opaque function).

(3) Finally `OR->getDescriptiveName(false)` may return an empty string when the origin region doesn't have a descriptive name (e.g. it's a symbolic region behind an opaque pointer, a nameless heap area returned by `malloc` etc.).

In each of these three situations the right thing to do is returning an empty string, because `getDescriptiveName` already signals failure by returning `""` when it cannot find a natural descriptive name. (And code that uses this function checks for an empty return value.)

Disclaimer: I did not test the code that I'm suggesting here; there may be typos in it. Also, it would be good to have testcases that cover the situations (1)-(3) and verify that this method doesn't produce incorrect output under them.

https://github.com/llvm/llvm-project/pull/85104


More information about the cfe-commits mailing list