[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
Kristóf Umann via cfe-commits
cfe-commits at lists.llvm.org
Tue Jul 16 05:49:05 PDT 2024
https://github.com/Szelethus updated https://github.com/llvm/llvm-project/pull/97407
>From 9fed2b7dc5395f487cb91c10eb076bb87e05e9b6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Krist=C3=B3f=20Umann?= <dkszelethus at gmail.com>
Date: Tue, 2 Jul 2024 12:58:19 +0200
Subject: [PATCH 1/4] [analyzer][NFC] Add some docs for LazyCompoundValue
Yes, I basically copy-pasted some posts from discord and Artem's book,
but these make for a rather decent docs.
---
.../StaticAnalyzer/Core/PathSensitive/SVals.h | 39 +++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
index 3a4b087257149..e44cda50ef21d 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
@@ -326,6 +326,12 @@ class LocAsInteger : public NonLoc {
static bool classof(SVal V) { return V.getKind() == LocAsIntegerKind; }
};
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// Source: https://github.com/haoNoQ/clang-analyzer-guide, v0.1, section 5.3.2.
class CompoundVal : public NonLoc {
friend class ento::SValBuilder;
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
};
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be unknown
+/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
+/// * a reference to the TypedValueRegion being snapshotted (yes, it is always
+/// typed), and also
+/// * a copy of the whole Store object, obtained from the ProgramState in
+/// which it was created.
+///
+/// Essentially, nonloc::LazyCompoundVal is a performance optimization for the
+/// analyzer. Because Store is immutable, creating a nonloc::LazyCompoundVal is
+/// a very cheap operation. Note that the Store contains all region bindings in
+/// the program state, not only related to the region. Later, if necessary, such
+/// value can be unpacked -- eg. when it is assigned to another variable.
+///
+/// Source: https://github.com/haoNoQ/clang-analyzer-guide, v0.1, section 5.3.2.
+///
+/// If you ever need to see if a LazyCompoundVal is fully or partially
+/// (un)initialized, you can iterBindings(). This is non-typechecked lookup
+/// (i.e., you cannot figure out which specific sub-region is initialized by the
+/// value you look at, you only get a byte offset). You can also improve
+/// iterBindings() to make it possible to restrict the iteration to a single
+/// cluster, because within the LazyCompoundVal’s Store only the cluster that
+/// corresponds to the LazyCompoundVal’s parent region is relevant.
+///
+/// Source: https://discourse.llvm.org/t/analyzer-for-the-undefined-value-of-array-element-the-tracking-information-is-incomplete/49372/2
class LazyCompoundVal : public NonLoc {
friend class ento::SValBuilder;
>From 1980cea32848f889d63cc61444d63a00506b52f5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Krist=C3=B3f=20Umann?= <dkszelethus at gmail.com>
Date: Tue, 16 Jul 2024 14:39:12 +0200
Subject: [PATCH 2/4] Apply suggestions from code review
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Co-authored-by: Artem Dergachev <noqnoqneo at gmail.com>
Co-authored-by: Donát Nagy <donat.nagy at ericsson.com>
Co-authored-by: Balazs Benics <benicsbalazs at gmail.com>
---
.../StaticAnalyzer/Core/PathSensitive/SVals.h | 44 +++++++++++--------
1 file changed, 25 insertions(+), 19 deletions(-)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
index e44cda50ef21d..0ddc272de2891 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
@@ -352,21 +352,21 @@ class CompoundVal : public NonLoc {
static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
};
-/// The simplest example of a concrete compound value is nonloc::CompoundVal,
-/// which represents a concrete r-value of an initializer-list or a string.
-/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
-/// literal.
-///
-/// However, there is another compound value used in the analyzer, which appears
-/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// While nonloc::CompoundVal covers a few simple use cases, the nonloc::LazyCompoundVal
+/// is a more performant and flexible way to represent an rvalue of record type,
+/// so it shows up much more frequently during analysis. This
/// value is an r-value that represents a snapshot of any structure "as a whole"
/// at a given moment during the analysis. Such value is already quite far from
-/// being re- ferred to as "concrete", as many fields inside it would be unknown
+/// being referred to as "concrete", as many fields inside it would be unknown
/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
/// * a reference to the TypedValueRegion being snapshotted (yes, it is always
/// typed), and also
-/// * a copy of the whole Store object, obtained from the ProgramState in
-/// which it was created.
+/// * a reference to the whole Store object, obtained from the ProgramState in
+/// which the nonloc::LazyCompoundVal was created.
+///
+/// Note that the old ProgramState and its Store is kept alive during the
+/// analysis because these are immutable functional data structures and each new
+/// Store value is represented as "earlier Store" + "additional binding".
///
/// Essentially, nonloc::LazyCompoundVal is a performance optimization for the
/// analyzer. Because Store is immutable, creating a nonloc::LazyCompoundVal is
@@ -374,15 +374,21 @@ class CompoundVal : public NonLoc {
/// the program state, not only related to the region. Later, if necessary, such
/// value can be unpacked -- eg. when it is assigned to another variable.
///
-/// Source: https://github.com/haoNoQ/clang-analyzer-guide, v0.1, section 5.3.2.
-///
-/// If you ever need to see if a LazyCompoundVal is fully or partially
-/// (un)initialized, you can iterBindings(). This is non-typechecked lookup
-/// (i.e., you cannot figure out which specific sub-region is initialized by the
-/// value you look at, you only get a byte offset). You can also improve
-/// iterBindings() to make it possible to restrict the iteration to a single
-/// cluster, because within the LazyCompoundVal’s Store only the cluster that
-/// corresponds to the LazyCompoundVal’s parent region is relevant.
+/// If you ever need inspect the contents of the LazyCompoundVal, you can use
+/// StoreManager::iterBindings(). It'll iterate through all values in the Store,
+/// but you're only interested in the ones that belong to LazyCompoundVal::getRegion();
+/// other bindings are immaterial.
+///
+/// NOTE: LazyCompoundVal::getRegion() itself is also immaterial. It is only an
+/// implementation detail. LazyCompoundVal represents only the rvalue, the data (known or unknown)
+/// that *was* stored in that region *at some point in the past*. The region should not be used for
+/// any purpose other than figuring out what part of the frozen Store you're interested in.
+/// The value does not represent the *current* value of that region. Sometimes it may, but
+/// this should not be relied upon. Instead, if you want to figure out what region it represents,
+/// you typically need to see where you got it from in the first place.
+/// The region is absolutely not analogous to the C++ "this" pointer.
+/// It is also not a valid way to "materialize" the prvalue into a glvalue in C++, because the region
+/// represents the *old* storage (sometimes very old), not the *future* storage.
///
/// Source: https://discourse.llvm.org/t/analyzer-for-the-undefined-value-of-array-element-the-tracking-information-is-incomplete/49372/2
class LazyCompoundVal : public NonLoc {
>From 23b79f32ae6e98d8f45702f7d7c712a4e6c418f9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Krist=C3=B3f=20Umann?= <dkszelethus at gmail.com>
Date: Tue, 16 Jul 2024 14:45:20 +0200
Subject: [PATCH 3/4] Remove sources, format comments
---
.../StaticAnalyzer/Core/PathSensitive/SVals.h | 45 +++++++++----------
1 file changed, 22 insertions(+), 23 deletions(-)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
index 0ddc272de2891..172f1c01c2cf6 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
@@ -330,8 +330,6 @@ class LocAsInteger : public NonLoc {
/// which represents a concrete r-value of an initializer-list or a string.
/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
/// literal.
-///
-/// Source: https://github.com/haoNoQ/clang-analyzer-guide, v0.1, section 5.3.2.
class CompoundVal : public NonLoc {
friend class ento::SValBuilder;
@@ -352,13 +350,14 @@ class CompoundVal : public NonLoc {
static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
};
-/// While nonloc::CompoundVal covers a few simple use cases, the nonloc::LazyCompoundVal
-/// is a more performant and flexible way to represent an rvalue of record type,
-/// so it shows up much more frequently during analysis. This
-/// value is an r-value that represents a snapshot of any structure "as a whole"
-/// at a given moment during the analysis. Such value is already quite far from
-/// being referred to as "concrete", as many fields inside it would be unknown
-/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
+/// While nonloc::CompoundVal covers a few simple use cases,
+/// nonloc::LazyCompoundVal is a more performant and flexible way to represent
+/// an rvalue of record type, so it shows up much more frequently during
+/// analysis. This value is an r-value that represents a snapshot of any
+/// structure "as a whole" at a given moment during the analysis. Such value is
+/// already quite far from being referred to as "concrete", as many fields
+/// inside it would be unknown or symbolic. nonloc::LazyCompoundVal operates by
+/// storing two things:
/// * a reference to the TypedValueRegion being snapshotted (yes, it is always
/// typed), and also
/// * a reference to the whole Store object, obtained from the ProgramState in
@@ -376,21 +375,21 @@ class CompoundVal : public NonLoc {
///
/// If you ever need inspect the contents of the LazyCompoundVal, you can use
/// StoreManager::iterBindings(). It'll iterate through all values in the Store,
-/// but you're only interested in the ones that belong to LazyCompoundVal::getRegion();
-/// other bindings are immaterial.
-///
-/// NOTE: LazyCompoundVal::getRegion() itself is also immaterial. It is only an
-/// implementation detail. LazyCompoundVal represents only the rvalue, the data (known or unknown)
-/// that *was* stored in that region *at some point in the past*. The region should not be used for
-/// any purpose other than figuring out what part of the frozen Store you're interested in.
-/// The value does not represent the *current* value of that region. Sometimes it may, but
-/// this should not be relied upon. Instead, if you want to figure out what region it represents,
-/// you typically need to see where you got it from in the first place.
-/// The region is absolutely not analogous to the C++ "this" pointer.
-/// It is also not a valid way to "materialize" the prvalue into a glvalue in C++, because the region
-/// represents the *old* storage (sometimes very old), not the *future* storage.
+/// but you're only interested in the ones that belong to
+/// LazyCompoundVal::getRegion(); other bindings are immaterial.
///
-/// Source: https://discourse.llvm.org/t/analyzer-for-the-undefined-value-of-array-element-the-tracking-information-is-incomplete/49372/2
+/// NOTE: LazyCompoundVal::getRegion() itself is also immaterial. It is only an
+/// implementation detail. LazyCompoundVal represents only the rvalue, the data
+/// (known or unknown) that *was* stored in that region *at some point in the
+/// past*. The region should not be used for any purpose other than figuring out
+/// what part of the frozen Store you're interested in. The value does not
+/// represent the *current* value of that region. Sometimes it may, but this
+/// should not be relied upon. Instead, if you want to figure out what region it
+/// represents, you typically need to see where you got it from in the first
+/// place. The region is absolutely not analogous to the C++ "this" pointer. It
+/// is also not a valid way to "materialize" the prvalue into a glvalue in C++,
+/// because the region represents the *old* storage (sometimes very old), not
+/// the *future* storage.
class LazyCompoundVal : public NonLoc {
friend class ento::SValBuilder;
>From ac2cf97fc63a4f6db2cfba5b38e8daeeb3bd5c43 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Krist=C3=B3f=20Umann?= <dkszelethus at gmail.com>
Date: Tue, 16 Jul 2024 14:48:51 +0200
Subject: [PATCH 4/4] Move LazyCompoundVal::getRegion() note to the actual
function
---
.../StaticAnalyzer/Core/PathSensitive/SVals.h | 26 ++++++++++---------
1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
index 172f1c01c2cf6..0aeb4a2e4ca81 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
@@ -378,18 +378,8 @@ class CompoundVal : public NonLoc {
/// but you're only interested in the ones that belong to
/// LazyCompoundVal::getRegion(); other bindings are immaterial.
///
-/// NOTE: LazyCompoundVal::getRegion() itself is also immaterial. It is only an
-/// implementation detail. LazyCompoundVal represents only the rvalue, the data
-/// (known or unknown) that *was* stored in that region *at some point in the
-/// past*. The region should not be used for any purpose other than figuring out
-/// what part of the frozen Store you're interested in. The value does not
-/// represent the *current* value of that region. Sometimes it may, but this
-/// should not be relied upon. Instead, if you want to figure out what region it
-/// represents, you typically need to see where you got it from in the first
-/// place. The region is absolutely not analogous to the C++ "this" pointer. It
-/// is also not a valid way to "materialize" the prvalue into a glvalue in C++,
-/// because the region represents the *old* storage (sometimes very old), not
-/// the *future* storage.
+/// NOTE: LazyCompoundVal::getRegion() itself is also immaterial (see the actual
+/// method docs for details).
class LazyCompoundVal : public NonLoc {
friend class ento::SValBuilder;
@@ -399,6 +389,18 @@ class LazyCompoundVal : public NonLoc {
}
public:
+ /// This function itself is immaterial. It is only an implementation detail.
+ /// LazyCompoundVal represents only the rvalue, the data (known or unknown)
+ /// that *was* stored in that region *at some point in the past*. The region
+ /// should not be used for any purpose other than figuring out what part of
+ /// the frozen Store you're interested in. The value does not represent the
+ /// *current* value of that region. Sometimes it may, but this should not be
+ /// relied upon. Instead, if you want to figure out what region it represents,
+ /// you typically need to see where you got it from in the first place. The
+ /// region is absolutely not analogous to the C++ "this" pointer. It is also
+ /// not a valid way to "materialize" the prvalue into a glvalue in C++,
+ /// because the region represents the *old* storage (sometimes very old), not
+ /// the *future* storage.
LLVM_ATTRIBUTE_RETURNS_NONNULL
const LazyCompoundValData *getCVData() const {
return castDataAs<LazyCompoundValData>();
More information about the cfe-commits
mailing list