[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)
Balazs Benics via cfe-commits
cfe-commits at lists.llvm.org
Mon Jun 3 04:52:13 PDT 2024
================
@@ -717,18 +717,71 @@ const ExplodedNode *StreamChecker::getAcquisitionSite(const ExplodedNode *N,
return nullptr;
}
+/// Invalidate only the requested elements instead of the whole buffer.
+/// This is basically a refinement of the more generic 'escapeArgs' or
+/// the plain old 'invalidateRegions'.
+/// This only works if the \p StartIndex and \p Count are concrete or
+/// perfectly-constrained.
+static ProgramStateRef
+escapeByStartIndexAndCount(ProgramStateRef State, CheckerContext &C,
+ const CallEvent &Call, const MemRegion *Buffer,
+ QualType ElemType, SVal StartIndex, SVal Count) {
+ if (!llvm::isa_and_nonnull<SubRegion>(Buffer))
+ return State;
+
+ auto UnboxAsInt = [&C, &State](SVal V) -> std::optional<int64_t> {
+ auto &SVB = C.getSValBuilder();
+ if (const llvm::APSInt *Int = SVB.getKnownValue(State, V))
+ return Int->tryExtValue();
+ return std::nullopt;
+ };
+
+ auto StartIndexVal = UnboxAsInt(StartIndex);
+ auto CountVal = UnboxAsInt(Count);
+
+ // FIXME: Maybe we could make this more generic, and expose this by the
+ // 'invalidateRegions' API. After doing so, it might make sense to make this
+ // limit configurable.
+ constexpr int MaxInvalidatedElementsLimit = 64;
+ if (!StartIndexVal || !CountVal || *CountVal > MaxInvalidatedElementsLimit) {
+ return State->invalidateRegions({loc::MemRegionVal{Buffer}},
+ Call.getOriginExpr(), C.blockCount(),
+ C.getLocationContext(),
+ /*CausesPointerEscape=*/false);
+ }
+
+ constexpr auto DoNotInvalidateSuperRegion =
+ RegionAndSymbolInvalidationTraits::InvalidationKinds::
+ TK_DoNotInvalidateSuperRegion;
+
+ auto &RegionManager = Buffer->getMemRegionManager();
+ SmallVector<SVal> EscapingVals;
+ EscapingVals.reserve(*CountVal);
+
+ RegionAndSymbolInvalidationTraits ITraits;
+ for (auto Idx : llvm::seq(*StartIndexVal, *StartIndexVal + *CountVal)) {
----------------
steakhal wrote:
Yes, this is a problem.
To properly invalidate element-wise, I must bind the correctly typed conjured symbols so that subsequent lookups in the store would hit the entry.
Consequently, I can't consider the buffer as a char array and invalidate each chars in the range.
I must use `ElementRegions`, thus I need properly aligned item-wise iteration and invalidation.
So, I could try to convert the element size and item count into a start byte offset and length in bytes. After this I could check if the start bytes offset would land on an item boundary and if the length would cover a whole element object.
This way we could cover the case like:
```
int buffer[100]; // assuming 4 byte ints.
fread(buffer + 1, 2, 10, file); // Reads 20 bytes of data, which is dividable by 4, thus we can think of the 'buffer[1:6]' elements as individually invalidated.
```
WDYT?
https://github.com/llvm/llvm-project/pull/93408
More information about the cfe-commits
mailing list