[PATCH] D124658: [analyzer] Canonicalize SymIntExpr so the RHS is positive when possible

Tomasz KamiƄski via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 4 07:15:18 PDT 2022


tomasz-kaminski-sonarsource added inline comments.


================
Comment at: clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:204
+    // subtraction/addition of the negated value.
+    if (!RHS.isNegative()) {
+      ConvertedRHS = &BasicVals.Convert(resultTy, RHS);
----------------
steakhal wrote:
> tomasz-kaminski-sonarsource wrote:
> > steakhal wrote:
> > > I would rather swap these branches though, to leave the default case (aka. this) to the end.
> > I folded the `RHS.isNegative()` into the if for the `BinaryOperator::isAssociative(op)`, as same conversion is performed in final else branch.
> I think what confused me is that a different API is used for doing the conversion.
>  - `resultIntTy.convert(RHS)`
>  - `&BasicVals.Convert(resultTy, RHS)`
> 
> Anyway, leave it as-is.
As a note, the use of different APIs was intentional. The `BasicVals` one is persisting the value, so it is safe to use ptr to it, as a consequence it is more costfull. So, I am delaying its use until I know I will need to persist the value.


================
Comment at: clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:212-219
+      llvm::APSInt ConvertedRHSValue = resultIntTy.convert(RHS);
+      // Check if the negation of the RHS is representable,
+      // i.e., the resultTy is signed, and it is not the lowest
+      // representable negative value.
+      if (ConvertedRHSValue > resultIntTy.getMinValue()) {
+        ConvertedRHS = &BasicVals.getValue(-ConvertedRHSValue);
+        op = (op == BO_Add) ? BO_Sub : BO_Add;
----------------
steakhal wrote:
> tomasz-kaminski-sonarsource wrote:
> > tomasz-kaminski-sonarsource wrote:
> > > steakhal wrote:
> > > > Somehow I miss a check for signedness here.
> > > > Why do you think it would be only triggered for signed types?
> > > > 
> > > > I have a guess, that since we already handled `x +-0`, SymIntExprs like `x - (-0)` cannot exist here, thus cannot trigger this condition spuriously. I cannot think of any ther example that could cause this misbehaving. So in that sense `ConvertedRHSValue > resultIntTy.getMinValue()` implies *at this place* that `ConvertedRHSValue.isSigned()`.
> > > > I would rather see this redundant check here to make the correctness reasoning local though.
> > > The integer representation does not have negative zeros (the standard and clang assume two's complement). However, this condition does need to check for the signedness of the types. What I mean is that if the `RHS` is negative, but `ConvertedRHSValue` the branch will trigger and we will change `x - INT_MIN` to `x + (INT_MAX + 1)U` which is ok, as a negation of `INT_MIN` is representable as an unsigned type of same or lager bit with.
> > > 
> > However, I was not able to reach this point with `RHS` being signed, and `resultTy` being unsigned. Any hints how this could be done?
> I'm not saying that I can follow this thought process. But the `clang/test/Analysis/PR49642.c` would trigger an assertion like this:
> 
> ```lang=diff
> diff --git a/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp b/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
> index 088c33c8e612..7e59309228e1 100644
> --- a/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
> +++ b/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
> @@ -207,6 +207,16 @@ SVal SimpleSValBuilder::MakeSymIntVal(const SymExpr *LHS,
>             "number of bits as its operands.");
>  
>      llvm::APSInt ConvertedRHSValue = resultIntTy.convert(RHS);
> +    if (RHS.isSigned() && resultTy->isUnsignedIntegerOrEnumerationType()) {
> +      llvm::errs() << "LHS sym:\n";
> +      LHS->dump();
> +      llvm::errs() << "RHS integral:\n";
> +      RHS.dump();
> +      llvm::errs() << "OP: " << BinaryOperator::getOpcodeStr(op) << "\n";
> +      llvm::errs() << "result type:\n";
> +      resultTy->dump();
> +      llvm_unreachable("how is it possible??");
> +    }
>      // Check if the negation of the RHS is representable,
>      // i.e., the resultTy is signed, and it is not the lowest
>      // representable negative value.
> ```
> 
> Which can be reduced into this one:
> 
> ```lang=c
> // RUN: %clang_analyze_cc1 -Wno-implicit-function-declaration -w -verify %s \
> // RUN:   -analyzer-checker=core \
> // RUN:   -analyzer-checker=apiModeling.StdCLibraryFunctions
> 
> // expected-no-diagnostics
> 
> typedef int ssize_t;
> int write(int, const void *, unsigned long);
> unsigned c;
> void a() {
>   int b = write(0, 0, c);
>   b != 0;
>   c -= b;
>   b < 1;
>   ++c; // crash simplifySValOnce: derived_$4{conj_$1{int, LC1, S700, #1},c}  op(-)  APInt(32b, 4294967295u -1s)  :: unsigned int
> }
> ```
What I mean, is that performing normalization (op and sign switch) is always correct for the unsigned `resultInTy`, even if `RHS` is the lowest representable negative number. The code is already behaving correctly in that case (I have verified your example), as the `ConvertedRHSValue > resultIntTy.getMinValue()` is always passing in a situation when `resultIntTy.isUnsigned()` is true (zero was eliminated before), so I left simple check.

But, now I see that this is confusing, so I have updated the check to be more explicit and updated the comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124658/new/

https://reviews.llvm.org/D124658



More information about the cfe-commits mailing list