[PATCH] D124658: [analyzer] Canonicalize SymIntExpr so the RHS is positive when possible
Tomasz KamiĆski via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed May 4 07:15:18 PDT 2022
tomasz-kaminski-sonarsource added inline comments.
================
Comment at: clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:204
+ // subtraction/addition of the negated value.
+ if (!RHS.isNegative()) {
+ ConvertedRHS = &BasicVals.Convert(resultTy, RHS);
----------------
steakhal wrote:
> tomasz-kaminski-sonarsource wrote:
> > steakhal wrote:
> > > I would rather swap these branches though, to leave the default case (aka. this) to the end.
> > I folded the `RHS.isNegative()` into the if for the `BinaryOperator::isAssociative(op)`, as same conversion is performed in final else branch.
> I think what confused me is that a different API is used for doing the conversion.
> - `resultIntTy.convert(RHS)`
> - `&BasicVals.Convert(resultTy, RHS)`
>
> Anyway, leave it as-is.
As a note, the use of different APIs was intentional. The `BasicVals` one is persisting the value, so it is safe to use ptr to it, as a consequence it is more costfull. So, I am delaying its use until I know I will need to persist the value.
================
Comment at: clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:212-219
+ llvm::APSInt ConvertedRHSValue = resultIntTy.convert(RHS);
+ // Check if the negation of the RHS is representable,
+ // i.e., the resultTy is signed, and it is not the lowest
+ // representable negative value.
+ if (ConvertedRHSValue > resultIntTy.getMinValue()) {
+ ConvertedRHS = &BasicVals.getValue(-ConvertedRHSValue);
+ op = (op == BO_Add) ? BO_Sub : BO_Add;
----------------
steakhal wrote:
> tomasz-kaminski-sonarsource wrote:
> > tomasz-kaminski-sonarsource wrote:
> > > steakhal wrote:
> > > > Somehow I miss a check for signedness here.
> > > > Why do you think it would be only triggered for signed types?
> > > >
> > > > I have a guess, that since we already handled `x +-0`, SymIntExprs like `x - (-0)` cannot exist here, thus cannot trigger this condition spuriously. I cannot think of any ther example that could cause this misbehaving. So in that sense `ConvertedRHSValue > resultIntTy.getMinValue()` implies *at this place* that `ConvertedRHSValue.isSigned()`.
> > > > I would rather see this redundant check here to make the correctness reasoning local though.
> > > The integer representation does not have negative zeros (the standard and clang assume two's complement). However, this condition does need to check for the signedness of the types. What I mean is that if the `RHS` is negative, but `ConvertedRHSValue` the branch will trigger and we will change `x - INT_MIN` to `x + (INT_MAX + 1)U` which is ok, as a negation of `INT_MIN` is representable as an unsigned type of same or lager bit with.
> > >
> > However, I was not able to reach this point with `RHS` being signed, and `resultTy` being unsigned. Any hints how this could be done?
> I'm not saying that I can follow this thought process. But the `clang/test/Analysis/PR49642.c` would trigger an assertion like this:
>
> ```lang=diff
> diff --git a/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp b/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
> index 088c33c8e612..7e59309228e1 100644
> --- a/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
> +++ b/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
> @@ -207,6 +207,16 @@ SVal SimpleSValBuilder::MakeSymIntVal(const SymExpr *LHS,
> "number of bits as its operands.");
>
> llvm::APSInt ConvertedRHSValue = resultIntTy.convert(RHS);
> + if (RHS.isSigned() && resultTy->isUnsignedIntegerOrEnumerationType()) {
> + llvm::errs() << "LHS sym:\n";
> + LHS->dump();
> + llvm::errs() << "RHS integral:\n";
> + RHS.dump();
> + llvm::errs() << "OP: " << BinaryOperator::getOpcodeStr(op) << "\n";
> + llvm::errs() << "result type:\n";
> + resultTy->dump();
> + llvm_unreachable("how is it possible??");
> + }
> // Check if the negation of the RHS is representable,
> // i.e., the resultTy is signed, and it is not the lowest
> // representable negative value.
> ```
>
> Which can be reduced into this one:
>
> ```lang=c
> // RUN: %clang_analyze_cc1 -Wno-implicit-function-declaration -w -verify %s \
> // RUN: -analyzer-checker=core \
> // RUN: -analyzer-checker=apiModeling.StdCLibraryFunctions
>
> // expected-no-diagnostics
>
> typedef int ssize_t;
> int write(int, const void *, unsigned long);
> unsigned c;
> void a() {
> int b = write(0, 0, c);
> b != 0;
> c -= b;
> b < 1;
> ++c; // crash simplifySValOnce: derived_$4{conj_$1{int, LC1, S700, #1},c} op(-) APInt(32b, 4294967295u -1s) :: unsigned int
> }
> ```
What I mean, is that performing normalization (op and sign switch) is always correct for the unsigned `resultInTy`, even if `RHS` is the lowest representable negative number. The code is already behaving correctly in that case (I have verified your example), as the `ConvertedRHSValue > resultIntTy.getMinValue()` is always passing in a situation when `resultIntTy.isUnsigned()` is true (zero was eliminated before), so I left simple check.
But, now I see that this is confusing, so I have updated the check to be more explicit and updated the comment.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124658/new/
https://reviews.llvm.org/D124658
More information about the cfe-commits
mailing list