Extend IdenticalExprChecker to find more potential problems

Daniel Fahlgren daniel at fahlgren.se
Sun Feb 2 23:16:03 PST 2014


Hi, Jordan.

On Thu, 2014-01-30 at 09:47 -0800, Jordan Rose wrote:
> Hi, Daniel. Can you split these two checks up into separate patches? I
> know they're pretty similar and have the same basic idea, but we like
> to have commits be limited to one new feature at a time. (Combining
> bitwise and logical operators into one is fine, though.) Sorry it's a
> bit more work for you to tease them apart.

No problem. I have fixed your comments with the two attached patches.
if_stmt adds support to test if both branches are identical and should
be applied first. bin_op adds support to find identical expressions when
using binary operators.

> +  // Split operators as long as we still have operators to split on. We will
> +  // get called for every binary operator in an expression so there is no need
> +  // to check every one against each other here, just the right most one with
> +  // the others.
> 
> Clever! Too bad it comes out to an N^2 algorithm, but N is likely
> pretty small most of the time, and the nature of isIdenticalStmt

Thanks. I thought about a better way to fix this when I implemented it,
but couldn't find a good one. First, somehow, sorting the expressions
and check only adjacent ones will probably be slower in practice due to
the overhead of sorting.

> Some missing test cases:
> 
> Should not warn: x == 4 && y == 5 || x == 4 && y == 6
> 
> Should warn: x ? "abc" : "abc"
> 
> Test cases with while, do, for, if, and return within the branches of
> an if. If you add the code, we need to test them! We could also not
> worry about this for now and just say the checker isn't powerful enough
> to handle them.

Test cases added.

> Why would that be useful? Because I'm a bit concerned about performance
> for the if-branch-check. Can you try running this over a large-ish
> project and let me know the before and after times of analyzing with
> this checker on? If it's a significant difference we should put the
> if-check under a separate checker name.

I ran some tests with php 5.5.8 and openssl 1.0.1f. The tests were done
on my laptop with a ssd and 8GB of ram, so the numbers should be cpu
bound. Between each run I completely removed the directory and
reconfigured to make sure conditions were as similar as possible. The
checker was modified t only check the branches of an if statement, not
the binary operators.

Adding the check when running scan-build on php increased the time by
0.2%. On openssl it actually decreased by 0.15% instead. I think the
background tasks of my otherwise idle desktop has more influence on the
numbers than the test.

php 5.5.8 base run:

real  53m44.366s
user  51m31.245s
sys    1m11.288s

php 5.5.8 with checking if stmts:

real  53m51.311s
user  51m35.985s
sys    1m12.949s

openssl 1.0.1f base run:

real  11m40.819s
user  10m39.124s
sys    0m49.091s

openssl 1.0.1f with checking if stmts:

real  11m39.783s
user  10m38.536s
sys    0m48.879s

Best regards,
Daniel Fahlgren
-------------- next part --------------
A non-text attachment was scrubbed...
Name: if_stmt.patch
Type: text/x-patch
Size: 17571 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140203/47f00b32/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bin_op.patch
Type: text/x-patch
Size: 7177 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140203/47f00b32/attachment-0001.bin>


More information about the cfe-commits mailing list