[cfe-dev] C++ algorithm analysis tool

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Wed Apr 5 03:05:22 PDT 2017


On second thought, maybe it's quite neat.

I mean, like, a person who doesn't know anything about AST and about 
matchers and whatever, could probably do something like that

$ cat < pattern.c
void bubblesort(int *array, int length) {
       for (int i = 0; i < length - 1; ++i)
          for (int j = 0; j < length - i - 1; ++j)
              if (array[j] > array[j + 1]) {
                  int tmp = array[j];
                  array[j] = array[j + 1];
                  array[j + 1] = tmp;
              }
}
^D

and then

$ clang-finder --pattern pattern.c my_project.c

and the tool would find all clones of the pattern in the project, by 
constructing the AST for the pattern and hashing it and looking for 
clones. The pattern would be in a different AST context, but i guess for 
the clone checker it shouldn't be a problem, so we wouldn't need to deal 
with ASTImporter for that.

On 4/4/17 12:25 PM, Vassil Vassilev wrote:
> On 04/04/17 10:59, Raphael Isemann wrote:
>> I feel the implementation of an algorithm in the STL and the
>> equivalent implementation written by the user are usually so different
>> in terms of syntax that the current clone detection won't work that
>> well. If we had a constraint for estimating semantic equality, then it
>> would be a different situation, but I don't expect an implementation
>> of this anytime soon :).
>>
>> However, if we are satisfied with syntactic equality, then this can be
>> done in a few minutes with the clone detector infrastructure with
>> something like this pseudocode:
>>
>>     detector.findClones(RecursiveTypeIIConstraint(),
>> MinFunctionBodyCount(1), MinComplexity(20));
>>
>> This would detect code like this:
>>
>> void bubblesort(int *array, int length) {
>>       for (int i = 0; i < length - 1; ++i)
>>          for (int j = 0; j < length - i - 1; ++j)
>>              if (array[j] > array[j + 1]) {
>>                  int tmp = array[j];
>>                  array[j] = array[j + 1];
>>                  array[j + 1] = tmp;
>>              }
>> }
>>
>> int main() {
>>       for (int i = 0; i < length - 1; ++i) // expect-warning{You could
>> call the function 'bubblesort' instead
>>          for (int j = 0; j < length - i - 1; ++j)
>>              if (array[j] > array[j + 1]) {
>>                  int tmp = array[j];
>>                  array[j] = array[j + 1];
>>                  array[j + 1] = tmp;
>>              }
>> }
>>
>> - Raphael
>
>   That being said, patches are welcome. The current infrastructure is 
> friendly for implementing semantic clone detection constraints (clones 
> type III). We'd be interested to help if necessary.
>
> -- Vassil
>>
>> 2017-04-04 10:35 GMT+02:00 Artem Dergachev <noqnoqneo at gmail.com>:
>>> +CC CloneDetector guys.
>>>
>>> Hmm, the idea of making body-farms and then using CloneChecker to find
>>> clones of synthesized bodies in the actual code looks curious and 
>>> funny,
>>> though i'm not immediately seeing how is it superior to ASTMatchers.
>>>
>>>
>>>
>>> On 4/1/17 1:49 PM, Kirill Bobyrev via cfe-dev wrote:
>>>>
>>>> Hi Chris,
>>>>
>>>> To my knowledge, there isn't.
>>>>
>>>> I don't recall where I got the idea, but I gave it a try last summer
>>>> trying to implement a clang-tidy check doing what you proposed. I 
>>>> didn't
>>>> have enough time to complete it, though, and I only managed to 
>>>> detect one or
>>>> two very simple patterns.
>>>>
>>>> After thinking about this idea for some time I found that 
>>>> clang-tidy might
>>>> be a perfect place for that, not sure whether a separate tool would be
>>>> beneficial. The task of detecting a specific pattern is very 
>>>> similar to what
>>>> clang-tidy checks do in a wide range of tasks. Also, there'd be a 
>>>> separate
>>>> heuristic set for each standard algorithm, which makes the 
>>>> partitioning into
>>>> different checks (for each popular standard library algorithm) 
>>>> natural.
>>>>
>>>> In my opinion, such checks would be useful, I'd be interested in 
>>>> seeing a
>>>> proof-of-concept of some sort.
>>>>
>>>> One more idea I have in mind: it might be interesting to try using
>>>> CloneChecker (a check of Clang Static Analyzer) to detect similar 
>>>> patterns
>>>> in a generic way, but I'm not sure how beneficial that would be in 
>>>> practice.
>>>> Still, might worth a try.
>>>>
>>>> +CC Alex, he might have some thoughts about this.
>>>>
>>>> Kind regards,
>>>> Kirill
>>>>
>>>>
>>>> On 01/04/17 02:42, Christopher Di Bella via cfe-dev wrote:
>>>>> Hey everyone,
>>>>>
>>>>> Just wondering if there's a clang tool that can analyse an 
>>>>> algorithm to
>>>>> suggest where standard algorithms can be replace handwritten algos?
>>>>> E.g.
>>>>>
>>>>> int i = 0;
>>>>> for (; i < v.size(); ++i)
>>>>>     if (v[i] == expected)
>>>>>        break;
>>>>> if (i != v.size())
>>>>>     some_function(v[i]);
>>>>>
>>>>> Can be rewritten to
>>>>>
>>>>> auto i = find(v.begin(), v.end(), expected);
>>>>> if (i != v.end())
>>>>>     some_function(*i);
>>>>>
>>>>> or in C++17:
>>>>>
>>>>> if (auto i = find(v.begin(), v.end(), expected); i != v.end())
>>>>>     some_function(*i);
>>>>>
>>>>> If not, how difficult a task is it to write such a tool? Is there
>>>>> anything that one should take into special consideration while 
>>>>> writing this
>>>>> tool? Do you think it would have a lot of use-cases? (I do, based 
>>>>> on my
>>>>> company's code base, and code I have seen while marking assignments).
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Chris
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> cfe-dev mailing list
>>>>> cfe-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>




More information about the cfe-dev mailing list