[cfe-dev] Getting out body of a while Statement

mats petersson via cfe-dev cfe-dev at lists.llvm.org
Thu Nov 17 10:40:18 PST 2016


Please: unless there are specific reasons to do so (e.g. discussing
personal things), always reply to the mailing list and all personal
participants taking part in the thread. It helps other people being able to
chime in, if they have better/different suggestions, as well as someone
else seeing the thread understanding what the outcome was.

On 17 November 2016 at 17:49, John Tan <NewSelleron at hotmail.com> wrote:

> What i want is really simple.
>
> I just wan to replace the original content in the while loop body with
> into a goto statement which will point to a label outside the statement,
> reason for this my project wants to do control flow flattening so the main
> purpose is to make reverse engineering harder.
>
>
>
> while (a > 3) {
>
> cout << "hello" << endl;
>
> }
>
> Will become
>
> while (a>3){
>
> goto label1;
>
> }
>
> label1:
>
> cout << "hello" << endl;
>
>
> Surely you mean:
while (a>3){

goto label1;

label2:
}
goto label3;
label1:

cout << "hello" << endl;
goto label2;
label3:

And since LLVM is pretty decent at figuring out "useless jumps", I'm not at
all sure that this will actually achieve anything useful - the "useless"
goto will just be removed. For any sufficiently complex project, with a bit
of inline code and a general large code-base, the compiler is pretty good
at obfuscating the code anyway.

> I am not sure how to use the ASTMatcher to get to the body of the while
> stmt .
>
The ASTMatcher will give you the AST statement for the while-loop,
WhileStmt, which has a "getBody", which gives you the statements inside the
body - you can get the source-locaton for the first and last. Or you can
perhaps use the Clang Rewriter functionality.

However, before you do that, do check that LLVM doesn't just remove your
gotos and turn the code into the same as you had before adding goto's - I'd
be very surprised if it doesn't optimise that away at some stage. At least
my small exampls:

a.c:

#include <stdio.h>

int main()
{
    int a = 0;
    while (a < 3)
    {
        printf("a=%d\n", a);
        a++;
    }
}

b.c:

#include <stdio.h>

int main()
{
    int a = 0;
    while (a < 3)
    {
        goto L1;
    L2:;
    }
    goto L3;

L1:
    printf("a=%d\n", a);
    a++;
    goto L2;

L3:

    return 0;
}

clang -S -O2 a.c
clang -S -O2 b.c

compiles to identical assembly-code in clang 3.8 (aside from the .file line
which obviously shows 'a.c' and 'b.c' respectively)

[And this is of course ignoring interesting effects of scoping in C++,
which will have to be dealt with in your translator if you don't want the
converted code to behave differently]

Modern compilers aren't very easy to trick into generating different code
just by adding goto's.

--
Mats

> I can get the sourcelocation of both the start and end of the body of the
> while stmt , but there is not method for me to extract information by
> source location. i hope you can help with me the method and the ASTMatcher
> needed.
> ------------------------------
> *From:* mats.o.petersson at googlemail.com <mats.o.petersson at googlemail.com>
> on behalf of mats petersson <mats at planetcatfish.com>
> *Sent:* Friday, November 18, 2016 1:42:13 AM
> *To:* John Tan
> *Cc:* cfe-dev at lists.llvm.org
> *Subject:* Re: [cfe-dev] Getting out body of a while Statement
>
> Really depends on what you want to achieve [in the big picture, not "I
> want a variable holding the content inside the while", but what you are
> actually planning to do beyond getting it into a variable - do you want to
> edit the source file to add or remove something, check that the body
> does/doesn't do something]
>
> Something involving the ASTMatcher would be a starting point:
> http://clang.llvm.org/docs/LibASTMatchersReference.html?
>
> If you want the actual source-code, then you'll also need to get out the
> source location, and use sourcemanager to get the "section of source code
> within the body into a string", but consider that you can have really
> "interesting" code:
>
>     while( a > 3 )
>     {
>     #include "mycode.h"
>     }
>
> or:
>     while( a > 3 )
>      #include "mycode.h"
>
> [where the content in mycode.h contains not just the loop body, but also
> further code that continues AFTER the loop.]
>
> or:
>
>     #define SOME_MACRO(x) while(a < (x))
>
>     SOME_MACRO(3)
>     {
>       ...
>     }
>
> or:
>     while( a > 3 )
>     {
>         SOME_MACRO(foo);
>     }
>
> where SOME_MACRO expands to some rather large chunk of code - and knowing
> the "source" is not really helpful in either of these cases. And of course,
> like the #include sample, you can have a the loop body end part way through
> the macro, so you probably don't really want to rely on the "string
> contains the body of the loop" if you want to do something with the content
> of the loop that is of any importance. These are of course simple examples
> of "unusual programming", but I guarantee that if you look at enoug code,
> you'll find SOMETHING like that.
>
>
> So, depending on what you actualy want to achieve, you may want to NOT try
> to deal with this as files/text strings, but as AST-code.
>
> --
> Mats
>
> On 17 November 2016 at 16:04, John Tan via cfe-dev <cfe-dev at lists.llvm.org
> > wrote:
>
>> i need help to get out the body of a while statement.
>>
>>
>> While( a > 3) {
>>
>>
>> cout << "hello" <<endl;   << --  I wan to copy out this line and store
>> into a variable.
>>
>> }
>>
>>
>> This is a example , i want to take out whats inside of the while
>> statement, and if its possible store it into a variable so i can print the
>> result out somewhere.
>>
>>
>> Much appreciated
>>
>> John Tan.
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20161117/f13ee93f/attachment.html>


More information about the cfe-dev mailing list