[cfe-dev] Advanced Rewriting

Rafael·Stahl via cfe-dev cfe-dev at lists.llvm.org
Thu Oct 25 00:49:07 PDT 2018


Hi Jonas

AST replacements are translated back into code by expanding macros where 
required. This of course might lead to code duplication and nasty 
inconsistency bugs when the programmer then changes the macro. This is 
not an issue for my use-case though: The transformations are applied 
_after_ programming, it's more of a code generation step. Why not apply 
those on IR level? For compatibility with other source-level tooling.

A great improvement for _during_ programming transformations would be to 
limit macro expansion to the absolutely necessary cases. Specifically: 
Don't expand when a change in the macro argument would be equivalent and 
don't expand when a change in the macro body would be equivalent. You 
could also do some clever stuff like parameterizing the original macro 
with the rewritten AST node, but that seems to intrusive to me.

@Manuel Yes, I will do that. Also to see if there is any interest.

-Rafael

On 24.10.18 10:30, Manuel Klimek wrote:
> If you want this in the core libs, a great idea would be to write up a 
> doc or email explaining the design and the use cases.
> Note that if this doesn't solve a generic enough use case, it might 
> still be a good idea for it to live in a different place - API surface 
> is a cost to users.
>
> On Tue, Oct 23, 2018 at 5:56 PM Jonas Toth via cfe-dev 
> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>
>     Hi Rafael,
>
>     nice to see your progress!
>
>     In my opinion these advancements should go in the Tooling and
>     Rewrite directly in clang but integrate with the existing
>     facilities. It would be inconvenient two have two separate ways of
>     doing rewritings.
>
>     What I am curious about is how to resolve the macro-situation. If
>     the refactoring will work directly on the AST and ignore macros,
>     how is the refactoring translated into real code again?
>
>     Having an functional way of doing the refactorings with just
>     transforming the AST, e.g. replacing one node with a new subtree
>     would be great and elegant on the AST side, but how would we move
>     back from AST to code?
>
>     Best, Jonas
>
>     Am 23.10.18 um 17:32 schrieb Rafael·Stahl:
>>
>>     Hey everyone
>>
>>     the two major limitations are resolved now:
>>
>>     - The macro argument issue
>>     - Support for replacing Decls using
>>     clang::ast_type_traits::DynTypedNode
>>
>>     The macro issue vanished by itself once I figured a little
>>     unintuitive detail out: The SourceLocations from the original AST
>>     and the ones from the new Preprocessor Lexer did not compare
>>     equal even though they refer to the exact same location.
>>
>>     I believe this is because expansion SrcLocs have a kind of
>>     pointer identity representation and equality just compares raw
>>     representations, such that even when they point to the same
>>     locations, they don't compare equal. What worked for me was a
>>     kind of "deep" comparison:
>>
>>     bool clutil::DeepSrcLocEqual(clang::SourceLocation lhs,
>>     clang::SourceLocation rhs, const clang::SourceManager &SM)
>>     {
>>         if (lhs == rhs)
>>             return true;
>>
>>         if (SM.getExpansionLoc(lhs) != SM.getExpansionLoc(rhs))
>>             return false;
>>         if (SM.getSpellingLoc(lhs) != SM.getSpellingLoc(rhs))
>>             return false;
>>
>>         clang::SourceLocation lhsMacro, rhsMacro;
>>         if (SM.isMacroArgExpansion(lhs, &lhsMacro))
>>         {
>>             if (!SM.isMacroArgExpansion(rhs, &rhsMacro))
>>                 return false;
>>             if (!DeepSrcLocEqual(lhsMacro, rhsMacro, SM))
>>                 return false;
>>         }
>>
>>         return true;
>>     }
>>
>>     Attached is the cleaned up rewriter that I ended up with now.
>>     Still, if there is any interest, I'd be happy to contribute this
>>     back to clangs libraries, but I would require some feedback how
>>     this fits into existing facilities.
>>
>>     Best regards
>>     Rafael
>>
>>     On 17.07.18 16:19, Jonas Toth wrote:
>>>
>>>     Hi Rafael,
>>>
>>>     I did read into clang-refactor a while ago but unfortunatly
>>>     could not follow that up. If I recall correctly its about
>>>     source-to-source transformation (as you said) and aims at
>>>     implementing the primitive refactorings that exist (e.g.
>>>     extract-method, extract-variable, ....).
>>>
>>>     Rewriting itself should happen with the normal tooling framework.
>>>
>>>     (https://clang.llvm.org/docs/RefactoringEngine.html)
>>>
>>>     Maybe the implementers of the existing code can give better
>>>     comments on you proposal (and might have considered a similar
>>>     solution to yours already).
>>>
>>>     +Alex Lorenz
>>>
>>>     All the best, Jonas
>>>
>>>
>>>     Am 17.07.2018 um 14:46 schrieb Rafael·Stahl:
>>>>
>>>>     Hi Jonas
>>>>
>>>>     Thanks for introducing me to this, I have seen the
>>>>     "Replacement" before, but not clang-refactor.
>>>>
>>>>     However it seems to only provide management facilities around
>>>>     rewrite operations and not aid with the rewriting itself. Am I
>>>>     missing something here?
>>>>
>>>>     The two core problems for me:
>>>>
>>>>     - nesting replacements: When implementing replacements with
>>>>     clang-refactor, I still have to provide replacements that are
>>>>     closed in themselves. I cannot make them depend on others, right?
>>>>     - macros: clang-refactor only seems to work with spelling
>>>>     locations.
>>>>
>>>>     Maybe an even simpler example: Replace all additions with
>>>>     "add(lhs, rhs)". This in itself is very difficult with clang as
>>>>     soon as the Stmts are nested or macros are involved.
>>>>
>>>>     Best regards
>>>>     Rafael
>>>>
>>>>
>>>>     On 16.07.2018 19:06, Jonas Toth via cfe-dev wrote:
>>>>>
>>>>>     Hi Rafael,
>>>>>
>>>>>     wouldn't your usecase be a task for clang-refactor?
>>>>>
>>>>>     Best,  Jonas
>>>>>
>>>>>
>>>>>     Am 16.07.2018 um 17:08 schrieb Rafael·Stahl via cfe-dev:
>>>>>>     Hey everyone
>>>>>>
>>>>>>     The rewriting API of Clang operates on the source code in
>>>>>>     textual form. The user can use AST nodes to figure out what
>>>>>>     to replace, but in the end he has to remove and insert
>>>>>>     snippets in a linear piece of text.
>>>>>>
>>>>>>     This is very inconvenient when it is required to restructure
>>>>>>     and nest replacements. The involvement of macros makes a
>>>>>>     manual process even more difficult. See some recent threads
>>>>>>     expressing difficulty with the API [1][2].
>>>>>>
>>>>>>     What do I mean by "nested replacements"? For example in the
>>>>>>     following:
>>>>>>
>>>>>>         int i = x + s->a;
>>>>>>
>>>>>>     I would want to replace the BinaryOperator with a function
>>>>>>     call and the MemberExpr with some constant:
>>>>>>
>>>>>>         int i = Addition(x, 7);
>>>>>>
>>>>>>     When keeping the two replacement rules independent of each
>>>>>>     other, achieving this with the current API is extremely
>>>>>>     difficult. More so when macros are involved.
>>>>>>
>>>>>>     I am proposing some kind of helper that aims to solve these
>>>>>>     issues by providing an interface that offers to directly
>>>>>>     replace AST nodes and a mechanism to nest AST node
>>>>>>     replacements - without having to worry about macros.
>>>>>>
>>>>>>     Potential usage:
>>>>>>
>>>>>>     - Develop a class that derives from StmtToRewrite to define
>>>>>>     how replacements should happen:
>>>>>>
>>>>>>         class RewriteAdds : public cu::StmtToRewrite
>>>>>>         {
>>>>>>         public:
>>>>>>             std::string makeReplaceStr() const override
>>>>>>             {
>>>>>>                 auto binOp = dyn_cast<BinaryOperator>(replaceS);
>>>>>>                 return "Addition(" +
>>>>>>     getMgr()->getReplaced(binOp->getLHS()).strToInsert + ", " +
>>>>>>     getMgr()->getReplaced(binOp->getRHS()).strToInsert + ")";
>>>>>>             }
>>>>>>         };
>>>>>>
>>>>>>         class RewriteMembs : public cu::StmtToRewrite
>>>>>>         {
>>>>>>         public:
>>>>>>             std::string makeReplaceStr() const override
>>>>>>             {
>>>>>>                 return "7";
>>>>>>             }
>>>>>>         };
>>>>>>
>>>>>>     - Construct a RewriteManager:
>>>>>>
>>>>>>         cu::RewriteManager mgr(ACtx, PP);
>>>>>>
>>>>>>     - Add rewriting operations to the manager:
>>>>>>
>>>>>>         // std::vector<const Stmt *> AddStmts = /* matched from
>>>>>>     binaryOperator() with plus */
>>>>>>         // std::vector<const Stmt *> MembStmts = /* matched from
>>>>>>     memberExpr() */
>>>>>>         for (const auto &S : AddStmts)
>>>>>>     mgr.registerStmt<RewriteAdds>(S);
>>>>>>         for (const auto &S : MembStmts)
>>>>>>     mgr.registerStmt<RewriteMembs>(S);
>>>>>>
>>>>>>     - Retrieve and apply the results:
>>>>>>
>>>>>>         clang::Rewriter rewriter(SM, LangOpts);
>>>>>>         for (const auto &r : mgr.getReplacements()) {
>>>>>>             rewriter.RemoveText(r.rangeToRemove);
>>>>>>     rewriter.InsertText(r.rangeToRemove.getBegin(), r.strToInsert);
>>>>>>         }
>>>>>>
>>>>>>
>>>>>>     At the end of this mail is my low quality code that kind-of
>>>>>>     implements this. TLDR:
>>>>>>
>>>>>>     - Build a hierarchy of stmts to replace and keep track of
>>>>>>     which replacements must be combined
>>>>>>     - Move further up in the AST if these replacements are inside
>>>>>>     a macro
>>>>>>     - Recursively lex the file and look for replacements
>>>>>>     outside-in by spelling locations. Expand any macros that are
>>>>>>     encountered during this. The re-lexing idea is based on the
>>>>>>     hint in [3].
>>>>>>
>>>>>>     The code has the following shortcomings:
>>>>>>
>>>>>>     - I do not know how to distinguish macro argument expansions
>>>>>>     within macros. For example in "#define FOO(a) a + a" the two
>>>>>>     "a"s expand to different AST nodes that could be replaced
>>>>>>     with different rules. This is an important issue, because it
>>>>>>     can lead to completely broken code with nesting.
>>>>>>     - Limited to Stmts, when Decls should be supported too.
>>>>>>     - Very un-optimized with lexing the entire source file many
>>>>>>     times. Easy to solve, but didn't want to raise the complexity
>>>>>>     further for now.
>>>>>>     - Could keep written code more clean by only expanding macros
>>>>>>     if required. For example not required if just a macro arg is
>>>>>>     replaced and all expansions would be the same.
>>>>>>
>>>>>>
>>>>>>     I am very interested in your general thoughts. I'm not very
>>>>>>     experienced with clang, but this was my vision how I would
>>>>>>     want to do replacements. Are you interested in getting this
>>>>>>     into clang? I would need help with ironing out the remaining
>>>>>>     issues.
>>>>>>
>>>>>>     -Rafael
>>>>>>
>>>>>>
>>>>>>     [1]
>>>>>>     http://lists.llvm.org/pipermail/cfe-dev/2018-July/058430.html
>>>>>>     [2]
>>>>>>     http://lists.llvm.org/pipermail/cfe-dev/2018-June/058213.html
>>>>>>     [3]
>>>>>>     http://lists.llvm.org/pipermail/cfe-dev/2017-August/055079.html
>>>>>>
>>>>>>
>>>>>>
>>>>>>     // snip
>>>>>>
>>>>>>
>>>>>>
>>>>>>     _______________________________________________
>>>>>>     cfe-dev mailing list
>>>>>>     cfe-dev at lists.llvm.org  <mailto:cfe-dev at lists.llvm.org>
>>>>>>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>
>>>>>
>>>>>
>>>>>     _______________________________________________
>>>>>     cfe-dev mailing list
>>>>>     cfe-dev at lists.llvm.org  <mailto:cfe-dev at lists.llvm.org>
>>>>>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>
>>>
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181025/e9ce7e53/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5449 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181025/e9ce7e53/attachment.bin>


More information about the cfe-dev mailing list