[cfe-dev] Advanced Rewriting

Jonas Toth via cfe-dev cfe-dev at lists.llvm.org
Tue Jul 17 07:19:07 PDT 2018


Hi Rafael,

I did read into clang-refactor a while ago but unfortunatly could not
follow that up. If I recall correctly its about source-to-source
transformation (as you said) and aims at implementing the primitive
refactorings that exist (e.g. extract-method, extract-variable, ....).

Rewriting itself should happen with the normal tooling framework.

(https://clang.llvm.org/docs/RefactoringEngine.html)

Maybe the implementers of the existing code can give better comments on
you proposal (and might have considered a similar solution to yours
already).

+Alex Lorenz

All the best, Jonas


Am 17.07.2018 um 14:46 schrieb Rafael·Stahl:
>
> Hi Jonas
>
> Thanks for introducing me to this, I have seen the "Replacement"
> before, but not clang-refactor.
>
> However it seems to only provide management facilities around rewrite
> operations and not aid with the rewriting itself. Am I missing
> something here?
>
> The two core problems for me:
>
> - nesting replacements: When implementing replacements with
> clang-refactor, I still have to provide replacements that are closed
> in themselves. I cannot make them depend on others, right?
> - macros: clang-refactor only seems to work with spelling locations.
>
> Maybe an even simpler example: Replace all additions with "add(lhs,
> rhs)". This in itself is very difficult with clang as soon as the
> Stmts are nested or macros are involved.
>
> Best regards
> Rafael
>
>
> On 16.07.2018 19:06, Jonas Toth via cfe-dev wrote:
>>
>> Hi Rafael,
>>
>> wouldn't your usecase be a task for clang-refactor?
>>
>> Best,  Jonas
>>
>>
>> Am 16.07.2018 um 17:08 schrieb Rafael·Stahl via cfe-dev:
>>> Hey everyone
>>>
>>> The rewriting API of Clang operates on the source code in textual
>>> form. The user can use AST nodes to figure out what to replace, but
>>> in the end he has to remove and insert snippets in a linear piece of
>>> text.
>>>
>>> This is very inconvenient when it is required to restructure and
>>> nest replacements. The involvement of macros makes a manual process
>>> even more difficult. See some recent threads expressing difficulty
>>> with the API [1][2].
>>>
>>> What do I mean by "nested replacements"? For example in the following:
>>>
>>>     int i = x + s->a;
>>>
>>> I would want to replace the BinaryOperator with a function call and
>>> the MemberExpr with some constant:
>>>
>>>     int i = Addition(x, 7);
>>>
>>> When keeping the two replacement rules independent of each other,
>>> achieving this with the current API is extremely difficult. More so
>>> when macros are involved.
>>>
>>> I am proposing some kind of helper that aims to solve these issues
>>> by providing an interface that offers to directly replace AST nodes
>>> and a mechanism to nest AST node replacements - without having to
>>> worry about macros.
>>>
>>> Potential usage:
>>>
>>> - Develop a class that derives from StmtToRewrite to define how
>>> replacements should happen:
>>>
>>>     class RewriteAdds : public cu::StmtToRewrite
>>>     {
>>>     public:
>>>         std::string makeReplaceStr() const override
>>>         {
>>>             auto binOp = dyn_cast<BinaryOperator>(replaceS);
>>>             return "Addition(" +
>>> getMgr()->getReplaced(binOp->getLHS()).strToInsert + ", " +
>>> getMgr()->getReplaced(binOp->getRHS()).strToInsert + ")";
>>>         }
>>>     };
>>>
>>>     class RewriteMembs : public cu::StmtToRewrite
>>>     {
>>>     public:
>>>         std::string makeReplaceStr() const override
>>>         {
>>>             return "7";
>>>         }
>>>     };
>>>
>>> - Construct a RewriteManager:
>>>
>>>     cu::RewriteManager mgr(ACtx, PP);
>>>
>>> - Add rewriting operations to the manager:
>>>
>>>     // std::vector<const Stmt *> AddStmts = /* matched from
>>> binaryOperator() with plus */
>>>     // std::vector<const Stmt *> MembStmts = /* matched from
>>> memberExpr() */
>>>     for (const auto &S : AddStmts) mgr.registerStmt<RewriteAdds>(S);
>>>     for (const auto &S : MembStmts) mgr.registerStmt<RewriteMembs>(S);
>>>
>>> - Retrieve and apply the results:
>>>
>>>     clang::Rewriter rewriter(SM, LangOpts);
>>>     for (const auto &r : mgr.getReplacements()) {
>>>         rewriter.RemoveText(r.rangeToRemove);
>>>         rewriter.InsertText(r.rangeToRemove.getBegin(), r.strToInsert);
>>>     }
>>>
>>>
>>> At the end of this mail is my low quality code that kind-of
>>> implements this. TLDR:
>>>
>>> - Build a hierarchy of stmts to replace and keep track of which
>>> replacements must be combined
>>> - Move further up in the AST if these replacements are inside a macro
>>> - Recursively lex the file and look for replacements outside-in by
>>> spelling locations. Expand any macros that are encountered during
>>> this. The re-lexing idea is based on the hint in [3].
>>>
>>> The code has the following shortcomings:
>>>
>>> - I do not know how to distinguish macro argument expansions within
>>> macros. For example in "#define FOO(a) a + a" the two "a"s expand to
>>> different AST nodes that could be replaced with different rules.
>>> This is an important issue, because it can lead to completely broken
>>> code with nesting.
>>> - Limited to Stmts, when Decls should be supported too.
>>> - Very un-optimized with lexing the entire source file many times.
>>> Easy to solve, but didn't want to raise the complexity further for now.
>>> - Could keep written code more clean by only expanding macros if
>>> required. For example not required if just a macro arg is replaced
>>> and all expansions would be the same.
>>>
>>>
>>> I am very interested in your general thoughts. I'm not very
>>> experienced with clang, but this was my vision how I would want to
>>> do replacements. Are you interested in getting this into clang? I
>>> would need help with ironing out the remaining issues.
>>>
>>> -Rafael
>>>
>>>
>>> [1] http://lists.llvm.org/pipermail/cfe-dev/2018-July/058430.html
>>> [2] http://lists.llvm.org/pipermail/cfe-dev/2018-June/058213.html
>>> [3] http://lists.llvm.org/pipermail/cfe-dev/2017-August/055079.html
>>>
>>>
>>>
>>> ----------------------------------------
>>>
>>> RewriteManager.h
>>>
>>> ----------------------------------------
>>>
>>> #ifndef CLANGUTIL_REWRITEMANAGER_H
>>> #define CLANGUTIL_REWRITEMANAGER_H
>>>
>>> #include "ClangUtil/SourceRangeLess.h"
>>> #include "make_unique.h"
>>> #include "clang/AST/AST.h"
>>> #include <vector>
>>> #include <map>
>>>
>>>
>>> // TODO extend to decls
>>>
>>>
>>> namespace cu
>>> {
>>> // Represents a statement in the original AST that should be
>>> rewritten. To implement recursive replacements, call
>>> // getMgr()->getReplaced() on any AST node within the makeReplaceStr
>>> callback.
>>> class StmtToRewrite
>>> {
>>>     friend class RewriteManager;
>>>
>>> public:
>>>     // Returns the enclosing RewriteManager.
>>>     class RewriteManager *getMgr() const;
>>>     // Override this to build a replacement string. Implement
>>> recursive replacements with RewriteManager::getReplaced.
>>>     virtual std::string makeReplaceStr() const = 0;
>>>
>>>     // The statement to replace.
>>>     const clang::Stmt *replaceS = nullptr;
>>>
>>> private:
>>>     RewriteManager *m_mgr;
>>> };
>>>
>>> struct RewriteOperation
>>> {
>>>     clang::SourceRange rangeToRemove;
>>>     std::string strToInsert;
>>> };
>>>
>>> // A class for managing replacements of AST nodes. It allows to
>>> specifically target AST nodes instead of raw source
>>> // locations to enable easy replacements involving macros and nested
>>> replacements.
>>> // For extended documentation see: doc/rewriting.md
>>> class RewriteManager
>>> {
>>> public:
>>>     RewriteManager(clang::ASTContext &ACtx, clang::Preprocessor &PP);
>>>
>>>     clang::ASTContext &getACtx() const { return ACtx; }
>>>
>>>     // Registers a StmtToRewrite for use with getReplacements. Call
>>> this on all
>>>     // statements that should be rewritten before calling any
>>> rewriting functions.
>>>     void registerStmt(std::unique_ptr<StmtToRewrite> S);
>>>
>>>     // Helper for constructing the custom type from a Stmt.
>>>     template <typename T, typename... Args>
>>>     void registerStmt(const clang::Stmt *S, Args... args)
>>>     {
>>>         auto p = std::make_unique<T>(std::forward<Args>(args)...);
>>>         p->replaceS = S;
>>>         registerStmt(std::move(p));
>>>     }
>>>
>>>     // Get the full replacement of an AST node. Note that this
>>> function removes any replaced statements from the work
>>>     // list, so calling it twice will only replace the first time.
>>>     RewriteOperation getReplaced(const clang::Stmt *S);
>>>     // Get all replacements. These may be fewer than the requested
>>> ones because of nesting.
>>>     std::vector<RewriteOperation> getReplacements();
>>>
>>> private:
>>>     std::string getExpandedCode(const clang::Stmt *toReplaceS);
>>>
>>> private:
>>>     clang::ASTContext &ACtx;
>>>     const clang::LangOptions &LangOpts;
>>>     clang::SourceManager &SM;
>>>     clang::Preprocessor &PP;
>>>
>>>     // Manages the pending replacements.
>>>     class WorkList
>>>     {
>>>     public:
>>>         typedef std::map<clang::SourceRange, std::vector<const
>>> StmtToRewrite *>> RangeToRepMap;
>>>
>>>         WorkList(clang::ASTContext &ACtx, clang::SourceManager &SM);
>>>
>>>         bool isStmtPending(const clang::Stmt *S) const;
>>>         void addStmt(std::unique_ptr<StmtToRewrite> S);
>>>         const RangeToRepMap &getRangeToReplacementsMap() const;
>>>         std::vector<const StmtToRewrite *> getSortedReplacements()
>>> const;
>>>         void markDone(const StmtToRewrite *S);
>>>         void cleanup();
>>>
>>>     private:
>>>         clang::ASTContext &ACtx;
>>>         clang::SourceManager &SM;
>>>         std::vector<std::unique_ptr<StmtToRewrite>> m_pending;
>>>         std::vector<std::unique_ptr<StmtToRewrite>> m_done;
>>>         RangeToRepMap m_rangeToReplacements;
>>>     };
>>>
>>>     WorkList m_workList;
>>> };
>>>
>>> } // namespace cu
>>>
>>> #endif
>>>
>>>
>>>
>>> ----------------------------------------
>>>
>>> RewriteManager.cpp
>>>
>>> ----------------------------------------
>>>
>>> #include "ClangUtil/RewriteManager.h"
>>> #include "ClangUtil/ASTUtil.h"
>>> #include "clang/Lex/Lexer.h"
>>> #include "clang/Lex/Preprocessor.h"
>>> #include "clang/Lex/PreprocessorOptions.h"
>>> #include "clang/Lex/TokenConcatenation.h"
>>> #include "clang/Lex/MacroArgs.h"
>>>
>>>
>>> using namespace cu;
>>>
>>>
>>> // Returns a Stmt that is the first parent of startS whose expansion
>>> range is within the given range.
>>> static const clang::Stmt *GetFullMacroStmt(clang::SourceRange range,
>>> const clang::Stmt *startS, clang::ASTContext &ACtx)
>>> {
>>>     auto &SM = ACtx.getSourceManager();
>>>
>>>     // Walk the tree upwards until ST does no longer expand to
>>> within range.
>>>     const clang::Stmt *ST = startS;
>>>     while (true)
>>>     {
>>>         const auto &parents = ACtx.getParents(*ST);
>>>         if (parents.empty())
>>>         {
>>>             break;
>>>         }
>>>         auto childS = ST;
>>>         ST = parents[0].get<clang::Stmt>();
>>>         if (!ST)
>>>         {
>>>             if (auto D = parents[0].get<clang::Decl>())
>>>             {
>>>                 const auto &parentsD = ACtx.getParents(*D);
>>>                 if (parentsD.empty())
>>>                 {
>>>                     break;
>>>                 }
>>>                 ST = parentsD[0].get<clang::Stmt>();
>>>                 if (!ST)
>>>                 {
>>>                     break;
>>>                 }
>>>             }
>>>             else
>>>             {
>>>                 break;
>>>             }
>>>         }
>>>
>>>         auto exLocS = SM.getExpansionLoc(ST->getLocStart());
>>>         auto exLocE = SM.getExpansionLoc(ST->getLocEnd());
>>>         if (SM.isBeforeInTranslationUnit(exLocS, range.getBegin()) ||
>>>             SM.isBeforeInTranslationUnit(range.getEnd(), exLocE))
>>>         {
>>>             return childS;
>>>         }
>>>     }
>>>
>>>     return nullptr;
>>> }
>>>
>>>
>>> RewriteManager *StmtToRewrite::getMgr() const
>>> {
>>>     return m_mgr;
>>> }
>>>
>>>
>>> RewriteManager::WorkList::WorkList(clang::ASTContext &ACtx,
>>> clang::SourceManager &SM) : ACtx(ACtx), SM(SM) {}
>>> bool RewriteManager::WorkList::isStmtPending(const clang::Stmt *S)
>>> const
>>> {
>>>     for (const auto &r : m_pending)
>>>     {
>>>         if (r->replaceS == S)
>>>         {
>>>             return true;
>>>         }
>>>     }
>>>     return false;
>>> }
>>> void
>>> RewriteManager::WorkList::addStmt(std::unique_ptr<StmtToRewrite> S)
>>> {
>>>     // Use the expansion range for maximal replacement flexibility
>>> in macros.
>>>     auto replaceRange =
>>> SM.getExpansionRange(S->replaceS->getSourceRange());
>>>
>>>     // TODO not quite correct.
>>>     /*auto sortRanges = [&](std::vector<const StmtToRewrite *> &vec) {
>>>         std::sort(vec.begin(), vec.end(), [&](const StmtToRewrite
>>> *lhs, const StmtToRewrite *rhs) {
>>>             auto lhsRange =
>>> SM.getExpansionRange(lhs->replaceS->getSourceRange());
>>>             auto rhsRange =
>>> SM.getExpansionRange(rhs->replaceS->getSourceRange());
>>>             return IsContained(rhsRange, lhsRange, SM);
>>>         });
>>>     };*/
>>>
>>>     // Establish hierarchical relation between all ranges.
>>>     bool found = false;
>>>     // First, check if this range is within one we already have.
>>>     for (auto &r : m_rangeToReplacements)
>>>     {
>>>         if (IsContained(replaceRange, r.first, SM))
>>>         {
>>>             // Insert in a sorted order.
>>>             for (auto it = r.second.begin(); it != r.second.end();
>>> ++it)
>>>             {
>>>                 //auto testRange =
>>> SM.getExpansionRange((*it)->replaceS->getSourceRange());
>>>                 // if (IsContained(testRange, replaceRange, SM))
>>>                 if (IsParent(S->replaceS, (*it)->replaceS, ACtx))
>>>                 {
>>>                     r.second.insert(it, S.get());
>>>                     found = true;
>>>                     break;
>>>                 }
>>>             }
>>>             if (!found)
>>>             {
>>>                 r.second.push_back(S.get());
>>>                 found = true;
>>>             }
>>>             break;
>>>         }
>>>     }
>>>     // Not within existing range, add as new top-level range.
>>>     if (!found)
>>>     {
>>>         // Check if any existing ranges are contained within the new
>>> one.
>>>         std::vector<const StmtToRewrite *> moveThese;
>>>         auto it = m_rangeToReplacements.begin();
>>>         while (it != m_rangeToReplacements.end())
>>>         {
>>>             if (IsContained(it->first, replaceRange, SM))
>>>             {
>>>                 moveThese.insert(moveThese.end(),
>>> it->second.begin(), it->second.end());
>>>                 it = m_rangeToReplacements.erase(it);
>>>             }
>>>             else
>>>             {
>>>                 ++it;
>>>             }
>>>         }
>>>         auto &accesses = m_rangeToReplacements[replaceRange];
>>>         // The order is important here. We want the first element to
>>> be the one that spans the full range.
>>>         accesses.push_back(S.get());
>>>         // TODO sort "moveThese".
>>>         accesses.insert(accesses.end(), moveThese.begin(),
>>> moveThese.end());
>>>     }
>>>
>>>     int count = 0;
>>>     for (const auto &r : m_rangeToReplacements)
>>>     {
>>>         printf("range %i\n", count++);
>>>         for (const auto &a : r.second)
>>>         {
>>>             printf("replacement:\n");
>>>             a->replaceS->dump();
>>>         }
>>>     }
>>>
>>>     m_pending.push_back(std::move(S));
>>> }
>>> const RewriteManager::WorkList::RangeToRepMap
>>> &RewriteManager::WorkList::getRangeToReplacementsMap() const
>>> {
>>>     return m_rangeToReplacements;
>>> }
>>> std::vector<const StmtToRewrite *>
>>> RewriteManager::WorkList::getSortedReplacements() const
>>> {
>>>     std::vector<const StmtToRewrite *> result;
>>>     for (auto &r : m_rangeToReplacements)
>>>     {
>>>         result.insert(result.end(), r.second.begin(), r.second.end());
>>>     }
>>>     return result;
>>> }
>>> void RewriteManager::WorkList::markDone(const StmtToRewrite *S)
>>> {
>>>     // Remove from hierarchy.
>>>     for (auto &r : m_rangeToReplacements)
>>>     {
>>>         r.second.erase(std::remove(r.second.begin(), r.second.end(),
>>> S), r.second.end());
>>>     }
>>>
>>>     // Move from pending to done list.
>>>     auto it = std::find_if(m_pending.begin(), m_pending.end(),
>>>                            [&](const std::unique_ptr<StmtToRewrite>
>>> &rep) { return rep.get() == S; });
>>>     if (it == m_pending.end())
>>>     {
>>>         throw std::runtime_error("Did not find replacement to mark
>>> as done");
>>>     }
>>>     m_done.push_back(std::move(*it));
>>>     m_pending.erase(it);
>>> }
>>> void RewriteManager::WorkList::cleanup()
>>> {
>>>     m_done.clear();
>>> }
>>>
>>>
>>> RewriteManager::RewriteManager(clang::ASTContext &ACtx,
>>> clang::Preprocessor &PP)
>>>     : ACtx(ACtx), LangOpts(ACtx.getLangOpts()),
>>> SM(ACtx.getSourceManager()), PP(PP), m_workList(ACtx, SM)
>>> {
>>> }
>>>
>>> void RewriteManager::registerStmt(std::unique_ptr<StmtToRewrite> S)
>>> {
>>>     if (!S->replaceS)
>>>     {
>>>         throw std::runtime_error("Must set replaceS");
>>>     }
>>>
>>>     if (m_workList.isStmtPending(S->replaceS))
>>>     {
>>>         throw std::runtime_error("This Stmt will already be replaced");
>>>     }
>>>
>>>     S->m_mgr = this;
>>>     m_workList.addStmt(std::move(S));
>>> }
>>>
>>> RewriteOperation RewriteManager::getReplaced(const clang::Stmt *S)
>>> {
>>>     auto range = SM.getExpansionRange(S->getSourceRange());
>>>     return { range, getExpandedCode(S) };
>>> }
>>>
>>> std::vector<RewriteOperation> RewriteManager::getReplacements()
>>> {
>>>     std::vector<RewriteOperation> results;
>>>
>>>     for (auto &rangeAndAccesses :
>>> m_workList.getRangeToReplacementsMap())
>>>     {
>>>         auto &range = rangeAndAccesses.first;
>>>         auto &accesses = rangeAndAccesses.second;
>>>
>>>         // Cannot replace something inside a macro because it would
>>> replace all expansions instead of just the selected
>>>         // AST node. So in a first step, get an enclosing statement
>>> that is no longer inside a macro.
>>>         // TODO we could keep the original code more clean by not
>>> expanding macro args if the whole expansion does not
>>>         // contain the macro arg more than once.
>>>         auto macroS = GetFullMacroStmt(range, accesses[0]->replaceS,
>>> ACtx);
>>>
>>>         results.push_back(getReplaced(macroS));
>>>
>>>         // TODO we could run clang-format on the replacements. this
>>> would especially benefit long macro expansions.
>>>     }
>>>
>>>     m_workList.cleanup();
>>>
>>>     return results;
>>> }
>>>
>>> std::string RewriteManager::getExpandedCode(const clang::Stmt
>>> *toReplaceS)
>>> {
>>>     // TODO performance optimization. this is parsing way more than
>>> required.
>>>
>>>     using namespace clang;
>>>
>>>     printf("getExpandedCode:\n");
>>>     toReplaceS->dump();
>>>
>>>     std::string out;
>>>
>>>     auto toReplaceExpStart =
>>> SM.getExpansionLoc(toReplaceS->getLocStart());
>>>     auto toReplaceExpEnd = SM.getExpansionLoc(toReplaceS->getLocEnd());
>>>     auto toReplaceSpellStart =
>>> SM.getSpellingLoc(toReplaceS->getLocStart());
>>>     auto toReplaceSpellEnd =
>>> SM.getSpellingLoc(toReplaceS->getLocEnd());
>>>
>>>     auto FID =
>>> SM.getFileID(SM.getExpansionLoc(toReplaceS->getLocStart()));
>>>
>>>     // The following is inspired by:
>>> clang/Rewrite/HTMLRewrite.cpp:HighlightMacros
>>>
>>>     // Re-lex the raw token stream into a token buffer.
>>>     std::vector<Token> TokenStream;
>>>
>>>     const llvm::MemoryBuffer *FromFile = SM.getBuffer(FID);
>>>     Lexer L(FID, FromFile, SM, PP.getLangOpts());
>>>
>>>     // Lex all the tokens in raw mode, to avoid entering #includes
>>> or expanding
>>>     // macros.
>>>     while (1)
>>>     {
>>>         Token Tok;
>>>         L.LexFromRawLexer(Tok);
>>>
>>>         // If this is a # at the start of a line, discard it from
>>> the token stream.
>>>         // We don't want the re-preprocess step to see #defines,
>>> #includes or other
>>>         // preprocessor directives.
>>>         if (Tok.is(tok::hash) && Tok.isAtStartOfLine())
>>>             continue;
>>>
>>>         // If this is a ## token, change its kind to unknown so that
>>> repreprocessing
>>>         // it will not produce an error.
>>>         if (Tok.is(tok::hashhash))
>>>             Tok.setKind(tok::unknown);
>>>
>>>         // If this raw token is an identifier, the raw lexer won't
>>> have looked up
>>>         // the corresponding identifier info for it.  Do this now so
>>> that it will be
>>>         // macro expanded when we re-preprocess it.
>>>         if (Tok.is(tok::raw_identifier))
>>>             PP.LookUpIdentifierInfo(Tok);
>>>
>>>         TokenStream.push_back(Tok);
>>>
>>>         for (auto &rep : m_workList.getSortedReplacements())
>>>         {
>>>             auto repS = rep->replaceS;
>>>             auto spellLoc = SM.getSpellingLoc(repS->getLocStart());
>>>             if (SM.getSpellingLoc(Tok.getLocation()) == spellLoc)
>>>             {
>>>                 //
>>>             }
>>>         }
>>>
>>>         if (Tok.is(tok::eof))
>>>             break;
>>>     }
>>>
>>>     // Temporarily change the diagnostics object so that we ignore
>>> any generated
>>>     // diagnostics from this pass.
>>>     DiagnosticsEngine
>>> TmpDiags(PP.getDiagnostics().getDiagnosticIDs(),
>>> &PP.getDiagnostics().getDiagnosticOptions(),
>>>                                new IgnoringDiagConsumer);
>>>
>>>     // Copy the preprocessor and all of its state.
>>>     auto PPOpts =
>>> std::make_shared<PreprocessorOptions>(PP.getPreprocessorOpts());
>>>     LangOptions LO = PP.getLangOpts();
>>>     Preprocessor TmpPP(PPOpts, TmpDiags, LO, SM, PP.getPCMCache(),
>>> PP.getHeaderSearchInfo(), PP.getModuleLoader(),
>>> PP.getIdentifierTable().getExternalIdentifierLookup());
>>>     TmpPP.Initialize(PP.getTargetInfo(), PP.getAuxTargetInfo());
>>>     TmpPP.setExternalSource(PP.getExternalSource());
>>>     TmpPP.setPreprocessedOutput(true);
>>>
>>>     std::map<const clang::IdentifierInfo *, bool>
>>> MacroPreviouslyEnabled;
>>>     for (const auto &m : PP.macros())
>>>     {
>>>         // printf("PREDEF MACRO: %s\n",
>>> m.first->getName().str().c_str());
>>>         TmpPP.getMacroDefinition(m.first);
>>>
>>>         for (const auto &tmpm : TmpPP.macros())
>>>         {
>>>             if (tmpm.first == m.first)
>>>             {
>>>                 auto MD = m.second.getLatest();
>>>                 auto MI = MD->getMacroInfo();
>>>                 // If this is a recursive call we might be in a
>>> macro expansion and the macro might be disabled. We need
>>>                 // to enable it for now so that all expansions work.
>>> Restore it later.
>>>                 MacroPreviouslyEnabled[tmpm.first] = MI->isEnabled();
>>>                 if (!MI->isEnabled())
>>>                 {
>>>                     MD->getMacroInfo()->EnableMacro();
>>>                 }
>>>
>>>                 // This should not change anything since we just
>>> copy data over.
>>>                 auto &mutableState =
>>> const_cast<std::remove_const<decltype(tmpm.second)>::type
>>> &>(tmpm.second);
>>>                 mutableState.setLatest(MD);
>>>                 break;
>>>             }
>>>         }
>>>     }
>>>
>>>     class MacroArgCollector : public clang::PPCallbacks
>>>     {
>>>     public:
>>>         MacroArgCollector(Preprocessor &TmpPP) : TmpPP(TmpPP) {}
>>>
>>>         void MacroExpands(const Token &Tok, const MacroDefinition
>>> &MD, SourceRange Range, const MacroArgs *Args) override
>>>         {
>>>             if (!Args)
>>>             {
>>>                 return;
>>>             }
>>>             printf("GOT MACRO ARGS EXPANSION CALLBACK\n");
>>>             for (int i = 0; i < (int)Args->getNumMacroArguments(); i++)
>>>             {
>>>                 auto TokUnex = Args->getUnexpArgument(i);
>>>                 // Thats just non-const for a cache, so should be fine.
>>>                 auto TokPreExp = const_cast<MacroArgs
>>> *>(Args)->getPreExpArgument(i, TmpPP);
>>>                 printf("unexp: %s\n",
>>> TmpPP.getSpelling(*TokUnex).c_str());
>>>                 for (const auto &T : TokPreExp)
>>>                 {
>>>                     printf("preexp: %s\n",
>>> TmpPP.getSpelling(T).c_str());
>>>                 }
>>>             }
>>>         }
>>>
>>>         Preprocessor &TmpPP;
>>>     };
>>> TmpPP.addPPCallbacks(std::make_unique<MacroArgCollector>(TmpPP));
>>>     // Instead: collect the macro arg info in the law lexing step
>>> above. or do another pass that uses the PP but without expansions.
>>>
>>>     /*printf("DUMP MACRO INFO\n");
>>>     for (const auto &m : PP.macros())
>>>         PP.dumpMacroInfo(m.first);
>>>     printf("---\n");
>>>     for (const auto &m : TmpPP.macros())
>>>         TmpPP.dumpMacroInfo(m.first);
>>>     printf("DUMP MACRO INFO END\n");*/
>>>
>>>     DiagnosticsEngine *OldDiags = &TmpPP.getDiagnostics();
>>>
>>>     // Inform the preprocessor that we don't want comments.
>>>     TmpPP.SetCommentRetentionState(false, false);
>>>
>>>     // We don't want pragmas either. Although we filtered out
>>> #pragma, removing
>>>     // _Pragma and __pragma is much harder.
>>>     bool PragmasPreviouslyEnabled = TmpPP.getPragmasEnabled();
>>>     TmpPP.setPragmasEnabled(false);
>>>
>>>     // Enter the tokens we just lexed.  This will cause them to be
>>> macro expanded
>>>     // but won't enter sub-files (because we removed #'s).
>>>     TmpPP.EnterTokenStream(TokenStream, false);
>>>
>>>     TokenConcatenation ConcatInfo(TmpPP);
>>>
>>>     // Lex all the tokens.
>>>     Token Tok;
>>>     TmpPP.Lex(Tok);
>>>
>>>     std::map<SourceLocation, int> slocIdx;
>>>
>>>     auto checkReplacement = [&]() {
>>>         for (auto &rep : m_workList.getSortedReplacements())
>>>         {
>>>             // auto rep = r.second.get();
>>>             auto repS = rep->replaceS;
>>>             auto spellLoc = SM.getSpellingLoc(repS->getLocStart());
>>>             // TODO we need to check here if the repS spans the full
>>> range (or largest?)
>>>             if (SM.getSpellingLoc(Tok.getLocation()) == spellLoc)
>>>             {
>>>                 if (slocIdx[spellLoc] == 7)
>>>                 {
>>>                     // replace
>>>                 }
>>>                 slocIdx[spellLoc]++;
>>>
>>>                 // Done replacing that one, but have to keep it
>>> alive until we're done with it.
>>>                 m_workList.markDone(rep);
>>>
>>>                 printf("[[[\n");
>>>                 auto repStr = rep->makeReplaceStr();
>>>                 printf("REPLACED: %s ]]]\n", repStr.c_str());
>>>                 out += repStr;
>>>
>>>                 // Skip ahead until after the whole replacement.
>>>                 auto repEnd = SM.getSpellingLoc(repS->getLocEnd());
>>>                 while (repEnd != SM.getSpellingLoc(Tok.getLocation()))
>>>                 {
>>>                     TmpPP.Lex(Tok);
>>>                     assert(!Tok.is(tok::eof) && "End not found");
>>>                 }
>>>
>>>                 // Eat one more since we stopped at the end token
>>> and we want to continue after it.
>>>                 TmpPP.Lex(Tok);
>>>
>>>                 return true;
>>>             }
>>>         }
>>>         return false;
>>>     };
>>>
>>>     while (Tok.isNot(tok::eof))
>>>     {
>>>         printf("TOKEN: %s\n", TmpPP.getSpelling(Tok).c_str());
>>>
>>>         auto TokLoc = Tok.getLocation();
>>>         auto TokExp = SM.getExpansionLoc(TokLoc);
>>>         if (SM.isBeforeInTranslationUnit(toReplaceExpEnd, TokExp))
>>>         {
>>>             // Anything after the Stmt we want to replace is not
>>> interesting.
>>>             break;
>>>         }
>>>
>>>         // Skip ahead until we are at the expansion start of the
>>> Stmt we want to replace.
>>>         if (!SM.isBeforeInTranslationUnit(TokLoc, toReplaceExpStart))
>>>         {
>>>             if (TokLoc.isMacroID())
>>>             {
>>>                 // This is the first token of a macro expansion.
>>>                 auto LLoc = SM.getExpansionRange(TokLoc);
>>>
>>>                 // Ignore tokens whose instantiation location was
>>> not the main file.
>>>                 if (SM.getFileID(LLoc.first) != FID)
>>>                 {
>>>                     TmpPP.Lex(Tok);
>>>                     continue;
>>>                 }
>>>
>>>                 assert(SM.getFileID(LLoc.second) == FID &&
>>>                        "Start and end of expansion must be in the
>>> same ultimate file!");
>>>
>>>                 bool stopOutputOnNextToken = false;
>>>                 bool toReplaceStartsInMacro = toReplaceExpStart ==
>>> TokExp;
>>>                 bool toReplaceEndsInMacro = toReplaceExpEnd == TokExp;
>>>                 bool startedOutput = false;
>>>
>>>                 Token PrevPrevTok;
>>>                 Token PrevTok = Tok;
>>>
>>>                 while (!Tok.is(tok::eof) &&
>>> SM.getExpansionLoc(Tok.getLocation()) == LLoc.first)
>>>                 {
>>>                     printf("TOKEN (in macro): %s\n",
>>> TmpPP.getSpelling(Tok).c_str());
>>>
>>>                     auto TokSpell =
>>> SM.getSpellingLoc(Tok.getLocation());
>>>                     if (stopOutputOnNextToken)
>>>                     {
>>>                         break;
>>>                     }
>>>                     if (toReplaceEndsInMacro && TokSpell ==
>>> toReplaceSpellEnd)
>>>                     {
>>>                         stopOutputOnNextToken = true;
>>>                     }
>>>
>>>                     if (toReplaceStartsInMacro && !startedOutput)
>>>                     {
>>>                         if (TokSpell == toReplaceSpellStart)
>>>                         {
>>>                             startedOutput = true;
>>>                         }
>>>                         else
>>>                         {
>>>                             TmpPP.Lex(Tok);
>>>                             continue;
>>>                         }
>>>                     }
>>>
>>>                     // If the tokens were already space separated,
>>> or if they must be to avoid
>>>                     // them being implicitly pasted, add a space
>>> between them.
>>>                     if (Tok.hasLeadingSpace() ||
>>> ConcatInfo.AvoidConcat(PrevPrevTok, PrevTok, Tok))
>>>                         out += ' ';
>>>
>>>                     if (checkReplacement())
>>>                     {
>>>                         continue;
>>>                     }
>>>
>>>                     out += TmpPP.getSpelling(Tok);
>>>                     TmpPP.Lex(Tok);
>>>                 }
>>>                 if (stopOutputOnNextToken)
>>>                 {
>>>                     break;
>>>                 }
>>>             }
>>>             else
>>>             {
>>>                 if (checkReplacement())
>>>                 {
>>>                     continue;
>>>                 }
>>>
>>>                 // Output original code because we are outside of a
>>> replacement.
>>>                 out += TmpPP.getSpelling(Tok);
>>>                 TmpPP.Lex(Tok);
>>>             }
>>>         }
>>>         else
>>>         {
>>>             TmpPP.Lex(Tok);
>>>         }
>>>     }
>>>
>>>     // Restore the preprocessor's old state.
>>>     TmpPP.setDiagnostics(*OldDiags);
>>>     TmpPP.setPragmasEnabled(PragmasPreviouslyEnabled);
>>>
>>>     for (const auto &tmpm : TmpPP.macros())
>>>     {
>>>         auto it = MacroPreviouslyEnabled.find(tmpm.first);
>>>         if (it != MacroPreviouslyEnabled.end())
>>>         {
>>>             auto MD = tmpm.second.getLatest();
>>>             auto MI = MD->getMacroInfo();
>>>             if (MI->isEnabled() && !it->second)
>>>             {
>>>                 MI->DisableMacro();
>>>             }
>>>             else if (!MI->isEnabled() && it->second)
>>>             {
>>>                 MI->EnableMacro();
>>>             }
>>>         }
>>>     }
>>>
>>>     return out;
>>> }
>>>
>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180717/79ea3ea7/attachment.html>


More information about the cfe-dev mailing list