[cfe-dev] Advanced Rewriting

Manuel Klimek via cfe-dev cfe-dev at lists.llvm.org
Wed Oct 24 01:30:24 PDT 2018


If you want this in the core libs, a great idea would be to write up a doc
or email explaining the design and the use cases.
Note that if this doesn't solve a generic enough use case, it might still
be a good idea for it to live in a different place - API surface is a cost
to users.

On Tue, Oct 23, 2018 at 5:56 PM Jonas Toth via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> Hi Rafael,
>
> nice to see your progress!
>
> In my opinion these advancements should go in the Tooling and Rewrite
> directly in clang but integrate with the existing facilities. It would be
> inconvenient two have two separate ways of doing rewritings.
>
> What I am curious about is how to resolve the macro-situation. If the
> refactoring will work directly on the AST and ignore macros, how is the
> refactoring translated into real code again?
>
> Having an functional way of doing the refactorings with just transforming
> the AST, e.g. replacing one node with a new subtree would be great and
> elegant on the AST side, but how would we move back from AST to code?
>
> Best, Jonas
> Am 23.10.18 um 17:32 schrieb Rafael·Stahl:
>
> Hey everyone
>
> the two major limitations are resolved now:
>
> - The macro argument issue
> - Support for replacing Decls using clang::ast_type_traits::DynTypedNode
>
> The macro issue vanished by itself once I figured a little unintuitive
> detail out: The SourceLocations from the original AST and the ones from the
> new Preprocessor Lexer did not compare equal even though they refer to the
> exact same location.
>
> I believe this is because expansion SrcLocs have a kind of pointer
> identity representation and equality just compares raw representations,
> such that even when they point to the same locations, they don't compare
> equal. What worked for me was a kind of "deep" comparison:
>
> bool clutil::DeepSrcLocEqual(clang::SourceLocation lhs,
> clang::SourceLocation rhs, const clang::SourceManager &SM)
> {
>     if (lhs == rhs)
>         return true;
>
>     if (SM.getExpansionLoc(lhs) != SM.getExpansionLoc(rhs))
>         return false;
>     if (SM.getSpellingLoc(lhs) != SM.getSpellingLoc(rhs))
>         return false;
>
>     clang::SourceLocation lhsMacro, rhsMacro;
>     if (SM.isMacroArgExpansion(lhs, &lhsMacro))
>     {
>         if (!SM.isMacroArgExpansion(rhs, &rhsMacro))
>             return false;
>         if (!DeepSrcLocEqual(lhsMacro, rhsMacro, SM))
>             return false;
>     }
>
>     return true;
> }
>
> Attached is the cleaned up rewriter that I ended up with now. Still, if
> there is any interest, I'd be happy to contribute this back to clangs
> libraries, but I would require some feedback how this fits into existing
> facilities.
>
> Best regards
> Rafael
> On 17.07.18 16:19, Jonas Toth wrote:
>
> Hi Rafael,
>
> I did read into clang-refactor a while ago but unfortunatly could not
> follow that up. If I recall correctly its about source-to-source
> transformation (as you said) and aims at implementing the primitive
> refactorings that exist (e.g. extract-method, extract-variable, ....).
>
> Rewriting itself should happen with the normal tooling framework.
> (https://clang.llvm.org/docs/RefactoringEngine.html)
>
> Maybe the implementers of the existing code can give better comments on
> you proposal (and might have considered a similar solution to yours
> already).
>
> +Alex Lorenz
>
> All the best, Jonas
>
>
> Am 17.07.2018 um 14:46 schrieb Rafael·Stahl:
>
> Hi Jonas
>
> Thanks for introducing me to this, I have seen the "Replacement" before,
> but not clang-refactor.
>
> However it seems to only provide management facilities around rewrite
> operations and not aid with the rewriting itself. Am I missing something
> here?
>
> The two core problems for me:
>
> - nesting replacements: When implementing replacements with
> clang-refactor, I still have to provide replacements that are closed in
> themselves. I cannot make them depend on others, right?
> - macros: clang-refactor only seems to work with spelling locations.
>
> Maybe an even simpler example: Replace all additions with "add(lhs, rhs)".
> This in itself is very difficult with clang as soon as the Stmts are nested
> or macros are involved.
>
> Best regards
> Rafael
>
> On 16.07.2018 19:06, Jonas Toth via cfe-dev wrote:
>
> Hi Rafael,
>
> wouldn't your usecase be a task for clang-refactor?
>
> Best,  Jonas
>
> Am 16.07.2018 um 17:08 schrieb Rafael·Stahl via cfe-dev:
>
> Hey everyone
>
> The rewriting API of Clang operates on the source code in textual form.
> The user can use AST nodes to figure out what to replace, but in the end he
> has to remove and insert snippets in a linear piece of text.
>
> This is very inconvenient when it is required to restructure and nest
> replacements. The involvement of macros makes a manual process even more
> difficult. See some recent threads expressing difficulty with the API
> [1][2].
>
> What do I mean by "nested replacements"? For example in the following:
>
>     int i = x + s->a;
>
> I would want to replace the BinaryOperator with a function call and the
> MemberExpr with some constant:
>
>     int i = Addition(x, 7);
>
> When keeping the two replacement rules independent of each other,
> achieving this with the current API is extremely difficult. More so when
> macros are involved.
>
> I am proposing some kind of helper that aims to solve these issues by
> providing an interface that offers to directly replace AST nodes and a
> mechanism to nest AST node replacements - without having to worry about
> macros.
>
> Potential usage:
>
> - Develop a class that derives from StmtToRewrite to define how
> replacements should happen:
>
>     class RewriteAdds : public cu::StmtToRewrite
>     {
>     public:
>         std::string makeReplaceStr() const override
>         {
>             auto binOp = dyn_cast<BinaryOperator>(replaceS);
>             return "Addition(" +
> getMgr()->getReplaced(binOp->getLHS()).strToInsert + ", " +
> getMgr()->getReplaced(binOp->getRHS()).strToInsert + ")";
>         }
>     };
>
>     class RewriteMembs : public cu::StmtToRewrite
>     {
>     public:
>         std::string makeReplaceStr() const override
>         {
>             return "7";
>         }
>     };
>
> - Construct a RewriteManager:
>
>     cu::RewriteManager mgr(ACtx, PP);
>
> - Add rewriting operations to the manager:
>
>     // std::vector<const Stmt *> AddStmts = /* matched from
> binaryOperator() with plus */
>     // std::vector<const Stmt *> MembStmts = /* matched from memberExpr()
> */
>     for (const auto &S : AddStmts) mgr.registerStmt<RewriteAdds>(S);
>     for (const auto &S : MembStmts) mgr.registerStmt<RewriteMembs>(S);
>
> - Retrieve and apply the results:
>
>     clang::Rewriter rewriter(SM, LangOpts);
>     for (const auto &r : mgr.getReplacements()) {
>         rewriter.RemoveText(r.rangeToRemove);
>         rewriter.InsertText(r.rangeToRemove.getBegin(), r.strToInsert);
>     }
>
>
> At the end of this mail is my low quality code that kind-of implements
> this. TLDR:
>
> - Build a hierarchy of stmts to replace and keep track of which
> replacements must be combined
> - Move further up in the AST if these replacements are inside a macro
> - Recursively lex the file and look for replacements outside-in by
> spelling locations. Expand any macros that are encountered during this. The
> re-lexing idea is based on the hint in [3].
>
> The code has the following shortcomings:
>
> - I do not know how to distinguish macro argument expansions within
> macros. For example in "#define FOO(a) a + a" the two "a"s expand to
> different AST nodes that could be replaced with different rules. This is an
> important issue, because it can lead to completely broken code with
> nesting.
> - Limited to Stmts, when Decls should be supported too.
> - Very un-optimized with lexing the entire source file many times. Easy to
> solve, but didn't want to raise the complexity further for now.
> - Could keep written code more clean by only expanding macros if required.
> For example not required if just a macro arg is replaced and all expansions
> would be the same.
>
>
> I am very interested in your general thoughts. I'm not very experienced
> with clang, but this was my vision how I would want to do replacements. Are
> you interested in getting this into clang? I would need help with ironing
> out the remaining issues.
>
> -Rafael
>
>
> [1] http://lists.llvm.org/pipermail/cfe-dev/2018-July/058430.html
> [2] http://lists.llvm.org/pipermail/cfe-dev/2018-June/058213.html
> [3] http://lists.llvm.org/pipermail/cfe-dev/2017-August/055079.html
>
>
>
> ----------------------------------------
>
> RewriteManager.h
>
> ----------------------------------------
>
> #ifndef CLANGUTIL_REWRITEMANAGER_H
> #define CLANGUTIL_REWRITEMANAGER_H
>
> #include "ClangUtil/SourceRangeLess.h"
> #include "make_unique.h"
> #include "clang/AST/AST.h"
> #include <vector>
> #include <map>
>
>
> // TODO extend to decls
>
>
> namespace cu
> {
> // Represents a statement in the original AST that should be rewritten. To
> implement recursive replacements, call
> // getMgr()->getReplaced() on any AST node within the makeReplaceStr
> callback.
> class StmtToRewrite
> {
>     friend class RewriteManager;
>
> public:
>     // Returns the enclosing RewriteManager.
>     class RewriteManager *getMgr() const;
>     // Override this to build a replacement string. Implement recursive
> replacements with RewriteManager::getReplaced.
>     virtual std::string makeReplaceStr() const = 0;
>
>     // The statement to replace.
>     const clang::Stmt *replaceS = nullptr;
>
> private:
>     RewriteManager *m_mgr;
> };
>
> struct RewriteOperation
> {
>     clang::SourceRange rangeToRemove;
>     std::string strToInsert;
> };
>
> // A class for managing replacements of AST nodes. It allows to
> specifically target AST nodes instead of raw source
> // locations to enable easy replacements involving macros and nested
> replacements.
> // For extended documentation see: doc/rewriting.md
> class RewriteManager
> {
> public:
>     RewriteManager(clang::ASTContext &ACtx, clang::Preprocessor &PP);
>
>     clang::ASTContext &getACtx() const { return ACtx; }
>
>     // Registers a StmtToRewrite for use with getReplacements. Call this
> on all
>     // statements that should be rewritten before calling any rewriting
> functions.
>     void registerStmt(std::unique_ptr<StmtToRewrite> S);
>
>     // Helper for constructing the custom type from a Stmt.
>     template <typename T, typename... Args>
>     void registerStmt(const clang::Stmt *S, Args... args)
>     {
>         auto p = std::make_unique<T>(std::forward<Args>(args)...);
>         p->replaceS = S;
>         registerStmt(std::move(p));
>     }
>
>     // Get the full replacement of an AST node. Note that this function
> removes any replaced statements from the work
>     // list, so calling it twice will only replace the first time.
>     RewriteOperation getReplaced(const clang::Stmt *S);
>     // Get all replacements. These may be fewer than the requested ones
> because of nesting.
>     std::vector<RewriteOperation> getReplacements();
>
> private:
>     std::string getExpandedCode(const clang::Stmt *toReplaceS);
>
> private:
>     clang::ASTContext &ACtx;
>     const clang::LangOptions &LangOpts;
>     clang::SourceManager &SM;
>     clang::Preprocessor &PP;
>
>     // Manages the pending replacements.
>     class WorkList
>     {
>     public:
>         typedef std::map<clang::SourceRange, std::vector<const
> StmtToRewrite *>> RangeToRepMap;
>
>         WorkList(clang::ASTContext &ACtx, clang::SourceManager &SM);
>
>         bool isStmtPending(const clang::Stmt *S) const;
>         void addStmt(std::unique_ptr<StmtToRewrite> S);
>         const RangeToRepMap &getRangeToReplacementsMap() const;
>         std::vector<const StmtToRewrite *> getSortedReplacements() const;
>         void markDone(const StmtToRewrite *S);
>         void cleanup();
>
>     private:
>         clang::ASTContext &ACtx;
>         clang::SourceManager &SM;
>         std::vector<std::unique_ptr<StmtToRewrite>> m_pending;
>         std::vector<std::unique_ptr<StmtToRewrite>> m_done;
>         RangeToRepMap m_rangeToReplacements;
>     };
>
>     WorkList m_workList;
> };
>
> } // namespace cu
>
> #endif
>
>
>
> ----------------------------------------
>
> RewriteManager.cpp
>
> ----------------------------------------
>
> #include "ClangUtil/RewriteManager.h"
> #include "ClangUtil/ASTUtil.h"
> #include "clang/Lex/Lexer.h"
> #include "clang/Lex/Preprocessor.h"
> #include "clang/Lex/PreprocessorOptions.h"
> #include "clang/Lex/TokenConcatenation.h"
> #include "clang/Lex/MacroArgs.h"
>
>
> using namespace cu;
>
>
> // Returns a Stmt that is the first parent of startS whose expansion range
> is within the given range.
> static const clang::Stmt *GetFullMacroStmt(clang::SourceRange range, const
> clang::Stmt *startS, clang::ASTContext &ACtx)
> {
>     auto &SM = ACtx.getSourceManager();
>
>     // Walk the tree upwards until ST does no longer expand to within
> range.
>     const clang::Stmt *ST = startS;
>     while (true)
>     {
>         const auto &parents = ACtx.getParents(*ST);
>         if (parents.empty())
>         {
>             break;
>         }
>         auto childS = ST;
>         ST = parents[0].get<clang::Stmt>();
>         if (!ST)
>         {
>             if (auto D = parents[0].get<clang::Decl>())
>             {
>                 const auto &parentsD = ACtx.getParents(*D);
>                 if (parentsD.empty())
>                 {
>                     break;
>                 }
>                 ST = parentsD[0].get<clang::Stmt>();
>                 if (!ST)
>                 {
>                     break;
>                 }
>             }
>             else
>             {
>                 break;
>             }
>         }
>
>         auto exLocS = SM.getExpansionLoc(ST->getLocStart());
>         auto exLocE = SM.getExpansionLoc(ST->getLocEnd());
>         if (SM.isBeforeInTranslationUnit(exLocS, range.getBegin()) ||
>             SM.isBeforeInTranslationUnit(range.getEnd(), exLocE))
>         {
>             return childS;
>         }
>     }
>
>     return nullptr;
> }
>
>
> RewriteManager *StmtToRewrite::getMgr() const
> {
>     return m_mgr;
> }
>
>
> RewriteManager::WorkList::WorkList(clang::ASTContext &ACtx,
> clang::SourceManager &SM) : ACtx(ACtx), SM(SM) {}
> bool RewriteManager::WorkList::isStmtPending(const clang::Stmt *S) const
> {
>     for (const auto &r : m_pending)
>     {
>         if (r->replaceS == S)
>         {
>             return true;
>         }
>     }
>     return false;
> }
> void RewriteManager::WorkList::addStmt(std::unique_ptr<StmtToRewrite> S)
> {
>     // Use the expansion range for maximal replacement flexibility in
> macros.
>     auto replaceRange =
> SM.getExpansionRange(S->replaceS->getSourceRange());
>
>     // TODO not quite correct.
>     /*auto sortRanges = [&](std::vector<const StmtToRewrite *> &vec) {
>         std::sort(vec.begin(), vec.end(), [&](const StmtToRewrite *lhs,
> const StmtToRewrite *rhs) {
>             auto lhsRange =
> SM.getExpansionRange(lhs->replaceS->getSourceRange());
>             auto rhsRange =
> SM.getExpansionRange(rhs->replaceS->getSourceRange());
>             return IsContained(rhsRange, lhsRange, SM);
>         });
>     };*/
>
>     // Establish hierarchical relation between all ranges.
>     bool found = false;
>     // First, check if this range is within one we already have.
>     for (auto &r : m_rangeToReplacements)
>     {
>         if (IsContained(replaceRange, r.first, SM))
>         {
>             // Insert in a sorted order.
>             for (auto it = r.second.begin(); it != r.second.end(); ++it)
>             {
>                 //auto testRange =
> SM.getExpansionRange((*it)->replaceS->getSourceRange());
>                 // if (IsContained(testRange, replaceRange, SM))
>                 if (IsParent(S->replaceS, (*it)->replaceS, ACtx))
>                 {
>                     r.second.insert(it, S.get());
>                     found = true;
>                     break;
>                 }
>             }
>             if (!found)
>             {
>                 r.second.push_back(S.get());
>                 found = true;
>             }
>             break;
>         }
>     }
>     // Not within existing range, add as new top-level range.
>     if (!found)
>     {
>         // Check if any existing ranges are contained within the new one.
>         std::vector<const StmtToRewrite *> moveThese;
>         auto it = m_rangeToReplacements.begin();
>         while (it != m_rangeToReplacements.end())
>         {
>             if (IsContained(it->first, replaceRange, SM))
>             {
>                 moveThese.insert(moveThese.end(), it->second.begin(),
> it->second.end());
>                 it = m_rangeToReplacements.erase(it);
>             }
>             else
>             {
>                 ++it;
>             }
>         }
>         auto &accesses = m_rangeToReplacements[replaceRange];
>         // The order is important here. We want the first element to be
> the one that spans the full range.
>         accesses.push_back(S.get());
>         // TODO sort "moveThese".
>         accesses.insert(accesses.end(), moveThese.begin(),
> moveThese.end());
>     }
>
>     int count = 0;
>     for (const auto &r : m_rangeToReplacements)
>     {
>         printf("range %i\n", count++);
>         for (const auto &a : r.second)
>         {
>             printf("replacement:\n");
>             a->replaceS->dump();
>         }
>     }
>
>     m_pending.push_back(std::move(S));
> }
> const RewriteManager::WorkList::RangeToRepMap
> &RewriteManager::WorkList::getRangeToReplacementsMap() const
> {
>     return m_rangeToReplacements;
> }
> std::vector<const StmtToRewrite *>
> RewriteManager::WorkList::getSortedReplacements() const
> {
>     std::vector<const StmtToRewrite *> result;
>     for (auto &r : m_rangeToReplacements)
>     {
>         result.insert(result.end(), r.second.begin(), r.second.end());
>     }
>     return result;
> }
> void RewriteManager::WorkList::markDone(const StmtToRewrite *S)
> {
>     // Remove from hierarchy.
>     for (auto &r : m_rangeToReplacements)
>     {
>         r.second.erase(std::remove(r.second.begin(), r.second.end(), S),
> r.second.end());
>     }
>
>     // Move from pending to done list.
>     auto it = std::find_if(m_pending.begin(), m_pending.end(),
>                            [&](const std::unique_ptr<StmtToRewrite> &rep)
> { return rep.get() == S; });
>     if (it == m_pending.end())
>     {
>         throw std::runtime_error("Did not find replacement to mark as
> done");
>     }
>     m_done.push_back(std::move(*it));
>     m_pending.erase(it);
> }
> void RewriteManager::WorkList::cleanup()
> {
>     m_done.clear();
> }
>
>
> RewriteManager::RewriteManager(clang::ASTContext &ACtx,
> clang::Preprocessor &PP)
>     : ACtx(ACtx), LangOpts(ACtx.getLangOpts()),
> SM(ACtx.getSourceManager()), PP(PP), m_workList(ACtx, SM)
> {
> }
>
> void RewriteManager::registerStmt(std::unique_ptr<StmtToRewrite> S)
> {
>     if (!S->replaceS)
>     {
>         throw std::runtime_error("Must set replaceS");
>     }
>
>     if (m_workList.isStmtPending(S->replaceS))
>     {
>         throw std::runtime_error("This Stmt will already be replaced");
>     }
>
>     S->m_mgr = this;
>     m_workList.addStmt(std::move(S));
> }
>
> RewriteOperation RewriteManager::getReplaced(const clang::Stmt *S)
> {
>     auto range = SM.getExpansionRange(S->getSourceRange());
>     return { range, getExpandedCode(S) };
> }
>
> std::vector<RewriteOperation> RewriteManager::getReplacements()
> {
>     std::vector<RewriteOperation> results;
>
>     for (auto &rangeAndAccesses : m_workList.getRangeToReplacementsMap())
>     {
>         auto &range = rangeAndAccesses.first;
>         auto &accesses = rangeAndAccesses.second;
>
>         // Cannot replace something inside a macro because it would
> replace all expansions instead of just the selected
>         // AST node. So in a first step, get an enclosing statement that
> is no longer inside a macro.
>         // TODO we could keep the original code more clean by not
> expanding macro args if the whole expansion does not
>         // contain the macro arg more than once.
>         auto macroS = GetFullMacroStmt(range, accesses[0]->replaceS,
> ACtx);
>
>         results.push_back(getReplaced(macroS));
>
>         // TODO we could run clang-format on the replacements. this would
> especially benefit long macro expansions.
>     }
>
>     m_workList.cleanup();
>
>     return results;
> }
>
> std::string RewriteManager::getExpandedCode(const clang::Stmt *toReplaceS)
> {
>     // TODO performance optimization. this is parsing way more than
> required.
>
>     using namespace clang;
>
>     printf("getExpandedCode:\n");
>     toReplaceS->dump();
>
>     std::string out;
>
>     auto toReplaceExpStart =
> SM.getExpansionLoc(toReplaceS->getLocStart());
>     auto toReplaceExpEnd = SM.getExpansionLoc(toReplaceS->getLocEnd());
>     auto toReplaceSpellStart =
> SM.getSpellingLoc(toReplaceS->getLocStart());
>     auto toReplaceSpellEnd = SM.getSpellingLoc(toReplaceS->getLocEnd());
>
>     auto FID =
> SM.getFileID(SM.getExpansionLoc(toReplaceS->getLocStart()));
>
>     // The following is inspired by:
> clang/Rewrite/HTMLRewrite.cpp:HighlightMacros
>
>     // Re-lex the raw token stream into a token buffer.
>     std::vector<Token> TokenStream;
>
>     const llvm::MemoryBuffer *FromFile = SM.getBuffer(FID);
>     Lexer L(FID, FromFile, SM, PP.getLangOpts());
>
>     // Lex all the tokens in raw mode, to avoid entering #includes or
> expanding
>     // macros.
>     while (1)
>     {
>         Token Tok;
>         L.LexFromRawLexer(Tok);
>
>         // If this is a # at the start of a line, discard it from the
> token stream.
>         // We don't want the re-preprocess step to see #defines, #includes
> or other
>         // preprocessor directives.
>         if (Tok.is(tok::hash) && Tok.isAtStartOfLine())
>             continue;
>
>         // If this is a ## token, change its kind to unknown so that
> repreprocessing
>         // it will not produce an error.
>         if (Tok.is(tok::hashhash))
>             Tok.setKind(tok::unknown);
>
>         // If this raw token is an identifier, the raw lexer won't have
> looked up
>         // the corresponding identifier info for it.  Do this now so that
> it will be
>         // macro expanded when we re-preprocess it.
>         if (Tok.is(tok::raw_identifier))
>             PP.LookUpIdentifierInfo(Tok);
>
>         TokenStream.push_back(Tok);
>
>         for (auto &rep : m_workList.getSortedReplacements())
>         {
>             auto repS = rep->replaceS;
>             auto spellLoc = SM.getSpellingLoc(repS->getLocStart());
>             if (SM.getSpellingLoc(Tok.getLocation()) == spellLoc)
>             {
>                 //
>             }
>         }
>
>         if (Tok.is(tok::eof))
>             break;
>     }
>
>     // Temporarily change the diagnostics object so that we ignore any
> generated
>     // diagnostics from this pass.
>     DiagnosticsEngine TmpDiags(PP.getDiagnostics().getDiagnosticIDs(),
> &PP.getDiagnostics().getDiagnosticOptions(),
>                                new IgnoringDiagConsumer);
>
>     // Copy the preprocessor and all of its state.
>     auto PPOpts =
> std::make_shared<PreprocessorOptions>(PP.getPreprocessorOpts());
>     LangOptions LO = PP.getLangOpts();
>     Preprocessor TmpPP(PPOpts, TmpDiags, LO, SM, PP.getPCMCache(),
> PP.getHeaderSearchInfo(), PP.getModuleLoader(),
> PP.getIdentifierTable().getExternalIdentifierLookup());
>     TmpPP.Initialize(PP.getTargetInfo(), PP.getAuxTargetInfo());
>     TmpPP.setExternalSource(PP.getExternalSource());
>     TmpPP.setPreprocessedOutput(true);
>
>     std::map<const clang::IdentifierInfo *, bool> MacroPreviouslyEnabled;
>     for (const auto &m : PP.macros())
>     {
>         // printf("PREDEF MACRO: %s\n", m.first->getName().str().c_str());
>         TmpPP.getMacroDefinition(m.first);
>
>         for (const auto &tmpm : TmpPP.macros())
>         {
>             if (tmpm.first == m.first)
>             {
>                 auto MD = m.second.getLatest();
>                 auto MI = MD->getMacroInfo();
>                 // If this is a recursive call we might be in a macro
> expansion and the macro might be disabled. We need
>                 // to enable it for now so that all expansions work.
> Restore it later.
>                 MacroPreviouslyEnabled[tmpm.first] = MI->isEnabled();
>                 if (!MI->isEnabled())
>                 {
>                     MD->getMacroInfo()->EnableMacro();
>                 }
>
>                 // This should not change anything since we just copy data
> over.
>                 auto &mutableState =
> const_cast<std::remove_const<decltype(tmpm.second)>::type &>(tmpm.second);
>                 mutableState.setLatest(MD);
>                 break;
>             }
>         }
>     }
>
>     class MacroArgCollector : public clang::PPCallbacks
>     {
>     public:
>         MacroArgCollector(Preprocessor &TmpPP) : TmpPP(TmpPP) {}
>
>         void MacroExpands(const Token &Tok, const MacroDefinition &MD,
> SourceRange Range, const MacroArgs *Args) override
>         {
>             if (!Args)
>             {
>                 return;
>             }
>             printf("GOT MACRO ARGS EXPANSION CALLBACK\n");
>             for (int i = 0; i < (int)Args->getNumMacroArguments(); i++)
>             {
>                 auto TokUnex = Args->getUnexpArgument(i);
>                 // Thats just non-const for a cache, so should be fine.
>                 auto TokPreExp = const_cast<MacroArgs
> *>(Args)->getPreExpArgument(i, TmpPP);
>                 printf("unexp: %s\n",
> TmpPP.getSpelling(*TokUnex).c_str());
>                 for (const auto &T : TokPreExp)
>                 {
>                     printf("preexp: %s\n", TmpPP.getSpelling(T).c_str());
>                 }
>             }
>         }
>
>         Preprocessor &TmpPP;
>     };
> TmpPP.addPPCallbacks(std::make_unique<MacroArgCollector>(TmpPP));
>     // Instead: collect the macro arg info in the law lexing step above.
> or do another pass that uses the PP but without expansions.
>
>     /*printf("DUMP MACRO INFO\n");
>     for (const auto &m : PP.macros())
>         PP.dumpMacroInfo(m.first);
>     printf("---\n");
>     for (const auto &m : TmpPP.macros())
>         TmpPP.dumpMacroInfo(m.first);
>     printf("DUMP MACRO INFO END\n");*/
>
>     DiagnosticsEngine *OldDiags = &TmpPP.getDiagnostics();
>
>     // Inform the preprocessor that we don't want comments.
>     TmpPP.SetCommentRetentionState(false, false);
>
>     // We don't want pragmas either. Although we filtered out #pragma,
> removing
>     // _Pragma and __pragma is much harder.
>     bool PragmasPreviouslyEnabled = TmpPP.getPragmasEnabled();
>     TmpPP.setPragmasEnabled(false);
>
>     // Enter the tokens we just lexed.  This will cause them to be macro
> expanded
>     // but won't enter sub-files (because we removed #'s).
>     TmpPP.EnterTokenStream(TokenStream, false);
>
>     TokenConcatenation ConcatInfo(TmpPP);
>
>     // Lex all the tokens.
>     Token Tok;
>     TmpPP.Lex(Tok);
>
>     std::map<SourceLocation, int> slocIdx;
>
>     auto checkReplacement = [&]() {
>         for (auto &rep : m_workList.getSortedReplacements())
>         {
>             // auto rep = r.second.get();
>             auto repS = rep->replaceS;
>             auto spellLoc = SM.getSpellingLoc(repS->getLocStart());
>             // TODO we need to check here if the repS spans the full range
> (or largest?)
>             if (SM.getSpellingLoc(Tok.getLocation()) == spellLoc)
>             {
>                 if (slocIdx[spellLoc] == 7)
>                 {
>                     // replace
>                 }
>                 slocIdx[spellLoc]++;
>
>                 // Done replacing that one, but have to keep it alive
> until we're done with it.
>                 m_workList.markDone(rep);
>
>                 printf("[[[\n");
>                 auto repStr = rep->makeReplaceStr();
>                 printf("REPLACED: %s ]]]\n", repStr.c_str());
>                 out += repStr;
>
>                 // Skip ahead until after the whole replacement.
>                 auto repEnd = SM.getSpellingLoc(repS->getLocEnd());
>                 while (repEnd != SM.getSpellingLoc(Tok.getLocation()))
>                 {
>                     TmpPP.Lex(Tok);
>                     assert(!Tok.is(tok::eof) && "End not found");
>                 }
>
>                 // Eat one more since we stopped at the end token and we
> want to continue after it.
>                 TmpPP.Lex(Tok);
>
>                 return true;
>             }
>         }
>         return false;
>     };
>
>     while (Tok.isNot(tok::eof))
>     {
>         printf("TOKEN: %s\n", TmpPP.getSpelling(Tok).c_str());
>
>         auto TokLoc = Tok.getLocation();
>         auto TokExp = SM.getExpansionLoc(TokLoc);
>         if (SM.isBeforeInTranslationUnit(toReplaceExpEnd, TokExp))
>         {
>             // Anything after the Stmt we want to replace is not
> interesting.
>             break;
>         }
>
>         // Skip ahead until we are at the expansion start of the Stmt we
> want to replace.
>         if (!SM.isBeforeInTranslationUnit(TokLoc, toReplaceExpStart))
>         {
>             if (TokLoc.isMacroID())
>             {
>                 // This is the first token of a macro expansion.
>                 auto LLoc = SM.getExpansionRange(TokLoc);
>
>                 // Ignore tokens whose instantiation location was not the
> main file.
>                 if (SM.getFileID(LLoc.first) != FID)
>                 {
>                     TmpPP.Lex(Tok);
>                     continue;
>                 }
>
>                 assert(SM.getFileID(LLoc.second) == FID &&
>                        "Start and end of expansion must be in the same
> ultimate file!");
>
>                 bool stopOutputOnNextToken = false;
>                 bool toReplaceStartsInMacro = toReplaceExpStart == TokExp;
>                 bool toReplaceEndsInMacro = toReplaceExpEnd == TokExp;
>                 bool startedOutput = false;
>
>                 Token PrevPrevTok;
>                 Token PrevTok = Tok;
>
>                 while (!Tok.is(tok::eof) &&
> SM.getExpansionLoc(Tok.getLocation()) == LLoc.first)
>                 {
>                     printf("TOKEN (in macro): %s\n",
> TmpPP.getSpelling(Tok).c_str());
>
>                     auto TokSpell = SM.getSpellingLoc(Tok.getLocation());
>                     if (stopOutputOnNextToken)
>                     {
>                         break;
>                     }
>                     if (toReplaceEndsInMacro && TokSpell ==
> toReplaceSpellEnd)
>                     {
>                         stopOutputOnNextToken = true;
>                     }
>
>                     if (toReplaceStartsInMacro && !startedOutput)
>                     {
>                         if (TokSpell == toReplaceSpellStart)
>                         {
>                             startedOutput = true;
>                         }
>                         else
>                         {
>                             TmpPP.Lex(Tok);
>                             continue;
>                         }
>                     }
>
>                     // If the tokens were already space separated, or if
> they must be to avoid
>                     // them being implicitly pasted, add a space between
> them.
>                     if (Tok.hasLeadingSpace() ||
> ConcatInfo.AvoidConcat(PrevPrevTok, PrevTok, Tok))
>                         out += ' ';
>
>                     if (checkReplacement())
>                     {
>                         continue;
>                     }
>
>                     out += TmpPP.getSpelling(Tok);
>                     TmpPP.Lex(Tok);
>                 }
>                 if (stopOutputOnNextToken)
>                 {
>                     break;
>                 }
>             }
>             else
>             {
>                 if (checkReplacement())
>                 {
>                     continue;
>                 }
>
>                 // Output original code because we are outside of a
> replacement.
>                 out += TmpPP.getSpelling(Tok);
>                 TmpPP.Lex(Tok);
>             }
>         }
>         else
>         {
>             TmpPP.Lex(Tok);
>         }
>     }
>
>     // Restore the preprocessor's old state.
>     TmpPP.setDiagnostics(*OldDiags);
>     TmpPP.setPragmasEnabled(PragmasPreviouslyEnabled);
>
>     for (const auto &tmpm : TmpPP.macros())
>     {
>         auto it = MacroPreviouslyEnabled.find(tmpm.first);
>         if (it != MacroPreviouslyEnabled.end())
>         {
>             auto MD = tmpm.second.getLatest();
>             auto MI = MD->getMacroInfo();
>             if (MI->isEnabled() && !it->second)
>             {
>                 MI->DisableMacro();
>             }
>             else if (!MI->isEnabled() && it->second)
>             {
>                 MI->EnableMacro();
>             }
>         }
>     }
>
>     return out;
> }
>
>
>
> _______________________________________________
> cfe-dev mailing listcfe-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
>
>
> _______________________________________________
> cfe-dev mailing listcfe-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181024/7f55ce3b/attachment-0001.html>


More information about the cfe-dev mailing list