[cfe-dev] Advanced Rewriting
Jonas Toth via cfe-dev
cfe-dev at lists.llvm.org
Tue Jul 17 07:19:07 PDT 2018
Hi Rafael,
I did read into clang-refactor a while ago but unfortunatly could not
follow that up. If I recall correctly its about source-to-source
transformation (as you said) and aims at implementing the primitive
refactorings that exist (e.g. extract-method, extract-variable, ....).
Rewriting itself should happen with the normal tooling framework.
(https://clang.llvm.org/docs/RefactoringEngine.html)
Maybe the implementers of the existing code can give better comments on
you proposal (and might have considered a similar solution to yours
already).
+Alex Lorenz
All the best, Jonas
Am 17.07.2018 um 14:46 schrieb Rafael·Stahl:
>
> Hi Jonas
>
> Thanks for introducing me to this, I have seen the "Replacement"
> before, but not clang-refactor.
>
> However it seems to only provide management facilities around rewrite
> operations and not aid with the rewriting itself. Am I missing
> something here?
>
> The two core problems for me:
>
> - nesting replacements: When implementing replacements with
> clang-refactor, I still have to provide replacements that are closed
> in themselves. I cannot make them depend on others, right?
> - macros: clang-refactor only seems to work with spelling locations.
>
> Maybe an even simpler example: Replace all additions with "add(lhs,
> rhs)". This in itself is very difficult with clang as soon as the
> Stmts are nested or macros are involved.
>
> Best regards
> Rafael
>
>
> On 16.07.2018 19:06, Jonas Toth via cfe-dev wrote:
>>
>> Hi Rafael,
>>
>> wouldn't your usecase be a task for clang-refactor?
>>
>> Best, Jonas
>>
>>
>> Am 16.07.2018 um 17:08 schrieb Rafael·Stahl via cfe-dev:
>>> Hey everyone
>>>
>>> The rewriting API of Clang operates on the source code in textual
>>> form. The user can use AST nodes to figure out what to replace, but
>>> in the end he has to remove and insert snippets in a linear piece of
>>> text.
>>>
>>> This is very inconvenient when it is required to restructure and
>>> nest replacements. The involvement of macros makes a manual process
>>> even more difficult. See some recent threads expressing difficulty
>>> with the API [1][2].
>>>
>>> What do I mean by "nested replacements"? For example in the following:
>>>
>>> int i = x + s->a;
>>>
>>> I would want to replace the BinaryOperator with a function call and
>>> the MemberExpr with some constant:
>>>
>>> int i = Addition(x, 7);
>>>
>>> When keeping the two replacement rules independent of each other,
>>> achieving this with the current API is extremely difficult. More so
>>> when macros are involved.
>>>
>>> I am proposing some kind of helper that aims to solve these issues
>>> by providing an interface that offers to directly replace AST nodes
>>> and a mechanism to nest AST node replacements - without having to
>>> worry about macros.
>>>
>>> Potential usage:
>>>
>>> - Develop a class that derives from StmtToRewrite to define how
>>> replacements should happen:
>>>
>>> class RewriteAdds : public cu::StmtToRewrite
>>> {
>>> public:
>>> std::string makeReplaceStr() const override
>>> {
>>> auto binOp = dyn_cast<BinaryOperator>(replaceS);
>>> return "Addition(" +
>>> getMgr()->getReplaced(binOp->getLHS()).strToInsert + ", " +
>>> getMgr()->getReplaced(binOp->getRHS()).strToInsert + ")";
>>> }
>>> };
>>>
>>> class RewriteMembs : public cu::StmtToRewrite
>>> {
>>> public:
>>> std::string makeReplaceStr() const override
>>> {
>>> return "7";
>>> }
>>> };
>>>
>>> - Construct a RewriteManager:
>>>
>>> cu::RewriteManager mgr(ACtx, PP);
>>>
>>> - Add rewriting operations to the manager:
>>>
>>> // std::vector<const Stmt *> AddStmts = /* matched from
>>> binaryOperator() with plus */
>>> // std::vector<const Stmt *> MembStmts = /* matched from
>>> memberExpr() */
>>> for (const auto &S : AddStmts) mgr.registerStmt<RewriteAdds>(S);
>>> for (const auto &S : MembStmts) mgr.registerStmt<RewriteMembs>(S);
>>>
>>> - Retrieve and apply the results:
>>>
>>> clang::Rewriter rewriter(SM, LangOpts);
>>> for (const auto &r : mgr.getReplacements()) {
>>> rewriter.RemoveText(r.rangeToRemove);
>>> rewriter.InsertText(r.rangeToRemove.getBegin(), r.strToInsert);
>>> }
>>>
>>>
>>> At the end of this mail is my low quality code that kind-of
>>> implements this. TLDR:
>>>
>>> - Build a hierarchy of stmts to replace and keep track of which
>>> replacements must be combined
>>> - Move further up in the AST if these replacements are inside a macro
>>> - Recursively lex the file and look for replacements outside-in by
>>> spelling locations. Expand any macros that are encountered during
>>> this. The re-lexing idea is based on the hint in [3].
>>>
>>> The code has the following shortcomings:
>>>
>>> - I do not know how to distinguish macro argument expansions within
>>> macros. For example in "#define FOO(a) a + a" the two "a"s expand to
>>> different AST nodes that could be replaced with different rules.
>>> This is an important issue, because it can lead to completely broken
>>> code with nesting.
>>> - Limited to Stmts, when Decls should be supported too.
>>> - Very un-optimized with lexing the entire source file many times.
>>> Easy to solve, but didn't want to raise the complexity further for now.
>>> - Could keep written code more clean by only expanding macros if
>>> required. For example not required if just a macro arg is replaced
>>> and all expansions would be the same.
>>>
>>>
>>> I am very interested in your general thoughts. I'm not very
>>> experienced with clang, but this was my vision how I would want to
>>> do replacements. Are you interested in getting this into clang? I
>>> would need help with ironing out the remaining issues.
>>>
>>> -Rafael
>>>
>>>
>>> [1] http://lists.llvm.org/pipermail/cfe-dev/2018-July/058430.html
>>> [2] http://lists.llvm.org/pipermail/cfe-dev/2018-June/058213.html
>>> [3] http://lists.llvm.org/pipermail/cfe-dev/2017-August/055079.html
>>>
>>>
>>>
>>> ----------------------------------------
>>>
>>> RewriteManager.h
>>>
>>> ----------------------------------------
>>>
>>> #ifndef CLANGUTIL_REWRITEMANAGER_H
>>> #define CLANGUTIL_REWRITEMANAGER_H
>>>
>>> #include "ClangUtil/SourceRangeLess.h"
>>> #include "make_unique.h"
>>> #include "clang/AST/AST.h"
>>> #include <vector>
>>> #include <map>
>>>
>>>
>>> // TODO extend to decls
>>>
>>>
>>> namespace cu
>>> {
>>> // Represents a statement in the original AST that should be
>>> rewritten. To implement recursive replacements, call
>>> // getMgr()->getReplaced() on any AST node within the makeReplaceStr
>>> callback.
>>> class StmtToRewrite
>>> {
>>> friend class RewriteManager;
>>>
>>> public:
>>> // Returns the enclosing RewriteManager.
>>> class RewriteManager *getMgr() const;
>>> // Override this to build a replacement string. Implement
>>> recursive replacements with RewriteManager::getReplaced.
>>> virtual std::string makeReplaceStr() const = 0;
>>>
>>> // The statement to replace.
>>> const clang::Stmt *replaceS = nullptr;
>>>
>>> private:
>>> RewriteManager *m_mgr;
>>> };
>>>
>>> struct RewriteOperation
>>> {
>>> clang::SourceRange rangeToRemove;
>>> std::string strToInsert;
>>> };
>>>
>>> // A class for managing replacements of AST nodes. It allows to
>>> specifically target AST nodes instead of raw source
>>> // locations to enable easy replacements involving macros and nested
>>> replacements.
>>> // For extended documentation see: doc/rewriting.md
>>> class RewriteManager
>>> {
>>> public:
>>> RewriteManager(clang::ASTContext &ACtx, clang::Preprocessor &PP);
>>>
>>> clang::ASTContext &getACtx() const { return ACtx; }
>>>
>>> // Registers a StmtToRewrite for use with getReplacements. Call
>>> this on all
>>> // statements that should be rewritten before calling any
>>> rewriting functions.
>>> void registerStmt(std::unique_ptr<StmtToRewrite> S);
>>>
>>> // Helper for constructing the custom type from a Stmt.
>>> template <typename T, typename... Args>
>>> void registerStmt(const clang::Stmt *S, Args... args)
>>> {
>>> auto p = std::make_unique<T>(std::forward<Args>(args)...);
>>> p->replaceS = S;
>>> registerStmt(std::move(p));
>>> }
>>>
>>> // Get the full replacement of an AST node. Note that this
>>> function removes any replaced statements from the work
>>> // list, so calling it twice will only replace the first time.
>>> RewriteOperation getReplaced(const clang::Stmt *S);
>>> // Get all replacements. These may be fewer than the requested
>>> ones because of nesting.
>>> std::vector<RewriteOperation> getReplacements();
>>>
>>> private:
>>> std::string getExpandedCode(const clang::Stmt *toReplaceS);
>>>
>>> private:
>>> clang::ASTContext &ACtx;
>>> const clang::LangOptions &LangOpts;
>>> clang::SourceManager &SM;
>>> clang::Preprocessor &PP;
>>>
>>> // Manages the pending replacements.
>>> class WorkList
>>> {
>>> public:
>>> typedef std::map<clang::SourceRange, std::vector<const
>>> StmtToRewrite *>> RangeToRepMap;
>>>
>>> WorkList(clang::ASTContext &ACtx, clang::SourceManager &SM);
>>>
>>> bool isStmtPending(const clang::Stmt *S) const;
>>> void addStmt(std::unique_ptr<StmtToRewrite> S);
>>> const RangeToRepMap &getRangeToReplacementsMap() const;
>>> std::vector<const StmtToRewrite *> getSortedReplacements()
>>> const;
>>> void markDone(const StmtToRewrite *S);
>>> void cleanup();
>>>
>>> private:
>>> clang::ASTContext &ACtx;
>>> clang::SourceManager &SM;
>>> std::vector<std::unique_ptr<StmtToRewrite>> m_pending;
>>> std::vector<std::unique_ptr<StmtToRewrite>> m_done;
>>> RangeToRepMap m_rangeToReplacements;
>>> };
>>>
>>> WorkList m_workList;
>>> };
>>>
>>> } // namespace cu
>>>
>>> #endif
>>>
>>>
>>>
>>> ----------------------------------------
>>>
>>> RewriteManager.cpp
>>>
>>> ----------------------------------------
>>>
>>> #include "ClangUtil/RewriteManager.h"
>>> #include "ClangUtil/ASTUtil.h"
>>> #include "clang/Lex/Lexer.h"
>>> #include "clang/Lex/Preprocessor.h"
>>> #include "clang/Lex/PreprocessorOptions.h"
>>> #include "clang/Lex/TokenConcatenation.h"
>>> #include "clang/Lex/MacroArgs.h"
>>>
>>>
>>> using namespace cu;
>>>
>>>
>>> // Returns a Stmt that is the first parent of startS whose expansion
>>> range is within the given range.
>>> static const clang::Stmt *GetFullMacroStmt(clang::SourceRange range,
>>> const clang::Stmt *startS, clang::ASTContext &ACtx)
>>> {
>>> auto &SM = ACtx.getSourceManager();
>>>
>>> // Walk the tree upwards until ST does no longer expand to
>>> within range.
>>> const clang::Stmt *ST = startS;
>>> while (true)
>>> {
>>> const auto &parents = ACtx.getParents(*ST);
>>> if (parents.empty())
>>> {
>>> break;
>>> }
>>> auto childS = ST;
>>> ST = parents[0].get<clang::Stmt>();
>>> if (!ST)
>>> {
>>> if (auto D = parents[0].get<clang::Decl>())
>>> {
>>> const auto &parentsD = ACtx.getParents(*D);
>>> if (parentsD.empty())
>>> {
>>> break;
>>> }
>>> ST = parentsD[0].get<clang::Stmt>();
>>> if (!ST)
>>> {
>>> break;
>>> }
>>> }
>>> else
>>> {
>>> break;
>>> }
>>> }
>>>
>>> auto exLocS = SM.getExpansionLoc(ST->getLocStart());
>>> auto exLocE = SM.getExpansionLoc(ST->getLocEnd());
>>> if (SM.isBeforeInTranslationUnit(exLocS, range.getBegin()) ||
>>> SM.isBeforeInTranslationUnit(range.getEnd(), exLocE))
>>> {
>>> return childS;
>>> }
>>> }
>>>
>>> return nullptr;
>>> }
>>>
>>>
>>> RewriteManager *StmtToRewrite::getMgr() const
>>> {
>>> return m_mgr;
>>> }
>>>
>>>
>>> RewriteManager::WorkList::WorkList(clang::ASTContext &ACtx,
>>> clang::SourceManager &SM) : ACtx(ACtx), SM(SM) {}
>>> bool RewriteManager::WorkList::isStmtPending(const clang::Stmt *S)
>>> const
>>> {
>>> for (const auto &r : m_pending)
>>> {
>>> if (r->replaceS == S)
>>> {
>>> return true;
>>> }
>>> }
>>> return false;
>>> }
>>> void
>>> RewriteManager::WorkList::addStmt(std::unique_ptr<StmtToRewrite> S)
>>> {
>>> // Use the expansion range for maximal replacement flexibility
>>> in macros.
>>> auto replaceRange =
>>> SM.getExpansionRange(S->replaceS->getSourceRange());
>>>
>>> // TODO not quite correct.
>>> /*auto sortRanges = [&](std::vector<const StmtToRewrite *> &vec) {
>>> std::sort(vec.begin(), vec.end(), [&](const StmtToRewrite
>>> *lhs, const StmtToRewrite *rhs) {
>>> auto lhsRange =
>>> SM.getExpansionRange(lhs->replaceS->getSourceRange());
>>> auto rhsRange =
>>> SM.getExpansionRange(rhs->replaceS->getSourceRange());
>>> return IsContained(rhsRange, lhsRange, SM);
>>> });
>>> };*/
>>>
>>> // Establish hierarchical relation between all ranges.
>>> bool found = false;
>>> // First, check if this range is within one we already have.
>>> for (auto &r : m_rangeToReplacements)
>>> {
>>> if (IsContained(replaceRange, r.first, SM))
>>> {
>>> // Insert in a sorted order.
>>> for (auto it = r.second.begin(); it != r.second.end();
>>> ++it)
>>> {
>>> //auto testRange =
>>> SM.getExpansionRange((*it)->replaceS->getSourceRange());
>>> // if (IsContained(testRange, replaceRange, SM))
>>> if (IsParent(S->replaceS, (*it)->replaceS, ACtx))
>>> {
>>> r.second.insert(it, S.get());
>>> found = true;
>>> break;
>>> }
>>> }
>>> if (!found)
>>> {
>>> r.second.push_back(S.get());
>>> found = true;
>>> }
>>> break;
>>> }
>>> }
>>> // Not within existing range, add as new top-level range.
>>> if (!found)
>>> {
>>> // Check if any existing ranges are contained within the new
>>> one.
>>> std::vector<const StmtToRewrite *> moveThese;
>>> auto it = m_rangeToReplacements.begin();
>>> while (it != m_rangeToReplacements.end())
>>> {
>>> if (IsContained(it->first, replaceRange, SM))
>>> {
>>> moveThese.insert(moveThese.end(),
>>> it->second.begin(), it->second.end());
>>> it = m_rangeToReplacements.erase(it);
>>> }
>>> else
>>> {
>>> ++it;
>>> }
>>> }
>>> auto &accesses = m_rangeToReplacements[replaceRange];
>>> // The order is important here. We want the first element to
>>> be the one that spans the full range.
>>> accesses.push_back(S.get());
>>> // TODO sort "moveThese".
>>> accesses.insert(accesses.end(), moveThese.begin(),
>>> moveThese.end());
>>> }
>>>
>>> int count = 0;
>>> for (const auto &r : m_rangeToReplacements)
>>> {
>>> printf("range %i\n", count++);
>>> for (const auto &a : r.second)
>>> {
>>> printf("replacement:\n");
>>> a->replaceS->dump();
>>> }
>>> }
>>>
>>> m_pending.push_back(std::move(S));
>>> }
>>> const RewriteManager::WorkList::RangeToRepMap
>>> &RewriteManager::WorkList::getRangeToReplacementsMap() const
>>> {
>>> return m_rangeToReplacements;
>>> }
>>> std::vector<const StmtToRewrite *>
>>> RewriteManager::WorkList::getSortedReplacements() const
>>> {
>>> std::vector<const StmtToRewrite *> result;
>>> for (auto &r : m_rangeToReplacements)
>>> {
>>> result.insert(result.end(), r.second.begin(), r.second.end());
>>> }
>>> return result;
>>> }
>>> void RewriteManager::WorkList::markDone(const StmtToRewrite *S)
>>> {
>>> // Remove from hierarchy.
>>> for (auto &r : m_rangeToReplacements)
>>> {
>>> r.second.erase(std::remove(r.second.begin(), r.second.end(),
>>> S), r.second.end());
>>> }
>>>
>>> // Move from pending to done list.
>>> auto it = std::find_if(m_pending.begin(), m_pending.end(),
>>> [&](const std::unique_ptr<StmtToRewrite>
>>> &rep) { return rep.get() == S; });
>>> if (it == m_pending.end())
>>> {
>>> throw std::runtime_error("Did not find replacement to mark
>>> as done");
>>> }
>>> m_done.push_back(std::move(*it));
>>> m_pending.erase(it);
>>> }
>>> void RewriteManager::WorkList::cleanup()
>>> {
>>> m_done.clear();
>>> }
>>>
>>>
>>> RewriteManager::RewriteManager(clang::ASTContext &ACtx,
>>> clang::Preprocessor &PP)
>>> : ACtx(ACtx), LangOpts(ACtx.getLangOpts()),
>>> SM(ACtx.getSourceManager()), PP(PP), m_workList(ACtx, SM)
>>> {
>>> }
>>>
>>> void RewriteManager::registerStmt(std::unique_ptr<StmtToRewrite> S)
>>> {
>>> if (!S->replaceS)
>>> {
>>> throw std::runtime_error("Must set replaceS");
>>> }
>>>
>>> if (m_workList.isStmtPending(S->replaceS))
>>> {
>>> throw std::runtime_error("This Stmt will already be replaced");
>>> }
>>>
>>> S->m_mgr = this;
>>> m_workList.addStmt(std::move(S));
>>> }
>>>
>>> RewriteOperation RewriteManager::getReplaced(const clang::Stmt *S)
>>> {
>>> auto range = SM.getExpansionRange(S->getSourceRange());
>>> return { range, getExpandedCode(S) };
>>> }
>>>
>>> std::vector<RewriteOperation> RewriteManager::getReplacements()
>>> {
>>> std::vector<RewriteOperation> results;
>>>
>>> for (auto &rangeAndAccesses :
>>> m_workList.getRangeToReplacementsMap())
>>> {
>>> auto &range = rangeAndAccesses.first;
>>> auto &accesses = rangeAndAccesses.second;
>>>
>>> // Cannot replace something inside a macro because it would
>>> replace all expansions instead of just the selected
>>> // AST node. So in a first step, get an enclosing statement
>>> that is no longer inside a macro.
>>> // TODO we could keep the original code more clean by not
>>> expanding macro args if the whole expansion does not
>>> // contain the macro arg more than once.
>>> auto macroS = GetFullMacroStmt(range, accesses[0]->replaceS,
>>> ACtx);
>>>
>>> results.push_back(getReplaced(macroS));
>>>
>>> // TODO we could run clang-format on the replacements. this
>>> would especially benefit long macro expansions.
>>> }
>>>
>>> m_workList.cleanup();
>>>
>>> return results;
>>> }
>>>
>>> std::string RewriteManager::getExpandedCode(const clang::Stmt
>>> *toReplaceS)
>>> {
>>> // TODO performance optimization. this is parsing way more than
>>> required.
>>>
>>> using namespace clang;
>>>
>>> printf("getExpandedCode:\n");
>>> toReplaceS->dump();
>>>
>>> std::string out;
>>>
>>> auto toReplaceExpStart =
>>> SM.getExpansionLoc(toReplaceS->getLocStart());
>>> auto toReplaceExpEnd = SM.getExpansionLoc(toReplaceS->getLocEnd());
>>> auto toReplaceSpellStart =
>>> SM.getSpellingLoc(toReplaceS->getLocStart());
>>> auto toReplaceSpellEnd =
>>> SM.getSpellingLoc(toReplaceS->getLocEnd());
>>>
>>> auto FID =
>>> SM.getFileID(SM.getExpansionLoc(toReplaceS->getLocStart()));
>>>
>>> // The following is inspired by:
>>> clang/Rewrite/HTMLRewrite.cpp:HighlightMacros
>>>
>>> // Re-lex the raw token stream into a token buffer.
>>> std::vector<Token> TokenStream;
>>>
>>> const llvm::MemoryBuffer *FromFile = SM.getBuffer(FID);
>>> Lexer L(FID, FromFile, SM, PP.getLangOpts());
>>>
>>> // Lex all the tokens in raw mode, to avoid entering #includes
>>> or expanding
>>> // macros.
>>> while (1)
>>> {
>>> Token Tok;
>>> L.LexFromRawLexer(Tok);
>>>
>>> // If this is a # at the start of a line, discard it from
>>> the token stream.
>>> // We don't want the re-preprocess step to see #defines,
>>> #includes or other
>>> // preprocessor directives.
>>> if (Tok.is(tok::hash) && Tok.isAtStartOfLine())
>>> continue;
>>>
>>> // If this is a ## token, change its kind to unknown so that
>>> repreprocessing
>>> // it will not produce an error.
>>> if (Tok.is(tok::hashhash))
>>> Tok.setKind(tok::unknown);
>>>
>>> // If this raw token is an identifier, the raw lexer won't
>>> have looked up
>>> // the corresponding identifier info for it. Do this now so
>>> that it will be
>>> // macro expanded when we re-preprocess it.
>>> if (Tok.is(tok::raw_identifier))
>>> PP.LookUpIdentifierInfo(Tok);
>>>
>>> TokenStream.push_back(Tok);
>>>
>>> for (auto &rep : m_workList.getSortedReplacements())
>>> {
>>> auto repS = rep->replaceS;
>>> auto spellLoc = SM.getSpellingLoc(repS->getLocStart());
>>> if (SM.getSpellingLoc(Tok.getLocation()) == spellLoc)
>>> {
>>> //
>>> }
>>> }
>>>
>>> if (Tok.is(tok::eof))
>>> break;
>>> }
>>>
>>> // Temporarily change the diagnostics object so that we ignore
>>> any generated
>>> // diagnostics from this pass.
>>> DiagnosticsEngine
>>> TmpDiags(PP.getDiagnostics().getDiagnosticIDs(),
>>> &PP.getDiagnostics().getDiagnosticOptions(),
>>> new IgnoringDiagConsumer);
>>>
>>> // Copy the preprocessor and all of its state.
>>> auto PPOpts =
>>> std::make_shared<PreprocessorOptions>(PP.getPreprocessorOpts());
>>> LangOptions LO = PP.getLangOpts();
>>> Preprocessor TmpPP(PPOpts, TmpDiags, LO, SM, PP.getPCMCache(),
>>> PP.getHeaderSearchInfo(), PP.getModuleLoader(),
>>> PP.getIdentifierTable().getExternalIdentifierLookup());
>>> TmpPP.Initialize(PP.getTargetInfo(), PP.getAuxTargetInfo());
>>> TmpPP.setExternalSource(PP.getExternalSource());
>>> TmpPP.setPreprocessedOutput(true);
>>>
>>> std::map<const clang::IdentifierInfo *, bool>
>>> MacroPreviouslyEnabled;
>>> for (const auto &m : PP.macros())
>>> {
>>> // printf("PREDEF MACRO: %s\n",
>>> m.first->getName().str().c_str());
>>> TmpPP.getMacroDefinition(m.first);
>>>
>>> for (const auto &tmpm : TmpPP.macros())
>>> {
>>> if (tmpm.first == m.first)
>>> {
>>> auto MD = m.second.getLatest();
>>> auto MI = MD->getMacroInfo();
>>> // If this is a recursive call we might be in a
>>> macro expansion and the macro might be disabled. We need
>>> // to enable it for now so that all expansions work.
>>> Restore it later.
>>> MacroPreviouslyEnabled[tmpm.first] = MI->isEnabled();
>>> if (!MI->isEnabled())
>>> {
>>> MD->getMacroInfo()->EnableMacro();
>>> }
>>>
>>> // This should not change anything since we just
>>> copy data over.
>>> auto &mutableState =
>>> const_cast<std::remove_const<decltype(tmpm.second)>::type
>>> &>(tmpm.second);
>>> mutableState.setLatest(MD);
>>> break;
>>> }
>>> }
>>> }
>>>
>>> class MacroArgCollector : public clang::PPCallbacks
>>> {
>>> public:
>>> MacroArgCollector(Preprocessor &TmpPP) : TmpPP(TmpPP) {}
>>>
>>> void MacroExpands(const Token &Tok, const MacroDefinition
>>> &MD, SourceRange Range, const MacroArgs *Args) override
>>> {
>>> if (!Args)
>>> {
>>> return;
>>> }
>>> printf("GOT MACRO ARGS EXPANSION CALLBACK\n");
>>> for (int i = 0; i < (int)Args->getNumMacroArguments(); i++)
>>> {
>>> auto TokUnex = Args->getUnexpArgument(i);
>>> // Thats just non-const for a cache, so should be fine.
>>> auto TokPreExp = const_cast<MacroArgs
>>> *>(Args)->getPreExpArgument(i, TmpPP);
>>> printf("unexp: %s\n",
>>> TmpPP.getSpelling(*TokUnex).c_str());
>>> for (const auto &T : TokPreExp)
>>> {
>>> printf("preexp: %s\n",
>>> TmpPP.getSpelling(T).c_str());
>>> }
>>> }
>>> }
>>>
>>> Preprocessor &TmpPP;
>>> };
>>> TmpPP.addPPCallbacks(std::make_unique<MacroArgCollector>(TmpPP));
>>> // Instead: collect the macro arg info in the law lexing step
>>> above. or do another pass that uses the PP but without expansions.
>>>
>>> /*printf("DUMP MACRO INFO\n");
>>> for (const auto &m : PP.macros())
>>> PP.dumpMacroInfo(m.first);
>>> printf("---\n");
>>> for (const auto &m : TmpPP.macros())
>>> TmpPP.dumpMacroInfo(m.first);
>>> printf("DUMP MACRO INFO END\n");*/
>>>
>>> DiagnosticsEngine *OldDiags = &TmpPP.getDiagnostics();
>>>
>>> // Inform the preprocessor that we don't want comments.
>>> TmpPP.SetCommentRetentionState(false, false);
>>>
>>> // We don't want pragmas either. Although we filtered out
>>> #pragma, removing
>>> // _Pragma and __pragma is much harder.
>>> bool PragmasPreviouslyEnabled = TmpPP.getPragmasEnabled();
>>> TmpPP.setPragmasEnabled(false);
>>>
>>> // Enter the tokens we just lexed. This will cause them to be
>>> macro expanded
>>> // but won't enter sub-files (because we removed #'s).
>>> TmpPP.EnterTokenStream(TokenStream, false);
>>>
>>> TokenConcatenation ConcatInfo(TmpPP);
>>>
>>> // Lex all the tokens.
>>> Token Tok;
>>> TmpPP.Lex(Tok);
>>>
>>> std::map<SourceLocation, int> slocIdx;
>>>
>>> auto checkReplacement = [&]() {
>>> for (auto &rep : m_workList.getSortedReplacements())
>>> {
>>> // auto rep = r.second.get();
>>> auto repS = rep->replaceS;
>>> auto spellLoc = SM.getSpellingLoc(repS->getLocStart());
>>> // TODO we need to check here if the repS spans the full
>>> range (or largest?)
>>> if (SM.getSpellingLoc(Tok.getLocation()) == spellLoc)
>>> {
>>> if (slocIdx[spellLoc] == 7)
>>> {
>>> // replace
>>> }
>>> slocIdx[spellLoc]++;
>>>
>>> // Done replacing that one, but have to keep it
>>> alive until we're done with it.
>>> m_workList.markDone(rep);
>>>
>>> printf("[[[\n");
>>> auto repStr = rep->makeReplaceStr();
>>> printf("REPLACED: %s ]]]\n", repStr.c_str());
>>> out += repStr;
>>>
>>> // Skip ahead until after the whole replacement.
>>> auto repEnd = SM.getSpellingLoc(repS->getLocEnd());
>>> while (repEnd != SM.getSpellingLoc(Tok.getLocation()))
>>> {
>>> TmpPP.Lex(Tok);
>>> assert(!Tok.is(tok::eof) && "End not found");
>>> }
>>>
>>> // Eat one more since we stopped at the end token
>>> and we want to continue after it.
>>> TmpPP.Lex(Tok);
>>>
>>> return true;
>>> }
>>> }
>>> return false;
>>> };
>>>
>>> while (Tok.isNot(tok::eof))
>>> {
>>> printf("TOKEN: %s\n", TmpPP.getSpelling(Tok).c_str());
>>>
>>> auto TokLoc = Tok.getLocation();
>>> auto TokExp = SM.getExpansionLoc(TokLoc);
>>> if (SM.isBeforeInTranslationUnit(toReplaceExpEnd, TokExp))
>>> {
>>> // Anything after the Stmt we want to replace is not
>>> interesting.
>>> break;
>>> }
>>>
>>> // Skip ahead until we are at the expansion start of the
>>> Stmt we want to replace.
>>> if (!SM.isBeforeInTranslationUnit(TokLoc, toReplaceExpStart))
>>> {
>>> if (TokLoc.isMacroID())
>>> {
>>> // This is the first token of a macro expansion.
>>> auto LLoc = SM.getExpansionRange(TokLoc);
>>>
>>> // Ignore tokens whose instantiation location was
>>> not the main file.
>>> if (SM.getFileID(LLoc.first) != FID)
>>> {
>>> TmpPP.Lex(Tok);
>>> continue;
>>> }
>>>
>>> assert(SM.getFileID(LLoc.second) == FID &&
>>> "Start and end of expansion must be in the
>>> same ultimate file!");
>>>
>>> bool stopOutputOnNextToken = false;
>>> bool toReplaceStartsInMacro = toReplaceExpStart ==
>>> TokExp;
>>> bool toReplaceEndsInMacro = toReplaceExpEnd == TokExp;
>>> bool startedOutput = false;
>>>
>>> Token PrevPrevTok;
>>> Token PrevTok = Tok;
>>>
>>> while (!Tok.is(tok::eof) &&
>>> SM.getExpansionLoc(Tok.getLocation()) == LLoc.first)
>>> {
>>> printf("TOKEN (in macro): %s\n",
>>> TmpPP.getSpelling(Tok).c_str());
>>>
>>> auto TokSpell =
>>> SM.getSpellingLoc(Tok.getLocation());
>>> if (stopOutputOnNextToken)
>>> {
>>> break;
>>> }
>>> if (toReplaceEndsInMacro && TokSpell ==
>>> toReplaceSpellEnd)
>>> {
>>> stopOutputOnNextToken = true;
>>> }
>>>
>>> if (toReplaceStartsInMacro && !startedOutput)
>>> {
>>> if (TokSpell == toReplaceSpellStart)
>>> {
>>> startedOutput = true;
>>> }
>>> else
>>> {
>>> TmpPP.Lex(Tok);
>>> continue;
>>> }
>>> }
>>>
>>> // If the tokens were already space separated,
>>> or if they must be to avoid
>>> // them being implicitly pasted, add a space
>>> between them.
>>> if (Tok.hasLeadingSpace() ||
>>> ConcatInfo.AvoidConcat(PrevPrevTok, PrevTok, Tok))
>>> out += ' ';
>>>
>>> if (checkReplacement())
>>> {
>>> continue;
>>> }
>>>
>>> out += TmpPP.getSpelling(Tok);
>>> TmpPP.Lex(Tok);
>>> }
>>> if (stopOutputOnNextToken)
>>> {
>>> break;
>>> }
>>> }
>>> else
>>> {
>>> if (checkReplacement())
>>> {
>>> continue;
>>> }
>>>
>>> // Output original code because we are outside of a
>>> replacement.
>>> out += TmpPP.getSpelling(Tok);
>>> TmpPP.Lex(Tok);
>>> }
>>> }
>>> else
>>> {
>>> TmpPP.Lex(Tok);
>>> }
>>> }
>>>
>>> // Restore the preprocessor's old state.
>>> TmpPP.setDiagnostics(*OldDiags);
>>> TmpPP.setPragmasEnabled(PragmasPreviouslyEnabled);
>>>
>>> for (const auto &tmpm : TmpPP.macros())
>>> {
>>> auto it = MacroPreviouslyEnabled.find(tmpm.first);
>>> if (it != MacroPreviouslyEnabled.end())
>>> {
>>> auto MD = tmpm.second.getLatest();
>>> auto MI = MD->getMacroInfo();
>>> if (MI->isEnabled() && !it->second)
>>> {
>>> MI->DisableMacro();
>>> }
>>> else if (!MI->isEnabled() && it->second)
>>> {
>>> MI->EnableMacro();
>>> }
>>> }
>>> }
>>>
>>> return out;
>>> }
>>>
>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180717/79ea3ea7/attachment.html>
More information about the cfe-dev
mailing list