[cfe-dev] Advanced Rewriting
Rafael·Stahl via cfe-dev
cfe-dev at lists.llvm.org
Tue Oct 23 08:32:13 PDT 2018
Hey everyone
the two major limitations are resolved now:
- The macro argument issue
- Support for replacing Decls using clang::ast_type_traits::DynTypedNode
The macro issue vanished by itself once I figured a little unintuitive
detail out: The SourceLocations from the original AST and the ones from
the new Preprocessor Lexer did not compare equal even though they refer
to the exact same location.
I believe this is because expansion SrcLocs have a kind of pointer
identity representation and equality just compares raw representations,
such that even when they point to the same locations, they don't compare
equal. What worked for me was a kind of "deep" comparison:
bool clutil::DeepSrcLocEqual(clang::SourceLocation lhs,
clang::SourceLocation rhs, const clang::SourceManager &SM)
{
if (lhs == rhs)
return true;
if (SM.getExpansionLoc(lhs) != SM.getExpansionLoc(rhs))
return false;
if (SM.getSpellingLoc(lhs) != SM.getSpellingLoc(rhs))
return false;
clang::SourceLocation lhsMacro, rhsMacro;
if (SM.isMacroArgExpansion(lhs, &lhsMacro))
{
if (!SM.isMacroArgExpansion(rhs, &rhsMacro))
return false;
if (!DeepSrcLocEqual(lhsMacro, rhsMacro, SM))
return false;
}
return true;
}
Attached is the cleaned up rewriter that I ended up with now. Still, if
there is any interest, I'd be happy to contribute this back to clangs
libraries, but I would require some feedback how this fits into existing
facilities.
Best regards
Rafael
On 17.07.18 16:19, Jonas Toth wrote:
>
> Hi Rafael,
>
> I did read into clang-refactor a while ago but unfortunatly could not
> follow that up. If I recall correctly its about source-to-source
> transformation (as you said) and aims at implementing the primitive
> refactorings that exist (e.g. extract-method, extract-variable, ....).
>
> Rewriting itself should happen with the normal tooling framework.
>
> (https://clang.llvm.org/docs/RefactoringEngine.html)
>
> Maybe the implementers of the existing code can give better comments
> on you proposal (and might have considered a similar solution to yours
> already).
>
> +Alex Lorenz
>
> All the best, Jonas
>
>
> Am 17.07.2018 um 14:46 schrieb Rafael·Stahl:
>>
>> Hi Jonas
>>
>> Thanks for introducing me to this, I have seen the "Replacement"
>> before, but not clang-refactor.
>>
>> However it seems to only provide management facilities around rewrite
>> operations and not aid with the rewriting itself. Am I missing
>> something here?
>>
>> The two core problems for me:
>>
>> - nesting replacements: When implementing replacements with
>> clang-refactor, I still have to provide replacements that are closed
>> in themselves. I cannot make them depend on others, right?
>> - macros: clang-refactor only seems to work with spelling locations.
>>
>> Maybe an even simpler example: Replace all additions with "add(lhs,
>> rhs)". This in itself is very difficult with clang as soon as the
>> Stmts are nested or macros are involved.
>>
>> Best regards
>> Rafael
>>
>>
>> On 16.07.2018 19:06, Jonas Toth via cfe-dev wrote:
>>>
>>> Hi Rafael,
>>>
>>> wouldn't your usecase be a task for clang-refactor?
>>>
>>> Best, Jonas
>>>
>>>
>>> Am 16.07.2018 um 17:08 schrieb Rafael·Stahl via cfe-dev:
>>>> Hey everyone
>>>>
>>>> The rewriting API of Clang operates on the source code in textual
>>>> form. The user can use AST nodes to figure out what to replace, but
>>>> in the end he has to remove and insert snippets in a linear piece
>>>> of text.
>>>>
>>>> This is very inconvenient when it is required to restructure and
>>>> nest replacements. The involvement of macros makes a manual process
>>>> even more difficult. See some recent threads expressing difficulty
>>>> with the API [1][2].
>>>>
>>>> What do I mean by "nested replacements"? For example in the following:
>>>>
>>>> int i = x + s->a;
>>>>
>>>> I would want to replace the BinaryOperator with a function call and
>>>> the MemberExpr with some constant:
>>>>
>>>> int i = Addition(x, 7);
>>>>
>>>> When keeping the two replacement rules independent of each other,
>>>> achieving this with the current API is extremely difficult. More so
>>>> when macros are involved.
>>>>
>>>> I am proposing some kind of helper that aims to solve these issues
>>>> by providing an interface that offers to directly replace AST nodes
>>>> and a mechanism to nest AST node replacements - without having to
>>>> worry about macros.
>>>>
>>>> Potential usage:
>>>>
>>>> - Develop a class that derives from StmtToRewrite to define how
>>>> replacements should happen:
>>>>
>>>> class RewriteAdds : public cu::StmtToRewrite
>>>> {
>>>> public:
>>>> std::string makeReplaceStr() const override
>>>> {
>>>> auto binOp = dyn_cast<BinaryOperator>(replaceS);
>>>> return "Addition(" +
>>>> getMgr()->getReplaced(binOp->getLHS()).strToInsert + ", " +
>>>> getMgr()->getReplaced(binOp->getRHS()).strToInsert + ")";
>>>> }
>>>> };
>>>>
>>>> class RewriteMembs : public cu::StmtToRewrite
>>>> {
>>>> public:
>>>> std::string makeReplaceStr() const override
>>>> {
>>>> return "7";
>>>> }
>>>> };
>>>>
>>>> - Construct a RewriteManager:
>>>>
>>>> cu::RewriteManager mgr(ACtx, PP);
>>>>
>>>> - Add rewriting operations to the manager:
>>>>
>>>> // std::vector<const Stmt *> AddStmts = /* matched from
>>>> binaryOperator() with plus */
>>>> // std::vector<const Stmt *> MembStmts = /* matched from
>>>> memberExpr() */
>>>> for (const auto &S : AddStmts) mgr.registerStmt<RewriteAdds>(S);
>>>> for (const auto &S : MembStmts) mgr.registerStmt<RewriteMembs>(S);
>>>>
>>>> - Retrieve and apply the results:
>>>>
>>>> clang::Rewriter rewriter(SM, LangOpts);
>>>> for (const auto &r : mgr.getReplacements()) {
>>>> rewriter.RemoveText(r.rangeToRemove);
>>>> rewriter.InsertText(r.rangeToRemove.getBegin(),
>>>> r.strToInsert);
>>>> }
>>>>
>>>>
>>>> At the end of this mail is my low quality code that kind-of
>>>> implements this. TLDR:
>>>>
>>>> - Build a hierarchy of stmts to replace and keep track of which
>>>> replacements must be combined
>>>> - Move further up in the AST if these replacements are inside a macro
>>>> - Recursively lex the file and look for replacements outside-in by
>>>> spelling locations. Expand any macros that are encountered during
>>>> this. The re-lexing idea is based on the hint in [3].
>>>>
>>>> The code has the following shortcomings:
>>>>
>>>> - I do not know how to distinguish macro argument expansions within
>>>> macros. For example in "#define FOO(a) a + a" the two "a"s expand
>>>> to different AST nodes that could be replaced with different rules.
>>>> This is an important issue, because it can lead to completely
>>>> broken code with nesting.
>>>> - Limited to Stmts, when Decls should be supported too.
>>>> - Very un-optimized with lexing the entire source file many times.
>>>> Easy to solve, but didn't want to raise the complexity further for
>>>> now.
>>>> - Could keep written code more clean by only expanding macros if
>>>> required. For example not required if just a macro arg is replaced
>>>> and all expansions would be the same.
>>>>
>>>>
>>>> I am very interested in your general thoughts. I'm not very
>>>> experienced with clang, but this was my vision how I would want to
>>>> do replacements. Are you interested in getting this into clang? I
>>>> would need help with ironing out the remaining issues.
>>>>
>>>> -Rafael
>>>>
>>>>
>>>> [1] http://lists.llvm.org/pipermail/cfe-dev/2018-July/058430.html
>>>> [2] http://lists.llvm.org/pipermail/cfe-dev/2018-June/058213.html
>>>> [3] http://lists.llvm.org/pipermail/cfe-dev/2017-August/055079.html
>>>>
>>>>
>>>>
>>>> ----------------------------------------
>>>>
>>>> RewriteManager.h
>>>>
>>>> ----------------------------------------
>>>>
>>>> #ifndef CLANGUTIL_REWRITEMANAGER_H
>>>> #define CLANGUTIL_REWRITEMANAGER_H
>>>>
>>>> #include "ClangUtil/SourceRangeLess.h"
>>>> #include "make_unique.h"
>>>> #include "clang/AST/AST.h"
>>>> #include <vector>
>>>> #include <map>
>>>>
>>>>
>>>> // TODO extend to decls
>>>>
>>>>
>>>> namespace cu
>>>> {
>>>> // Represents a statement in the original AST that should be
>>>> rewritten. To implement recursive replacements, call
>>>> // getMgr()->getReplaced() on any AST node within the
>>>> makeReplaceStr callback.
>>>> class StmtToRewrite
>>>> {
>>>> friend class RewriteManager;
>>>>
>>>> public:
>>>> // Returns the enclosing RewriteManager.
>>>> class RewriteManager *getMgr() const;
>>>> // Override this to build a replacement string. Implement
>>>> recursive replacements with RewriteManager::getReplaced.
>>>> virtual std::string makeReplaceStr() const = 0;
>>>>
>>>> // The statement to replace.
>>>> const clang::Stmt *replaceS = nullptr;
>>>>
>>>> private:
>>>> RewriteManager *m_mgr;
>>>> };
>>>>
>>>> struct RewriteOperation
>>>> {
>>>> clang::SourceRange rangeToRemove;
>>>> std::string strToInsert;
>>>> };
>>>>
>>>> // A class for managing replacements of AST nodes. It allows to
>>>> specifically target AST nodes instead of raw source
>>>> // locations to enable easy replacements involving macros and
>>>> nested replacements.
>>>> // For extended documentation see: doc/rewriting.md
>>>> class RewriteManager
>>>> {
>>>> public:
>>>> RewriteManager(clang::ASTContext &ACtx, clang::Preprocessor &PP);
>>>>
>>>> clang::ASTContext &getACtx() const { return ACtx; }
>>>>
>>>> // Registers a StmtToRewrite for use with getReplacements. Call
>>>> this on all
>>>> // statements that should be rewritten before calling any
>>>> rewriting functions.
>>>> void registerStmt(std::unique_ptr<StmtToRewrite> S);
>>>>
>>>> // Helper for constructing the custom type from a Stmt.
>>>> template <typename T, typename... Args>
>>>> void registerStmt(const clang::Stmt *S, Args... args)
>>>> {
>>>> auto p = std::make_unique<T>(std::forward<Args>(args)...);
>>>> p->replaceS = S;
>>>> registerStmt(std::move(p));
>>>> }
>>>>
>>>> // Get the full replacement of an AST node. Note that this
>>>> function removes any replaced statements from the work
>>>> // list, so calling it twice will only replace the first time.
>>>> RewriteOperation getReplaced(const clang::Stmt *S);
>>>> // Get all replacements. These may be fewer than the requested
>>>> ones because of nesting.
>>>> std::vector<RewriteOperation> getReplacements();
>>>>
>>>> private:
>>>> std::string getExpandedCode(const clang::Stmt *toReplaceS);
>>>>
>>>> private:
>>>> clang::ASTContext &ACtx;
>>>> const clang::LangOptions &LangOpts;
>>>> clang::SourceManager &SM;
>>>> clang::Preprocessor &PP;
>>>>
>>>> // Manages the pending replacements.
>>>> class WorkList
>>>> {
>>>> public:
>>>> typedef std::map<clang::SourceRange, std::vector<const
>>>> StmtToRewrite *>> RangeToRepMap;
>>>>
>>>> WorkList(clang::ASTContext &ACtx, clang::SourceManager &SM);
>>>>
>>>> bool isStmtPending(const clang::Stmt *S) const;
>>>> void addStmt(std::unique_ptr<StmtToRewrite> S);
>>>> const RangeToRepMap &getRangeToReplacementsMap() const;
>>>> std::vector<const StmtToRewrite *> getSortedReplacements()
>>>> const;
>>>> void markDone(const StmtToRewrite *S);
>>>> void cleanup();
>>>>
>>>> private:
>>>> clang::ASTContext &ACtx;
>>>> clang::SourceManager &SM;
>>>> std::vector<std::unique_ptr<StmtToRewrite>> m_pending;
>>>> std::vector<std::unique_ptr<StmtToRewrite>> m_done;
>>>> RangeToRepMap m_rangeToReplacements;
>>>> };
>>>>
>>>> WorkList m_workList;
>>>> };
>>>>
>>>> } // namespace cu
>>>>
>>>> #endif
>>>>
>>>>
>>>>
>>>> ----------------------------------------
>>>>
>>>> RewriteManager.cpp
>>>>
>>>> ----------------------------------------
>>>>
>>>> #include "ClangUtil/RewriteManager.h"
>>>> #include "ClangUtil/ASTUtil.h"
>>>> #include "clang/Lex/Lexer.h"
>>>> #include "clang/Lex/Preprocessor.h"
>>>> #include "clang/Lex/PreprocessorOptions.h"
>>>> #include "clang/Lex/TokenConcatenation.h"
>>>> #include "clang/Lex/MacroArgs.h"
>>>>
>>>>
>>>> using namespace cu;
>>>>
>>>>
>>>> // Returns a Stmt that is the first parent of startS whose
>>>> expansion range is within the given range.
>>>> static const clang::Stmt *GetFullMacroStmt(clang::SourceRange
>>>> range, const clang::Stmt *startS, clang::ASTContext &ACtx)
>>>> {
>>>> auto &SM = ACtx.getSourceManager();
>>>>
>>>> // Walk the tree upwards until ST does no longer expand to
>>>> within range.
>>>> const clang::Stmt *ST = startS;
>>>> while (true)
>>>> {
>>>> const auto &parents = ACtx.getParents(*ST);
>>>> if (parents.empty())
>>>> {
>>>> break;
>>>> }
>>>> auto childS = ST;
>>>> ST = parents[0].get<clang::Stmt>();
>>>> if (!ST)
>>>> {
>>>> if (auto D = parents[0].get<clang::Decl>())
>>>> {
>>>> const auto &parentsD = ACtx.getParents(*D);
>>>> if (parentsD.empty())
>>>> {
>>>> break;
>>>> }
>>>> ST = parentsD[0].get<clang::Stmt>();
>>>> if (!ST)
>>>> {
>>>> break;
>>>> }
>>>> }
>>>> else
>>>> {
>>>> break;
>>>> }
>>>> }
>>>>
>>>> auto exLocS = SM.getExpansionLoc(ST->getLocStart());
>>>> auto exLocE = SM.getExpansionLoc(ST->getLocEnd());
>>>> if (SM.isBeforeInTranslationUnit(exLocS, range.getBegin()) ||
>>>> SM.isBeforeInTranslationUnit(range.getEnd(), exLocE))
>>>> {
>>>> return childS;
>>>> }
>>>> }
>>>>
>>>> return nullptr;
>>>> }
>>>>
>>>>
>>>> RewriteManager *StmtToRewrite::getMgr() const
>>>> {
>>>> return m_mgr;
>>>> }
>>>>
>>>>
>>>> RewriteManager::WorkList::WorkList(clang::ASTContext &ACtx,
>>>> clang::SourceManager &SM) : ACtx(ACtx), SM(SM) {}
>>>> bool RewriteManager::WorkList::isStmtPending(const clang::Stmt *S)
>>>> const
>>>> {
>>>> for (const auto &r : m_pending)
>>>> {
>>>> if (r->replaceS == S)
>>>> {
>>>> return true;
>>>> }
>>>> }
>>>> return false;
>>>> }
>>>> void
>>>> RewriteManager::WorkList::addStmt(std::unique_ptr<StmtToRewrite> S)
>>>> {
>>>> // Use the expansion range for maximal replacement flexibility
>>>> in macros.
>>>> auto replaceRange =
>>>> SM.getExpansionRange(S->replaceS->getSourceRange());
>>>>
>>>> // TODO not quite correct.
>>>> /*auto sortRanges = [&](std::vector<const StmtToRewrite *> &vec) {
>>>> std::sort(vec.begin(), vec.end(), [&](const StmtToRewrite
>>>> *lhs, const StmtToRewrite *rhs) {
>>>> auto lhsRange =
>>>> SM.getExpansionRange(lhs->replaceS->getSourceRange());
>>>> auto rhsRange =
>>>> SM.getExpansionRange(rhs->replaceS->getSourceRange());
>>>> return IsContained(rhsRange, lhsRange, SM);
>>>> });
>>>> };*/
>>>>
>>>> // Establish hierarchical relation between all ranges.
>>>> bool found = false;
>>>> // First, check if this range is within one we already have.
>>>> for (auto &r : m_rangeToReplacements)
>>>> {
>>>> if (IsContained(replaceRange, r.first, SM))
>>>> {
>>>> // Insert in a sorted order.
>>>> for (auto it = r.second.begin(); it != r.second.end();
>>>> ++it)
>>>> {
>>>> //auto testRange =
>>>> SM.getExpansionRange((*it)->replaceS->getSourceRange());
>>>> // if (IsContained(testRange, replaceRange, SM))
>>>> if (IsParent(S->replaceS, (*it)->replaceS, ACtx))
>>>> {
>>>> r.second.insert(it, S.get());
>>>> found = true;
>>>> break;
>>>> }
>>>> }
>>>> if (!found)
>>>> {
>>>> r.second.push_back(S.get());
>>>> found = true;
>>>> }
>>>> break;
>>>> }
>>>> }
>>>> // Not within existing range, add as new top-level range.
>>>> if (!found)
>>>> {
>>>> // Check if any existing ranges are contained within the
>>>> new one.
>>>> std::vector<const StmtToRewrite *> moveThese;
>>>> auto it = m_rangeToReplacements.begin();
>>>> while (it != m_rangeToReplacements.end())
>>>> {
>>>> if (IsContained(it->first, replaceRange, SM))
>>>> {
>>>> moveThese.insert(moveThese.end(),
>>>> it->second.begin(), it->second.end());
>>>> it = m_rangeToReplacements.erase(it);
>>>> }
>>>> else
>>>> {
>>>> ++it;
>>>> }
>>>> }
>>>> auto &accesses = m_rangeToReplacements[replaceRange];
>>>> // The order is important here. We want the first element
>>>> to be the one that spans the full range.
>>>> accesses.push_back(S.get());
>>>> // TODO sort "moveThese".
>>>> accesses.insert(accesses.end(), moveThese.begin(),
>>>> moveThese.end());
>>>> }
>>>>
>>>> int count = 0;
>>>> for (const auto &r : m_rangeToReplacements)
>>>> {
>>>> printf("range %i\n", count++);
>>>> for (const auto &a : r.second)
>>>> {
>>>> printf("replacement:\n");
>>>> a->replaceS->dump();
>>>> }
>>>> }
>>>>
>>>> m_pending.push_back(std::move(S));
>>>> }
>>>> const RewriteManager::WorkList::RangeToRepMap
>>>> &RewriteManager::WorkList::getRangeToReplacementsMap() const
>>>> {
>>>> return m_rangeToReplacements;
>>>> }
>>>> std::vector<const StmtToRewrite *>
>>>> RewriteManager::WorkList::getSortedReplacements() const
>>>> {
>>>> std::vector<const StmtToRewrite *> result;
>>>> for (auto &r : m_rangeToReplacements)
>>>> {
>>>> result.insert(result.end(), r.second.begin(), r.second.end());
>>>> }
>>>> return result;
>>>> }
>>>> void RewriteManager::WorkList::markDone(const StmtToRewrite *S)
>>>> {
>>>> // Remove from hierarchy.
>>>> for (auto &r : m_rangeToReplacements)
>>>> {
>>>> r.second.erase(std::remove(r.second.begin(),
>>>> r.second.end(), S), r.second.end());
>>>> }
>>>>
>>>> // Move from pending to done list.
>>>> auto it = std::find_if(m_pending.begin(), m_pending.end(),
>>>> [&](const std::unique_ptr<StmtToRewrite>
>>>> &rep) { return rep.get() == S; });
>>>> if (it == m_pending.end())
>>>> {
>>>> throw std::runtime_error("Did not find replacement to mark
>>>> as done");
>>>> }
>>>> m_done.push_back(std::move(*it));
>>>> m_pending.erase(it);
>>>> }
>>>> void RewriteManager::WorkList::cleanup()
>>>> {
>>>> m_done.clear();
>>>> }
>>>>
>>>>
>>>> RewriteManager::RewriteManager(clang::ASTContext &ACtx,
>>>> clang::Preprocessor &PP)
>>>> : ACtx(ACtx), LangOpts(ACtx.getLangOpts()),
>>>> SM(ACtx.getSourceManager()), PP(PP), m_workList(ACtx, SM)
>>>> {
>>>> }
>>>>
>>>> void RewriteManager::registerStmt(std::unique_ptr<StmtToRewrite> S)
>>>> {
>>>> if (!S->replaceS)
>>>> {
>>>> throw std::runtime_error("Must set replaceS");
>>>> }
>>>>
>>>> if (m_workList.isStmtPending(S->replaceS))
>>>> {
>>>> throw std::runtime_error("This Stmt will already be
>>>> replaced");
>>>> }
>>>>
>>>> S->m_mgr = this;
>>>> m_workList.addStmt(std::move(S));
>>>> }
>>>>
>>>> RewriteOperation RewriteManager::getReplaced(const clang::Stmt *S)
>>>> {
>>>> auto range = SM.getExpansionRange(S->getSourceRange());
>>>> return { range, getExpandedCode(S) };
>>>> }
>>>>
>>>> std::vector<RewriteOperation> RewriteManager::getReplacements()
>>>> {
>>>> std::vector<RewriteOperation> results;
>>>>
>>>> for (auto &rangeAndAccesses :
>>>> m_workList.getRangeToReplacementsMap())
>>>> {
>>>> auto &range = rangeAndAccesses.first;
>>>> auto &accesses = rangeAndAccesses.second;
>>>>
>>>> // Cannot replace something inside a macro because it would
>>>> replace all expansions instead of just the selected
>>>> // AST node. So in a first step, get an enclosing statement
>>>> that is no longer inside a macro.
>>>> // TODO we could keep the original code more clean by not
>>>> expanding macro args if the whole expansion does not
>>>> // contain the macro arg more than once.
>>>> auto macroS = GetFullMacroStmt(range,
>>>> accesses[0]->replaceS, ACtx);
>>>>
>>>> results.push_back(getReplaced(macroS));
>>>>
>>>> // TODO we could run clang-format on the replacements. this
>>>> would especially benefit long macro expansions.
>>>> }
>>>>
>>>> m_workList.cleanup();
>>>>
>>>> return results;
>>>> }
>>>>
>>>> std::string RewriteManager::getExpandedCode(const clang::Stmt
>>>> *toReplaceS)
>>>> {
>>>> // TODO performance optimization. this is parsing way more than
>>>> required.
>>>>
>>>> using namespace clang;
>>>>
>>>> printf("getExpandedCode:\n");
>>>> toReplaceS->dump();
>>>>
>>>> std::string out;
>>>>
>>>> auto toReplaceExpStart =
>>>> SM.getExpansionLoc(toReplaceS->getLocStart());
>>>> auto toReplaceExpEnd =
>>>> SM.getExpansionLoc(toReplaceS->getLocEnd());
>>>> auto toReplaceSpellStart =
>>>> SM.getSpellingLoc(toReplaceS->getLocStart());
>>>> auto toReplaceSpellEnd =
>>>> SM.getSpellingLoc(toReplaceS->getLocEnd());
>>>>
>>>> auto FID =
>>>> SM.getFileID(SM.getExpansionLoc(toReplaceS->getLocStart()));
>>>>
>>>> // The following is inspired by:
>>>> clang/Rewrite/HTMLRewrite.cpp:HighlightMacros
>>>>
>>>> // Re-lex the raw token stream into a token buffer.
>>>> std::vector<Token> TokenStream;
>>>>
>>>> const llvm::MemoryBuffer *FromFile = SM.getBuffer(FID);
>>>> Lexer L(FID, FromFile, SM, PP.getLangOpts());
>>>>
>>>> // Lex all the tokens in raw mode, to avoid entering #includes
>>>> or expanding
>>>> // macros.
>>>> while (1)
>>>> {
>>>> Token Tok;
>>>> L.LexFromRawLexer(Tok);
>>>>
>>>> // If this is a # at the start of a line, discard it from
>>>> the token stream.
>>>> // We don't want the re-preprocess step to see #defines,
>>>> #includes or other
>>>> // preprocessor directives.
>>>> if (Tok.is(tok::hash) && Tok.isAtStartOfLine())
>>>> continue;
>>>>
>>>> // If this is a ## token, change its kind to unknown so
>>>> that repreprocessing
>>>> // it will not produce an error.
>>>> if (Tok.is(tok::hashhash))
>>>> Tok.setKind(tok::unknown);
>>>>
>>>> // If this raw token is an identifier, the raw lexer won't
>>>> have looked up
>>>> // the corresponding identifier info for it. Do this now
>>>> so that it will be
>>>> // macro expanded when we re-preprocess it.
>>>> if (Tok.is(tok::raw_identifier))
>>>> PP.LookUpIdentifierInfo(Tok);
>>>>
>>>> TokenStream.push_back(Tok);
>>>>
>>>> for (auto &rep : m_workList.getSortedReplacements())
>>>> {
>>>> auto repS = rep->replaceS;
>>>> auto spellLoc = SM.getSpellingLoc(repS->getLocStart());
>>>> if (SM.getSpellingLoc(Tok.getLocation()) == spellLoc)
>>>> {
>>>> //
>>>> }
>>>> }
>>>>
>>>> if (Tok.is(tok::eof))
>>>> break;
>>>> }
>>>>
>>>> // Temporarily change the diagnostics object so that we ignore
>>>> any generated
>>>> // diagnostics from this pass.
>>>> DiagnosticsEngine
>>>> TmpDiags(PP.getDiagnostics().getDiagnosticIDs(),
>>>> &PP.getDiagnostics().getDiagnosticOptions(),
>>>> new IgnoringDiagConsumer);
>>>>
>>>> // Copy the preprocessor and all of its state.
>>>> auto PPOpts =
>>>> std::make_shared<PreprocessorOptions>(PP.getPreprocessorOpts());
>>>> LangOptions LO = PP.getLangOpts();
>>>> Preprocessor TmpPP(PPOpts, TmpDiags, LO, SM, PP.getPCMCache(),
>>>> PP.getHeaderSearchInfo(), PP.getModuleLoader(),
>>>> PP.getIdentifierTable().getExternalIdentifierLookup());
>>>> TmpPP.Initialize(PP.getTargetInfo(), PP.getAuxTargetInfo());
>>>> TmpPP.setExternalSource(PP.getExternalSource());
>>>> TmpPP.setPreprocessedOutput(true);
>>>>
>>>> std::map<const clang::IdentifierInfo *, bool>
>>>> MacroPreviouslyEnabled;
>>>> for (const auto &m : PP.macros())
>>>> {
>>>> // printf("PREDEF MACRO: %s\n",
>>>> m.first->getName().str().c_str());
>>>> TmpPP.getMacroDefinition(m.first);
>>>>
>>>> for (const auto &tmpm : TmpPP.macros())
>>>> {
>>>> if (tmpm.first == m.first)
>>>> {
>>>> auto MD = m.second.getLatest();
>>>> auto MI = MD->getMacroInfo();
>>>> // If this is a recursive call we might be in a
>>>> macro expansion and the macro might be disabled. We need
>>>> // to enable it for now so that all expansions
>>>> work. Restore it later.
>>>> MacroPreviouslyEnabled[tmpm.first] = MI->isEnabled();
>>>> if (!MI->isEnabled())
>>>> {
>>>> MD->getMacroInfo()->EnableMacro();
>>>> }
>>>>
>>>> // This should not change anything since we just
>>>> copy data over.
>>>> auto &mutableState =
>>>> const_cast<std::remove_const<decltype(tmpm.second)>::type
>>>> &>(tmpm.second);
>>>> mutableState.setLatest(MD);
>>>> break;
>>>> }
>>>> }
>>>> }
>>>>
>>>> class MacroArgCollector : public clang::PPCallbacks
>>>> {
>>>> public:
>>>> MacroArgCollector(Preprocessor &TmpPP) : TmpPP(TmpPP) {}
>>>>
>>>> void MacroExpands(const Token &Tok, const MacroDefinition
>>>> &MD, SourceRange Range, const MacroArgs *Args) override
>>>> {
>>>> if (!Args)
>>>> {
>>>> return;
>>>> }
>>>> printf("GOT MACRO ARGS EXPANSION CALLBACK\n");
>>>> for (int i = 0; i < (int)Args->getNumMacroArguments();
>>>> i++)
>>>> {
>>>> auto TokUnex = Args->getUnexpArgument(i);
>>>> // Thats just non-const for a cache, so should be
>>>> fine.
>>>> auto TokPreExp = const_cast<MacroArgs
>>>> *>(Args)->getPreExpArgument(i, TmpPP);
>>>> printf("unexp: %s\n",
>>>> TmpPP.getSpelling(*TokUnex).c_str());
>>>> for (const auto &T : TokPreExp)
>>>> {
>>>> printf("preexp: %s\n",
>>>> TmpPP.getSpelling(T).c_str());
>>>> }
>>>> }
>>>> }
>>>>
>>>> Preprocessor &TmpPP;
>>>> };
>>>> TmpPP.addPPCallbacks(std::make_unique<MacroArgCollector>(TmpPP));
>>>> // Instead: collect the macro arg info in the law lexing step
>>>> above. or do another pass that uses the PP but without expansions.
>>>>
>>>> /*printf("DUMP MACRO INFO\n");
>>>> for (const auto &m : PP.macros())
>>>> PP.dumpMacroInfo(m.first);
>>>> printf("---\n");
>>>> for (const auto &m : TmpPP.macros())
>>>> TmpPP.dumpMacroInfo(m.first);
>>>> printf("DUMP MACRO INFO END\n");*/
>>>>
>>>> DiagnosticsEngine *OldDiags = &TmpPP.getDiagnostics();
>>>>
>>>> // Inform the preprocessor that we don't want comments.
>>>> TmpPP.SetCommentRetentionState(false, false);
>>>>
>>>> // We don't want pragmas either. Although we filtered out
>>>> #pragma, removing
>>>> // _Pragma and __pragma is much harder.
>>>> bool PragmasPreviouslyEnabled = TmpPP.getPragmasEnabled();
>>>> TmpPP.setPragmasEnabled(false);
>>>>
>>>> // Enter the tokens we just lexed. This will cause them to be
>>>> macro expanded
>>>> // but won't enter sub-files (because we removed #'s).
>>>> TmpPP.EnterTokenStream(TokenStream, false);
>>>>
>>>> TokenConcatenation ConcatInfo(TmpPP);
>>>>
>>>> // Lex all the tokens.
>>>> Token Tok;
>>>> TmpPP.Lex(Tok);
>>>>
>>>> std::map<SourceLocation, int> slocIdx;
>>>>
>>>> auto checkReplacement = [&]() {
>>>> for (auto &rep : m_workList.getSortedReplacements())
>>>> {
>>>> // auto rep = r.second.get();
>>>> auto repS = rep->replaceS;
>>>> auto spellLoc = SM.getSpellingLoc(repS->getLocStart());
>>>> // TODO we need to check here if the repS spans the
>>>> full range (or largest?)
>>>> if (SM.getSpellingLoc(Tok.getLocation()) == spellLoc)
>>>> {
>>>> if (slocIdx[spellLoc] == 7)
>>>> {
>>>> // replace
>>>> }
>>>> slocIdx[spellLoc]++;
>>>>
>>>> // Done replacing that one, but have to keep it
>>>> alive until we're done with it.
>>>> m_workList.markDone(rep);
>>>>
>>>> printf("[[[\n");
>>>> auto repStr = rep->makeReplaceStr();
>>>> printf("REPLACED: %s ]]]\n", repStr.c_str());
>>>> out += repStr;
>>>>
>>>> // Skip ahead until after the whole replacement.
>>>> auto repEnd = SM.getSpellingLoc(repS->getLocEnd());
>>>> while (repEnd != SM.getSpellingLoc(Tok.getLocation()))
>>>> {
>>>> TmpPP.Lex(Tok);
>>>> assert(!Tok.is(tok::eof) && "End not found");
>>>> }
>>>>
>>>> // Eat one more since we stopped at the end token
>>>> and we want to continue after it.
>>>> TmpPP.Lex(Tok);
>>>>
>>>> return true;
>>>> }
>>>> }
>>>> return false;
>>>> };
>>>>
>>>> while (Tok.isNot(tok::eof))
>>>> {
>>>> printf("TOKEN: %s\n", TmpPP.getSpelling(Tok).c_str());
>>>>
>>>> auto TokLoc = Tok.getLocation();
>>>> auto TokExp = SM.getExpansionLoc(TokLoc);
>>>> if (SM.isBeforeInTranslationUnit(toReplaceExpEnd, TokExp))
>>>> {
>>>> // Anything after the Stmt we want to replace is not
>>>> interesting.
>>>> break;
>>>> }
>>>>
>>>> // Skip ahead until we are at the expansion start of the
>>>> Stmt we want to replace.
>>>> if (!SM.isBeforeInTranslationUnit(TokLoc, toReplaceExpStart))
>>>> {
>>>> if (TokLoc.isMacroID())
>>>> {
>>>> // This is the first token of a macro expansion.
>>>> auto LLoc = SM.getExpansionRange(TokLoc);
>>>>
>>>> // Ignore tokens whose instantiation location was
>>>> not the main file.
>>>> if (SM.getFileID(LLoc.first) != FID)
>>>> {
>>>> TmpPP.Lex(Tok);
>>>> continue;
>>>> }
>>>>
>>>> assert(SM.getFileID(LLoc.second) == FID &&
>>>> "Start and end of expansion must be in the
>>>> same ultimate file!");
>>>>
>>>> bool stopOutputOnNextToken = false;
>>>> bool toReplaceStartsInMacro = toReplaceExpStart ==
>>>> TokExp;
>>>> bool toReplaceEndsInMacro = toReplaceExpEnd == TokExp;
>>>> bool startedOutput = false;
>>>>
>>>> Token PrevPrevTok;
>>>> Token PrevTok = Tok;
>>>>
>>>> while (!Tok.is(tok::eof) &&
>>>> SM.getExpansionLoc(Tok.getLocation()) == LLoc.first)
>>>> {
>>>> printf("TOKEN (in macro): %s\n",
>>>> TmpPP.getSpelling(Tok).c_str());
>>>>
>>>> auto TokSpell =
>>>> SM.getSpellingLoc(Tok.getLocation());
>>>> if (stopOutputOnNextToken)
>>>> {
>>>> break;
>>>> }
>>>> if (toReplaceEndsInMacro && TokSpell ==
>>>> toReplaceSpellEnd)
>>>> {
>>>> stopOutputOnNextToken = true;
>>>> }
>>>>
>>>> if (toReplaceStartsInMacro && !startedOutput)
>>>> {
>>>> if (TokSpell == toReplaceSpellStart)
>>>> {
>>>> startedOutput = true;
>>>> }
>>>> else
>>>> {
>>>> TmpPP.Lex(Tok);
>>>> continue;
>>>> }
>>>> }
>>>>
>>>> // If the tokens were already space separated,
>>>> or if they must be to avoid
>>>> // them being implicitly pasted, add a space
>>>> between them.
>>>> if (Tok.hasLeadingSpace() ||
>>>> ConcatInfo.AvoidConcat(PrevPrevTok, PrevTok, Tok))
>>>> out += ' ';
>>>>
>>>> if (checkReplacement())
>>>> {
>>>> continue;
>>>> }
>>>>
>>>> out += TmpPP.getSpelling(Tok);
>>>> TmpPP.Lex(Tok);
>>>> }
>>>> if (stopOutputOnNextToken)
>>>> {
>>>> break;
>>>> }
>>>> }
>>>> else
>>>> {
>>>> if (checkReplacement())
>>>> {
>>>> continue;
>>>> }
>>>>
>>>> // Output original code because we are outside of a
>>>> replacement.
>>>> out += TmpPP.getSpelling(Tok);
>>>> TmpPP.Lex(Tok);
>>>> }
>>>> }
>>>> else
>>>> {
>>>> TmpPP.Lex(Tok);
>>>> }
>>>> }
>>>>
>>>> // Restore the preprocessor's old state.
>>>> TmpPP.setDiagnostics(*OldDiags);
>>>> TmpPP.setPragmasEnabled(PragmasPreviouslyEnabled);
>>>>
>>>> for (const auto &tmpm : TmpPP.macros())
>>>> {
>>>> auto it = MacroPreviouslyEnabled.find(tmpm.first);
>>>> if (it != MacroPreviouslyEnabled.end())
>>>> {
>>>> auto MD = tmpm.second.getLatest();
>>>> auto MI = MD->getMacroInfo();
>>>> if (MI->isEnabled() && !it->second)
>>>> {
>>>> MI->DisableMacro();
>>>> }
>>>> else if (!MI->isEnabled() && it->second)
>>>> {
>>>> MI->EnableMacro();
>>>> }
>>>> }
>>>> }
>>>>
>>>> return out;
>>>> }
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181023/40acbed1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RewriteManager.h
Type: text/x-chdr
Size: 3423 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181023/40acbed1/attachment.h>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RewriteManager.cpp
Type: text/x-c++src
Size: 13393 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181023/40acbed1/attachment.cpp>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5449 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181023/40acbed1/attachment.bin>
More information about the cfe-dev
mailing list