<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">Hi Alex,<br>

      <br>

      Thanks again for all your help.  Actually I did manage to make

      this eventually work with a huge amount of Lexer/Token magic. 

      Unfortunately I could not actually follow it past 1 level, so it

      looks like an unworkable solution for what I have in mind. I had

      thought the macro definitons were expanded from other macros. 

      After digging though the SLocEntries, this is clearly not the case

      and they are expanded at their final uses.  This means I'm going

      back to the PPCallbacks and digging in there.  I think I can get

      the whole tree without annotating every macro I meet.  Apparently

      MacroExpands gets called repeatedly at the point the macro gets

      used.<br>

      <br>

      Kind Regards,<br>

         -Eric<br>

      <br>

      On 9/26/2016 10:41 PM, Eric Bayer wrote:<br>

    </div>

    <blockquote

      cite="mid:26ac2529-2e33-1961-bd5a-e511b2ba7f85@vmware.com"

      type="cite">

      <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

      <div class="moz-cite-prefix">Alex,<br>

        <br>

        First off thanks so much for your help (and probably patience at

        this point.)  Okay, that all works with a few tweaks.  I spent

        most of the day trying to figure out how I get the definition. 

        I have been looking at the getSpellingLoc() which seems to get

        me one end of it, but I can't seem to figure out how I find the

        end of the definition.  If this were just a string I'd look

        until I found a line break that wasn't preceeded with a \.  So

        far I tried constructing a lexer and using ReadToEndOfLine() and

        LexFromRawLexer() based on some things I found online.  Neither

        seemed to work.  My eventual goal is to get another SourceRange

        and check it for macros as well, etc, right now the return is

        StringRef just for debugging.  I.e. I want to check for any

        macro dependency trees.  I've attached the code below of what I

        tried.  ReadToEndOfLine() seems to never advance anything, and

        LexFromRawLexer() seems to never come across an Tok::eod.  :/ 

        Some output below the function clip.  Maybe there's an entirely

        easier approach?<br>

        <br>

           -Eric<br>

        <br>

        <div style="font-family: "Courier New"; font-size:

          12.0pt; color: #000000;background-color: #e7e7e7; font-style:

          normal; font-weight: normal; text-decoration: none;">

          <pre><span>StringRef</span><span> </span><span style="font-weight: bolder;">getTokensThroughEndOfDefine</span><span>(</span><span>SourceLocation</span><span> </span><span>BeginLoc</span><span>, </span><span>SourceManager</span><span> </span><span>&</span><span>SM</span><span>) </span><span style="color:#800000;">{</span>

<span>    </span><span style="color:#800080;">const</span><span> </span><span>LangOptions</span><span> </span><span>&</span><span>LangOpts</span><span> </span><span style="color:#004080;">=</span><span> </span><span style="font-weight: bolder;">getDefaultLangOpts</span><span>()</span><span>;</span>

<span>    </span><span>SourceLocation</span><span> </span><span>CurLoc</span><span> </span><span style="color:#004080;">=</span><span> </span><span>BeginLoc</span><span>;</span>

<span>    </span><span>SourceLocation</span><span> </span><span>NextLoc</span><span>;</span>

<span>    </span><span style="color:#800080;">int</span><span> </span><span>iter</span><span> </span><span style="color:#004080;">=</span><span> </span><span style="color:#000080;">0</span><span>;</span>


<span>    </span><span style="color:#c04000;">std</span><span>::</span><span style="color:#c04000;">pair</span><span><</span><span>FileID</span><span>, </span><span style="color:#800080;">unsigned</span><span>></span><span> </span><span>cur_info</span><span> </span><span style="color:#004080;">=</span><span> </span><span>SM</span><span>.</span><span style="font-weight: bolder;">getDecomposedLoc</span><span>(</span><span>BeginLoc</span><span>)</span><span>;</span>

<span>    </span><span style="color:#800080;">bool</span><span> </span><span>invalid</span><span> </span><span style="color:#004080;">=</span><span> </span><span style="color:#800080;">false</span><span>;</span>

<span>    </span><span>StringRef</span><span> </span><span>buf</span><span> </span><span style="color:#004080;">=</span><span> </span><span>SM</span><span>.</span><span style="font-weight: bolder;">getBufferData</span><span>(</span><span>cur_info</span><span>.</span><span>first</span><span>, </span><span>&</span><span>invalid</span><span>)</span><span>;</span>


<span>    </span><span style="color:#800080;">if</span><span> (</span><span>invalid</span><span>) </span><span style="color:#800000;">{</span>

<span>        </span><span style="color:#800080;">return</span><span> </span><span style="color:#800080;">nullptr</span><span>;</span>

<span>    </span><span style="color:#800000;">}</span>


<span>    </span><span style="color:#008000;font-style: italic;">// Get the point in the buffer</span>

<span>    </span><span style="color:#800080;">const</span><span> </span><span style="color:#800080;">char</span><span>*</span><span> </span><span>point</span><span> </span><span style="color:#004080;">=</span><span> </span><span>buf</span><span>.</span><span style="font-weight: bolder;">data</span><span>() </span><span>+</span><span> </span><span>cur_info</span><span>.</span><span>second</span><span>;</span>


<span>    </span><span style="color:#008000;font-style: italic;">// Make a lexer and point it at our buffer and offset</span>

<span>    </span><span>Lexer</span><span> </span><span style="font-weight: bolder;">lexer</span><span>(</span><span>SM</span><span>.</span><span style="font-weight: bolder;">getLocForStartOfFile</span><span>(</span><span>cur_info</span><span>.</span><span>first</span><span>), </span><span>LangOpts</span><span>,</span>

<span>                </span><span>buf</span><span>.</span><span style="font-weight: bolder;">begin</span><span>(), </span><span>point</span><span>, </span><span>buf</span><span>.</span><span style="font-weight: bolder;">end</span><span>())</span><span>;</span>


<span>    </span><span style="color:#800080;">while</span><span> (</span><span style="color:#000080;">1</span><span>) </span><span style="color:#800000;">{</span>

<span>        </span><span style="color:#008000;font-style: italic;">// read through the end of line</span>

<span>        </span><span>SmallString</span><span><</span><span style="color:#000080;">128</span><span>></span><span> </span><span>text</span><span>;</span>

<span>        </span><span>lexer</span><span>.</span><span style="font-weight: bolder;">ReadToEndOfLine</span><span>(</span><span>&</span><span>text</span><span>)</span><span>;</span>


<span>        </span><span style="color:#800080;">if</span><span> (</span><span>text</span><span>.</span><span style="font-weight: bolder;">back</span><span>() </span><span>!=</span><span> </span><span style="color:#006060;">'\\'</span><span>) </span><span style="color:#800000;">{</span>

<span>            </span><span style="color:#800080;">break</span><span>;</span>

<span>        </span><span style="color:#800000;">}</span>


<span>        </span><span>llvm</span><span>::</span><span style="font-weight: bolder;">errs</span><span>() </span><span><<</span><span> </span><span style="color:#008080;">"Incomplete line, so far: "</span><span> </span><span><<</span>

<span>        </span><span style="font-weight: bolder;">getCodeString</span><span>(</span><span>SM</span><span>, </span><span>BeginLoc</span><span>, </span><span>lexer</span><span>.</span><span style="font-weight: bolder;">getFileLoc</span><span>(), </span><span style="color:#008080;">"Token"</span><span>) </span><span><<</span><span> </span><span style="color:#008080;">"\n"</span><span>;</span>

<span>    </span><span style="color:#800000;">}</span>


<span>    </span><span style="color:#800080;">return</span><span> </span><span style="font-weight: bolder;">getCodeString</span><span>(</span><span>SM</span><span>, </span><span>BeginLoc</span><span>, </span><span>lexer</span><span>.</span><span style="font-weight: bolder;">getFileLoc</span><span>(), </span><span style="color:#008080;">"Definition"</span><span>)</span><span>;</span>

<span style="color:#808000;">#if</span><span> </span><span style="color:#000080;">0</span>

<span style="color:#808080;">    Token tok;</span>

<span style="color:#808080;">    </span><span style="color:#808080;font-weight: bolder;">while</span><span style="color:#808080;"> (1) {</span>

<span style="color:#808080;">        lexer.LexFromRawLexer(tok);</span>


<span style="color:#808080;">        </span><span style="color:#808080;font-weight: bolder;">if</span><span style="color:#808080;"> (tok.is(tok::eof) || tok.is(tok::eod)) {</span>

<span style="color:#808080;">            </span><span style="color:#808080;font-weight: bolder;">break</span><span style="color:#808080;">;</span>

<span style="color:#808080;">        }</span>


<span style="color:#808080;">        llvm::errs() << "Token[" << tok.getName() << "]: \"" <<</span>

<span style="color:#808080;">            getCodeString(SM, tok.getLocation(), tok.getEndLoc(), "Token") <<</span>

<span style="color:#808080;">            "\"\n";</span>

<span style="color:#808080;">    }</span>


<span style="color:#808080;">    </span><span style="color:#808080;font-weight: bolder;">return</span><span style="color:#808080;"> getCodeString(SM, BeginLoc, tok.getEndLoc(), "Definition");</span>

<span style="color:#808000;">#endif</span>

<span style="color:#800000;">}</span>

</pre>

        </div>

        <br>

        Example failure on tokens:  (and ignore the fact that we're

        sorta printing out two tokens on every line as getEndLoc() seems

        to really be the next token and getCodeString() seems to print

        on token boundaries.)<br>

        <br>

        Macro name: ASSERT<br>

        Macro string: ASSERT((getFirstMatchingOnly &&

        firstMatching != nullptr) ||<br>

                  (!getFirstMatchingOnly && (allMatchingMo !=

        nullptr ||<br>

                                             allMatchingMoRef !=

        nullptr)))<br>

        Token[raw_identifier]: "ASSERT_IFNOT("<br>

        Token[l_paren]: "(cond"<br>

        Token[raw_identifier]: "cond,"<br>

        Token[comma]: ","<br>

        Token[raw_identifier]: "_ASSERT_PANIC("<br>

        Token[l_paren]: "(AssertAssert"<br>

        Token[raw_identifier]: "AssertAssert)"<br>

        Token[r_paren]: "))"<br>

        Token[r_paren]: ")"                                   <----

        I'd expect a eod token here.  Guessing though.<br>

        Token[hash]: "#define"<br>

        Token[raw_identifier]: "define"<br>

        ...<br>

        <br>

        On 9/26/2016 3:12 PM, Alex L wrote:<br>

      </div>

      <blockquote

cite="mid:CAKS3GBtSan6JZgeZGeVxt8zEAxOhxn889728m11uOzntSwCnWQ@mail.gmail.com"

        type="cite">

        <meta http-equiv="Content-Type" content="text/html;

          charset=utf-8">

        <div dir="ltr"><br>

          <div class="gmail_extra"><br>

            <div class="gmail_quote">On 26 September 2016 at 14:55, Eric

              Bayer <span dir="ltr"><<a moz-do-not-send="true"

                  href="mailto:ebayer@vmware.com" target="_blank">ebayer@vmware.com</a>></span>

              wrote:<br>

              <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                <div bgcolor="#FFFFFF">

                  <div>Thanks Alex,<br>

                    <br>

                    That gets me mostly there.  Pardon if that is a dumb

                    question, but I'm not sure how I go from a

                    SourceLocation to a Token.  I have not worked at all

                    in the preprocessor levels before.</div>

                </div>

              </blockquote>

              <div><br>

              </div>

              <div>Something like this should work:</div>

              <div><br>

              </div>

              <div>

                <div>    StringRef getToken(SourceLocation BeginLoc,

                  SourceManager &SM, LangOptions &LangOpts) {</div>

                <div>      const SourceLocation EndLoc =

                  Lexer::getLocForEndOfToken(BeginLoc, 0, SM, LangOpts);</div>

                <div>      return

                  Lexer::getSourceText(CharSourceRange::getTokenRange(BeginLoc,

                  EndLoc), SM, LangOpts);</div>

                <div>    }</div>

              </div>

              <div><br>

              </div>

            </div>

          </div>

        </div>

      </blockquote>

      <p><br>

      </p>

    </blockquote>

    <p><br>

    </p>

  </body>

</html>