[cfe-commits] [PATCH] X-TU Refactoring support

Wed May 23 09:31:56 PDT 2012

Addressed all of Dougs comments, with the exception of the getRangeSize,
which I moved to be an implementation detail and added a FIXME for now.

Submitted as r157331.

Thanks,
/Manuel

On Thu, May 10, 2012 at 4:39 PM, Manuel Klimek <klimek at google.com> wrote:

> More thoughts:
> When writing tools, what I often really want are *contiguous* ranges -
> I guess that's the idea of makeFileCharRange, too, but it doesn't
> really come out in the name or the types (or I'm just confused :)
>
> So, I want to go from an AST node to a ContiguousRange in one step,
> and then either get back that it's not possible to find one, or have
> one on which I can call things like getText(), getPositionAsString(),
> getFileName(), etc
>
> I'm not sure exactly where that would fit in - currently we internally
> still have a few functions that do stuff like that, but not
> orthogonally enough that I'm happy with it.
>
> Cheers,
> /Manuel
>
> On Thu, May 10, 2012 at 3:55 PM, Manuel Klimek <klimek at google.com> wrote:
> > After pondering about this some more, my concrete proposal would
> > probably be to have 2 methods on the Lexer:
> > makeLargestFullMatchingFileCharRange(...)
> > makeSmallestFullMatchingFileCharRange(...)
> >
> > (names are prolly crap, suggestiongs welcome :)
> >
> > both taking a range; the first trying to find the outermost range that
> > exactly covers the given range, the second trying to find the
> > innermost range that exactly covers the given range.h
> >
> > The first one is basically makeFileCharRange, plus being able to get
> > the spelling location,
> > the second one is basically getting the spelling location, plus some
> > magic when the AST node is created by from multiple different macro
> > expansions.
> >
> > Thoughts?
> > /Manuel
> >
> > On Thu, May 10, 2012 at 3:35 PM, Manuel Klimek <klimek at google.com>
> wrote:
> >> On Fri, May 4, 2012 at 7:56 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com>
> wrote:
> >>> On May 3, 2012, at 11:10 AM, Manuel Klimek wrote:
> >>>
> >>>> On Fri, Apr 27, 2012 at 7:49 PM, Douglas Gregor <dgregor at apple.com>
> wrote:
> >>>>> On Apr 19, 2012, at 4:30 AM, Manuel Klimek <klimek at google.com>
> wrote:
> >>>>>
> >>>>>> The attached patch adds a layer on top of the tooling support for
> >>>>>> cross-TU refactorings.
> >>>>>
> >>>>>
> >>>>> +static const char * const InvalidLocation = "invalid-location";
> >>>>> +
> >>>>> +Replacement::Replacement()
> >>>>> +  : FilePath(InvalidLocation), Offset(0), Length(0) {}
> >>>>> +
> >>>>>
> >>>>> If we have to pick a sentinel, can it at least be an empty FilePath?
> >>>>>
> >>>>> +bool saveRewrittenFiles(Rewriter &Rewrite) {
> >>>>> +  for (Rewriter::buffer_iterator I = Rewrite.buffer_begin(),
> >>>>> +                                 E = Rewrite.buffer_end();
> >>>>> +       I != E; ++I) {
> >>>>> +    // FIXME: This code is copied from the FixItRewriter.cpp - I
> think it should
> >>>>> +    // go into directly into Rewriter (there we also have the
> Diagnostics to
> >>>>> +    // handle the error cases better).
> >>>>> +    const FileEntry *Entry =
> >>>>> +        Rewrite.getSourceMgr().getFileEntryForID(I->first);
> >>>>> +    std::string ErrorInfo;
> >>>>> +    llvm::raw_fd_ostream FileStream(
> >>>>> +        Entry->getName(), ErrorInfo,
> llvm::raw_fd_ostream::F_Binary);
> >>>>> +    if (!ErrorInfo.empty())
> >>>>> +      return false;
> >>>>> +    I->second.write(FileStream);
> >>>>> +    FileStream.flush();
> >>>>> +  }
> >>>>> +  return true;
> >>>>> +}
> >>>>>
> >>>>> I agree that this should go into the Rewriter. Go for it :)
> >>>>>
> >>>>> +int getRangeSize(SourceManager &Sources, const CharSourceRange
> &Range) {
> >>>>> +  SourceLocation SpellingBegin =
> Sources.getSpellingLoc(Range.getBegin());
> >>>>> +  SourceLocation SpellingEnd =
> Sources.getSpellingLoc(Range.getEnd());
> >>>>> +  std::pair<FileID, unsigned> Start =
> Sources.getDecomposedLoc(SpellingBegin);
> >>>>> +  std::pair<FileID, unsigned> End =
> Sources.getDecomposedLoc(SpellingEnd);
> >>>>> +  if (Start.first != End.first) return -1;
> >>>>> +  if (Range.isTokenRange())
> >>>>> +    // FIXME: Bring in the correct LangOptions.
> >>>>> +    End.second += Lexer::MeasureTokenLength(SpellingEnd, Sources,
> >>>>> +                                            LangOptions());
> >>>>> +  return End.second - Start.second;
> >>>>> +}
> >>>>>
> >>>>> This should probably return an 'unsigned'. It belongs in the Lexer,
> IMO.
> >>>>
> >>>> So return bool and use and unsigned reference to return the value?
> >>>>
> >>>> My main concern with moving that right now is that we recently
> >>>> discovered that this might not be the right solution yet to create a
> >>>> replacement from an AST node - we'll probably want to drill through
> >>>> macros and find the right point inside the expansion closest to the
> >>>> spelling where the full node is expressed.
> >>>> I've looked at makeFileCharRange in the Lexer, which seemed related at
> >>>> a first glance, but I have no idea what it really does :)
> >>>
> >>> Ok, I consider that a failure in documenting makeFileCharRange.
> >>>
> >>> That function is trying to deal with macros and return a range based
> on file locations, which I assume is what you are trying to do as well.
> >>> Dealing with macros in general is of course an open problem, note that
> the function won't give you a range inside the macro definition.
> >>> The cases where it can successfully handle macros are:
> >>>
> >>> -begin or end range lies at the start or end of a macro expansion, in
> which case the location will be set to the expansion point, e.g:
> >>>
> >>> #define M 1 2
> >>> a M
> >>>
> >>> If you have a range [a, 2] (where 2 came from the macro), the function
> will return a range for "a M"
> >>> if you have range [a, 1], the function will fail because the range
> overlaps with only a part of the macro
> >>>
> >>> -The macro is a function macro and the range can be mapped to the
> macro arguments, e.g:
> >>>
> >>> #define M 1 2
> >>> #define FM(x) x
> >>>
> >>> FM(a b M)
> >>>
> >>> if you have range [b, 2], the function will return the file range "b
> M" inside the macro arguments
> >>> if you have range [a, 2], the function will return the file range
> "FM(a b M)" since the range includes all of the macro expansion.
> >>>
> >>> Interested in thoughts on improving this. I guess we could allow
> conditionally mapping inside the macro definition if the range is part of
> the macro definition, and leave it to the caller to make sure changing the
> definition will be valid for all uses of it.
> >>
> >> So, for the case where I want to generate Replacements based on an AST
> >> nodes, I think there are 2 cases where makeFileCharRange does not what
> >> I need.
> >>
> >> One case is mapping inside the macro definition - in this case I agree
> >> with your assessment; it is definitely cool when you can rename a
> >> function and all calls inside macros are automatically updated.
> >>
> >> The second case is that I mostly want the innermost location, where
> >> the full node is in the text. Consider:
> >> #define A(X) X
> >> A(class Y)
> >> In this case when I want to replace the CXXRecordDecl, I usually want
> >> to replace the text inside the macro argument, *unless* I want to
> >> completely delete the code.
> >>
> >> I realize that those might be pretty special cases. Not sure we can
> >> find a common abstraction to pull out, but if we can, that would
> >> definitely help.
> >>
> >> Also, I think there's no good way to represent source manager
> >> independent character accurate ranges - am I missing something? If
> >> not, would we want to introduce a class for that?
> >>
> >> Thanks,
> >> /Manuel
> >>
> >>
> >>
> >>
> >>>> My problem is currently that I could need some ideas on what to do
> >>>> here, as the general case seems really hard to implement and I'm a
> >>>> little lost.
> >>>>
> >>>> On the other hand, I think using the spelling location is a good first
> >>>> approximation (as it actually works most of the time), so I think this
> >>>> is a good first step, and I don't think keeping the implementation of
> >>>> how to figure out what range is covered by a node a detail of the
> >>>> refactoring lib is a big problem right now.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>>> I'm not sure whether you've looked at the Edit library or not, but
> its functionality fits well with what you're doing here, because it can
> order rewrites such that a given set of rewrite operations is more likely
> to succeed. For example, if one rewrite edits part of a function body and
> another rewrite removes that function entirely, the edit library will deal
> with it whereas the rewriter by itself could fail. There's no specific
> action item here, other than a suggestion that there's some potential
> goodness in combining these features.
> >>>>
> >>>> I've looked and found that it didn't completely fit what we're trying
> >>>> to do with the refactoring library (mainly creating intermediate
> >>>> representations that are not depending on a sourcemanager to then
> >>>> apply them later in bulk over different TUs). I'll take a deeper look
> >>>> to figure out how we might get some of the benefits you mentioned out
> >>>> of it.
> >>>
> >>> That library is mainly useful for composing multiple rewrites in a
> single translation-unit, making sure the rewrites won't mess up the source:
> >>>
> >>> -Keeps track of what is getting removed so that if rewrites
> inadvertently remove something more than once, the rewrites compose
> correctly
> >>> -Makes use of the aforementioned makeFileCharRange to deal with macros
> >>> -Makes sure a rewrite won't mess up the source because of #if/#endif
> conditional blocks (the preprocessor must be enabled to keep track of those)
> >>> -Allows for "transaction" rewrites, in that either all of the rewrites
> in a "transaction" can be applied correctly (no messing up the source) or
> none of the rewrites will be applied.
> >>>
> >>> At the end it can give a list of changes with file ranges for that
> translation unit, but composing multiple rewrites across different TUs
> still needs to be handled one level higher than that.
> >>>
> >>>>
> >>>> Thanks for the review!
> >>>> /Manuel
> >>>>
> >>>> _______________________________________________
> >>>> cfe-commits mailing list
> >>>> cfe-commits at cs.uiuc.edu
> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
> >>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120523/e53c776a/attachment.html>