[PATCH] D24383: Add addOrMerge interface to tooling::Replacements.

Eric Liu via cfe-commits cfe-commits at lists.llvm.org
Mon Sep 12 07:36:36 PDT 2016


ioeric added a comment.

@djasper Thanks for the insight into the problem.

For the cases you listed, I have added some test cases. Case 1-3 can be handled as expected. And you are right, the behavior for case 4 is interesting.

But I still agree that `addOrMerge` is confusing. As you said, the intricate behavior here is adding benign conflicting replacement in case 1-3. This is actually like `add` with assumption that benign conflicts will be resolved by combining/merging with the adjacent replacements. Maybe we can add an optional parameter to `add` interface (e.g. `mergeAdjacent`) to resolve conflicts for case 1-3 and still return error for case 4. And this might also answer @alexshap's question 2 and 3.

Copying discussion from email thread:

> @ioeric:

> 

>   1. regarding B: i apologize in advance, i am not sure if it's directly related to your patch, but i'm kinda afraid that the current approach will shadow the issue (if it's considered to be an issue). For simplicity let's take a look at the following example:

> 

>     Point.h: struct Point {};

> 

>     a.cpp: include "Point.h"

> 

>     b.cpp: include "Point.h"

> 

>     clang-rename rename-all -i -old-name Point -new-name Point2 a.cpp b.cpp Renaming failed in /Users/Alexshap/PlayRename/srcs/./include/Point.h! New replacement: /Users/Alexshap/PlayRename/srcs/./include/Point.h: 7:+5:"Point2" conflicts with existing replacement: /Users/Alexshap/PlayRename/srcs/./include/Point.h: 7:+5:"Point2"

> 

>     the thing is that in the past Replacements was a typedef on std::set<...> (if i am not mistaken) and insert() ignored duplicates. Now Replacements.add(...) will return an error (the same replacement has already been added).

> 2. (Separate observation, I'd like to take advantage of this discussion and ask about it)

> 

>   Replacements Replacements::merge(const Replacements &ReplacesToMerge)

> 

>   The question is the following: with the current implementation we copy all the replacements even if the merge is "trivial" or if there are only a few conflicts. This looks suboptimal, please, correct me if I'm wrong.

> 3. Having said that - just curious - is the fallback on merge (in addOrMergeReplacement ) really necessary in those examples ? (I mean in change-namespace and include-fixer)

> 

> Thanks,

> Alex

>
> 

> On Sat, Sep 10, 2016 at 9:43 PM, Daniel Jasper <djasper at google.com> wrote:

>  There are several things I find particularly confusing about this. Fundamentally, Replacements refer to a specific version of the code. Say you have a Replacements object R and a Replacement r that you want to integrate into it. Further assume, that C0 is the initial version of the code whereas C1 is the version after applying R. If you *add* r to R, that means that r (specifically the offset and length in it) refer to C0. If you *merge* it, that means that r refers to C1. Correct replacements can always be *merged*, while *adding* might cause conflicts if they overlap with existing ones. Thus, a function "add or merge" fundamentally doesn't make sense to me. These are fundamentally different concepts.

> 

> What this function (and the different implementations we already have) is fundamentally trying to do is to have a version of *add* that resolves (some) conflicts. It does that by assuming that r is currently referring to C0, so it first shifts it and then calls *merge*. The fact that you can shortcut if there isn't actually a conflict (i.e. the "addOr" part) is just an optimization. I am fine with implementing that optimization, but it shouldn't be part of the name.

> 

> Now, I don't (yet) have a good name as this has very intricate behavior. And I am not sure it is really a good idea to have this magically "resolve" all conflicts, because I think there are different classes. I think what we want here is to have a way to combine replacements that require a defined order, but that don't actually conflict. Lets look at a few cases. Assume R contains a single Replacement A and you trying to "addOrMerge" another Replacement B.

> 

> 1. A and B are both inserts at the same code location. This seems relatively benign and we just want a defined order

> 2. A inserts at offset X and B replaces (X, X+N). Still quite benign, but we don't want B to replace anything that A inserted, it should replace the original text

> 3. B inserts at offset X and A replaces (X, X+N). Same thing, though interesting as we now actually have to switch the order

> 4. A and B actually replace overlapping code ranges. This is really bad and I think we should continue to report a conflict instead of doing something weird

> 

>   I'd think that your existing implementation gets #1 right but possibly only one of #2/#3. It will also do something very interesting for #4 and I really think what we want is to report an error.

> 

>   On Sat, Sep 10, 2016 at 5:53 PM, Eric Liu <ioeric at google.com> wrote: This function has been implemented in several places, so I figure it might be useful to have this function in the library. Although I'm not sure whether this should be a class interface or just a standalone helper function.

> 

>   @alexshap: thanks for the explanation. For your point B,  I'm not sure about the example, could you elaborate a bit? E.g. which use case to be specific? What are the headers?

> 

>   Generally, the way users of `Replacements` class adding new replacements also matters. For example, if you have multiple insertions at the same offset, you could concatenate them and insert as one single replacement, if you care about performance.

> 

>   For most use cases I've seen when converting from old std::set implementation to the current implementation, performance can be improved by changing the way replacements are added. And the most natural way is often the most efficient one.



https://reviews.llvm.org/D24383





More information about the cfe-commits mailing list