[cfe-dev] GSoC project ideas

Guillaume Papin guillaume.papin at epitech.eu
Mon Apr 29 14:25:10 PDT 2013


I added some comments and wrote a summary of the new plan at the end of the
mail.

"Vane, Edwin" <edwin.vane at intel.com> writes:

> Comments below.
>
>> -----Original Message-----
>> From: Guillaume Papin [mailto:guillaume.papin at epitech.eu]
>> Sent: Sunday, April 28, 2013 9:13 PM
>> To: Vane, Edwin
>> Cc: cfe-dev at cs.uiuc.edu
>> Subject: Re: [cfe-dev] GSoC project ideas
>> 
>> I'm working on the proposal and would like to have your feedback about the
>> following plan regarding cpp11-migrate.
>> 
>> 
>>                                 Tasks
>>                                 =====
>> 
>> Table of Contents
>> =================
>> 1 Transform to replace 'auto_ptr' by 'unique_ptr'
>> 2 Transform for delegating constructors
>> 3 Transform for non-static data member initializers
>> 4 Add support for interactive actions
>> 5 Default transformation profile
>> 6 Integrating LibFormat
>> 7 Transform to make use existing of move constructors
>> 8 Generate a diff of the changes
>> 9 Other incomplete ideas
>> 
>> 
>> 1 Transform to replace 'auto_ptr' by 'unique_ptr'
>> ==================================================
>> 
>>    Seems like a good transform to start.
>> 
> I agree. It's not completely trivial due to semantic differences
> between auto_ptr and unique_ptr (e.g. no destructive copy in
> unique_ptr) but should be a good first big project.
>

I had this in mind (non-triviality) as you mentioned it in an earlier mail.

>> 2 Transform for delegating constructors
>> ========================================
>> 
>>    A transform that can convert code such as:
>> 
>>   struct A
>>   {
>>     int x;
>> 
>>     A() : x(0) { }
>>     A(int _x) : x(_x) { }
>>   };
>> 
>> 
>>    Into:
>> 
>>   struct A
>>   {
>>     int x;
>> 
>>     A() : A(0) { }                // now use delegation
>>     A(int _x) : x(_x) { }
>>   };
>> 
>> 
>>    This is a really trivial case here but I expect this transform to
>>    be non-trivial to implement.
>> 
>
> A test for determining if the functionality of one constructor is
> completely subsumed by another would be really difficult to do. I'm
> not sure the benefit of a few less lines of code and some improved
> maintainability is really worth it. There is the common workaround of
> having constructors call init() functions that might be easier to
> handle but still, I think there are more useful things to focus on
> first.
>

Okay, I will remove this of the list then.

I was considering handling only constructors with empty bodies (at least for
the one 'delegated') and only simple expressions in initialization (such as
parameters, literals, ...). But it was mostly for aesthetics reasons and some
other transforms might be more beneficial (tr1?).

>> 3 Transform for non-static data member initializers
>> ====================================================
>> 
>>    When one or more constructor initialize a member variable with
>>    a value independant from the constructor arguments the
>>    initialization can be placed in-class.
>> 
>>    This might be beneficial when multiple constructors are duplicating
>>    member initialization.
>> 
>>    Note that this transform might easily leads to conflicts with the
>>    previous transform (delegating constructors).
>> 
>
> Also questionable implementation/benefit ratio. You'd have to ensure
> every member variable is initialized the same way by every
> constructor. If you detect such a case, that would mean removing all
> the existing initializations and adding the in-class initialization.
> All that's left is to hope the user didn't mind making some vars
> initialized by constructors and some by the in-class initializers.
>

I totally agree. I will remove it from the list.

>> 4 Add support for interactive actions
>> ======================================
>> 
>>    Some actions might need user interaction.
>> 
>>    Example (maybe not the best one):
>>    If some replacement code needs to introduce a new variable and
>>    that the default identifier is already taken then we might want to
>>    prompt the user for an alternative name.
>> 
>>    Or simply to ask confirmation before a risky replacement.
>> 
>
> Definitely something we'd like to add to the migrator but requires
> some design first. User interactivity should be implemented in such a
> way that the actual user interface doesn't matter. That way one could
> write a plugin for an editor/IDE or just have a simple command-line
> interface. This implies some sort of library interface for
> cpp11-migrate and cpp11-migrate itself then turns into a library.
> LibFormat and clang-format have the same relationship. I'm not sure if
> this much design work is suitable to a GSoC project.
>

I see. Actually this idea was very vague. I will remove it from the list as I
don't think I'm well suited (yet?) to start designing such a library.

>> 5 Default transformation profile
>> =================================
>> 
>>    Apply a list of transformation by default and allow different
>>    profiles. By profile I'm talking about an option such as:
>> 
>>      cpp11-migrate -target-profile=[clang-3.2|gcc-4.7|...] ...
>> 
>> 
>>    This option will enable all known safe (low-risk/zero-risk)
>>    transformations to the input files and are supported by the given
>>    target.
>> 
>>    This could allow incremental migration toward C++11. Let's say the
>>    project has to support Clang 3.1 in a first place and later on the
>>    minimum version switch to 3.2, they can re-run the tools with the
>>    new profile.
>> 
>
> This is kinda cool. It's certainly not much work right now since there
> are only a handful of transforms. It'd be a slightly nicer way than
> just saying --all-transforms (if such an option existed) especially
> for people out there migrating code that's tied to a particular
> compiler version.
>
> Remembering the discussion about C++11 on llvm-dev a while back, maybe you
> could even specify a list of compilers to this flag and the common subset of
> supported features is applied :)
>

I actually had this in mind as well (but maybe an unconscious memory from the
discussion on llvm-dev?).

>> 6 Integrating LibFormat
>> ========================
>> 
>>    In order to format correctly inserted code.
>> 
>
> Would definitely be nice. The transforms don't do too much to mangle
> code right now but any that use the TypePrinter to print out types
> will cause the 'const' to go on the wrong side of the type specifier
> according to most styles. (i.e. const MyType *A => MyType const A*). I
> don't think LibFormat handles const locations though yet, probably for
> the same reason the transforms are limited in dealing with const
> qualifiers currently: clang doesn't provide enough TypeLoc info.
>

Good.

>> 7 Transform to make use existing of move constructors
>> ======================================================
>> 
>>    With move semantics added to the language and the standard library
>>    being updated accordingly (move constructors added to many types),
>>    it is now interesting to take an argument by value and then moving
>>    it (as opposed to take by 'const &' and then copy).
>> 
>
> Could be useful. Also in this category would be use of
> stl_container::emplace() functions. You'll have to be very, very careful
> about semantics though.
>

Well, I guess this idea will be a good fit for second half of the GSoC.

>> 8 Generate a diff of the changes
>> =================================
>> 
>>    Add an option to print a diff of the modifications against the
>>    original source file.
>> 
>
> Could be useful as a kind of 'dry-run' mode where changes are not actually made but one could find out how many and what sort of changes were made.
>

I will remove this one from the list. It has been pointed out the SCM tools
already provide such functionality quite well. I think for most projects using
cpp11-migrate they will already be under source control management.

I was thinking about users that are curious about the tool (or C++11) who might
want to try cpp11-migrate on a file non-destructively. But an option for the
output file or directory would be easier to implement and as useful. But then,
what if an included file is modified? Is it necessary to reproduce the source
tree structure?

>> 9 Other incomplete ideas
>> =========================
>> 
>>    If the charge of the previous ideas is not sufficient for the
>>    GSoC I'm confident there is more work to do.
>> 
>>    - initializer_list and uniform initialization transforms (use
>>      cases not identified yet)
>
> Someone once suggested to me looking for:
>
> Std::vector<int> A;
> A.push_back(a);
> A.push_back(b);
> ...
> A.push_back(z);
>
> And replacing with 
>
> Std::vector<int> A = {a,b,...,z};
>
> I'm not entirely sure this is worth the effort. That is, how often is a
> vector initialization done this way? I'm not aware of other use cases right
> now.
>

I was think about easier cases (more commonly used?) such as:

  struct A
  {
    A(int a, int b);
  
    int         a;
    const char *b;
  };
  
  A bar()
  {
    return F(1, "toto");          // -> return { 1, "toto" };
  }

  
  // code such as:
  F ary[] = { A(1, "foo"), A(2, "bar"), A(3, "foobar") };
  // becomes:
  F ary[] = { {1, "foo"}, {2, "bar"}, {3, "foobar"} };

  // returning object by calling the constructor
  std::vector<int> foo(bool arg)
  {
    if (!arg)
      return std::vector<int>();  // -> return { };
  
    std::vector<int> results;
    // <fill-in results...>
    return results;
  }


But I think they have a limited usefulness and I don't want to add this to my
proposal.

And I agree, I don't think it's that common to initialize a vector in such a
way. Maybe to initialize some static containers and using a factory functions
(see: http://stackoverflow.com/questions/3701903/initialisation-of-static-vector).
In this situation it would be good to get rid of the factory function and
initialize the vector directly, which seems to add a lot of complexity.

>>    - tr1 replacements. Doing everything might not be possible but at
>>      least some would be useful such as: unordered_map, smart
>>      pointers, function<> & bind(), tuple.
>
> This one in particular is high priority. I think pretty much everything in
> TR1 except the extra math functions is in C++11.
>

One thing I'm afraid with this task is that to be useful it requires to
implement all the changes from tr1. If we change the include by dropping 'tr1/'
it means we should support the transformation of everything the #include has.
Maybe it's not risky at all to drop out 'tr1/' in the include directives and
the reference to the namespace 'tr1::' but I don't know yet. If I understand
correctly C++11 has some difference with tr1 but only additions, mostly to
benefit of the new languages features.

Also, I think someone already talked about this, it will be interesting to find
some open source project using tr1 to apply the transformation. I took a quick
look, it doesn't seem impossible to find some.

>>    - fixing existing bugs (I think it's a good way to get around the
>>      project before starting the GSoC to get acquainted with the
>>      code)
>
> I agree.
>
>>    - and (much) more...
>> 
>
> Another option could be looking at additions to STL for C++11 and
> making changes based on those additions. I mentioned emplace earlier.

I haven't thought looking at this but it's a good idea. Functions such as
emplace as you pointed-out is a perfect example of a tranform people might want
to benefit by using cpp11-migrate.

> Another option could be looking for nested calls to std::max or
> std::min to do an N-wise horizontal max/min op:
> std::max(std::max(a,b), std::max(c,d)) => std::max({a,b,c,d}); Again,
> not sure how useful this particular case is. Another suggestion was
> replacing use of C arrays with std::array. I haven't looked into the
> implications of this myself though. Yet another option is something
> done by the remove-cstr tool in clang-tools-extra. C++11 allows you to
> create std::fstreams with a std::string directly now instead of
> calling std::string::c_str().
>

For std::array I'm not sure, I think it's usefulness is limited to small number
of situations.

I like the idea of removing std::string::c_str() calls for std::fstream.

Also:
- the access of vector data, can be replaced from '&vec[0]'/'&vec.front()' to
  'vec.data()'. I haven't looked if something more has to be taken care of
  here.
- already mentioned in the tooling doc: replace member functions
  begin()/end() by their free function equivalent.


To resume the list of apparently interesting ideas:
- Transform to replace 'auto_ptr' by 'unique_ptr'
- Transform to use free-function std::begin()/std::end()
- Integrating LibFormat
- Default transformations, profiles
- Transform to remove call to std::string::c_str() when using std::fstream
- Transform to make use existing of move constructors
- Transform to make use of new emplace functions for STL containers
- [maybe] tr1 replacement (need to know more about the implications)
- [maybe] Command line option for output file / output directory
- [maybe] Make use of new std::vector.data() / std::string::data()?




More information about the cfe-dev mailing list