[cfe-dev] GSoC project ideas

Guillaume Papin guillaume.papin at epitech.eu
Tue Apr 30 10:22:23 PDT 2013


"Vane, Edwin" <edwin.vane at intel.com> writes:

>> -----Original Message-----
>> From: Guillaume Papin [mailto:guillaume.papin at epitech.eu]
>> Sent: Monday, April 29, 2013 5:25 PM
>> To: Vane, Edwin
>> Cc: cfe-dev at cs.uiuc.edu
>> Subject: Re: [cfe-dev] GSoC project ideas
>> 
>> I added some comments and wrote a summary of the new plan at the end of the
>> mail.
>> 
>> "Vane, Edwin" <edwin.vane at intel.com> writes:
>> 
>> > Comments below.
>> >
>> >> -----Original Message-----
>> >> From: Guillaume Papin [mailto:guillaume.papin at epitech.eu]
>> >> Sent: Sunday, April 28, 2013 9:13 PM
>> >> To: Vane, Edwin
>> >> Cc: cfe-dev at cs.uiuc.edu
>> >> Subject: Re: [cfe-dev] GSoC project ideas
>> >>
>> >> I'm working on the proposal and would like to have your feedback
>> >> about the following plan regarding cpp11-migrate.
>> >>
>> >>
>> >>                                 Tasks
>> >>                                 =====
>> >>
>> >> Table of Contents
>> >> =================
>> >> 1 Transform to replace 'auto_ptr' by 'unique_ptr'
>> >> 2 Transform for delegating constructors
>> >> 3 Transform for non-static data member initializers
>> >> 4 Add support for interactive actions
>> >> 5 Default transformation profile
>> >> 6 Integrating LibFormat
>> >> 7 Transform to make use existing of move constructors
>> >> 8 Generate a diff of the changes
>> >> 9 Other incomplete ideas
>> >>
>> >>
>> >> 1 Transform to replace 'auto_ptr' by 'unique_ptr'
>> >> ==================================================
>> >>
>> >>    Seems like a good transform to start.
>> >>
>> > I agree. It's not completely trivial due to semantic differences
>> > between auto_ptr and unique_ptr (e.g. no destructive copy in
>> > unique_ptr) but should be a good first big project.
>> >
>> 
>> I had this in mind (non-triviality) as you mentioned it in an earlier mail.
>> 
>> >> 2 Transform for delegating constructors
>> >> ========================================
>> >>
>> >>    A transform that can convert code such as:
>> >>
>> >>   struct A
>> >>   {
>> >>     int x;
>> >>
>> >>     A() : x(0) { }
>> >>     A(int _x) : x(_x) { }
>> >>   };
>> >>
>> >>
>> >>    Into:
>> >>
>> >>   struct A
>> >>   {
>> >>     int x;
>> >>
>> >>     A() : A(0) { }                // now use delegation
>> >>     A(int _x) : x(_x) { }
>> >>   };
>> >>
>> >>
>> >>    This is a really trivial case here but I expect this transform to
>> >>    be non-trivial to implement.
>> >>
>> >
>> > A test for determining if the functionality of one constructor is
>> > completely subsumed by another would be really difficult to do. I'm
>> > not sure the benefit of a few less lines of code and some improved
>> > maintainability is really worth it. There is the common workaround of
>> > having constructors call init() functions that might be easier to
>> > handle but still, I think there are more useful things to focus on
>> > first.
>> >
>> 
>> Okay, I will remove this of the list then.
>> 
>> I was considering handling only constructors with empty bodies (at least for the
>> one 'delegated') and only simple expressions in initialization (such as
>> parameters, literals, ...). But it was mostly for aesthetics reasons and some
>> other transforms might be more beneficial (tr1?).
>> 
>> >> 3 Transform for non-static data member initializers
>> >> ====================================================
>> >>
>> >>    When one or more constructor initialize a member variable with
>> >>    a value independant from the constructor arguments the
>> >>    initialization can be placed in-class.
>> >>
>> >>    This might be beneficial when multiple constructors are duplicating
>> >>    member initialization.
>> >>
>> >>    Note that this transform might easily leads to conflicts with the
>> >>    previous transform (delegating constructors).
>> >>
>> >
>> > Also questionable implementation/benefit ratio. You'd have to ensure
>> > every member variable is initialized the same way by every
>> > constructor. If you detect such a case, that would mean removing all
>> > the existing initializations and adding the in-class initialization.
>> > All that's left is to hope the user didn't mind making some vars
>> > initialized by constructors and some by the in-class initializers.
>> >
>> 
>> I totally agree. I will remove it from the list.
>> 
>> >> 4 Add support for interactive actions
>> >> ======================================
>> >>
>> >>    Some actions might need user interaction.
>> >>
>> >>    Example (maybe not the best one):
>> >>    If some replacement code needs to introduce a new variable and
>> >>    that the default identifier is already taken then we might want to
>> >>    prompt the user for an alternative name.
>> >>
>> >>    Or simply to ask confirmation before a risky replacement.
>> >>
>> >
>> > Definitely something we'd like to add to the migrator but requires
>> > some design first. User interactivity should be implemented in such a
>> > way that the actual user interface doesn't matter. That way one could
>> > write a plugin for an editor/IDE or just have a simple command-line
>> > interface. This implies some sort of library interface for
>> > cpp11-migrate and cpp11-migrate itself then turns into a library.
>> > LibFormat and clang-format have the same relationship. I'm not sure if
>> > this much design work is suitable to a GSoC project.
>> >
>> 
>> I see. Actually this idea was very vague. I will remove it from the list as I don't
>> think I'm well suited (yet?) to start designing such a library.
>> 
>> >> 5 Default transformation profile
>> >> =================================
>> >>
>> >>    Apply a list of transformation by default and allow different
>> >>    profiles. By profile I'm talking about an option such as:
>> >>
>> >>      cpp11-migrate -target-profile=[clang-3.2|gcc-4.7|...] ...
>> >>
>> >>
>> >>    This option will enable all known safe (low-risk/zero-risk)
>> >>    transformations to the input files and are supported by the given
>> >>    target.
>> >>
>> >>    This could allow incremental migration toward C++11. Let's say the
>> >>    project has to support Clang 3.1 in a first place and later on the
>> >>    minimum version switch to 3.2, they can re-run the tools with the
>> >>    new profile.
>> >>
>> >
>> > This is kinda cool. It's certainly not much work right now since there
>> > are only a handful of transforms. It'd be a slightly nicer way than
>> > just saying --all-transforms (if such an option existed) especially
>> > for people out there migrating code that's tied to a particular
>> > compiler version.
>> >
>> > Remembering the discussion about C++11 on llvm-dev a while back, maybe
>> > you could even specify a list of compilers to this flag and the common
>> > subset of supported features is applied :)
>> >
>> 
>> I actually had this in mind as well (but maybe an unconscious memory from the
>> discussion on llvm-dev?).
>> 
>> >> 6 Integrating LibFormat
>> >> ========================
>> >>
>> >>    In order to format correctly inserted code.
>> >>
>> >
>> > Would definitely be nice. The transforms don't do too much to mangle
>> > code right now but any that use the TypePrinter to print out types
>> > will cause the 'const' to go on the wrong side of the type specifier
>> > according to most styles. (i.e. const MyType *A => MyType const A*). I
>> > don't think LibFormat handles const locations though yet, probably for
>> > the same reason the transforms are limited in dealing with const
>> > qualifiers currently: clang doesn't provide enough TypeLoc info.
>> >
>> 
>> Good.
>> 
>> >> 7 Transform to make use existing of move constructors
>> >> ======================================================
>> >>
>> >>    With move semantics added to the language and the standard library
>> >>    being updated accordingly (move constructors added to many types),
>> >>    it is now interesting to take an argument by value and then moving
>> >>    it (as opposed to take by 'const &' and then copy).
>> >>
>> >
>> > Could be useful. Also in this category would be use of
>> > stl_container::emplace() functions. You'll have to be very, very
>> > careful about semantics though.
>> >
>> 
>> Well, I guess this idea will be a good fit for second half of the GSoC.
>> 
>> >> 8 Generate a diff of the changes
>> >> =================================
>> >>
>> >>    Add an option to print a diff of the modifications against the
>> >>    original source file.
>> >>
>> >
>> > Could be useful as a kind of 'dry-run' mode where changes are not actually
>> made but one could find out how many and what sort of changes were made.
>> >
>> 
>> I will remove this one from the list. It has been pointed out the SCM tools
>> already provide such functionality quite well. I think for most projects using
>> cpp11-migrate they will already be under source control management.
>> 
>> I was thinking about users that are curious about the tool (or C++11) who might
>> want to try cpp11-migrate on a file non-destructively. But an option for the
>> output file or directory would be easier to implement and as useful. But then,
>> what if an included file is modified? Is it necessary to reproduce the source tree
>> structure?
>> 
>
> These questions you ask indicate why it's just more complex for
> cpp11-migrate to handle this sort of thing. The easiest option would
> be to run the migrator on your source, use SCM to see the diff and
> then use SCM to undo the changes: easy non-destructive investigation.
> Without SCM you could just copy your code-base and do a directory
> diff. Also easy.
>
>> >> 9 Other incomplete ideas
>> >> =========================
>> >>
>> >>    If the charge of the previous ideas is not sufficient for the
>> >>    GSoC I'm confident there is more work to do.
>> >>
>> >>    - initializer_list and uniform initialization transforms (use
>> >>      cases not identified yet)
>> >
>> > Someone once suggested to me looking for:
>> >
>> > Std::vector<int> A;
>> > A.push_back(a);
>> > A.push_back(b);
>> > ...
>> > A.push_back(z);
>> >
>> > And replacing with
>> >
>> > Std::vector<int> A = {a,b,...,z};
>> >
>> > I'm not entirely sure this is worth the effort. That is, how often is
>> > a vector initialization done this way? I'm not aware of other use
>> > cases right now.
>> >
>> 
>> I was think about easier cases (more commonly used?) such as:
>> 
>>   struct A
>>   {
>>     A(int a, int b);
>> 
>>     int         a;
>>     const char *b;
>>   };
>> 
>>   A bar()
>>   {
>>     return F(1, "toto");          // -> return { 1, "toto" };
>>   }
>> 
>
> I actually kinda like this use of uniform initialization. Using braced
> init lists in return statements is really helpful. Granted, it's more
> helpful in new code that you're writing. I wouldn't be against adding
> this as a smallish project to add to your proposal if you liked.
>

I will add this (restrained?) case to the list. It can be a base for future
work on using uniform initialization.

>> 
>>   // code such as:
>>   F ary[] = { A(1, "foo"), A(2, "bar"), A(3, "foobar") };
>>   // becomes:
>>   F ary[] = { {1, "foo"}, {2, "bar"}, {3, "foobar"} };
>> 
>>   // returning object by calling the constructor
>>   std::vector<int> foo(bool arg)
>>   {
>>     if (!arg)
>>       return std::vector<int>();  // -> return { };
>> 
>>     std::vector<int> results;
>>     // <fill-in results...>
>>     return results;
>>   }
>> 
>> 
>> But I think they have a limited usefulness and I don't want to add this to my
>> proposal.
>> 
>> And I agree, I don't think it's that common to initialize a vector in such a way.
>> Maybe to initialize some static containers and using a factory functions
>> (see: http://stackoverflow.com/questions/3701903/initialisation-of-static-
>> vector).
>> In this situation it would be good to get rid of the factory function and initialize
>> the vector directly, which seems to add a lot of complexity.
>> 
>> >>    - tr1 replacements. Doing everything might not be possible but at
>> >>      least some would be useful such as: unordered_map, smart
>> >>      pointers, function<> & bind(), tuple.
>> >
>> > This one in particular is high priority. I think pretty much
>> > everything in
>> > TR1 except the extra math functions is in C++11.
>> >
>> 
>> One thing I'm afraid with this task is that to be useful it requires to implement all
>> the changes from tr1. If we change the include by dropping 'tr1/'
>> it means we should support the transformation of everything the #include has.
>> Maybe it's not risky at all to drop out 'tr1/' in the include directives and the
>> reference to the namespace 'tr1::' but I don't know yet. If I understand correctly
>> C++11 has some difference with tr1 but only additions, mostly to benefit of the
>> new languages features.
>> 
>> Also, I think someone already talked about this, it will be interesting to find
>> some open source project using tr1 to apply the transformation. I took a quick
>> look, it doesn't seem impossible to find some.
>> 
>
> Since Marshall suggested the transform I bet he has some TR1 code he'd like to transform. Perhaps he can point us at some open-source code to test on.
>
> The first part of implementing this transform would be to do an
> inventory of TR1 and research what made it into C++11 and what didn't
> and what changes, if any, were made to things that did make it in. I'd
> split this inventory into three lists: Stuff that appears exactly in
> C++11 as it does in TR1, stuff that didn't make it at all, and stuff
> that made it but with changes. The first list is the easiest to
> address: just drop 'tr1::' and modify the #includes to use the right
> STD header. The stuff that didn't make it is also pretty easy: don't
> change anything. The third list will just need to be a bunch of
> special cases hard-coded into the transform.
>
> I think this transform has high value and could be straightforward to get something useful working. It only requires a bit of research to start.
>

Okay, it really sounds like a valuable inclusion to the project. I will add it
to the list.

>> >>    - fixing existing bugs (I think it's a good way to get around the
>> >>      project before starting the GSoC to get acquainted with the
>> >>      code)
>> >
>> > I agree.
>> >
>> >>    - and (much) more...
>> >>
>> >
>> > Another option could be looking at additions to STL for C++11 and
>> > making changes based on those additions. I mentioned emplace earlier.
>> 
>> I haven't thought looking at this but it's a good idea. Functions such as emplace
>> as you pointed-out is a perfect example of a tranform people might want to
>> benefit by using cpp11-migrate.
>> 
>> > Another option could be looking for nested calls to std::max or
>> > std::min to do an N-wise horizontal max/min op:
>> > std::max(std::max(a,b), std::max(c,d)) => std::max({a,b,c,d}); Again,
>> > not sure how useful this particular case is. Another suggestion was
>> > replacing use of C arrays with std::array. I haven't looked into the
>> > implications of this myself though. Yet another option is something
>> > done by the remove-cstr tool in clang-tools-extra. C++11 allows you to
>> > create std::fstreams with a std::string directly now instead of
>> > calling std::string::c_str().
>> >
>> 
>> For std::array I'm not sure, I think it's usefulness is limited to small number of
>> situations.
>> 
>> I like the idea of removing std::string::c_str() calls for std::fstream.
>> 
>> Also:
>> - the access of vector data, can be replaced from '&vec[0]'/'&vec.front()' to
>>   'vec.data()'. I haven't looked if something more has to be taken care of
>>   here.
>> - already mentioned in the tooling doc: replace member functions
>>   begin()/end() by their free function equivalent.
>> 
>> 
>> To resume the list of apparently interesting ideas:
>> - Transform to replace 'auto_ptr' by 'unique_ptr'
>> - Transform to use free-function std::begin()/std::end()
>> - Integrating LibFormat
>> - Default transformations, profiles
>> - Transform to remove call to std::string::c_str() when using std::fstream
>> - Transform to make use existing of move constructors
>> - Transform to make use of new emplace functions for STL containers
>> - [maybe] tr1 replacement (need to know more about the implications)
>> - [maybe] Command line option for output file / output directory
>> - [maybe] Make use of new std::vector.data() / std::string::data()?
>
> Do you want our feedback on prioritizing this list?
>

Here is my list ordered by the order I would like to implement things (not
order of importance). I haven't thought carefully yet about the time it would
take.

- Transform to replace 'auto_ptr' by 'unique_ptr' [*]
- Transform to use free-function std::begin()/std::end()
- Transform to use uniform-initialization on return by calling a constructor
- Transform to remove call to std::string::c_str() when using std::fstream
- Integrating LibFormat [*]
- Default transformations, profiles [*]
- Transform to replace uses of tr1 [*]
- Transform to make use existing of move constructors [*]
- Transform to make use of new emplace functions for STL containers
- [maybe] Make use of new std::vector.data() / std::string::data()?

I marked with [*] the one I consider the most important to have.

Yes any feedback is most welcomed !

Thank you.
-- 
Guillaume Papin




More information about the cfe-dev mailing list