[cfe-dev] GSoC project ideas

Vane, Edwin edwin.vane at intel.com
Wed May 1 06:10:46 PDT 2013


Your list has a good mix of different sized projects. Some of the points are clearly more useful to the community than others so I would recommend organizing your list that way. That said, starting with projects that are interesting to you is a good idea for keeping your motivation high:)

The only item I'm not sure of is using vector::data() and string::data(). I haven't really seen any compelling reason for converting existing &operator[i] calls. I'm fine to be convinced otherwise.

> -----Original Message-----
> From: Guillaume Papin [mailto:guillaume.papin at epitech.eu]
> Sent: Tuesday, April 30, 2013 1:22 PM
> To: Vane, Edwin
> Cc: cfe-dev at cs.uiuc.edu
> Subject: Re: [cfe-dev] GSoC project ideas
> 
> "Vane, Edwin" <edwin.vane at intel.com> writes:
> 
> >> -----Original Message-----
> >> From: Guillaume Papin [mailto:guillaume.papin at epitech.eu]
> >> Sent: Monday, April 29, 2013 5:25 PM
> >> To: Vane, Edwin
> >> Cc: cfe-dev at cs.uiuc.edu
> >> Subject: Re: [cfe-dev] GSoC project ideas
> >>
> >> I added some comments and wrote a summary of the new plan at the end
> >> of the mail.
> >>
> >> "Vane, Edwin" <edwin.vane at intel.com> writes:
> >>
> >> > Comments below.
> >> >
> >> >> -----Original Message-----
> >> >> From: Guillaume Papin [mailto:guillaume.papin at epitech.eu]
> >> >> Sent: Sunday, April 28, 2013 9:13 PM
> >> >> To: Vane, Edwin
> >> >> Cc: cfe-dev at cs.uiuc.edu
> >> >> Subject: Re: [cfe-dev] GSoC project ideas
> >> >>
> >> >> I'm working on the proposal and would like to have your feedback
> >> >> about the following plan regarding cpp11-migrate.
> >> >>
> >> >>
> >> >>                                 Tasks
> >> >>                                 =====
> >> >>
> >> >> Table of Contents
> >> >> =================
> >> >> 1 Transform to replace 'auto_ptr' by 'unique_ptr'
> >> >> 2 Transform for delegating constructors
> >> >> 3 Transform for non-static data member initializers
> >> >> 4 Add support for interactive actions
> >> >> 5 Default transformation profile
> >> >> 6 Integrating LibFormat
> >> >> 7 Transform to make use existing of move constructors
> >> >> 8 Generate a diff of the changes
> >> >> 9 Other incomplete ideas
> >> >>
> >> >>
> >> >> 1 Transform to replace 'auto_ptr' by 'unique_ptr'
> >> >> ==================================================
> >> >>
> >> >>    Seems like a good transform to start.
> >> >>
> >> > I agree. It's not completely trivial due to semantic differences
> >> > between auto_ptr and unique_ptr (e.g. no destructive copy in
> >> > unique_ptr) but should be a good first big project.
> >> >
> >>
> >> I had this in mind (non-triviality) as you mentioned it in an earlier mail.
> >>
> >> >> 2 Transform for delegating constructors
> >> >> ========================================
> >> >>
> >> >>    A transform that can convert code such as:
> >> >>
> >> >>   struct A
> >> >>   {
> >> >>     int x;
> >> >>
> >> >>     A() : x(0) { }
> >> >>     A(int _x) : x(_x) { }
> >> >>   };
> >> >>
> >> >>
> >> >>    Into:
> >> >>
> >> >>   struct A
> >> >>   {
> >> >>     int x;
> >> >>
> >> >>     A() : A(0) { }                // now use delegation
> >> >>     A(int _x) : x(_x) { }
> >> >>   };
> >> >>
> >> >>
> >> >>    This is a really trivial case here but I expect this transform to
> >> >>    be non-trivial to implement.
> >> >>
> >> >
> >> > A test for determining if the functionality of one constructor is
> >> > completely subsumed by another would be really difficult to do. I'm
> >> > not sure the benefit of a few less lines of code and some improved
> >> > maintainability is really worth it. There is the common workaround
> >> > of having constructors call init() functions that might be easier
> >> > to handle but still, I think there are more useful things to focus
> >> > on first.
> >> >
> >>
> >> Okay, I will remove this of the list then.
> >>
> >> I was considering handling only constructors with empty bodies (at
> >> least for the one 'delegated') and only simple expressions in
> >> initialization (such as parameters, literals, ...). But it was mostly
> >> for aesthetics reasons and some other transforms might be more beneficial
> (tr1?).
> >>
> >> >> 3 Transform for non-static data member initializers
> >> >> ====================================================
> >> >>
> >> >>    When one or more constructor initialize a member variable with
> >> >>    a value independant from the constructor arguments the
> >> >>    initialization can be placed in-class.
> >> >>
> >> >>    This might be beneficial when multiple constructors are duplicating
> >> >>    member initialization.
> >> >>
> >> >>    Note that this transform might easily leads to conflicts with the
> >> >>    previous transform (delegating constructors).
> >> >>
> >> >
> >> > Also questionable implementation/benefit ratio. You'd have to
> >> > ensure every member variable is initialized the same way by every
> >> > constructor. If you detect such a case, that would mean removing
> >> > all the existing initializations and adding the in-class initialization.
> >> > All that's left is to hope the user didn't mind making some vars
> >> > initialized by constructors and some by the in-class initializers.
> >> >
> >>
> >> I totally agree. I will remove it from the list.
> >>
> >> >> 4 Add support for interactive actions
> >> >> ======================================
> >> >>
> >> >>    Some actions might need user interaction.
> >> >>
> >> >>    Example (maybe not the best one):
> >> >>    If some replacement code needs to introduce a new variable and
> >> >>    that the default identifier is already taken then we might want to
> >> >>    prompt the user for an alternative name.
> >> >>
> >> >>    Or simply to ask confirmation before a risky replacement.
> >> >>
> >> >
> >> > Definitely something we'd like to add to the migrator but requires
> >> > some design first. User interactivity should be implemented in such
> >> > a way that the actual user interface doesn't matter. That way one
> >> > could write a plugin for an editor/IDE or just have a simple
> >> > command-line interface. This implies some sort of library interface
> >> > for cpp11-migrate and cpp11-migrate itself then turns into a library.
> >> > LibFormat and clang-format have the same relationship. I'm not sure
> >> > if this much design work is suitable to a GSoC project.
> >> >
> >>
> >> I see. Actually this idea was very vague. I will remove it from the
> >> list as I don't think I'm well suited (yet?) to start designing such a library.
> >>
> >> >> 5 Default transformation profile
> >> >> =================================
> >> >>
> >> >>    Apply a list of transformation by default and allow different
> >> >>    profiles. By profile I'm talking about an option such as:
> >> >>
> >> >>      cpp11-migrate -target-profile=[clang-3.2|gcc-4.7|...] ...
> >> >>
> >> >>
> >> >>    This option will enable all known safe (low-risk/zero-risk)
> >> >>    transformations to the input files and are supported by the given
> >> >>    target.
> >> >>
> >> >>    This could allow incremental migration toward C++11. Let's say the
> >> >>    project has to support Clang 3.1 in a first place and later on the
> >> >>    minimum version switch to 3.2, they can re-run the tools with the
> >> >>    new profile.
> >> >>
> >> >
> >> > This is kinda cool. It's certainly not much work right now since
> >> > there are only a handful of transforms. It'd be a slightly nicer
> >> > way than just saying --all-transforms (if such an option existed)
> >> > especially for people out there migrating code that's tied to a
> >> > particular compiler version.
> >> >
> >> > Remembering the discussion about C++11 on llvm-dev a while back,
> >> > maybe you could even specify a list of compilers to this flag and
> >> > the common subset of supported features is applied :)
> >> >
> >>
> >> I actually had this in mind as well (but maybe an unconscious memory
> >> from the discussion on llvm-dev?).
> >>
> >> >> 6 Integrating LibFormat
> >> >> ========================
> >> >>
> >> >>    In order to format correctly inserted code.
> >> >>
> >> >
> >> > Would definitely be nice. The transforms don't do too much to
> >> > mangle code right now but any that use the TypePrinter to print out
> >> > types will cause the 'const' to go on the wrong side of the type
> >> > specifier according to most styles. (i.e. const MyType *A => MyType
> >> > const A*). I don't think LibFormat handles const locations though
> >> > yet, probably for the same reason the transforms are limited in
> >> > dealing with const qualifiers currently: clang doesn't provide enough
> TypeLoc info.
> >> >
> >>
> >> Good.
> >>
> >> >> 7 Transform to make use existing of move constructors
> >> >> ======================================================
> >> >>
> >> >>    With move semantics added to the language and the standard library
> >> >>    being updated accordingly (move constructors added to many types),
> >> >>    it is now interesting to take an argument by value and then moving
> >> >>    it (as opposed to take by 'const &' and then copy).
> >> >>
> >> >
> >> > Could be useful. Also in this category would be use of
> >> > stl_container::emplace() functions. You'll have to be very, very
> >> > careful about semantics though.
> >> >
> >>
> >> Well, I guess this idea will be a good fit for second half of the GSoC.
> >>
> >> >> 8 Generate a diff of the changes
> >> >> =================================
> >> >>
> >> >>    Add an option to print a diff of the modifications against the
> >> >>    original source file.
> >> >>
> >> >
> >> > Could be useful as a kind of 'dry-run' mode where changes are not
> >> > actually
> >> made but one could find out how many and what sort of changes were made.
> >> >
> >>
> >> I will remove this one from the list. It has been pointed out the SCM
> >> tools already provide such functionality quite well. I think for most
> >> projects using cpp11-migrate they will already be under source control
> management.
> >>
> >> I was thinking about users that are curious about the tool (or C++11)
> >> who might want to try cpp11-migrate on a file non-destructively. But
> >> an option for the output file or directory would be easier to
> >> implement and as useful. But then, what if an included file is
> >> modified? Is it necessary to reproduce the source tree structure?
> >>
> >
> > These questions you ask indicate why it's just more complex for
> > cpp11-migrate to handle this sort of thing. The easiest option would
> > be to run the migrator on your source, use SCM to see the diff and
> > then use SCM to undo the changes: easy non-destructive investigation.
> > Without SCM you could just copy your code-base and do a directory
> > diff. Also easy.
> >
> >> >> 9 Other incomplete ideas
> >> >> =========================
> >> >>
> >> >>    If the charge of the previous ideas is not sufficient for the
> >> >>    GSoC I'm confident there is more work to do.
> >> >>
> >> >>    - initializer_list and uniform initialization transforms (use
> >> >>      cases not identified yet)
> >> >
> >> > Someone once suggested to me looking for:
> >> >
> >> > Std::vector<int> A;
> >> > A.push_back(a);
> >> > A.push_back(b);
> >> > ...
> >> > A.push_back(z);
> >> >
> >> > And replacing with
> >> >
> >> > Std::vector<int> A = {a,b,...,z};
> >> >
> >> > I'm not entirely sure this is worth the effort. That is, how often
> >> > is a vector initialization done this way? I'm not aware of other
> >> > use cases right now.
> >> >
> >>
> >> I was think about easier cases (more commonly used?) such as:
> >>
> >>   struct A
> >>   {
> >>     A(int a, int b);
> >>
> >>     int         a;
> >>     const char *b;
> >>   };
> >>
> >>   A bar()
> >>   {
> >>     return F(1, "toto");          // -> return { 1, "toto" };
> >>   }
> >>
> >
> > I actually kinda like this use of uniform initialization. Using braced
> > init lists in return statements is really helpful. Granted, it's more
> > helpful in new code that you're writing. I wouldn't be against adding
> > this as a smallish project to add to your proposal if you liked.
> >
> 
> I will add this (restrained?) case to the list. It can be a base for future work on
> using uniform initialization.
> 
> >>
> >>   // code such as:
> >>   F ary[] = { A(1, "foo"), A(2, "bar"), A(3, "foobar") };
> >>   // becomes:
> >>   F ary[] = { {1, "foo"}, {2, "bar"}, {3, "foobar"} };
> >>
> >>   // returning object by calling the constructor
> >>   std::vector<int> foo(bool arg)
> >>   {
> >>     if (!arg)
> >>       return std::vector<int>();  // -> return { };
> >>
> >>     std::vector<int> results;
> >>     // <fill-in results...>
> >>     return results;
> >>   }
> >>
> >>
> >> But I think they have a limited usefulness and I don't want to add
> >> this to my proposal.
> >>
> >> And I agree, I don't think it's that common to initialize a vector in such a way.
> >> Maybe to initialize some static containers and using a factory
> >> functions
> >> (see:
> >> http://stackoverflow.com/questions/3701903/initialisation-of-static-
> >> vector).
> >> In this situation it would be good to get rid of the factory function
> >> and initialize the vector directly, which seems to add a lot of complexity.
> >>
> >> >>    - tr1 replacements. Doing everything might not be possible but at
> >> >>      least some would be useful such as: unordered_map, smart
> >> >>      pointers, function<> & bind(), tuple.
> >> >
> >> > This one in particular is high priority. I think pretty much
> >> > everything in
> >> > TR1 except the extra math functions is in C++11.
> >> >
> >>
> >> One thing I'm afraid with this task is that to be useful it requires
> >> to implement all the changes from tr1. If we change the include by dropping
> 'tr1/'
> >> it means we should support the transformation of everything the #include
> has.
> >> Maybe it's not risky at all to drop out 'tr1/' in the include
> >> directives and the reference to the namespace 'tr1::' but I don't
> >> know yet. If I understand correctly
> >> C++11 has some difference with tr1 but only additions, mostly to
> >> C++benefit of the
> >> new languages features.
> >>
> >> Also, I think someone already talked about this, it will be
> >> interesting to find some open source project using tr1 to apply the
> >> transformation. I took a quick look, it doesn't seem impossible to find some.
> >>
> >
> > Since Marshall suggested the transform I bet he has some TR1 code he'd like
> to transform. Perhaps he can point us at some open-source code to test on.
> >
> > The first part of implementing this transform would be to do an
> > inventory of TR1 and research what made it into C++11 and what didn't
> > and what changes, if any, were made to things that did make it in. I'd
> > split this inventory into three lists: Stuff that appears exactly in
> > C++11 as it does in TR1, stuff that didn't make it at all, and stuff
> > that made it but with changes. The first list is the easiest to
> > address: just drop 'tr1::' and modify the #includes to use the right
> > STD header. The stuff that didn't make it is also pretty easy: don't
> > change anything. The third list will just need to be a bunch of
> > special cases hard-coded into the transform.
> >
> > I think this transform has high value and could be straightforward to get
> something useful working. It only requires a bit of research to start.
> >
> 
> Okay, it really sounds like a valuable inclusion to the project. I will add it to the
> list.
> 
> >> >>    - fixing existing bugs (I think it's a good way to get around the
> >> >>      project before starting the GSoC to get acquainted with the
> >> >>      code)
> >> >
> >> > I agree.
> >> >
> >> >>    - and (much) more...
> >> >>
> >> >
> >> > Another option could be looking at additions to STL for C++11 and
> >> > making changes based on those additions. I mentioned emplace earlier.
> >>
> >> I haven't thought looking at this but it's a good idea. Functions
> >> such as emplace as you pointed-out is a perfect example of a tranform
> >> people might want to benefit by using cpp11-migrate.
> >>
> >> > Another option could be looking for nested calls to std::max or
> >> > std::min to do an N-wise horizontal max/min op:
> >> > std::max(std::max(a,b), std::max(c,d)) => std::max({a,b,c,d});
> >> > Again, not sure how useful this particular case is. Another
> >> > suggestion was replacing use of C arrays with std::array. I haven't
> >> > looked into the implications of this myself though. Yet another
> >> > option is something done by the remove-cstr tool in
> >> > clang-tools-extra. C++11 allows you to create std::fstreams with a
> >> > std::string directly now instead of calling std::string::c_str().
> >> >
> >>
> >> For std::array I'm not sure, I think it's usefulness is limited to
> >> small number of situations.
> >>
> >> I like the idea of removing std::string::c_str() calls for std::fstream.
> >>
> >> Also:
> >> - the access of vector data, can be replaced from '&vec[0]'/'&vec.front()' to
> >>   'vec.data()'. I haven't looked if something more has to be taken care of
> >>   here.
> >> - already mentioned in the tooling doc: replace member functions
> >>   begin()/end() by their free function equivalent.
> >>
> >>
> >> To resume the list of apparently interesting ideas:
> >> - Transform to replace 'auto_ptr' by 'unique_ptr'
> >> - Transform to use free-function std::begin()/std::end()
> >> - Integrating LibFormat
> >> - Default transformations, profiles
> >> - Transform to remove call to std::string::c_str() when using
> >> std::fstream
> >> - Transform to make use existing of move constructors
> >> - Transform to make use of new emplace functions for STL containers
> >> - [maybe] tr1 replacement (need to know more about the implications)
> >> - [maybe] Command line option for output file / output directory
> >> - [maybe] Make use of new std::vector.data() / std::string::data()?
> >
> > Do you want our feedback on prioritizing this list?
> >
> 
> Here is my list ordered by the order I would like to implement things (not order
> of importance). I haven't thought carefully yet about the time it would take.
> 
> - Transform to replace 'auto_ptr' by 'unique_ptr' [*]
> - Transform to use free-function std::begin()/std::end()
> - Transform to use uniform-initialization on return by calling a constructor
> - Transform to remove call to std::string::c_str() when using std::fstream
> - Integrating LibFormat [*]
> - Default transformations, profiles [*]
> - Transform to replace uses of tr1 [*]
> - Transform to make use existing of move constructors [*]
> - Transform to make use of new emplace functions for STL containers
> - [maybe] Make use of new std::vector.data() / std::string::data()?
> 
> I marked with [*] the one I consider the most important to have.
> 
> Yes any feedback is most welcomed !
> 
> Thank you.
> --
> Guillaume Papin





More information about the cfe-dev mailing list