[Lldb-commits] [PATCH] D18530: Move some functions from ClangASTContext to ClangUtil

Wed Mar 30 12:58:34 PDT 2016

zturner added a comment.

In http://reviews.llvm.org/D18530#387328, @tberghammer wrote:

> I don't maintain any downstream branch to worry about merges but my personal opinion is moving large amount of code around can cause some issue in the future even for upstream. The 2 main issue I can think about:
>
> - "git log" and "git blame" will not display the original change for the code what is problematic when we try to track down why a piece of code have been added (especially for special cases)
> - As an active LLDB developer I have a pretty good idea about where a given piece of code lives. Moving the code around invalidates this knowledge for every developer in the project
>
>   So in general if we get some actual advantage from moving the code (e.g. cleaner API, better testing options, etc...) then I have no issue around it but if the only gain is the file size reduction and the removal of the cyclic dependencies then I think this causes more problem then the benefit of the change (same reason why we don't run clang-format over the full code base).
>
>   In this concrete case I don't see the benefit neither in the API nor for testing as all functions were accessible before in the same way and the tests can call the functions from the ClangASTContext instead of the ClangUtils with the same functionality.

In this case I only moved a few functions just to test the waters, but if you look through ClangASTContext.h there are probably 40-50 functions that could theory be moved under the same idea.  It's a couple thousand lines of code of jsut generic helper functions, which is not insignificant.

The corollary to your statement is that we can just keep adding everything clang related to this one file, which if so maybe it jumps from 10,000 lines to 20,000 lines.  File size is not normally a major concern, but at the same time I don't think we should just allow files and classes to grow unbounded in lines of code and number of member functions without being willing to think about if there should be a split.

But relevant code should be grouped together as much as possible.  Small interfaces are easier to understand than big interfaces.  Answering the question "is there a way to do X" is easier when you only have to look an interface containing 30 functions than when you have to look through an interface containing 200 functions.

As for "git log", "git blame", and knowing where a piece of code lives, there will always be times where you have to do some code archaeology to trace a line of code back to its origin.  It's usually only a few extra commands.  For example, you git blame `ClangUtil.cpp`, the revision of a line is the first commit of that file.  The diff (or the commit message) shows it was moved from `ClangASTContext.cpp`.  So you git blame the corresponding lines of `ClangASTContext.cpp` starting with x~1 where x is the revision shown in the first diff.  It doesn't really much extra effort at all, certainly not so much that the extra effort outweighs the benefits of maintaining healthy code.

Just my 2c, if nobody likes the idea then so be it, but I think it's easy to get stuck viewing this from the lens of someone who's already spent a great deal of effort understanding the code.  So I would encourage you to think about it from the viewpoint of a newcomer to the codebase trying to understand the code.

http://reviews.llvm.org/D18530