[cfe-dev] Can you give me advice on the proper strategy for a Clang-using tool I'm writing?

Manuel Klimek klimek at google.com
Thu Apr 2 07:45:40 PDT 2015


On Thu, Apr 2, 2015 at 12:56 AM David Bakin <davidbak at gmail.com> wrote:

> Hi!
>
> I started working with Clang a couple of weeks ago with an aim to
> developing a
> small project that would be useful to me (perhaps to others as well) and
> to get
> involved with the clang project and community.  After a couple of weeks of
> experimentation I've reached the point where it would be helpful to ask for
> advice.
> (By way of background I have had commercial development experience with
> compiler
> development, which is somewhat dated but still relevant, but I have no
> experience
> with using or modifying either LLVM or Clang.)
>
> I'd like to briefly describe two versions of my project and then ask for
> your
> advice on the best way to proceed.
>
> I want to "break encapsulation" on C++ classes to better enable unit
> testing of
> legacy code (or just plain poorly written code) that can't be refactored,
> for
> which developing better tests is a higher priority (given the constraints
> I'm
> subject to) than changing the design.
>
> I have two approaches in mind:
>
> The first (which I've tested by hand) is to build a C++ source-to-source
> processor that given a source file and some class names will parse the
> translation unit and then emit two files: first, a C++ header that will
> contain
> the named classes (but renamed) with all methods and fields in the same
> order as
> the original but where all access is public (and it will also have the
> appropriate #includes), and second, an assembly language file of thunks
> that
> will implement the methods of the new proxy class by jumping to the
> methods of
> the original class (and this file needs the mangled names of original and
> proxy
> class methods).  So that to write a unit test you create this proxy header
> and
> use it (by a cast) on your objects-under-test and by linking with the
> assembly
> thunks you can transparently access all public/protected/private methods
> and
> fields of your instance in hand.  (As I've said, this works when I've done
> it
> by hand.)
>

This sounds very close to #define private public before including the
headers. Any reason this does not work for you? (for protected things, you
can probably derive from the classes and provide accessors that way)


> The second approach is to use Clang/LLVM as a C++ compiler (not just the
> front
> end of a tool) by inventing a new statement which is like a reverse friend
> declaration, with its own special keyword.  Placed in a method it will
> name a
> class or method that you want to "break encapsulation" of.  And it will
> act is
> if the class or method named has a friend declaration pointing back to the
> method with the break declaration statement.  With its own keyword that can
> easily be grepped for you can make sure this statement is used only in unit
> tests and not production code.  (In fact, it could be enabled as any other
> language extension only if a compiler switch is present, and so you could
> easily
> ensure that only unit test projects have that compiler switch.)
>
> Here are my questions about these approaches:
>
> 1.  What packaged Clang functionality do I need?
>
>     a. Can approach 1 be done with strictly libclang (using the AST and
> the lexer
>        to guide modification of the source and to identify methods that
> need
>        assembly thunks)?  Or do I need to step up to LibTooling +
> LibASTMatchers
>        (or LibAST).  Or does it need a plugin?  What is your
> recommendation?
>

I'd probably go with libtooling. You usually want to use libclang, if you
need a tool that you want to ship to customers, so that you need a stable
interface for them to work against. Libtooling gives you more power at the
cost of integrating with upstream clang changes (the core interfaces don't
change that wildly though, usually mostly when we allow clang to use more
parts of a new C++ standard). Plugins are more for when you want to do some
extra checking as part of every build.


>     b. Can approach 2 be done with LibTooling + LibASTMatchers + minimal
> changes
>        to clang so it accepts the new grammar with new AST nodes to
> match.  My
>        idea there is, having parsed and traversed my new "break
> encapsulation"
>        declarations, to go right to the definitions of the targeted class
> and
>        modify the AST in-place to have an actual friend declaration
> pointing
>        back, and then to finish the compilation of the modified AST.
>

I think both approaches are overengineering the problem.
Often there are much simpler approaches; I recommend Feather's "Working
Effectively with Legacy Code", which has many great ideas on how to
minimally change legacy systems to get access points for unit tests.
http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052


> 2.  How much of this work can be done on the Windows platform with Clang?
> And
>     can it be done with Visual Studio or do I need to use an alternate
> native
>     compiler for Windows?
>
>     I've become aware of restrictions of developing on the Windows
> platform.
>     Leave aside restrictions on what you can do with Clang/LLVM as a
> compiler
>     on Windows (e.g., at this time no exceptions or anything else that
> requires
>     compiler-rt) which would only affect my second approach.  Even so I've
> found
>     things that make Windows/Visual Studio less than a perfect development
>     environment for Clang.
>
>     For example, I can't confirm my compiled Clang/LLVM is correct (using
> the VS
>     12 Win64) platform because even though it compiles without error and
> the
>     unit tests all pass I can't successfully run the "command line tests"
>     (http://clang.llvm.org/hacking.html#testingCommands) as I reported
> here
>     earlier (
> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-March/042170.html) - I
>     got some help from the community there but the thread petered out and
> I've
>     been unable to continue with them.  (Just for reference I've attached
> to
>     this email my last log of running the tests, with 122 unexpected
> failures
>     which are all some kind of lock error I don't understand.)
>
>     Anyway, if I continue with the Windows platform can I be successful (or
>     should I switch to Linux)?
>
> This email has been quite long, and I apologize, but I'd really appreciate
> your
> help.  I'd like to start my Clang/LLVM development with some chance of
> success
> without getting greatly frustrated by not knowing some basic things that
> everyone
> who is working in the code "just knows" from experience.  So thanks in
> advance!
> And I hope to contribute back to the Clang/LLVM community in the future ...
>
> -- David Bakin
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150402/961526ab/attachment.html>


More information about the cfe-dev mailing list