[cfe-dev] Can you give me advice on the proper strategy for a Clang-using tool I'm writing?

Thu Apr 2 00:51:10 PDT 2015

Hi!

I started working with Clang a couple of weeks ago with an aim to
developing a
small project that would be useful to me (perhaps to others as well) and to
get
involved with the clang project and community.  After a couple of weeks of
experimentation I've reached the point where it would be helpful to ask for
advice.
(By way of background I have had commercial development experience with
compiler
development, which is somewhat dated but still relevant, but I have no
experience
with using or modifying either LLVM or Clang.)

I'd like to briefly describe two versions of my project and then ask for
your
advice on the best way to proceed.

I want to "break encapsulation" on C++ classes to better enable unit
testing of
legacy code (or just plain poorly written code) that can't be refactored,
for
which developing better tests is a higher priority (given the constraints
I'm
subject to) than changing the design.

I have two approaches in mind:

The first (which I've tested by hand) is to build a C++ source-to-source
processor that given a source file and some class names will parse the
translation unit and then emit two files: first, a C++ header that will
contain
the named classes (but renamed) with all methods and fields in the same
order as
the original but where all access is public (and it will also have the
appropriate #includes), and second, an assembly language file of thunks that
will implement the methods of the new proxy class by jumping to the methods
of
the original class (and this file needs the mangled names of original and
proxy
class methods).  So that to write a unit test you create this proxy header
and
use it (by a cast) on your objects-under-test and by linking with the
assembly
thunks you can transparently access all public/protected/private methods and
fields of your instance in hand.  (As I've said, this works when I've done
it
by hand.)

The second approach is to use Clang/LLVM as a C++ compiler (not just the
front
end of a tool) by inventing a new statement which is like a reverse friend
declaration, with its own special keyword.  Placed in a method it will name
a
class or method that you want to "break encapsulation" of.  And it will act
is
if the class or method named has a friend declaration pointing back to the
method with the break declaration statement.  With its own keyword that can
easily be grepped for you can make sure this statement is used only in unit
tests and not production code.  (In fact, it could be enabled as any other
language extension only if a compiler switch is present, and so you could
easily
ensure that only unit test projects have that compiler switch.)

Here are my questions about these approaches:

1.  What packaged Clang functionality do I need?

    a. Can approach 1 be done with strictly libclang (using the AST and the
lexer
       to guide modification of the source and to identify methods that need
       assembly thunks)?  Or do I need to step up to LibTooling +
LibASTMatchers
       (or LibAST).  Or does it need a plugin?  What is your recommendation?

    b. Can approach 2 be done with LibTooling + LibASTMatchers + minimal
changes
       to clang so it accepts the new grammar with new AST nodes to match.
My
       idea there is, having parsed and traversed my new "break
encapsulation"
       declarations, to go right to the definitions of the targeted class
and
       modify the AST in-place to have an actual friend declaration pointing
       back, and then to finish the compilation of the modified AST.

2.  How much of this work can be done on the Windows platform with Clang?
And
    can it be done with Visual Studio or do I need to use an alternate
native
    compiler for Windows?

    I've become aware of restrictions of developing on the Windows platform.
    Leave aside restrictions on what you can do with Clang/LLVM as a
compiler
    on Windows (e.g., at this time no exceptions or anything else that
requires
    compiler-rt) which would only affect my second approach.  Even so I've
found
    things that make Windows/Visual Studio less than a perfect development
    environment for Clang.

    For example, I can't confirm my compiled Clang/LLVM is correct (using
the VS
    12 Win64) platform because even though it compiles without error and the
    unit tests all pass I can't successfully run the "command line tests"
    (http://clang.llvm.org/hacking.html#testingCommands) as I reported here
    earlier (
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-March/042170.html) - I
    got some help from the community there but the thread petered out and
I've
    been unable to continue with them.  (Just for reference I've attached to
    this email my last log of running the tests, with 122 unexpected
failures
    which are all some kind of lock error I don't understand.)

    Anyway, if I continue with the Windows platform can I be successful (or
    should I switch to Linux)?

This email has been quite long, and I apologize, but I'd really appreciate
your
help.  I'd like to start my Clang/LLVM development with some chance of
success
without getting greatly frustrated by not knowing some basic things that
everyone
who is working in the code "just knows" from experience.  So thanks in
advance!
And I hope to contribute back to the Clang/LLVM community in the future ...

-- David Bakin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150402/4b312673/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: command-line-tests.zip
Type: application/zip
Size: 26514 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150402/4b312673/attachment.zip>