[cfe-dev] [RFC] Add custom built-in functions using a plugin

Rudy Pons via cfe-dev cfe-dev at lists.llvm.org
Tue Jul 12 09:54:18 PDT 2016


Hello,

I want to add some custom built-in functions to Clang (functions for which
I need some compiler support, for example "get the number of member
variables of a class"), and want to do it using a plugin (mainly for
iteration time and distribution reasons).
Clang plugin system currently doesn't allow to do this, the way to add it
is to define a new token in TokenKinds.def, and plug custom code in several
places of Clang.
So, I would like to add support for it.

I started working on it locally, and currently have a basic working flow.
The main steps in it are:
 - Adding custom token support
    - Add a new interface class, CustomTokenHandler. It will have a bunch
of virtual functions, which will be entry points from various points where
we handle tokens, and functions to expose the handled keyword. For now,
from my tests I think we need entry points in ParseTopLevelDecl and
ParseCastExpression. Others can be added as needed.
    - Add a new registry to register these handlers.
    - Have Preprocessor list all the plugins for custom tokens, and make an
association between handlers and an internal CustomTokenId, and add an
entry in the IdentifierTable to insert the new token.
    - Add a CustomTokenId field in IdentifierInfo (We have 29 bits left,
using 13 bits here would allow several thousands of custom tokens, and let
a 16 free bits. It would be the same size as for ObjCOrBuiltinID). This id
will be stored when the Preprocessor calls the identifier table, and be
used to retrieve the correct handler.
    - Add a new token type custom_token. Every custom token will have this
type.
    - At places where we handle tokens, we add a new case in the switch
statement for custom_token (this means there is no overhead when not using
a custom token). In this case, we retrieve the handler from the
Preprocessor, and call the relevant virtual function on it, instead of
doing the static treatment.

With this, we can output some simple ExprResult as a result of our token
being parsed. However, this will not be enough in many situations, mostly
because of the dependant types in templates. ExprResult are re-evaluated by
TreeTransform after template types resolution, so we need to use our custom
ExprResult and handler for this.

 - Adding custom stmt/expr support
    - We add CustomStmt/CustomExpr virtual classes. They're basically
Stmt/Expr classes, with an additional CustomId field, and meant to be
extended by the plugin developer for them to store additional information.
    - We add CustomStmtHandler/CustomExprHandler interfaces, and a new
registry for them. Similarly to CustomTokenHandler, it will have virtual
functions. For now, we only need Transform function, for which the static
version is called from a templated TreeTransform function, so...
    - We introduce a new TransformPluginEntryPoint interface. This
interface is passed to the Transform function of the
CustomStmt/ExprHandler, and exposes several functions of the TreeTransform
which will probably be needed by the plugin. Mainly, these are:
TransformType, TransformExpr, TransformStmt. The implementation is only a
simple wrapper.
    - Have Sema list all the plugins for custom statements, and associate
an internal CustomStmtId with each of them.
    - CustomTokenHandlers may return a CustomStmt (or subclasses of them)
with the correct CustomStmtId, to have them handled by the corresponding
CustomStmtHandler.
    - The TreeTransform, when transforming CustomStmt/CustomExpr, will
retrieve the handler from Sema, create a TransformPluginEntryPoint, and
call the Transform function.
    - We add entry points for CustomStmt/CustomExpr where necessary. For
now relevant places seems to be ASTReader/WriterStmt, StmtPrinter and
StmtProfile.
    - Still not 100% sure about this, but the internal CustomStmtId may be
a StringRef chosen by the plugin instead of a runtime numeric id - it would
allow a straightforward way for the CustomTokenHandler to know the id of
the CustomStmt they want to create, and be able to serialize/deserialize
the statements in ASTReader/WriterStmt, but will add some overhead to do
the correspondence

 All this should add no overhead when not using custom tokens. When using
them, it will add virtual calls for the custom tokens and statements only.

Do you:
 - Think adding support for custom built-in functions in plugins is a
reasonable objective?
 - Think my approach is viable?
 - Have any comment/advice?
 - Have a workaround to do this with my modifications?

If it's ok, I can start sending some small patches in review for the
details of the implementation.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160712/0fd8427e/attachment.html>


More information about the cfe-dev mailing list