[LLVMdev] GSOC - Use more StringRef in clang.

Kevin Cox kevincox at kevincox.ca
Mon Mar 17 10:22:28 PDT 2014


Thanks for the feedback David.

I have created a quick draft of my proposal and would appreciate any
feedback.


GSOC Proposal -- StringRef'ize APIs
===================================

Background
----------

LLVM provides a StringRef class that quite simply references a string
(arbitrary byte buffer).  Using llvm::StringRef copies can be avoided
when a string passed into a function is only going to be read.  While
this class is used in Clang there are still places where std::string are
used.  By replacing these with llvm::StringRef copies can be avoided,
improving performance.  Furthermore there are places where a const char*
is used, in these situations llvm::StringRef can be used to improve
safety and convenience through use of llvm::StringRef's already written
functions and bounds checks when assertions are enabled.

Project Goals
-------------

The goal of this project is to replaces these uses of const std::string
and const char* inside Clang with llvm::StringRef.  Furthermore
std::string and char* that are not marked const will also be considered
to see if they do not need to be modified, in which case a StringRef
will be used as well.

This project entails changing both the headers and implementation to use
llvm:StringRefs as well as updating the documentation.  Other goals of
the project include no API breakage (unless discussed specifically) and
maintaining good performance.  As this project is largely performance
focused if other performance improvements are possible while changing
the code they will likely be explored as well.

Another important part of the project is allowing llvm::StringRefs to be
used in more places rather than being converted everywhere.  This is
where the real benefits of StringRefs are, allowing them to be passed
far down the call stack.

Who am I
--------

Hello, I am Kevin Cox, a Canadian student from Carleton University.  By
this summer I will be in a Third Year Software Engineering standing.

I have been using Linux for over 6 years and compiling my C and C++ code
with clang for nearly that long.  I love working with low level code and
really enjoy squeezing all of the performance out of it.  I think that
LLVM and Clang are great projects and am glad to have an opportunity to
help out.

Contact
-------

Communication is vitally important to success, especially when working
with a new code base.  To facilitate quick communication I will be
reading email constantly and idling in IRC whenever I am working on my
project (and most likely more often than that).  I also have a good
quality microphone so voice and video chats are a viable option for live
communication.

Email: kevincox at kevincox.ca
XMPP:  kevincox at kevincox.ca
PGP:   E394 3366 624E 7449 B9B4 85AE C075 8A3B 34D5 2E74
IRC:   kevincox
Phone: <omitted>

In addition to regular communication I propose weekly or bi-weekly
meetings with my mentor to keep in touch and ensure the project is
moving forward.

Goals
-----

Google Summer of Code is a three month program.  Over the course of
three months I will be working to convert as many APIs as possible to
use llvm::StringRef.  While I have identified a number of APIs that can
be converted to use StringRefs listing them would be a waste of time and
energy.  Instead, I have used some incredible high tech methods to count
the uses of std::string& and const char* in Clang and hope to reduce
that number.  Please note that not all of these matches can be converted
based on quick analysis about 1/8 to 1/4 of the functions can be converted.

% grep 'const\s*char\s*\\*' **/*.{h,cpp} | wc -l
3272
% grep 'std::string\s*&' **/*.{h,cpp} | wc -l
506

Throughout the summer I expect to significantly reduce these numbers.  I
am also going to create and maintain a document of APIs that can and
can't be converted (think of a tri-color collector) and work through
this document throughout the summer.  Idealy, by the end of the summer
all APIs that can be converted have been and call sites have been
updated to take advantage of the new APIs.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 295 bytes
Desc: OpenPGP digital signature
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140317/414a9469/attachment.sig>


More information about the llvm-dev mailing list