[libc-commits] [PATCH] D85779: [libc] Add strtok_r implementation.

Siva Chandra via Phabricator via libc-commits libc-commits at lists.llvm.org
Wed Aug 12 22:26:26 PDT 2020

sivachandra added a comment.

LGTM but not yet accepting as I asked for some extension to the tests.

Comment at: libc/src/string/string_utils.h:40
+// terminating character is reached, returns a nullptr.
+inline char *string_token(char *src, const char *delimiter_string,
+                          char **saveptr) {

Comment at: libc/src/string/strtok_r.cpp:16
+char *LLVM_LIBC_ENTRYPOINT(strtok_r)(char *src, const char *delimiter_string,
+                                     char **saveptr) {
cgyurgyik wrote:
> cgyurgyik wrote:
> > abrachet wrote:
> > > cgyurgyik wrote:
> > > > cgyurgyik wrote:
> > > > > cgyurgyik wrote:
> > > > > > abrachet wrote:
> > > > > > > sivachandra wrote:
> > > > > > > > Would it make sense to implement `strtok` and `strtok_r` over a common function which is equivalent in functionality to `strtok_r`?
> > > > > > > Making the prototype for this function `restrict` but not the function itself doesn't allow for optimizations. `src` and `delimiter_string` should be marked `__restrict` here and I imagine in a couple of other string functions too
> > > > > > Yeah you're right, I haven't been using `__restrict` at all. I should've probably asked about this at an earlier point, wasn't sure if there was a reason to use it or not. Will fix this here, and other places in future revisions as well.
> > > > > I'm struggling to get a static inline function working for both a static char * and a char ** without running into lifetime issues with the `saveptr`. Was there something you had in mind? Maybe I'm missing something simple here.
> > > > One way I could do this is to have the utility function return a struct `{char *token, char *strtok_string}`, and then set the `strtok_string` to the respective value, `static char *` for strtok, and `dereferenced char**` for strtok_r. Then we can return the `struct.token`.
> > > `strtok` can just be
> > > ```lang=cpp
> > > char *LLVM_LIBC_ENTRYPOINT(strtok)(char *src, const char *delimiter_string) {
> > >   static char *strtok_str;
> > >   return __llvm_libc::strtok_r(src, delimiter_string, &strtok_str);
> > > }
> > > ```
> > > There is no lifetime issue.
> > Yep... that's the simple solution I avoided for much too long. I was trying to do the reverse. Will fix. Thanks abrachet :)
> Left a TODO for myself to address this in a separate revision, since this applies to other functions as well.
SGTM. And sorry I missed these in previous patches.

Comment at: libc/test/src/string/strtok_r_test.cpp:11
+#include "utils/UnitTest/Test.h"
+TEST(StrTokReentrantTest, NoTokenFound) {
While having these differently named tests is fine, I am mildly concerned with the fact that when I first read them, the tests seems incomplete. For example, most tests are making only a single call to `strtok_r`.  The results of subsequent calls to `strtok_r` are not tested. I suggest putting all of this in one test called `Various` with each of the tests you have in their own comment annotated scoped block. For example:

Test(StrTokRTest, Various) {
  { // Empty source string and delimiter string
    char empty[] = "";
    char *reserve = nullptr;
    ASSERT_STREQ(__llvm_libc::strtok_r(empty, "",  &reserve), nullptr);
    // Another call to ensure that a bad state was not written in to |reserve|.
    ASSERT_STREQ(__llvm_libc::strtok_r(empty, "",  &reserve), nullptr);
    ASSERT_STREQ(__llvm_libc::strtok_r(nullptr, "",  &reserve), nullptr);

That said, putting everything in one test is my personal preference. You can choose to the keep the different test classes as you have currently, but include follow up calls to `strtok_r` as suggested in the above example.

  rG LLVM Github Monorepo



More information about the libc-commits mailing list