[PATCH] D67867: [libc] Add few docs and implementation of strcpy and strcat.

Siva Chandra via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 27 13:48:45 PDT 2019


sivachandra added inline comments.


================
Comment at: libc/docs/header_generation.md:9
+problems in llvm-libc, we use a header generation mechanism. The mechanism is
+driven by a *header configuration language*.
+
----------------
theraven wrote:
> I'm not sure I understand how this will work.  For existing code, it's common to have to do things like `#define` or `#undef` macros like `_GNU_SOURCE` for a single compilation unit before including libc headers.  There are some annoying differences in philosophy between the header exports: most libc implementations export all interfaces but provide a mechanism for hiding non-standard (or later-standard) ones from portable code, glibc exports only standard functions (though I don't recall which set of standards) and requires feature macros to expose others.  Code that works with glibc and other libc implementations ends up jumping through hoops to support both models.  Are we adding a third mechanism?  Does this work for projects that want to use only standard interfaces in most compilation units but some non-standard extensions in their platform abstraction layer?
This is kind of related to the other question you have asked below. I have tried to address the two questions together below.


================
Comment at: libc/docs/header_generation.md:21
+2. Replace the line on which a command occurs with some other text as directed
+by the command. The replacment text can span multiple lines.
+
----------------
theraven wrote:
> This sounds like you will end up with only one set of headers per configuration, so you lose the ability to have different projects using the same generated headers but enforcing different sets of standards compliance in their use of the interfaces.
Yes, that is the general direction in which this is going. We are making the headers for a configuration much simpler to navigate at the cost of having multiple sets of headers. In this day and age, I do not think forcing multiple sets of header files is a bad thing. Note also that users' build systems already have the knowledge and capability to handle multiple configurations. Hence, we are not making the build systems any more complicated.

This is not what traditional libcs have done. So, yes we are introducing a "third mechanism". At the same time, one can also argue that we are doing away with such mechanisms as we require that each configuration have its own set of header files.


================
Comment at: libc/docs/implementation_standard.md:79
+a post build step. For example, for the `round` function, one can use `objcopy`
+to add an alias symbol as follows:
+
----------------
theraven wrote:
> There are a few things that are unclear to me in this description:
> 
> 1. How do we express the standards to which an entrypoint conforms?  For example, a function defined in C11 or POSIX2008?
> 2. How do we differentiate between things that we want to be preemptible versus things that we don't?  If we want to call the preemptible version of a symbol in other libc code, will we have the `::foo` symbol visible at library build time?
> 2. How are we exposing information for building subsets of the implementation that avoid dependencies on certain platform features?  For example, a CloudABI-compatible mode that does not provide (or consume) any functions that touch the global namespace.
Answers as per numbers in the comment.

1. We can do it in the public header file, or in the implementation .cpp files. Or, at both the places. Did I understand this question correctly? Is this question related to or similar to #3 below? Like, are you asking as to how we will add a new function without breaking the old standard? If yes, then the %%include mechanism is present to accommodate such scenarios: we start with a baseline standard and %%include new standards until they become baseline.
2. I frankly do not have a good answer and would prefer someone who cares about this use case to contribute. May be a real world example can help me think about this more clearly.
3. For the header files, we have the %%include_file mechanism. For the library files, we pick and choose to compose a suitable library target. For example, like the one in lib/CMakeLists.txt of this patch.

At some level, my answers are only guessing about how things would evolve. So, I wouldn't be surprised if my answers here aren't valid or relevant even in say 3 months from now.


================
Comment at: libc/include/ctype.h:14
+extern "C" {
+#endif
+
----------------
theraven wrote:
> BSD libcs include a `cdefs.h` that provides macros such as `__BEGIN_DECLS` and `__END_DECLS` wrapping this kind of pattern, and `__restrict` for language keywords that should not be in headers in certain versions of the standard.  It looks as if `__support/common.h` is the equivalent - we should probably have an explicit rule that this header is included first in all libc headers and provide sensible helpers there.
All this is great, so I have incorporated them now.


================
Comment at: libc/include/math.h:20
+
+long double acosl(long double);
+
----------------
theraven wrote:
> The `long long` functions should be exposed only for C99 or later and a version of C++ that supports the `long long` type.  
We are C17 and higher already. Should we still have such conditions?


================
Comment at: libc/include/string.h:12
+
+#include <stddef.h> // For size_t
+
----------------
theraven wrote:
> Namespace pollution.  The standard expects `size_t` to be exposed by this header, but not the other types in `stddef.h`.  Software that relies on this pollution is non-portable (and will break on existing libc implementations that follow the standard).
Fixed.


================
Comment at: libc/include/string.h:18
+
+void *memcpy(void *, const void *, size_t);
+
----------------
theraven wrote:
> Missing `restrict` qualifiers on this and many other standard functions in this file.  The standard defines `memcpy` as:
> 
> ```
> void *memcpy(void * restrict s1,
> const void * restrict s2,
> size_t n);
> ```
> 
> I'm surprised that clang doesn't warn about the declaration of `memcpy` with incorrect types - it usually notices missing qualifiers.
I used a script which scanned the standard document to produce these headers and missed restrict. I guess clang did not find it because we do a C++ compilation of the implementations.


================
Comment at: libc/src/string/strcpy/strcpy_test.cpp:16
+  std::string abc = "abc";
+  char *dest = new char[4];
+
----------------
theraven wrote:
> Having the libc test suite depend on a working C++ runtime and standard library is likely to make unit testing the library difficult.  The implementation of `std::string` almost certainly depends on several `libc` functions and `new` and `delete` probably do as well.  This means that we can test the namespaced libc functions in an environment that already has a libc, but we can't easily extend these tests to run in a process (or other environment) that has only LLVM libc.
Using gtest already brings in the C++ runtime. Should that also be avoided?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67867/new/

https://reviews.llvm.org/D67867





More information about the llvm-commits mailing list