[libcxx-commits] [PATCH] D64818: [libcxx] Construct path using a instance of std::locale
Jean Guegant via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Mon Jul 22 12:55:04 PDT 2019
jguegant marked 8 inline comments as done.
jguegant added a comment.
Thanks a lot @Quuxplusone! I am aware that you cannot take any final decision on what gets in but these are really helpful comments here.
I will fix some of you remarks and upload a better diff (with proper context to get a better picture) afterwards.
Any way to gently attract the three reviewers here?
================
Comment at: libcxx/include/__locale:1367
+ const int __sz = 32;
+ _InternT __buf[__sz];
+ _InternT* __bn;
----------------
Quuxplusone wrote:
> Peanut gallery says: I'm not 100% sure of the rules about `const`. I observe that this line seems to be okay even in `-pedantic` mode, but intuitively I would expect `__buf` to be a VLA. IMHO it would be clearer to make `__sz` a `constexpr int` instead of just a `const int` — or even my generally preferred formulation,
>
> _InternT __buf[32];
> constexpr ptrdiff_t __sz = sizeof(__buf) / sizeof(__buf[0]);
This function is a dumb generalisation of the functions 1389, 1426 which had this pattern.
Correct me if I am wrong but this will work as C-arrays have such syntax:
```
D1 [ constant-expressionopt ] attribute-specifier-seqopt
```
Where:
```
The constant-expression shall be a converted constant expression of type std::size_t ([expr.const]).
```
Here const int should work thanks to a special rule for const-qualified integrals:
```
A variable is usable in constant expressions after its initializing declaration is encountered if it is a constexpr variable, or it is of reference type or of const-qualified integral or enumeration type, and its initializer is a constant initializer.
```
================
Comment at: libcxx/include/__locale:1374
+ __throw_runtime_error("locale not supported");
+ for (const _InternT* __p = __buf; __p < __bn; ++__p, ++__s)
+ *__s = (wchar_t)*__p;
----------------
Quuxplusone wrote:
> `__s` is of user-provided type. Depending on level of library-maintainer paranoia, you might want to make this `++__p, void(), ++__s`, or pull `++__s` down into the body of the for-loop (and add curly braces). The current formulation will call a user-defined `operator,(const char*, _OutputIterator)` if one exists. (And will perform ADL to find out if one exists, regardless.)
Good idea! Now that this function accept more than one type, it seems a reasonable to be paranoid.
================
Comment at: libcxx/include/filesystem:749
+template <class _InputIt,
+ class = typename enable_if<!is_constructible<const char*, _InputIt>::value>::type>
+inline _LIBCPP_INLINE_VISIBILITY
----------------
Quuxplusone wrote:
> `class = _EnableIf<!is_constructible<const char*, _InputIt>::value>`
> But also, how is that constructibility relevant? Does the Standard say something like "If the input iterator is explicitly convertible to `const char*` then do X, else do Y"?
>
> Furthermore, this codepath is going to do heap-allocation with `string __s(__b, __e)`. Can we avoid the heap-allocation somehow?
The idea here is to give the priority to the overload with `const char*` which effectively make both `char*` and `const char*`being routed by the first overload line 739. Obviously, my intent is not really clear here, I wonder how I could express that in a better way if I continue in that direction. Maybe having 3 overloads? const char*, char* and generic? A more descriptive trait name?
================
Comment at: libcxx/include/filesystem:753
+ const string __s(__b, __e);
+ return __widen_from_char_iter_pair(__s.data(), __s.data() + __s.size(), __loc);
+}
----------------
Quuxplusone wrote:
> I'm worried about the way you use overloading on the name `__widen_from_char_iter_pair`. If some well-intentioned maintainer removed the `const` on line 752, then line 753 would turn into an infinite recursion. I don't see any reason to use overloading here. Maybe change `__widen_from_char_iter_pair` on lines 739, 753, 766, 776, 793, and 811 to `__widen_from_char_pointer_pair`? That would leave only line 889 as a potential source of extra heap-allocations...
I believe that this would not create an infinite recursion, the enable_if is sending both char* and const char* to the first overload line 739.
I am not sure if there is a smart way to avoid allocation here. Maybe if I call `codecvt::in` on every character from the iterator. But wouldn't that be slow? What happen if your code-point is made of 3-4 char instead of one? Should I then copy X amounts of characters into a buffer allocated on the stack and send process the iterator chunks by chunks?
I have a similar issue line 808 and if anyone has a clever solution, I am all for it!
I had a look at how other constructors for path already in place (like line 681 or 692) do it and they already heap-allocate on even more code-paths:
```
template <class _Iter>
static void __append_range(string& __dest, _Iter __b, _Iter __e) {
static_assert(!is_same<_Iter, _ECharT*>::value, "Call const overload");
if (__b == __e)
return;
basic_string<_ECharT> __tmp(__b, __e);
_Narrower()(back_inserter(__dest), __tmp.data(),
__tmp.data() + __tmp.length());
}
```
If we go in your direction, 889 would now heap-allocation in all scenarios, whereas now the 889 overload supplied with const char* and char* would avoid allocating a string.
================
Comment at: libcxx/include/filesystem:792
+ for (; *__e != '\0'; ++__e)
+ ;
+ return __widen_from_char_iter_pair(__b, __e, __loc);
----------------
Quuxplusone wrote:
> This could probably be more simply expressed as
> ```
> string_view __sv(__b);
> return __widen_from_char_iter_pair(__sv.data(), __sv.data() + __sv.size(), __loc);
> ```
Neat! Will do that.
================
Comment at: libcxx/include/filesystem:838
+ using _EnableIfWidenableFromCharSource =
+ typename enable_if<__is_widenable_from_char_source<_SourceOrIter>::value, _Tp>::type;
+
----------------
Quuxplusone wrote:
> ```
> template <class _SourceOrIter>
> using _EnableIfWidenableFromCharSource =
> _EnableIf<__is_widenable_from_char_source<_SourceOrIter>::value>;
> ```
> You don't seem to use the `_Tp` parameter for anything at the moment.
Indeed, I will remove it!
================
Comment at: libcxx/test/std/input.output/filesystems/class.path/path.member/path.construct/source_and_locale.pass.cpp:32
+template <class ...Args>
+void RunTestCase(const char* TestPath, const char* Expect, std::locale Locale, Args... args) {
+ using namespace fs;
----------------
Quuxplusone wrote:
> Should `std::locale` be passed as `const std::locale&` instead? Does it matter? (I don't know.)
I was thinking that locale is a pretty cheap type (a sort of reference counted pointer), but I might be wrong. Taking it by reference cannot make it worst ;)
================
Comment at: libcxx/test/std/input.output/filesystems/class.path/path.member/path.construct/source_and_locale.pass.cpp:97
+ static_assert(std::__is_input_iterator<It>::value, "");
+ //static_assert(std::is_constructible<path, It, std::locale>::value, "");
+ }
----------------
Quuxplusone wrote:
> Why is this commented out?
This should not be here. Thanks for spotting it!
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D64818/new/
https://reviews.llvm.org/D64818
More information about the libcxx-commits
mailing list