[PATCH] D88741: [SystemZ/z/OS] Add utility class for char set conversion.

Hubert Tong via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Nov 6 21:01:32 PDT 2021


hubert.reinterpretcast added inline comments.


================
Comment at: llvm/CMakeLists.txt:386
 
+set(LLVM_ENABLE_ICONV "ON" CACHE STRING "Use iconv for for character set conversion if available. Can be ON, OFF, or FORCE_ON")
+
----------------
Minor nit: s/for for/for;


================
Comment at: llvm/include/llvm/Support/CharSet.h:76
+  ///
+  /// The following error codes can occuor, among others:
+  ///   - std::errc::invalid_argument: The requested conversion is not
----------------
Minor nit: s/occuor/occur/;


================
Comment at: llvm/include/llvm/Support/CharSet.h:90
+
+  CharSetConverter &operator=(CharSetConverter &&Other) {
+    this->Convert = Other.Convert;
----------------
If the current object had a cleanup action on entry to the move assignment operator, then that cleanup action should be run prior to replacing the value.


================
Comment at: llvm/include/llvm/Support/CharSet.h:107
+  ///
+  /// The following error codes can occuor, among others:
+  ///   - std::errc::argument_list_too_long: The result requires more than
----------------
Here too: s/occuor/occur/;


================
Comment at: llvm/include/llvm/Support/CharSet.h:42
+};
+}
+
----------------
ZarkoCA wrote:
> nit: may be more clear to add the ending comment.
Minor comment still to address here.


================
Comment at: llvm/include/llvm/Support/CharSet.h:98
+  /// \param[in] Source source string
+  /// \param[in,out] Result container for converted string
+  /// \return error code in case something went wrong
----------------
Kai wrote:
> hubert.reinterpretcast wrote:
> > The expectations around null termination and shift state should be documented. Note that https://wg21.link/p2029 recommends that numeric escapes do not trigger changes in shift state, so avoiding automatic transitions to the initial shift state at the end of the (potential mid-string) translation is probably the right thing to do. Using a null `StringRef` as the method to cause a return to the initial shift state would be consistent with the `iconv` interface.
> I updated the comment (and also the implementation) that an empty string resets the shift state. However, this feels a bit clumsy, and leaks a lot of the details of the iconv interface. So I wonder if a new method, e.g. `resetState()` would not be the better approach.
I think that empty string is maybe not unique enough for this operation, so yes, a new method is probably better. However, it may be useful to understand how using `StringRef("", 1u)` as the input fits in.


================
Comment at: llvm/lib/Support/CharSet.cpp:135
+        Ch = Ch2 | (Ch << 6);
+        Length--;
+      }
----------------



================
Comment at: llvm/lib/Support/CharSet.cpp:163
+  Result.resize_for_overwrite(Capacity);
+  char *Output = InputLength ? static_cast<char *>(Result.data()) : nullptr;
+  size_t OutputLength = Capacity;
----------------
It may be useful for the client of the interface to be able to retrieve the shift sequence (if any) to the initial shift state from the current internal output shift state. Although that use case might be achieved by using `StringRef("", 1u)`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88741/new/

https://reviews.llvm.org/D88741



More information about the llvm-commits mailing list