[llvm] Create a CharSetConverter class with both iconv and icu support (PR #74516)

Hubert Tong via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 26 10:41:35 PDT 2024


================
@@ -0,0 +1,136 @@
+//===-- CharSet.h - Characters set conversion class ---------------*- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// This file provides a utility class to convert between different character
+/// set encodings.
+///
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_CHARSET_H
+#define LLVM_SUPPORT_CHARSET_H
+
+#include "llvm/ADT/SmallString.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Config/config.h"
+#include "llvm/Support/ErrorOr.h"
+
+#include <functional>
+#include <string>
+#include <system_error>
+
+namespace llvm {
+
+template <typename T> class SmallVectorImpl;
+
+namespace details {
+class CharSetConverterImplBase {
+public:
+  virtual ~CharSetConverterImplBase() = default;
+
+  /// Converts a string.
+  /// \param[in] Source source string
+  /// \param[out] Result container for converted string
+  /// \param[in] ShouldAutoFlush Append shift-back sequence after conversion
+  /// for stateful encodings if true.
----------------
hubert-reinterpretcast wrote:

If there is no reason to avoid the shift-back sequence (and resetting the current shift state for translating input), then `ShouldAutoFlush` should be removed as a parameter.

The description of the function could be extended to clarify that the output stream and the beginning of the source string is (for stateful encodings) assumed to be in the initial shift state.

https://github.com/llvm/llvm-project/pull/74516


More information about the llvm-commits mailing list