[LLVMbugs] [Bug 13759] New: codecvt::in() incorrectly handles partial multibyte inputs

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Mon Sep 3 22:14:37 PDT 2012


http://llvm.org/bugs/show_bug.cgi?id=13759

             Bug #: 13759
           Summary: codecvt::in() incorrectly handles partial multibyte
                    inputs
           Product: libc++
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: All Bugs
        AssignedTo: hhinnant at apple.com
        ReportedBy: philippuryear at gmail.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified


Created attachment 9151
  --> http://llvm.org/bugs/attachment.cgi?id=9151
testcase

When supplied with a partial multibyte input (i.e. a multibyte prefix),
codecvt::in() returns codecvt_base::error instead of codecvt_base::partial.

Steps to reproduce:
1. Compile the attached testcase with e.g.
       $ clang -std=c++11 -stdlib=libc++ -o test test.cc
   and run it.

Expected results: The testcase should print "partial".
Actual results: The testcase prints "error".

Additional notes:
I've reproduced this bug on both my Mac and Linux systems using the above
testcase, which calls codecvt::in() on the first byte of the three-byte UTF-8
string "ℝ".

The C++11 standard (22.4.1.4.2) seems to imply that codecvt_base::partial
should be returned, since "additional source elements are needed before another
destination element can be produced". However, in the above testcase, libstdc++
4.7 on my Linux machine returns codecvt_base::ok, so I'm not sure what the
correct behavior is.

The code responsible (locale.cpp:1386) seems to have some inverted logic. In
particular,
> n = mbsnrtowcs(...);
> if (n == size_t(-1))
>     ...
> if (n == 0)
>     return error;
seems wrong, since mbsnrtowcs() returns -1 on error, not 0 (which simply
indicates that the length of the wide output string is 0).

I'm *fairly* sure this is a bug, however I'm kind of poking around in the dark
here, so please correct me if this is the intended behavior.

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list