[LLVMbugs] [Bug 13602] basic_filebuf's internal buffer is shrinking when using with some codecvt.
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Fri Aug 17 13:52:48 PDT 2012
http://llvm.org/bugs/show_bug.cgi?id=13602
Hyeon-Bin Jeong <tuhertz at gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|INVALID |
--- Comment #2 from Hyeon-Bin Jeong <tuhertz at gmail.com> 2012-08-17 15:52:48 CDT ---
I think i'm doing exactly what you are saying and what the standard intend. (Am
i missing something?)
I'm trying to make a codecvt which converts external char type (UTF-8) to
internal char32_t type (UTF-32).
UTF-8 has 1~4 bytes so it's N:1 conversion. When overflow() is called, it fills
4096 bytes external buffer with char(UTF-8) sequence from FILE object. and then
convert them into internal buffer with char32_t(UTF-32) characters by calling
in().
__r = __cv_->in(__st_, __extbuf_, __extbufend_, __extbufnext_,
this->eback() + __unget_sz,
this->egptr(), __inext);
But it produce only about 1300 char32_t characters when converting asian
characters because most of asian language has 3 bytes character width in UTF-8.
So __inext move only a third of buffer size;
The problem happens when it calls setg after a few line below.
this->setg(this->eback(), this->eback() + __unget_sz, __inext);
This line sets (internal) buffer end(i.e. __einp_) to __inext. So after this
line, egptr() returns position at one third of the way from __intbuf_ to
__intbuf+__ibs_.
When next underflow called, It calculate read size __nmemb by egptr() -
eback(), so it load only 33% of external buffer! As a result, buffer size keep
shrinking on each underflow() call until it's size to be 1 byte.
Here is sample code from standard documents and it use intern_buf+ISIZE as
buffer end, not egptr().
char extern_buf[XSIZE];
char* extern_end;
charT intern_buf[ISIZE];
charT* intern_end;
codecvt_base::result r =
a_codecvt.in(state, extern_buf, extern_buf+XSIZE, extern_end,
intern_buf, intern_buf+ISIZE, intern_end);
And one thing more. I think seekpos() should restore the state from position
argument. seekoff() save current state into position and return it, so
seekpos() need to restore state from it.
--
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
More information about the llvm-bugs
mailing list