Found this message in my own commit logs for the console rubygem and I figured I’d post it here in case some other poor fool
is stuck writing code with mbrtowc:
Well this is a fun little feature of mbrtowc that I discovered! Give it one bad input and it will barf on all future inputs for the rest of the program, unless you pass in your own cleared shift state object.
The mbrtowc manpage is helpfully vague:
“If the multibyte string starting at s contains an invalid multibyte sequence before the next complete character, mbrtowc() returns (size_t) -1 and sets errno to EILSEQ. In this case, the effects on *ps are undefined.”
and
“If ps is a NULL pointer, a static anonymous state only known to the mbrtowc function is used instead. Otherwise, *ps must be a valid mbstate_t object.”
The reader is left to infer, of course, that the “undefined effects” of the “static anonymous state only known to the mbrtowc function” are that of “breaking all successive calls for the rest of the execution of the program”.