granicus.if.org Git - recode/commit

author	Reuben Thomas <rrt@sc3d.org>
	Wed, 17 Jan 2018 22:43:10 +0000 (22:43 +0000)
committer	Reuben Thomas <rrt@sc3d.org>
	Tue, 23 Jan 2018 07:02:42 +0000 (07:02 +0000)
commit	5f430e94ae31c4192be9c28d8f01d01591998537
tree	0867b644393ebfa76bb683a2a2461b4b778f9c2a	tree \| snapshot
parent	66876d7f13cb943a190bef6fb992ea4c8e4a8233	commit \| diff

Try to diagnose untranslatable input when using iconv

See Debian bug #348909.

The problem starts with the fact that iconv returns EILSEQ (invalid input)
when in fact the input is merely untranslatable.

It is possible to diagnose this situation by running another conversion with
the output encoding the same as the input (so that it will always succeed on
valid input) at the same point. This is what we now do. Unfortunately,
there’s no way I can see to work out how much input to skip (i.e. the length
of the untranslatable character in the source encoding). Hence, we still
just skip one byte. The typical result is that invalid input is diagnosed on
the next step, resulting in the same problem as at present.

Two possible workarounds are to not use iconv, or to set abort_level to
RECODE_UNTRANSLATABLE (this is what test_2 in t80_error.py does).

doc/recode.texi		diff \| blob \| history
src/iconv.c		diff \| blob \| history
tests/t80_error.py		diff \| blob \| history