From: Andy Heninger Date: Fri, 3 Feb 2017 02:46:43 +0000 (+0000) Subject: ICU-12870 Charset Detector, have docs reference the Compact Encoding Detector. X-Git-Tag: release-59-rc~155 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=415932a1e3bb86d12c297b172a903d89af861352;p=icu ICU-12870 Charset Detector, have docs reference the Compact Encoding Detector. X-SVN-Rev: 39640 --- diff --git a/icu4c/source/common/unicode/docmain.h b/icu4c/source/common/unicode/docmain.h index 478856366c4..698e2ae596c 100644 --- a/icu4c/source/common/unicode/docmain.h +++ b/icu4c/source/common/unicode/docmain.h @@ -98,6 +98,11 @@ * C API * * + * Codepage Detection + * ucsdet.h + * C API + * + * * Unicode Text Compression * ucnv.h
(encoding name "SCSU" or "BOCU-1") * C API diff --git a/icu4c/source/i18n/unicode/ucsdet.h b/icu4c/source/i18n/unicode/ucsdet.h index 73f0ab587da..0d0bc3186d6 100644 --- a/icu4c/source/i18n/unicode/ucsdet.h +++ b/icu4c/source/i18n/unicode/ucsdet.h @@ -45,6 +45,10 @@ * in a single language, and a minimum of a few hundred bytes worth of plain text * in the language are needed. The detection process will attempt to * ignore html or xml style markup that could otherwise obscure the content. + *

+ * An alternative to the ICU Charset Detector is the + * Compact Encoding Detector, https://github.com/google/compact_enc_det. + * It often gives more accurate results, especially with short input samples. */ @@ -395,7 +399,7 @@ ucsdet_getDetectableCharsets(const UCharsetDetector *ucsd, UErrorCode *status); /** * Enable or disable individual charset encoding. * A name of charset encoding must be included in the names returned by - * {@link #getAllDetectableCharsets()}. + * {@link #ucsdet_getAllDetectableCharsets()}. * * @param ucsd a Charset detector. * @param encoding encoding the name of charset encoding.