Issue #13617: Document that the result of the conversion of a Unicode object to

author Victor Stinner <victor.stinner@haypocalc.com>

Sun, 18 Dec 2011 18:30:55 +0000 (19:30 +0100)

committer Victor Stinner <victor.stinner@haypocalc.com>

Sun, 18 Dec 2011 18:30:55 +0000 (19:30 +0100)
author Victor Stinner <victor.stinner@haypocalc.com>
Sun, 18 Dec 2011 18:30:55 +0000 (19:30 +0100)
committer Victor Stinner <victor.stinner@haypocalc.com>
Sun, 18 Dec 2011 18:30:55 +0000 (19:30 +0100)
diff --cc Doc/ACKS.txt
Simple merge
diff --cc Doc/c-api/unicode.rst

index a6f3a69bfe3ec62a935aa019b3c3c2cf5953660a,35006547c6469020a91e42118a85daa7f62a22ff..43e3d2fef23b271a6b459f9cb38066a2ebfd42ba
--- 1/Doc/c-api/unicode.rst
--- 2/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@@ -527,158 -328,31 +527,164 @@@ APIs
      Identical to :c:func:`PyUnicode_FromFormat` except that it takes exactly two
      arguments.
   
-    :c:type:`Py_UNICODE` buffer, *NULL* if *unicode* is not a Unicode object.
-    This will create the :c:type:`Py_UNICODE` representation of the object if it
-    is not yet available.
+ +
+ +.. c:function:: PyObject* PyUnicode_FromEncodedObject(PyObject *obj, \
+ +                               const char *encoding, const char *errors)
+ +
+ +   Coerce an encoded object *obj* to an Unicode object and return a reference with
+ +   incremented refcount.
+ +
+ +   :class:`bytes`, :class:`bytearray` and other char buffer compatible objects
+ +   are decoded according to the given *encoding* and using the error handling
+ +   defined by *errors*. Both can be *NULL* to have the interface use the default
+ +   values (see the next section for details).
+ +
+ +   All other objects, including Unicode objects, cause a :exc:`TypeError` to be
+ +   set.
+ +
+ +   The API returns *NULL* if there was an error.  The caller is responsible for
+ +   decref'ing the returned objects.
+ +
+ +
+ +.. c:function:: Py_ssize_t PyUnicode_GetLength(PyObject *unicode)
+ +
+ +   Return the length of the Unicode object, in code points.
+ +
+ +   .. versionadded:: 3.3
+ +
+ +
+ +.. c:function:: int PyUnicode_CopyCharacters(PyObject *to, Py_ssize_t to_start, \
+ +                        PyObject *to, Py_ssize_t from_start, Py_ssize_t how_many)
+ +
+ +   Copy characters from one Unicode object into another.  This function performs
+ +   character conversion when necessary and falls back to :c:func:`memcpy` if
+ +   possible.  Returns ``-1`` and sets an exception on error, otherwise returns
+ +   ``0``.
+ +
+ +   .. versionadded:: 3.3
+ +
+ +
+ +.. c:function:: int PyUnicode_WriteChar(PyObject *unicode, Py_ssize_t index, \
+ +                                        Py_UCS4 character)
+ +
+ +   Write a character to a string.  The string must have been created through
+ +   :c:func:`PyUnicode_New`.  Since Unicode strings are supposed to be immutable,
+ +   the string must not be shared, or have been hashed yet.
+ +
+ +   This function checks that *unicode* is a Unicode object, that the index is
+ +   not out of bounds, and that the object can be modified safely (i.e. that it
+ +   its reference count is one), in contrast to the macro version
+ +   :c:func:`PyUnicode_WRITE_CHAR`.
+ +
+ +   .. versionadded:: 3.3
+ +
+ +
+ +.. c:function:: Py_UCS4 PyUnicode_ReadChar(PyObject *unicode, Py_ssize_t index)
+ +
+ +   Read a character from a string.  This function checks that *unicode* is a
+ +   Unicode object and the index is not out of bounds, in contrast to the macro
+ +   version :c:func:`PyUnicode_READ_CHAR`.
+ +
+ +   .. versionadded:: 3.3
+ +
+ +
+ +.. c:function:: PyObject* PyUnicode_Substring(PyObject *str, Py_ssize_t start, \
+ +                                              Py_ssize_t end)
+ +
+ +   Return a substring of *str*, from character index *start* (included) to
+ +   character index *end* (excluded).  Negative indices are not supported.
+ +
+ +   .. versionadded:: 3.3
+ +
+ +
+ +.. c:function:: Py_UCS4* PyUnicode_AsUCS4(PyObject *u, Py_UCS4 *buffer, \
+ +                                          Py_ssize_t buflen, int copy_null)
+ +
+ +   Copy the string *u* into a UCS4 buffer, including a null character, if
+ +   *copy_null* is set.  Returns *NULL* and sets an exception on error (in
+ +   particular, a :exc:`ValueError` if *buflen* is smaller than the length of
+ +   *u*).  *buffer* is returned on success.
+ +
+ +   .. versionadded:: 3.3
+ +
+ +
+ +.. c:function:: Py_UCS4* PyUnicode_AsUCS4Copy(PyObject *u)
+ +
+ +   Copy the string *u* into a new UCS4 buffer that is allocated using
+ +   :c:func:`PyMem_Malloc`.  If this fails, *NULL* is returned with a
+ +   :exc:`MemoryError` set.
+ +
+ +   .. versionadded:: 3.3
+ +
+ +
+ +Deprecated Py_UNICODE APIs
+ +""""""""""""""""""""""""""
+ +
+ +.. deprecated-removed:: 3.3 4.0
+ +
+ +These API functions are deprecated with the implementation of :pep:`393`.
+ +Extension modules can continue using them, as they will not be removed in Python
+ +3.x, but need to be aware that their use can now cause performance and memory hits.
+ +
+ +
+ +.. c:function:: PyObject* PyUnicode_FromUnicode(const Py_UNICODE *u, Py_ssize_t size)
+ +
+ +   Create a Unicode object from the Py_UNICODE buffer *u* of the given size. *u*
+ +   may be *NULL* which causes the contents to be undefined. It is the user's
+ +   responsibility to fill in the needed data.  The buffer is copied into the new
+ +   object.
+ +
+ +   If the buffer is not *NULL*, the return value might be a shared object.
+ +   Therefore, modification of the resulting Unicode object is only allowed when
+ +   *u* is *NULL*.
+ +
+ +   If the buffer is *NULL*, :c:func:`PyUnicode_READY` must be called once the
+ +   string content has been filled before using any of the access macros such as
+ +   :c:func:`PyUnicode_KIND`.
+ +
+ +   Please migrate to using :c:func:`PyUnicode_FromKindAndData` or
+ +   :c:func:`PyUnicode_New`.
+ +
+ +
+ +.. c:function:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode)
+ +
+ +   Return a read-only pointer to the Unicode object's internal
++   :c:type:`Py_UNICODE` buffer, or *NULL* on error. This will create the
++   :c:type:`Py_UNICODE*` representation of the object if it is not yet
++   available. Note that the resulting :c:type:`Py_UNICODE` string may contain
++   embedded null characters, which would cause the string to be truncated when
++   used in most C functions.
+ +
+ +   Please migrate to using :c:func:`PyUnicode_AsUCS4`,
+ +   :c:func:`PyUnicode_Substring`, :c:func:`PyUnicode_ReadChar` or similar new
+ +   APIs.
+ +
+ +
   .. c:function:: PyObject* PyUnicode_TransformDecimalToASCII(Py_UNICODE *s, Py_ssize_t size)
   
      Create a Unicode object by replacing all decimal digits in
      :c:type:`Py_UNICODE` buffer of the given *size* by ASCII digits 0--9
- -   according to their decimal value.  Return *NULL* if an exception
- -   occurs.
+ +   according to their decimal value.  Return *NULL* if an exception occurs.
   
   
- -.. c:function:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode)
+ +.. c:function:: Py_UNICODE* PyUnicode_AsUnicodeAndSize(PyObject *unicode, Py_ssize_t *size)
   
- -   Return a read-only pointer to the Unicode object's internal
- -   :c:type:`Py_UNICODE` buffer, *NULL* if *unicode* is not a Unicode object.
- -   Note that the resulting :c:type:`Py_UNICODE*` string may contain embedded
- -   null characters, which would cause the string to be truncated when used in
- -   most C functions.
+ +   Like :c:func:`PyUnicode_AsUnicode`, but also saves the :c:func:`Py_UNICODE`
-    array length in *size*.
++   array length in *size*. Note that the resulting :c:type:`Py_UNICODE*` string
++   may contain embedded null characters, which would cause the string to be
++   truncated when used in most C functions.
+ +
+ +   .. versionadded:: 3.3
   
   
   .. c:function:: Py_UNICODE* PyUnicode_AsUnicodeCopy(PyObject *unicode)
   
      Create a copy of a Unicode string ending with a nul character. Return *NULL*
      and raise a :exc:`MemoryError` exception on memory allocation failure,
-    otherwise return a new allocated buffer (use :c:func:`PyMem_Free` to free the
-    buffer).
+    otherwise return a new allocated buffer (use :c:func:`PyMem_Free` to free
- -   the buffer). Note that the resulting :c:type:`Py_UNICODE*` string may contain
- -   embedded null characters, which would cause the string to be truncated when
- -   used in most C functions.
++   the buffer). Note that the resulting :c:type:`Py_UNICODE*` string may
++   contain embedded null characters, which would cause the string to be
++   truncated when used in most C functions.
   
      .. versionadded:: 3.2
   
@@@ -850,10 -479,12 +857,12 @@@ wchar_t Suppor
      Copy the Unicode object contents into the :c:type:`wchar_t` buffer *w*.  At most
      *size* :c:type:`wchar_t` characters are copied (excluding a possibly trailing
      0-termination character).  Return the number of :c:type:`wchar_t` characters
--   copied or -1 in case of an error.  Note that the resulting :c:type:`wchar_t`
++   copied or -1 in case of an error.  Note that the resulting :c:type:`wchar_t*`
      string may or may not be 0-terminated.  It is the responsibility of the caller
--   to make sure that the :c:type:`wchar_t` string is 0-terminated in case this is
-    required by the application.
++   to make sure that the :c:type:`wchar_t*` string is 0-terminated in case this is
+    required by the application. Also, note that the :c:type:`wchar_t*` string
+    might contain null characters, which would cause the string to be truncated
+    when used with most C functions.
   
   
   .. c:function:: wchar_t* PyUnicode_AsWideCharString(PyObject *unicode, Py_ssize_t *size)
@@@ -863,9 -494,11 +872,11 @@@
      of wide characters (excluding the trailing 0-termination character) into
      *\*size*.
   
-    Returns a buffer allocated by :c:func:`PyMem_Alloc` (use :c:func:`PyMem_Free`
-    to free it) on success. On error, returns *NULL*, *\*size* is undefined and
-    raises a :exc:`MemoryError`.
+    Returns a buffer allocated by :c:func:`PyMem_Alloc` (use
+    :c:func:`PyMem_Free` to free it) on success. On error, returns *NULL*,
+    *\*size* is undefined and raises a :exc:`MemoryError`. Note that the
- -   resulting :c:type:`wchar_t*` string might contain null characters, which
++   resulting :c:type:`wchar_t` string might contain null characters, which
+    would cause the string to be truncated when used with most C functions.
   
      .. versionadded:: 3.2
author	Victor Stinner <victor.stinner@haypocalc.com>
	Sun, 18 Dec 2011 18:30:55 +0000 (19:30 +0100)
committer	Victor Stinner <victor.stinner@haypocalc.com>
	Sun, 18 Dec 2011 18:30:55 +0000 (19:30 +0100)
		1	2
Doc/ACKS.txt	patch \|	diff1 \|	diff2 \|	blob \| history
Doc/c-api/unicode.rst	patch \|	diff1 \|	diff2 \|	blob \| history