An unrecognized format character causes all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.
+ .. note::
+
+ The `"%lld"` and `"%llu"` format specifiers are only available
+ when :const:`HAVE_LONG_LONG` is defined.
+
+ .. versionchanged:: 3.2
+ Support for ``"%lld"`` and ``"%llu"`` added.
-.. cfunction:: PyObject* PyUnicode_FromFormatV(const char *format, va_list vargs)
- Identical to :func:`PyUnicode_FromFormat` except that it takes exactly two
+.. c:function:: PyObject* PyUnicode_FromFormatV(const char *format, va_list vargs)
+
+ Identical to :c:func:`PyUnicode_FromFormat` except that it takes exactly two
arguments.
- :c:type:`Py_UNICODE` buffer of the given size by ASCII digits 0--9
+.. c:function:: PyObject* PyUnicode_TransformDecimalToASCII(Py_UNICODE *s, Py_ssize_t size)
+
+ Create a Unicode object by replacing all decimal digits in
++ :c:type:`Py_UNICODE` buffer of the given *size* by ASCII digits 0--9
+ according to their decimal value. Return *NULL* if an exception
+ occurs.
-.. cfunction:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode)
- Return a read-only pointer to the Unicode object's internal :ctype:`Py_UNICODE`
+.. c:function:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode)
+
+ Return a read-only pointer to the Unicode object's internal :c:type:`Py_UNICODE`
buffer, *NULL* if *unicode* is not a Unicode object.
-.. cfunction:: Py_ssize_t PyUnicode_GetSize(PyObject *unicode)
+.. c:function:: Py_UNICODE* PyUnicode_AsUnicodeCopy(PyObject *unicode)
+
- Create a copy of a unicode string ending with a nul character. Return *NULL*
++ Create a copy of a Unicode string ending with a nul character. Return *NULL*
+ and raise a :exc:`MemoryError` exception on memory allocation failure,
+ otherwise return a new allocated buffer (use :c:func:`PyMem_Free` to free the
+ buffer).
+
+ .. versionadded:: 3.2
+
+
+.. c:function:: Py_ssize_t PyUnicode_GetSize(PyObject *unicode)
Return the length of the Unicode object.
wchar_t Support
"""""""""""""""
- wchar_t support for platforms which support it:
-:ctype:`wchar_t` support for platforms which support it:
++:c:type:`wchar_t` support for platforms which support it:
-.. cfunction:: PyObject* PyUnicode_FromWideChar(const wchar_t *w, Py_ssize_t size)
+.. c:function:: PyObject* PyUnicode_FromWideChar(const wchar_t *w, Py_ssize_t size)
- Create a Unicode object from the :c:type:`wchar_t` buffer *w* of the given size.
- Passing -1 as the size indicates that the function must itself compute the length,
- Create a Unicode object from the :ctype:`wchar_t` buffer *w* of the given *size*.
++ Create a Unicode object from the :c:type:`wchar_t` buffer *w* of the given *size*.
+ Passing -1 as the *size* indicates that the function must itself compute the length,
using wcslen.
Return *NULL* on failure.
Setting encoding to *NULL* causes the default encoding to be used
which is ASCII. The file system calls should use
-:cfunc:`PyUnicode_FSConverter` for encoding file names. This uses the
-variable :cdata:`Py_FileSystemDefaultEncoding` internally. This
+:c:func:`PyUnicode_FSConverter` for encoding file names. This uses the
+variable :c:data:`Py_FileSystemDefaultEncoding` internally. This
- variable should be treated as read-only: On some systems, it will be a
+ variable should be treated as read-only: on some systems, it will be a
pointer to a static string, on others, it will change at run-time
(such as when the application invokes setlocale).
the codec.
-.. cfunction:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, const char *encoding, const char *errors)
+.. c:function:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, const char *encoding, const char *errors)
- Encode the :c:type:`Py_UNICODE` buffer of the given size and return a Python
- Encode the :ctype:`Py_UNICODE` buffer *s* of the given *size* and return a Python
++ Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* and return a Python
bytes object. *encoding* and *errors* have the same meaning as the
parameters of the same name in the Unicode :meth:`encode` method. The codec
to be used is looked up using the Python codec registry. Return *NULL* if an
that have been decoded will be stored in *consumed*.
-.. cfunction:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
+.. c:function:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
- Encode the :c:type:`Py_UNICODE` buffer of the given size using UTF-8 and
- Encode the :ctype:`Py_UNICODE` buffer *s* of the given *size* using UTF-8 and
++ Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* using UTF-8 and
return a Python bytes object. Return *NULL* if an exception was raised by
the codec.
These are the UTF-32 codec APIs:
-.. cfunction:: PyObject* PyUnicode_DecodeUTF32(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
+.. c:function:: PyObject* PyUnicode_DecodeUTF32(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
- Decode *length* bytes from a UTF-32 encoded buffer string and return the
+ Decode *size* bytes from a UTF-32 encoded buffer string and return the
corresponding Unicode object. *errors* (if non-*NULL*) defines the error
handling. It defaults to "strict".
These are the UTF-16 codec APIs:
-.. cfunction:: PyObject* PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
+.. c:function:: PyObject* PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
- Decode *length* bytes from a UTF-16 encoded buffer string and return the
+ Decode *size* bytes from a UTF-16 encoded buffer string and return the
corresponding Unicode object. *errors* (if non-*NULL*) defines the error
handling. It defaults to "strict".
string *s*. Return *NULL* if an exception was raised by the codec.
-.. cfunction:: PyObject* PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size)
+.. c:function:: PyObject* PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size)
- Encode the :c:type:`Py_UNICODE` buffer of the given size using Unicode-Escape and
- Encode the :ctype:`Py_UNICODE` buffer of the given size using Unicode-Escape and
++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Unicode-Escape and
return a Python string object. Return *NULL* if an exception was raised by the
codec.
encoded string *s*. Return *NULL* if an exception was raised by the codec.
-.. cfunction:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
+.. c:function:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
- Encode the :c:type:`Py_UNICODE` buffer of the given size using Raw-Unicode-Escape
- Encode the :ctype:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape
++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape
and return a Python string object. Return *NULL* if an exception was raised by
the codec.
*s*. Return *NULL* if an exception was raised by the codec.
-.. cfunction:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
+.. c:function:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
- Encode the :c:type:`Py_UNICODE` buffer of the given size using Latin-1 and
- Encode the :ctype:`Py_UNICODE` buffer of the given *size* using Latin-1 and
++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Latin-1 and
return a Python bytes object. Return *NULL* if an exception was raised by
the codec.
*s*. Return *NULL* if an exception was raised by the codec.
-.. cfunction:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
+.. c:function:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
- Encode the :c:type:`Py_UNICODE` buffer of the given size using ASCII and
- Encode the :ctype:`Py_UNICODE` buffer of the given *size* using ASCII and
++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using ASCII and
return a Python bytes object. Return *NULL* if an exception was raised by
the codec.
resp. Because of this, mappings only need to contain those mappings which map
characters to different code points.
+ These are the mapping codec APIs:
-.. cfunction:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, PyObject *mapping, const char *errors)
+.. c:function:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, PyObject *mapping, const char *errors)
Create a Unicode object by decoding *size* bytes of the encoded string *s* using
the given *mapping* object. Return *NULL* if an exception was raised by the
treated as "undefined mapping".
-.. cfunction:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *mapping, const char *errors)
+.. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *mapping, const char *errors)
- Encode the :c:type:`Py_UNICODE` buffer of the given size using the given
- Encode the :ctype:`Py_UNICODE` buffer of the given *size* using the given
++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using the given
*mapping* object and return a Python string object. Return *NULL* if an
exception was raised by the codec.
The following codec API is special in that maps Unicode to Unicode.
-.. cfunction:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *table, const char *errors)
+.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *table, const char *errors)
- Translate a :c:type:`Py_UNICODE` buffer of the given length by applying a
- Translate a :ctype:`Py_UNICODE` buffer of the given *size* by applying a
++ Translate a :c:type:`Py_UNICODE` buffer of the given *size* by applying a
character mapping *table* to it and return the resulting Unicode object. Return
*NULL* when an exception was raised by the codec.
:exc:`LookupError`) are left untouched and are copied as-is.
- These are the MBCS codec APIs. They are currently only available on Windows and
- use the Win32 MBCS converters to implement the conversions. Note that MBCS (or
- DBCS) is a class of encodings, not just one. The target encoding is defined by
- the user settings on the machine running the codec.
-
+
MBCS codecs for Windows
"""""""""""""""""""""""
+ These are the MBCS codec APIs. They are currently only available on Windows and
+ use the Win32 MBCS converters to implement the conversions. Note that MBCS (or
+ DBCS) is a class of encodings, not just one. The target encoding is defined by
+ the user settings on the machine running the codec.
-
-.. cfunction:: PyObject* PyUnicode_DecodeMBCS(const char *s, Py_ssize_t size, const char *errors)
+.. c:function:: PyObject* PyUnicode_DecodeMBCS(const char *s, Py_ssize_t size, const char *errors)
Create a Unicode object by decoding *size* bytes of the MBCS encoded string *s*.
Return *NULL* if an exception was raised by the codec.
in *consumed*.
-.. cfunction:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
+.. c:function:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
- Encode the :c:type:`Py_UNICODE` buffer of the given size using MBCS and return
- Encode the :ctype:`Py_UNICODE` buffer of the given *size* using MBCS and return
++ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using MBCS and return
a Python bytes object. Return *NULL* if an exception was raised by the
codec.
Concat two strings giving a new Unicode string.
-.. cfunction:: PyObject* PyUnicode_Split(PyObject *s, PyObject *sep, Py_ssize_t maxsplit)
+.. c:function:: PyObject* PyUnicode_Split(PyObject *s, PyObject *sep, Py_ssize_t maxsplit)
- Split a string giving a list of Unicode strings. If sep is *NULL*, splitting
+ Split a string giving a list of Unicode strings. If *sep* is *NULL*, splitting
will be done at all whitespace substrings. Otherwise, splits occur at the given
separator. At most *maxsplit* splits will be done. If negative, no limit is
set. Separators are not included in the resulting list.
use the default error handling.
-.. cfunction:: PyObject* PyUnicode_Join(PyObject *separator, PyObject *seq)
+.. c:function:: PyObject* PyUnicode_Join(PyObject *separator, PyObject *seq)
- Join a sequence of strings using the given separator and return the resulting
+ Join a sequence of strings using the given *separator* and return the resulting
Unicode string.
-.. cfunction:: int PyUnicode_Tailmatch(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
+.. c:function:: int PyUnicode_Tailmatch(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
- Return 1 if *substr* matches *str*[*start*:*end*] at the given tail end
+ Return 1 if *substr* matches ``str[start:end]`` at the given tail end
(*direction* == -1 means to do a prefix match, *direction* == 1 a suffix match),
0 otherwise. Return ``-1`` if an error occurred.
-.. cfunction:: Py_ssize_t PyUnicode_Find(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
+.. c:function:: Py_ssize_t PyUnicode_Find(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
- Return the first position of *substr* in *str*[*start*:*end*] using the given
+ Return the first position of *substr* in ``str[start:end]`` using the given
*direction* (*direction* == 1 means to do a forward search, *direction* == -1 a
backward search). The return value is the index of the first match; a value of
``-1`` indicates that no match was found, and ``-2`` indicates that an error