From 5b9da893d3590106398afed0383bc06738d8c095 Mon Sep 17 00:00:00 2001 From: Barry Warsaw Date: Tue, 1 Oct 2002 01:05:52 +0000 Subject: [PATCH] Vast update to email version 2. This could surely use proofreading. --- Doc/lib/email.tex | 217 ++++++++------------ Doc/lib/emailencoders.tex | 8 +- Doc/lib/emailexc.tex | 10 +- Doc/lib/emailgenerator.tex | 93 +++++++-- Doc/lib/emailheaders.tex | 409 +++++++++++++++++++++++++++++++++++++ Doc/lib/emailiter.tex | 32 +++ Doc/lib/emailmessage.tex | 403 ++++++++++++++++++++++++++---------- Doc/lib/emailmimebase.tex | 159 ++++++++++++++ Doc/lib/emailparser.tex | 111 ++++++---- Doc/lib/emailutil.tex | 68 +++--- 10 files changed, 1185 insertions(+), 325 deletions(-) create mode 100644 Doc/lib/emailheaders.tex create mode 100644 Doc/lib/emailmimebase.tex diff --git a/Doc/lib/email.tex b/Doc/lib/email.tex index 5ba0ceaea2..aa9f3e552e 100644 --- a/Doc/lib/email.tex +++ b/Doc/lib/email.tex @@ -1,4 +1,4 @@ -% Copyright (C) 2001 Python Software Foundation +% Copyright (C) 2001,2002 Python Software Foundation % Author: barry@zope.com (Barry Warsaw) \section{\module{email} --- @@ -19,13 +19,10 @@ such as \refmodule{rfc822}, \refmodule{mimetools}, \refmodule{multifile}, and other non-standard packages such as \module{mimecntl}. It is specifically \emph{not} designed to do any sending of email messages to SMTP (\rfc{2821}) servers; that is the -function of the \refmodule{smtplib} module\footnote{For this reason, -line endings in the \module{email} package are always native line -endings. The \module{smtplib} module is responsible for converting -from native line endings to \rfc{2821} line endings, just as your mail -server would be responsible for converting from \rfc{2821} line -endings to native line endings when it stores messages in a local -mailbox.}. +function of the \refmodule{smtplib} module. The \module{email} +package attempts to be as RFC-compliant as possible, supporting in +addition to \rfc{2822}, such MIME-related RFCs as +\rfc{2045}-\rfc{2047}, and \rfc{2231}. The primary distinguishing feature of the \module{email} package is that it splits the parsing and generating of email messages from the @@ -55,8 +52,8 @@ Also included are detailed specifications of all the classes and modules that the \module{email} package provides, the exception classes you might encounter while using the \module{email} package, some auxiliary utilities, and a few examples. For users of the older -\module{mimelib} package, from which the \module{email} package is -descended, a section on differences and porting is provided. +\module{mimelib} package, or previous versions of the \module{email} +package, a section on differences and porting is provided. \begin{seealso} \seemodule{smtplib}{SMTP protocol client} @@ -72,133 +69,10 @@ descended, a section on differences and porting is provided. \input{emailgenerator} \subsection{Creating email and MIME objects from scratch} +\input{emailmimebase} -Ordinarily, you get a message object tree by passing some text to a -parser, which parses the text and returns the root of the message -object tree. However you can also build a complete object tree from -scratch, or even individual \class{Message} objects by hand. In fact, -you can also take an existing tree and add new \class{Message} -objects, move them around, etc. This makes a very convenient -interface for slicing-and-dicing MIME messages. - -You can create a new object tree by creating \class{Message} -instances, adding payloads and all the appropriate headers manually. -For MIME messages though, the \module{email} package provides some -convenient classes to make things easier. Each of these classes -should be imported from a module with the same name as the class, from -within the \module{email} package. E.g.: - -\begin{verbatim} -import email.MIMEImage.MIMEImage -\end{verbatim} - -or - -\begin{verbatim} -from email.MIMEText import MIMEText -\end{verbatim} - -Here are the classes: - -\begin{classdesc}{MIMEBase}{_maintype, _subtype, **_params} -This is the base class for all the MIME-specific subclasses of -\class{Message}. Ordinarily you won't create instances specifically -of \class{MIMEBase}, although you could. \class{MIMEBase} is provided -primarily as a convenient base class for more specific MIME-aware -subclasses. - -\var{_maintype} is the \mailheader{Content-Type} major type -(e.g. \mimetype{text} or \mimetype{image}), and \var{_subtype} is the -\mailheader{Content-Type} minor type -(e.g. \mimetype{plain} or \mimetype{gif}). \var{_params} is a parameter -key/value dictionary and is passed directly to -\method{Message.add_header()}. - -The \class{MIMEBase} class always adds a \mailheader{Content-Type} header -(based on \var{_maintype}, \var{_subtype}, and \var{_params}), and a -\mailheader{MIME-Version} header (always set to \code{1.0}). -\end{classdesc} - -\begin{classdesc}{MIMEAudio}{_audiodata\optional{, _subtype\optional{, - _encoder\optional{, **_params}}}} - -A subclass of \class{MIMEBase}, the \class{MIMEAudio} class is used to -create MIME message objects of major type \mimetype{audio}. -\var{_audiodata} is a string containing the raw audio data. If this -data can be decoded by the standard Python module \refmodule{sndhdr}, -then the subtype will be automatically included in the -\mailheader{Content-Type} header. Otherwise you can explicitly specify the -audio subtype via the \var{_subtype} parameter. If the minor type could -not be guessed and \var{_subtype} was not given, then \exception{TypeError} -is raised. - -Optional \var{_encoder} is a callable (i.e. function) which will -perform the actual encoding of the audio data for transport. This -callable takes one argument, which is the \class{MIMEAudio} instance. -It should use \method{get_payload()} and \method{set_payload()} to -change the payload to encoded form. It should also add any -\mailheader{Content-Transfer-Encoding} or other headers to the message -object as necessary. The default encoding is \emph{Base64}. See the -\refmodule{email.Encoders} module for a list of the built-in encoders. - -\var{_params} are passed straight through to the \class{MIMEBase} -constructor. -\end{classdesc} - -\begin{classdesc}{MIMEImage}{_imagedata\optional{, _subtype\optional{, - _encoder\optional{, **_params}}}} - -A subclass of \class{MIMEBase}, the \class{MIMEImage} class is used to -create MIME message objects of major type \mimetype{image}. -\var{_imagedata} is a string containing the raw image data. If this -data can be decoded by the standard Python module \refmodule{imghdr}, -then the subtype will be automatically included in the -\mailheader{Content-Type} header. Otherwise you can explicitly specify the -image subtype via the \var{_subtype} parameter. If the minor type could -not be guessed and \var{_subtype} was not given, then \exception{TypeError} -is raised. - -Optional \var{_encoder} is a callable (i.e. function) which will -perform the actual encoding of the image data for transport. This -callable takes one argument, which is the \class{MIMEImage} instance. -It should use \method{get_payload()} and \method{set_payload()} to -change the payload to encoded form. It should also add any -\mailheader{Content-Transfer-Encoding} or other headers to the message -object as necessary. The default encoding is \emph{Base64}. See the -\refmodule{email.Encoders} module for a list of the built-in encoders. - -\var{_params} are passed straight through to the \class{MIMEBase} -constructor. -\end{classdesc} - -\begin{classdesc}{MIMEText}{_text\optional{, _subtype\optional{, - _charset\optional{, _encoder}}}} - -A subclass of \class{MIMEBase}, the \class{MIMEText} class is used to -create MIME objects of major type \mimetype{text}. \var{_text} is the -string for the payload. \var{_subtype} is the minor type and defaults -to \mimetype{plain}. \var{_charset} is the character set of the text and is -passed as a parameter to the \class{MIMEBase} constructor; it defaults -to \code{us-ascii}. No guessing or encoding is performed on the text -data, but a newline is appended to \var{_text} if it doesn't already -end with a newline. - -The \var{_encoding} argument is as with the \class{MIMEImage} class -constructor, except that the default encoding for \class{MIMEText} -objects is one that doesn't actually modify the payload, but does set -the \mailheader{Content-Transfer-Encoding} header to \code{7bit} or -\code{8bit} as appropriate. -\end{classdesc} - -\begin{classdesc}{MIMEMessage}{_msg\optional{, _subtype}} -A subclass of \class{MIMEBase}, the \class{MIMEMessage} class is used to -create MIME objects of main type \mimetype{message}. \var{_msg} is used as -the payload, and must be an instance of class \class{Message} (or a -subclass thereof), otherwise a \exception{TypeError} is raised. - -Optional \var{_subtype} sets the subtype of the message; it defaults -to \mimetype{rfc822}. -\end{classdesc} +\subsection{Headers, Character sets, and Internationalization} +\input{emailheaders} \subsection{Encoders} \input{emailencoders} @@ -212,6 +86,77 @@ to \mimetype{rfc822}. \subsection{Iterators} \input{emailiter} +\subsection{Differences from \module{email} v1 (up to Python 2.2.1)} + +Version 1 of the \module{email} package was bundled with Python +releases up to Python 2.2.1. Version 2 was developed for the Python +2.3 release, and backported to Python 2.2.2. It was also available as +a separate distutils based package. \module{email} version 2 is +almost entirely backwards compatible with version 1, with the +following differences: + +\begin{itemize} +\item The \module{email.Header} and \module{email.Charset} modules + have been added. +\item The pickle format for \class{Message} instances has changed. + Since this was never (and still isn't) formally defined, this + isn't considered a backwards incompatibility. However if your + application pickles and unpickles \class{Message} instances, be + aware that in \module{email} version 2, \class{Message} + instances now have private variables \var{_charset} and + \var{_default_type}. +\item Several methods in the \class{Message} class have been + deprecated, or their signatures changes. Also, many new methods + have been added. See the documentation for the \class{Message} + class for deatils. The changes should be completely backwards + compatible. +\item The object structure has changed in the face of + \mimetype{message/rfc822} content types. In \module{email} + version 1, such a type would be represented by a scalar payload, + i.e. the container message's \method{is_multipart()} returned + false, \method{get_payload()} was not a list object, and was + actually a \class{Message} instance. + + This structure was inconsistent with the rest of the package, so + the object representation for \mimetype{message/rfc822} content + types was changed. In module{email} version 2, the container + \emph{does} return \code{True} from \method{is_multipart()}, and + \method{get_payload()} returns a list containing a single + \class{Message} item. + + Note that this is one place that backwards compatibility could + not be completely maintained. However, if you're already + testing the return type of \method{get_payload()}, you should be + fine. You just need to make sure your code doesn't do a + \method{set_payload()} with a \class{Message} instance on a + container with a content type of \mimetype{message/rfc822}. +\item The \class{Parser} constructor's \var{strict} argument was + added, and its \method{parse()} and \method{parsestr()} methods + grew a \var{headersonly} argument. The \var{strict} flag was + also added to functions \function{email.message_from_file()} + and \function{email.message_from_string()}. +\item \method{Generator.__call__()} is deprecated; use + \method{Generator.flatten()} instead. The \class{Generator} + class has also grown the \method{clone()} method. +\item The \class{DecodedGenerator} class in the + \module{email.Generator} module was added. +\item The intermediate base classes \class{MIMENonMultipart} and + \class{MIMEMultipart} have been added, and interposed in the + class heirarchy for most of the other MIME-related derived + classes. +\item The \var{_encoder} argument to the \class{MIMEText} constructor + has been deprecated. Encoding now happens implicitly based + on the \var{_charset} argument. +\item The following functions in the \module{email.Utils} module have + been deprecated: \function{dump_address_pairs()}, + \function{decode()}, and \function{encode()}. The following + functions have been added to the module: + \function{make_msgid()}, \function{decode_rfc2231()}, + \function{encode_rfc2231()}, and \function{decode_params()}. +\item The non-public function \function{email.Iterators._structure()} + was added. +\end{itemize} + \subsection{Differences from \module{mimelib}} The \module{email} package was originally prototyped as a separate diff --git a/Doc/lib/emailencoders.tex b/Doc/lib/emailencoders.tex index 3e247a9256..4b4e6370c9 100644 --- a/Doc/lib/emailencoders.tex +++ b/Doc/lib/emailencoders.tex @@ -17,8 +17,8 @@ set the \mailheader{Content-Transfer-Encoding} header as appropriate. Here are the encoding functions provided: \begin{funcdesc}{encode_quopri}{msg} -Encodes the payload into \emph{Quoted-Printable} form and sets the -\code{Content-Transfer-Encoding:} header to +Encodes the payload into quoted-Printable form and sets the +\mailheader{Content-Transfer-Encoding} header to \code{quoted-printable}\footnote{Note that encoding with \method{encode_quopri()} also encodes all tabs and space characters in the data.}. @@ -27,11 +27,11 @@ printable data, but contains a few unprintable characters. \end{funcdesc} \begin{funcdesc}{encode_base64}{msg} -Encodes the payload into \emph{Base64} form and sets the +Encodes the payload into base64 form and sets the \mailheader{Content-Transfer-Encoding} header to \code{base64}. This is a good encoding to use when most of your payload is unprintable data since it is a more compact form than -Quoted-Printable. The drawback of Base64 encoding is that it +quoted-printable. The drawback of base64 encoding is that it renders the text non-human readable. \end{funcdesc} diff --git a/Doc/lib/emailexc.tex b/Doc/lib/emailexc.tex index 492924462c..824a276f17 100644 --- a/Doc/lib/emailexc.tex +++ b/Doc/lib/emailexc.tex @@ -21,7 +21,7 @@ a message, this class is derived from \exception{MessageParseError}. It can be raised from the \method{Parser.parse()} or \method{Parser.parsestr()} methods. -Situations where it can be raised include finding a \emph{Unix-From} +Situations where it can be raised include finding an envelope header after the first \rfc{2822} header of the message, finding a continuation line before the first \rfc{2822} header is found, or finding a line in the headers which is neither a header or a continuation @@ -35,7 +35,8 @@ It can be raised from the \method{Parser.parse()} or \method{Parser.parsestr()} methods. Situations where it can be raised include not being able to find the -starting or terminating boundary in a \mimetype{multipart/*} message. +starting or terminating boundary in a \mimetype{multipart/*} message +when strict parsing is used. \end{excclassdesc} \begin{excclassdesc}{MultipartConversionError}{} @@ -45,4 +46,9 @@ message's \mailheader{Content-Type} main type is not either \mimetype{multipart} or missing. \exception{MultipartConversionError} multiply inherits from \exception{MessageError} and the built-in \exception{TypeError}. + +Since \method{Message.add_payload()} is deprecated, this exception is +rarely raised in practice. However the exception may also be raised +if the \method{attach()} method is called on an instance of a class +derived from \class{MIMENonMultipart} (e.g. \class{MIMEImage}). \end{excclassdesc} diff --git a/Doc/lib/emailgenerator.tex b/Doc/lib/emailgenerator.tex index 63ceb73d1d..03fee9f6cc 100644 --- a/Doc/lib/emailgenerator.tex +++ b/Doc/lib/emailgenerator.tex @@ -1,11 +1,11 @@ \declaremodule{standard}{email.Generator} -\modulesynopsis{Generate flat text email messages from a message object tree.} +\modulesynopsis{Generate flat text email messages from a message structure.} One of the most common tasks is to generate the flat text of the email -message represented by a message object tree. You will need to do +message represented by a message object structure. You will need to do this if you want to send your message via the \refmodule{smtplib} module or the \refmodule{nntplib} module, or print the message on the -console. Taking a message object tree and producing a flat text +console. Taking a message object structure and producing a flat text document is the job of the \class{Generator} class. Again, as with the \refmodule{email.Parser} module, you aren't limited @@ -13,10 +13,9 @@ to the functionality of the bundled generator; you could write one from scratch yourself. However the bundled generator knows how to generate most email in a standards-compliant way, should handle MIME and non-MIME email messages just fine, and is designed so that the -transformation from flat text, to an object tree via the -\class{Parser} class, -and back to flat text, is idempotent (the input is identical to the -output). +transformation from flat text, to a message structure via the +\class{Parser} class, and back to flat text, is idempotent (the input +is identical to the output). Here are the public methods of the \class{Generator} class: @@ -27,14 +26,16 @@ object called \var{outfp} for an argument. \var{outfp} must support the \method{write()} method and be usable as the output file in a Python 2.0 extended print statement. -Optional \var{mangle_from_} is a flag that, when true, puts a \samp{>} -character in front of any line in the body that starts exactly as +Optional \var{mangle_from_} is a flag that, when \code{True}, puts a +\samp{>} character in front of any line in the body that starts exactly as \samp{From } (i.e. \code{From} followed by a space at the front of the line). This is the only guaranteed portable way to avoid having such -lines be mistaken for \emph{Unix-From} headers (see +lines be mistaken for a Unix mailbox format envelope header separator (see \ulink{WHY THE CONTENT-LENGTH FORMAT IS BAD} {http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html} -for details). +for details). \var{mangle_from_} defaults to \code{True}, but you +might want to set this to \code{False} if you are not writing Unix +mailbox format files. Optional \var{maxheaderlen} specifies the longest length for a non-continued header. When a header line is longer than @@ -47,20 +48,28 @@ recommended (but not required) by \rfc{2822}. The other public \class{Generator} methods are: -\begin{methoddesc}[Generator]{__call__}{msg\optional{, unixfrom}} -Print the textual representation of the message object tree rooted at +\begin{methoddesc}[Generator]{flatten()}{msg\optional{, unixfrom}} +Print the textual representation of the message object structure rooted at \var{msg} to the output file specified when the \class{Generator} instance was created. Sub-objects are visited depth-first and the resulting text will be properly MIME encoded. Optional \var{unixfrom} is a flag that forces the printing of the -\emph{Unix-From} (a.k.a. envelope header or \code{From_} header) -delimiter before the first \rfc{2822} header of the root message -object. If the root object has no \emph{Unix-From} header, a standard -one is crafted. By default, this is set to 0 to inhibit the printing -of the \emph{Unix-From} delimiter. +envelope header delimiter before the first \rfc{2822} header of the +root message object. If the root object has no envelope header, a +standard one is crafted. By default, this is set to \code{False} to +inhibit the printing of the envelope delimiter. + +Note that for sub-objects, no envelope header is ever printed. + +\versionadded{2.2.2} +\end{methoddesc} + +\begin{methoddesc}[Generator]{clone}{fp} +Return an independent clone of this \class{Generator} instance with +the exact same options. -Note that for sub-objects, no \emph{Unix-From} header is ever printed. +\versionadded{2.2.2} \end{methoddesc} \begin{methoddesc}[Generator]{write}{s} @@ -74,3 +83,49 @@ As a convenience, see the methods \method{Message.as_string()} and \code{str(aMessage)}, a.k.a. \method{Message.__str__()}, which simplify the generation of a formatted string representation of a message object. For more detail, see \refmodule{email.Message}. + +The \module{email.Generator} module also provides a derived class, +called \class{DecodedGenerator} which is like the \class{Generator} +base class, except that non-\mimetype{text} parts are substituted with +a format string representing the part. + +\begin{classdesc}{DecodedGenerator}{outfp\optional{, mangle_from_\optional{, + maxheaderlen\optional{, fmt}}}} + +This class, derived from \class{Generator} walks through all the +subparts of a message. If the subpart is of main type +\mimetype{text}, then it prints the decoded payload of the subpart. +Optional \var{_mangle_from_} and \var{maxheaderlen} are as with the +\class{Generator} base class. + +If the subpart is not of main type \mimetype{text}, optional \var{fmt} +is a format string that is used instead of the message +payload. \var{fmt} is expanded with the following keywords (in +\samp{\%(keyword)s} format): + +type : Full MIME type of the non-\mimetype{text} part +maintype : Main MIME type of the non-\mimetype{text} part +subtype : Sub-MIME type of the non-\mimetype{text} part +filename : Filename of the non-\mimetype{text} part +description: Description associated with the non-\mimetype{text} part +encoding : Content transfer encoding of the non-\mimetype{text} part + +The default value for \var{fmt} is \code{None}, meaning + +\begin{verbatim} +[Non-text (%(type)s) part of message omitted, filename %(filename)s] +\end{verbatim} + +\versionadded{2.2.2} +\end{classdesc} + +\subsubsection{Deprecated methods} + +The following methods are deprecated in \module{email} version 2. +They are documented here for completeness. + +\begin{methoddesc}[Generator]{__call__}{msg\optional{, unixfrom}} +This method is identical to the \method{flatten()} method. + +\deprecated{2.2.2}{Use the \method{flatten()} method instead.} +\end{methoddesc} diff --git a/Doc/lib/emailheaders.tex b/Doc/lib/emailheaders.tex new file mode 100644 index 0000000000..172e5d6539 --- /dev/null +++ b/Doc/lib/emailheaders.tex @@ -0,0 +1,409 @@ +\declaremodule{standard}{email.Header} +\modulesynopsis{Representing non-ASCII headers} + +\rfc{2822} is the base standard that describes the format of email +messages. It derives from the older \rfc{822} standard which came +into widespread at a time when most email was composed of \ASCII{} +characters only. \rfc{2822} is a specification written assuming email +contains only 7-bit \ASCII{} characters. + +Of course, as email has been deployed worldwide, it has become +internationalized, such that language specific character sets can now +be used in email messages. The base standard still requires email +messages to be transfered using only 7-bit \ASCII{} characters, so a +slew of RFCs have been written describing how to encode email +containing non-\ASCII{} characters into \rfc{2822}-compliant format. +These RFCs include \rfc{2045}, \rfc{2046}, \rfc{2047}, and \rfc{2231}. +The \module{email} package supports these standards in its +\module{email.Header} and \module{email.Charset} modules. + +If you want to include non-\ASCII{} characters in your email headers, +say in the \mailheader{Subject} or \mailheader{To} fields, you should +use the \class{Header} class (in module \module{email.Header} and +assign the field in the \class{Message} object to an instance of +\class{Header} instead of using a string for the header value. For +example: + +\begin{verbatim} +>>> from email.Message import Message +>>> from email.Header import Header +>>> msg = Message() +>>> h = Header('p\xf6stal', 'iso-8859-1') +>>> msg['Subject'] = h +>>> print msg.as_string() +Subject: =?iso-8859-1?q?p=F6stal?= + + +\end{verbatim} + +Notice here how we wanted the \mailheader{Subject} field to contain a +non-\ASCII{} character? We did this by creating a \class{Header} +instance and passing in the character set that the byte string was +encoded in. When the subsequent \class{Message} instance was +flattened, the \mailheader{Subject} field was properly \rfc{2047} +encoded. MIME-aware mail readers would show this header using the +embedded ISO-8859-1 character. + +\versionadded{2.2.2} + +Here is the \class{Header} class description: + +\begin{classdesc}{Header}{\optional{s\optional{, charset\optional{, + maxlinelen\optional{, header_name\optional{, continuation_ws}}}}}} +Create a MIME-compliant header that can contain many character sets. + +Optional \var{s} is the initial header value. If \code{None} (the +default), the initial header value is not set. You can later append +to the header with \method{append()} method calls. \var{s} may be a +byte string or a Unicode string, but see the \method{append()} +documentation for semantics. + +Optional \var{charset} serves two purposes: it has the same meaning as +the \var{charset} argument to the \method{append()} method. It also +sets the default character set for all subsequent \method{append()} +calls that omit the \var{charset} argument. If \var{charset} is not +provided in the constructor (the default), the \code{us-ascii} +character set is used both as \var{s}'s initial charset and as the +default for subsequent \method{append()} calls. + +The maximum line length can be specified explicit via +\var{maxlinelen}. For splitting the first line to a shorter value (to +account for the field header which isn't included in \var{s}, +e.g. \mailheader{Subject}) pass in the name of the field in +\var{header_name}. The default \var{maxlinelen} is 76, and the +default value for \var{header_name} is \code{None}, meaning it is not +taken into account for the first line of a long, split header. + +Optional \var{continuation_ws} must be RFC 2822 compliant folding +whitespace, and is usually either a space or a hard tab character. +This character will be prepended to continuation lines. +\end{classdesc} + +\begin{methoddesc}[Header]{append}{s\optional{, charset}} +Append the string \var{s} to the MIME header. + +Optional \var{charset}, if given, should be a \class{Charset} instance +(see \refmodule{email.Charset}) or the name of a character set, which +will be converted to a \class{Charset} instance. A value of +\code{None} (the default) means that the \var{charset} given in the +constructor is used. + +\var{s} may be a byte string or a Unicode string. If it is a byte +string (i.e. \code{isinstance(s, StringType)} is true), then +\var{charset} is the encoding of that byte string, and a +\exception{UnicodeError} will be raised if the string cannot be +decoded with that character set. + +If \var{s} is a Unicode string, then \var{charset} is a hint +specifying the character set of the characters in the string. In this +case, when producing an \rfc{2822}-compliant header using \rfc{2047} +rules, the Unicode string will be encoded using the following charsets +in order: \code{us-ascii}, the \var{charset} hint, \code{utf-8}. The +first character set to not provoke a \exception{UnicodeError} is used. +\end{methoddesc} + +\begin{methoddesc}[Header]{encode}{} +Encode a message header into an RFC-compliant format, possibly +wrapping long lines and encapsulating non-\ASCII{} parts in base64 or +quoted-printable encodings. +\end{methoddesc} + +The \class{Header} class also provides a number of methods to support +standard operators and built-in functions. + +\begin{methoddesc}[Header]{__str__}{} +A synonym for \method{Header.encode()}. Useful for +\code{str(aHeader)} calls. +\end{methoddesc} + +\begin{methoddesc}[Header]{__unicode__}{} +A helper for the built-in \function{unicode()} function. Returns the +header as a Unicode string. +\end{methoddesc} + +\begin{methoddesc}[Header]{__eq__}{other} +This method allows you to compare two \class{Header} instances for equality. +\end{methoddesc} + +\begin{methoddesc}[Header]{__ne__}{other} +This method allows you to compare two \class{Header} instances for inequality. +\end{methoddesc} + +The \module{email.Header} module also provides the following +convenient functions. + +\begin{funcdesc}{decode_header}{header} +Decode a message header value without converting the character set. +The header value is in \var{header}. + +This function returns a list of \code{(decoded_string, charset)} pairs +containing each of the decoded parts of the header. \var{charset} is +\code{None} for non-encoded parts of the header, otherwise a lower +case string containing the name of the character set specified in the +encoded string. + +Here's an example: + +\begin{verbatim} +>>> from email.Header import decode_header +>>> decode_header('=?iso-8859-1?q?p=F6stal?=') +[('p\\xf6stal', 'iso-8859-1')] +\end{verbatim} +\end{funcdesc} + +\begin{funcdesc}{make_header}{decoded_seq\optional{, maxlinelen\optional{, + header_name\optional{, continuation_ws}}}} +Create a \class{Header} instance from a sequence of pairs as returned +by \function{decode_header()}. + +\function{decode_header()} takes a header value string and returns a +sequence of pairs of the format \code{(decoded_string, charset)} where +\var{charset} is the name of the character set. + +This function takes one of those sequence of pairs and returns a +\class{Header} instance. Optional \var{maxlinelen}, +\var{header_name}, and \var{continuation_ws} are as in the +\class{Header} constructor. +\end{funcdesc} + +\declaremodule{standard}{email.Charset} +\modulesynopsis{Character Sets} + +This module provides a class \class{Charset} for representing +character sets and character set conversions in email messages, as +well as a character set registry and several convenience methods for +manipulating this registry. Instances of \class{Charset} are used in +several other modules within the \module{email} package. + +\versionadded{2.2.2} + +\begin{classdesc}{Charset}{\optional{input_charset}} +Map character sets to their email properties. + +This class provides information about the requirements imposed on +email for a specific character set. It also provides convenience +routines for converting between character sets, given the availability +of the applicable codecs. Given a character set, it will do its best +to provide information on how to use that character set in an email +message in an RFC-compliant way. + +Certain character sets must be encoded with quoted-printable or base64 +when used in email headers or bodies. Certain character sets must be +converted outright, and are not allowed in email. + +Optional \var{input_charset} is as described below. After being alias +normalized it is also used as a lookup into the registry of character +sets to find out the header encoding, body encoding, and output +conversion codec to be used for the character set. For example, if +\var{input_charset} is \code{iso-8859-1}, then headers and bodies will +be encoded using quoted-printable and no output conversion codec is +necessary. If \var{input_charset} is \code{euc-jp}, then headers will +be encoded with base64, bodies will not be encoded, but output text +will be converted from the \code{euc-jp} character set to the +\code{iso-2022-jp} character set. +\end{classdesc} + +\class{Charset} instances have the following data attributes: + +\begin{datadesc}{input_charset} +The initial character set specified. Common aliases are converted to +their \emph{official} email names (e.g. \code{latin_1} is converted to +\code{iso-8859-1}). Defaults to 7-bit \code{us-ascii}. +\end{datadesc} + +\begin{datadesc}{header_encoding} +If the character set must be encoded before it can be used in an +email header, this attribute will be set to \code{Charset.QP} (for +quoted-printable), \code{Charset.BASE64} (for base64 encoding), or +\code{Charset.SHORTEST} for the shortest of QP or BASE64 encoding. +Otherwise, it will be \code{None}. +\end{datadesc} + +\begin{datadesc}{body_encoding} +Same as \var{header_encoding}, but describes the encoding for the +mail message's body, which indeed may be different than the header +encoding. \code{Charset.SHORTEST} is not allowed for +\var{body_encoding}. +\end{datadesc} + +\begin{datadesc}{output_charset} +Some character sets must be converted before the can be used in +email headers or bodies. If the \var{input_charset} is one of +them, this attribute will contain the name of the character set +output will be converted to. Otherwise, it will be \code{None}. +\end{datadesc} + +\begin{datadesc}{input_codec} +The name of the Python codec used to convert the \var{input_charset} to +Unicode. If no conversion codec is necessary, this attribute will be +\code{None}. +\end{datadesc} + +\begin{datadesc}{output_codec} +The name of the Python codec used to convert Unicode to the +\var{output_charset}. If no conversion codec is necessary, this +attribute will have the same value as the \var{input_codec}. +\end{datadesc} + +\class{Charset} instances also have the following methods: + +\begin{methoddesc}[Charset]{get_body_encoding}{} +Return the content transfer encoding used for body encoding. + +This is either the string \samp{quoted-printable} or \samp{base64} +depending on the encoding used, or it is a function, in which case you +should call the function with a single argument, the Message object +being encoded. The function should then set the +\mailheader{Content-Transfer-Encoding} header itself to whatever is +appropriate. + +Returns the string \samp{quoted-printable} if +\var{body_encoding} is \code{QP}, returns the string +\samp{base64} if \var{body_encoding} is \code{BASE64}, and returns the +string \samp{7bit} otherwise. +\end{methoddesc} + +\begin{methoddesc}{convert}{s} +Convert the string \var{s} from the \var{input_codec} to the +\var{output_codec}. +\end{methoddesc} + +\begin{methoddesc}{to_splittable}{s} +Convert a possibly multibyte string to a safely splittable format. +\var{s} is the string to split. + +Uses the \var{input_codec} to try and convert the string to Unicode, +so it can be safely split on character boundaries (even for multibyte +characters). + +Returns the string as-is if it isn't known how to convert \var{s} to +Unicode with the \var{input_charset}. + +Characters that could not be converted to Unicode will be replaced +with the Unicode replacement character \character{U+FFFD}. +\end{methoddesc} + +\begin{methoddesc}{from_splittable}{ustr\optional{, to_output}} +Convert a splittable string back into an encoded string. \var{ustr} +is a Unicode string to ``unsplit''. + +This method uses the proper codec to try and convert the string from +Unicode back into an encoded format. Return the string as-is if it is +not Unicode, or if it could not be converted from Unicode. + +Characters that could not be converted from Unicode will be replaced +with an appropriate character (usually \character{?}). + +If \var{to_output} is \code{True} (the default), uses +\var{output_codec} to convert to an +encoded format. If \var{to_output} is \code{False}, it uses +\var{input_codec}. +\end{methoddesc} + +\begin{methoddesc}{get_output_charset}{} +Return the output character set. + +This is the \var{output_charset} attribute if that is not \code{None}, +otherwise it is \var{input_charset}. +\end{methoddesc} + +\begin{methoddesc}{encoded_header_len}{} +Return the length of the encoded header string, properly calculating +for quoted-printable or base64 encoding. +\end{methoddesc} + +\begin{methoddesc}{header_encode}{s\optional{, convert}} +Header-encode the string \var{s}. + +If \var{convert} is \code{True}, the string will be converted from the +input charset to the output charset automatically. This is not useful +for multibyte character sets, which have line length issues (multibyte +characters must be split on a character, not a byte boundary); use the +higher-level \class{Header} class to deal with these issues (see +\refmodule{email.Header}). \var{convert} defaults to \code{False}. + +The type of encoding (base64 or quoted-printable) will be based on +the \var{header_encoding} attribute. +\end{methoddesc} + +\begin{methoddesc}{body_encode}{s\optional{, convert}} +Body-encode the string \var{s}. + +If \var{convert} is \code{True} (the default), the string will be +converted from the input charset to output charset automatically. +Unlike \method{header_encode()}, there are no issues with byte +boundaries and multibyte charsets in email bodies, so this is usually +pretty safe. + +The type of encoding (base64 or quoted-printable) will be based on +the \var{body_encoding} attribute. +\end{methoddesc} + +The \class{Charset} class also provides a number of methods to support +standard operations and built-in functions. + +\begin{methoddesc}[Charset]{__str__}{} +Returns \var{input_charset} as a string coerced to lower case. +\end{methoddesc} + +\begin{methoddesc}[Charset]{__eq__}{other} +This method allows you to compare two \class{Charset} instances for equality. +\end{methoddesc} + +\begin{methoddesc}[Header]{__ne__}{other} +This method allows you to compare two \class{Charset} instances for inequality. +\end{methoddesc} + +The \module{email.Charset} module also provides the following +functions for adding new entries to the global character set, alias, +and codec registries: + +\begin{funcdesc}{add_charset}{charset\optional{, header_enc\optional{, + body_enc\optional{, output_charset}}}} +Add character properties to the global registry. + +\var{charset} is the input character set, and must be the canonical +name of a character set. + +Optional \var{header_enc} and \var{body_enc} is either +\code{Charset.QP} for quoted-printable, \code{Charset.BASE64} for +base64 encoding, \code{Charset.SHORTEST} for the shortest of qp or +base64 encoding, or \code{None} for no encoding. \code{SHORTEST} is +only valid for \var{header_enc}. It describes how message headers and +message bodies in the input charset are to be encoded. Default is no +encoding. + +Optional \var{output_charset} is the character set that the output +should be in. Conversions will proceed from input charset, to +Unicode, to the output charset when the method +\method{Charset.convert()} is called. The default is to output in the +same character set as the input. + +Both \var{input_charset} and \var{output_charset} must have Unicode +codec entries in the module's character set-to-codec mapping; use +\function{add_codec(charset, codecname)} to add codecs the module does +not know about. See the \refmodule{codecs} module's documentation for +more information. + +The global character set registry is kept in the module global +dictionary \code{CHARSETS}. +\end{funcdesc} + +\begin{funcdesc}{add_alias}{alias, canonical} +Add a character set alias. \var{alias} is the alias name, +e.g. \code{latin-1}. \var{canonical} is the character set's canonical +name, e.g. \code{iso-8859-1}. + +The global charset alias registry is kept in the module global +dictionary \code{ALIASES}. +\end{funcdesc} + +\begin{funcdesc}{add_codec}{charset, codecname} +Add a codec that map characters in the given character set to and from +Unicode. + +\var{charset} is the canonical name of a character set. +\var{codecname} is the name of a Python codec, as appropriate for the +second argument to the \function{unicode()} built-in, or to the +\method{encode()} method of a Unicode string. +\end{funcdesc} diff --git a/Doc/lib/emailiter.tex b/Doc/lib/emailiter.tex index eed98bef92..9180ac293e 100644 --- a/Doc/lib/emailiter.tex +++ b/Doc/lib/emailiter.tex @@ -29,3 +29,35 @@ Thus, by default \function{typed_subpart_iterator()} returns each subpart that has a MIME type of \mimetype{text/*}. \end{funcdesc} +The following function has been added as a useful debugging tool. It +should \emph{not} be considered part of the supported public interface +for the package. + +\begin{funcdesc}{_structure}{msg\optional{, fp\optional{, level}}} +Prints an indented representation of the content types of the +message object structure. For example: + +\begin{verbatim} +>>> msg = email.message_from_file(somefile) +>>> _structure(msg) +multipart/mixed + text/plain + text/plain + multipart/digest + message/rfc822 + text/plain + message/rfc822 + text/plain + message/rfc822 + text/plain + message/rfc822 + text/plain + message/rfc822 + text/plain + text/plain +\end{verbatim} + +Optional \var{fp} is a file-like object to print the output to. It +must be suitable for Python's extended print statement. \var{level} +is used internally. +\end{funcdesc} diff --git a/Doc/lib/emailmessage.tex b/Doc/lib/emailmessage.tex index 1abe68c5cf..271619d684 100644 --- a/Doc/lib/emailmessage.tex +++ b/Doc/lib/emailmessage.tex @@ -12,12 +12,12 @@ values where the field name and value are separated by a colon. The colon is not part of either the field name or the field value. Headers are stored and returned in case-preserving form but are -matched case-insensitively. There may also be a single -\emph{Unix-From} header, also known as the envelope header or the +matched case-insensitively. There may also be a single envelope +header, also known as the \emph{Unix-From} header or the \code{From_} header. The payload is either a string in the case of -simple message objects, a list of \class{Message} objects for -multipart MIME documents, or a single \class{Message} instance for -\mimetype{message/rfc822} type objects. +simple message objects or a list of \class{Message} objects for +MIME container documents (e.g. \mimetype{multipart/*} and +\mimetype{message/rfc822}). \class{Message} objects provide a mapping style interface for accessing the message headers, and an explicit interface for accessing @@ -35,82 +35,96 @@ The constructor takes no arguments. \begin{methoddesc}[Message]{as_string}{\optional{unixfrom}} Return the entire formatted message as a string. Optional \var{unixfrom}, when true, specifies to include the \emph{Unix-From} -envelope header; it defaults to 0. +envelope header; it defaults to \code{False}. \end{methoddesc} \begin{methoddesc}[Message]{__str__}{} -Equivalent to \method{aMessage.as_string(unixfrom=1)}. +Equivalent to \method{aMessage.as_string(unixfrom=True)}. \end{methoddesc} \begin{methoddesc}[Message]{is_multipart}{} -Return 1 if the message's payload is a list of sub-\class{Message} -objects, otherwise return 0. When \method{is_multipart()} returns 0, -the payload should either be a string object, or a single -\class{Message} instance. +Return \code{True} if the message's payload is a list of +sub-\class{Message} objects, otherwise return \code{False}. When +\method{is_multipart()} returns False, the payload should be a string +object. \end{methoddesc} \begin{methoddesc}[Message]{set_unixfrom}{unixfrom} -Set the \emph{Unix-From} (a.k.a envelope header or \code{From_} -header) to \var{unixfrom}, which should be a string. +Set the message's envelope header to \var{unixfrom}, which should be a string. \end{methoddesc} \begin{methoddesc}[Message]{get_unixfrom}{} -Return the \emph{Unix-From} header. Defaults to \code{None} if the -\emph{Unix-From} header was never set. -\end{methoddesc} - -\begin{methoddesc}[Message]{add_payload}{payload} -Add \var{payload} to the message object's existing payload. If, prior -to calling this method, the object's payload was \code{None} -(i.e. never before set), then after this method is called, the payload -will be the argument \var{payload}. - -If the object's payload was already a list -(i.e. \method{is_multipart()} returns 1), then \var{payload} is -appended to the end of the existing payload list. - -For any other type of existing payload, \method{add_payload()} will -transform the new payload into a list consisting of the old payload -and \var{payload}, but only if the document is already a MIME -multipart document. This condition is satisfied if the message's -\mailheader{Content-Type} header's main type is either -\mimetype{multipart}, or there is no \mailheader{Content-Type} -header. In any other situation, -\exception{MultipartConversionError} is raised. +Return the message's envelope header. Defaults to \code{None} if the +envelope header was never set. \end{methoddesc} \begin{methoddesc}[Message]{attach}{payload} -Synonymous with \method{add_payload()}. +Add the given payload to the current payload, which must be +\code{None} or a list of \class{Message} objects before the call. +After the call, the payload will always be a list of \class{Message} +objects. If you want to set the payload to a scalar object (e.g. a +string), use \method{set_payload()} instead. \end{methoddesc} \begin{methoddesc}[Message]{get_payload}{\optional{i\optional{, decode}}} -Return the current payload, which will be a list of \class{Message} -objects when \method{is_multipart()} returns 1, or a scalar (either a -string or a single \class{Message} instance) when -\method{is_multipart()} returns 0. +Return a reference the current payload, which will be a list of +\class{Message} objects when \method{is_multipart()} is \code{True}, or a +string when \method{is_multipart()} is \code{False}. If the +payload is a list and you mutate the list object, you modify the +message's payload in place. -With optional \var{i}, \method{get_payload()} will return the +With optional argument \var{i}, \method{get_payload()} will return the \var{i}-th element of the payload, counting from zero, if -\method{is_multipart()} returns 1. An \exception{IndexError} will be raised -if \var{i} is less than 0 or greater than or equal to the number of -items in the payload. If the payload is scalar -(i.e. \method{is_multipart()} returns 0) and \var{i} is given, a +\method{is_multipart()} is \code{True}. An \exception{IndexError} +will be raised if \var{i} is less than 0 or greater than or equal to +the number of items in the payload. If the payload is a string +(i.e. \method{is_multipart()} is \code{False}) and \var{i} is given, a \exception{TypeError} is raised. Optional \var{decode} is a flag indicating whether the payload should be decoded or not, according to the \mailheader{Content-Transfer-Encoding} header. -When true and the message is not a multipart, the payload will be +When \code{True} and the message is not a multipart, the payload will be decoded if this header's value is \samp{quoted-printable} or \samp{base64}. If some other encoding is used, or \mailheader{Content-Transfer-Encoding} header is missing, the payload is returned as-is (undecoded). If the message is -a multipart and the \var{decode} flag is true, then \code{None} is -returned. +a multipart and the \var{decode} flag is \code{True}, then \code{None} is +returned. The default for \var{decode} is \code{False}. \end{methoddesc} -\begin{methoddesc}[Message]{set_payload}{payload} +\begin{methoddesc}[Message]{set_payload}{payload\optional{, charset}} Set the entire message object's payload to \var{payload}. It is the -client's responsibility to ensure the payload invariants. +client's responsibility to ensure the payload invariants. Optional +\var{charset} sets the message's default character set (see +\method{set_charset()} for details. + +\versionchanged[\var{charset} argument added]{2.2.2} +\end{methoddesc} + +\begin{methoddesc}[Message]{set_charset}{charset} +Set the character set of the payload to \var{charset}, which can +either be a \class{Charset} instance (see \refmodule{email.Charset}, a +string naming a character set, +or \code{None}. If it is a string, it will be converted to a +\class{Charset} instance. If \var{charset} is \code{None}, the +\code{charset} parameter will be removed from the +\mailheader{Content-Type} header. Anything else will generate a +\exception{TypeError}. + +The message will be assumed to be of type \mimetype{text/*} encoded with +\code{charset.input_charset}. It will be converted to +\code{charset.output_charset} +and encoded properly, if needed, when generating the plain text +representation of the message. MIME headers +(\mailheader{MIME-Version}, \mailheader{Content-Type}, +\mailheader{Content-Transfer-Encoding}) will be added as needed. + +\versionadded{2.2.2} +\end{methoddesc} + +\begin{methoddesc}[Message]{get_charset}{} +Return the \class{Charset} instance associated with the message's payload. +\versionadded{2.2.2} \end{methoddesc} The following methods implement a mapping-like interface for accessing @@ -123,8 +137,8 @@ in dictionaries there is no guaranteed order to the keys returned by order. These semantic differences are intentional and are biased toward maximal convenience. -Note that in all cases, any optional \emph{Unix-From} header the message -may have is not included in the mapping interface. +Note that in all cases, any envelope header present in the message is +not included in the mapping interface. \begin{methoddesc}[Message]{__len__}{} Return the total number of headers, including duplicates. @@ -177,32 +191,32 @@ present in the headers. \end{methoddesc} \begin{methoddesc}[Message]{has_key}{name} -Return 1 if the message contains a header field named \var{name}, -otherwise return 0. +Return true if the message contains a header field named \var{name}, +otherwise return false. \end{methoddesc} \begin{methoddesc}[Message]{keys}{} Return a list of all the message's header field names. These keys -will be sorted in the order in which they were added to the message -via \method{__setitem__()}, and may contain duplicates. Any fields -deleted and then subsequently re-added are always appended to the end -of the header list. +will be sorted in the order in which they appeared in the original +message, or were added to the message and may contain +duplicates. Any fields deleted and then subsequently re-added are +always appended to the end of the header list. \end{methoddesc} \begin{methoddesc}[Message]{values}{} Return a list of all the message's field values. These will be sorted -in the order in which they were added to the message via -\method{__setitem__()}, and may contain duplicates. Any fields -deleted and then subsequently re-added are always appended to the end -of the header list. +in the order in which they appeared in the original message, or were +added to the message, and may contain +duplicates. Any fields deleted and then subsequently re-added are +always appended to the end of the header list. \end{methoddesc} \begin{methoddesc}[Message]{items}{} -Return a list of 2-tuples containing all the message's field headers and -values. These will be sorted in the order in which they were added to -the message via \method{__setitem__()}, and may contain duplicates. -Any fields deleted and then subsequently re-added are always appended -to the end of the header list. +Return a list of 2-tuples containing all the message's field headers +and values. These will be sorted in the order in which they appeared +in the original message, or were added to the message, and may contain +duplicates. Any fields deleted and then subsequently re-added are +always appended to the end of the header list. \end{methoddesc} \begin{methoddesc}[Message]{get}{name\optional{, failobj}} @@ -215,10 +229,9 @@ Here are some additional useful methods: \begin{methoddesc}[Message]{get_all}{name\optional{, failobj}} Return a list of all the values for the field named \var{name}. These -will be sorted in the order in which they were added to the message -via \method{__setitem__()}. Any fields -deleted and then subsequently re-added are always appended to the end -of the list. +will be sorted in the order in which they appeared in the original +message, or were added to the message. Any fields deleted and then +subsequently re-added are always appended to the end of the list. If there are no such named headers in the message, \var{failobj} is returned (defaults to \code{None}). @@ -227,8 +240,8 @@ returned (defaults to \code{None}). \begin{methoddesc}[Message]{add_header}{_name, _value, **_params} Extended header setting. This method is similar to \method{__setitem__()} except that additional header parameters can be -provided as keyword arguments. \var{_name} is the header to set and -\var{_value} is the \emph{primary} value for the header. +provided as keyword arguments. \var{_name} is the header field to add +and \var{_value} is the \emph{primary} value for the header. For each item in the keyword argument dictionary \var{_params}, the key is taken as the parameter name, with underscores converted to @@ -249,43 +262,84 @@ Content-Disposition: attachment; filename="bud.gif" \end{verbatim} \end{methoddesc} -\begin{methoddesc}[Message]{get_type}{\optional{failobj}} -Return the message's content type, as a string of the form -\mimetype{maintype/subtype} as taken from the -\mailheader{Content-Type} header. -The returned string is coerced to lowercase. +\begin{methoddesc}[Message]{replace_header}{_name, _value} +Replace a header. Replace the first header found in the message that +matches \var{_name}, retaining header order and field name case. If +no matching header was found, a \exception{KeyError} is raised. -If there is no \mailheader{Content-Type} header in the message, -\var{failobj} is returned (defaults to \code{None}). +\versionadded{2.2.2} \end{methoddesc} -\begin{methoddesc}[Message]{get_main_type}{\optional{failobj}} -Return the message's \emph{main} content type. This essentially returns the -\var{maintype} part of the string returned by \method{get_type()}, with the -same semantics for \var{failobj}. +\begin{methoddesc}[Message]{get_content_type}{} +Return the message's content type. The returned string is coerced to +lower case of the form \mimetype{maintype/subtype}. If there was no +\mailheader{Content-Type} header in the message the default type as +given by \method{get_default_type()} will be returned. Since +according to \rfc{2045}, messages always have a default type, +\method{get_content_type()} will always return a value. + +\rfc{2045} defines a message's default type to be +\mimetype{text/plain} unless it appears inside a +\mimetype{multipart/digest} container, in which case it would be +\mimetype{message/rfc822}. If the \mailheader{Content-Type} header +has an invalid type specification, \rfc{2045} mandates that the +default type be \mimetype{text/plain}. + +\versionadded{2.2.2} \end{methoddesc} -\begin{methoddesc}[Message]{get_subtype}{\optional{failobj}} -Return the message's sub-content type. This essentially returns the -\var{subtype} part of the string returned by \method{get_type()}, with the -same semantics for \var{failobj}. +\begin{methoddesc}[Message]{get_content_maintype}{} +Return the message's main content type. This is the +\mimetype{maintype} part of the string returned by +\method{get_content_type()}. + +\versionadded{2.2.2} +\end{methoddesc} + +\begin{methoddesc}[Message]{get_content_subtype}{} +Return the message's sub-content type. This is the \mimetype{subtype} +part of the string returned by \method{get_content_type()}. + +\versionadded{2.2.2} +\end{methoddesc} + +\begin{methoddesc}[Message]{get_default_type}{} +Return the default content type. Most messages have a default content +type of \mimetype{text/plain}, except for messages that are subparts +of \mimetype{multipart/digest} containers. Such subparts have a +default content type of \mimetype{message/rfc822}. + +\versionadded{2.2.2} \end{methoddesc} -\begin{methoddesc}[Message]{get_params}{\optional{failobj\optional{, header}}} +\begin{methoddesc}[Message]{set_default_type}{ctype} +Set the default content type. \var{ctype} should either be +\mimetype{text/plain} or \mimetype{message/rfc822}, although this is +not enforced. The default content type is not stored in the +\mailheader{Content-Type} header. + +\versionadded{2.2.2} +\end{methoddesc} + +\begin{methoddesc}[Message]{get_params}{\optional{failobj\optional{, + header\optional{, unquote}}}} Return the message's \mailheader{Content-Type} parameters, as a list. The elements of the returned list are 2-tuples of key/value pairs, as split on the \character{=} sign. The left hand side of the \character{=} is the key, while the right hand side is the value. If there is no \character{=} sign in the parameter the value is the empty -string. The value is always unquoted with \method{Utils.unquote()}. +string, otherwise the value is as described in \method{get_param()} and is +unquoted if optional \var{unquote} is \code{True} (the default). Optional \var{failobj} is the object to return if there is no \mailheader{Content-Type} header. Optional \var{header} is the header to search instead of \mailheader{Content-Type}. + +\versionchanged[\var{unquote} argument added]{2.2.2} \end{methoddesc} \begin{methoddesc}[Message]{get_param}{param\optional{, - failobj\optional{, header}}} + failobj\optional{, header\optional{, unquote}}}} Return the value of the \mailheader{Content-Type} header's parameter \var{param} as a string. If the message has no \mailheader{Content-Type} header or if there is no such parameter, then \var{failobj} is @@ -293,20 +347,80 @@ returned (defaults to \code{None}). Optional \var{header} if given, specifies the message header to use instead of \mailheader{Content-Type}. + +Parameter keys are always compared case insensitively. The return +value can either be a string, or a 3-tuple if the parameter was +\rfc{2231} encoded. When it's a 3-tuple, the elements of the value are of +the form \samp{(CHARSET, LANGUAGE, VALUE)}, where \var{LANGUAGE} may +be the empty string. Your application should be prepared to deal with +3-tuple return values, which it can convert the parameter to a Unicode +string like so: + +\begin{verbatim} +param = msg.get_param('foo') +if isinstance(param, tuple): + param = unicode(param[2], param[0]) +\end{verbatim} + +In any case, the parameter value (either the returned string, or the +\var{VALUE} item in the 3-tuple) is always unquoted, unless +\var{unquote} is set to \code{False}. + +\versionchanged[\var{unquote} argument added, and 3-tuple return value +possible]{2.2.2} \end{methoddesc} -\begin{methoddesc}[Message]{get_charsets}{\optional{failobj}} -Return a list containing the character set names in the message. If -the message is a \mimetype{multipart}, then the list will contain one -element for each subpart in the payload, otherwise, it will be a list -of length 1. +\begin{methoddesc}[Message]{set_param}{param, value\optional{, + header\optional{, requote\optional{, charset\optional{, language}}}}} -Each item in the list will be a string which is the value of the -\code{charset} parameter in the \mailheader{Content-Type} header for the -represented subpart. However, if the subpart has no -\mailheader{Content-Type} header, no \code{charset} parameter, or is not of -the \mimetype{text} main MIME type, then that item in the returned list -will be \var{failobj}. +Set a parameter in the \mailheader{Content-Type} header. If the +parameter already exists in the header, its value will be replaced +with \var{value}. If the \mailheader{Content-Type} header as not yet +been defined for this message, it will be set to \mimetype{text/plain} +and the new parameter value will be appended as per \rfc{2045}. + +Optional \var{header} specifies an alternative header to +\mailheader{Content-Type}, and all parameters will be quoted as +necessary unless optional \var{requote} is \code{False} (the default +is \code{True}). + +If optional \var{charset} is specified, the parameter will be encoded +according to \rfc{2231}. Optional \var{language} specifies the RFC +2231 language, defaulting to the empty string. Both \var{charset} and +\var{language} should be strings. + +\versionadded{2.2.2} +\end{methoddesc} + +\begin{methoddesc}[Message]{del_param}{param\optional{, header\optional{, + requote}}} +Remove the given parameter completely from the +\mailheader{Content-Type} header. The header will be re-written in +place without the parameter or its value. All values will be quoted +as necessary unless \var{requote} is \code{False} (the default is +\code{True}). Optional \var{header} specifies an alterative to +\mailheader{Content-Type}. + +\versionadded{2.2.2} +\end{methoddesc} + +\begin{methoddesc}[Message]{set_type}{type\optional{, header}\optional{, + requote}} +Set the main type and subtype for the \mailheader{Content-Type} +header. \var{type} must be a string in the form +\mimetype{maintype/subtype}, otherwise a \exception{ValueError} is +raised. + +This method replaces the \mailheader{Content-Type} header, keeping all +the parameters in place. If \var{requote} is \code{False}, this +leaves the existing header's quoting as is, otherwise the parameters +will be quoted (the default). + +An alternative header can be specified in the \var{header} argument. +When the \mailheader{Content-Type} header is set, we'll always also +add a \mailheader{MIME-Version} header. + +\versionadded{2.2.2} \end{methoddesc} \begin{methoddesc}[Message]{get_filename}{\optional{failobj}} @@ -340,6 +454,32 @@ However, it does \emph{not} preserve any continuation lines which may have been present in the original \mailheader{Content-Type} header. \end{methoddesc} +\begin{methoddesc}[Message]{get_content_charset}{\optional{failobj}} +Return the \code{charset} parameter of the \mailheader{Content-Type} +header. If there is no \mailheader{Content-Type} header, or if that +header has no \code{charset} parameter, \var{failobj} is returned. + +Note that this method differs from \method{get_charset} which returns +the \class{Charset} instance for the default encoding of the message +body. + +\versionadded{2.2.2} +\end{methoddesc} + +\begin{methoddesc}[Message]{get_charsets}{\optional{failobj}} +Return a list containing the character set names in the message. If +the message is a \mimetype{multipart}, then the list will contain one +element for each subpart in the payload, otherwise, it will be a list +of length 1. + +Each item in the list will be a string which is the value of the +\code{charset} parameter in the \mailheader{Content-Type} header for the +represented subpart. However, if the subpart has no +\mailheader{Content-Type} header, no \code{charset} parameter, or is not of +the \mimetype{text} main MIME type, then that item in the returned list +will be \var{failobj}. +\end{methoddesc} + \begin{methoddesc}[Message]{walk}{} The \method{walk()} method is an all-purpose generator which can be used to iterate over all the parts and subparts of a message object @@ -380,7 +520,8 @@ the headers but before the first boundary string, it assigns this text to the message's \var{preamble} attribute. When the \class{Generator} is writing out the plain text representation of a MIME message, and it finds the message has a \var{preamble} attribute, it will write this -text in the area between the headers and the first boundary. +text in the area between the headers and the first boundary. See +\refmodule{email.Parser} and \refmodule{email.Generator} for details. Note that if the message object has no preamble, the \var{preamble} attribute will be \code{None}. @@ -401,3 +542,59 @@ practical sense. The upshot is that if you want to ensure that a newline get printed after your closing \mimetype{multipart} boundary, set the \var{epilogue} to the empty string. \end{datadesc} + +\subsubsection{Deprecated methods} + +The following methods are deprecated in \module{email} version 2. +They are documented here for completeness. + +\begin{methoddesc}[Message]{add_payload}{payload} +Add \var{payload} to the message object's existing payload. If, prior +to calling this method, the object's payload was \code{None} +(i.e. never before set), then after this method is called, the payload +will be the argument \var{payload}. + +If the object's payload was already a list +(i.e. \method{is_multipart()} returns 1), then \var{payload} is +appended to the end of the existing payload list. + +For any other type of existing payload, \method{add_payload()} will +transform the new payload into a list consisting of the old payload +and \var{payload}, but only if the document is already a MIME +multipart document. This condition is satisfied if the message's +\mailheader{Content-Type} header's main type is either +\mimetype{multipart}, or there is no \mailheader{Content-Type} +header. In any other situation, +\exception{MultipartConversionError} is raised. + +\deprecated{2.2.2}{Use the \method{attach()} method instead.} +\end{methoddesc} + +\begin{methoddesc}[Message]{get_type}{\optional{failobj}} +Return the message's content type, as a string of the form +\mimetype{maintype/subtype} as taken from the +\mailheader{Content-Type} header. +The returned string is coerced to lowercase. + +If there is no \mailheader{Content-Type} header in the message, +\var{failobj} is returned (defaults to \code{None}). + +\deprecated{2.2.2}{Use the \method{get_content_type()} method instead.} +\end{methoddesc} + +\begin{methoddesc}[Message]{get_main_type}{\optional{failobj}} +Return the message's \emph{main} content type. This essentially returns the +\var{maintype} part of the string returned by \method{get_type()}, with the +same semantics for \var{failobj}. + +\deprecated{2.2.2}{Use the \method{get_content_maintype()} method instead.} +\end{methoddesc} + +\begin{methoddesc}[Message]{get_subtype}{\optional{failobj}} +Return the message's sub-content type. This essentially returns the +\var{subtype} part of the string returned by \method{get_type()}, with the +same semantics for \var{failobj}. + +\deprecated{2.2.2}{Use the \method{get_content_subtype()} method instead.} +\end{methoddesc} + diff --git a/Doc/lib/emailmimebase.tex b/Doc/lib/emailmimebase.tex new file mode 100644 index 0000000000..97c3edac20 --- /dev/null +++ b/Doc/lib/emailmimebase.tex @@ -0,0 +1,159 @@ +Ordinarily, you get a message object structure by passing a file or +some text to a parser, which parses the text and returns the root of +the message object structure. However you can also build a complete +object structure from scratch, or even individual \class{Message} +objects by hand. In fact, you can also take an existing structure and +add new \class{Message} objects, move them around, etc. This makes a +very convenient interface for slicing-and-dicing MIME messages. + +You can create a new object structure by creating \class{Message} +instances, adding attachments and all the appropriate headers manually. +For MIME messages though, the \module{email} package provides some +convenient subclasses to make things easier. Each of these classes +should be imported from a module with the same name as the class, from +within the \module{email} package. E.g.: + +\begin{verbatim} +import email.MIMEImage.MIMEImage +\end{verbatim} + +or + +\begin{verbatim} +from email.MIMEText import MIMEText +\end{verbatim} + +Here are the classes: + +\begin{classdesc}{MIMEBase}{_maintype, _subtype, **_params} +This is the base class for all the MIME-specific subclasses of +\class{Message}. Ordinarily you won't create instances specifically +of \class{MIMEBase}, although you could. \class{MIMEBase} is provided +primarily as a convenient base class for more specific MIME-aware +subclasses. + +\var{_maintype} is the \mailheader{Content-Type} major type +(e.g. \mimetype{text} or \mimetype{image}), and \var{_subtype} is the +\mailheader{Content-Type} minor type +(e.g. \mimetype{plain} or \mimetype{gif}). \var{_params} is a parameter +key/value dictionary and is passed directly to +\method{Message.add_header()}. + +The \class{MIMEBase} class always adds a \mailheader{Content-Type} header +(based on \var{_maintype}, \var{_subtype}, and \var{_params}), and a +\mailheader{MIME-Version} header (always set to \code{1.0}). +\end{classdesc} + +\begin{classdesc}{MIMENonMultipart}{} +A subclass of \class{MIMEBase}, this is an intermediate base class for +MIME messages that are not \mimetype{multipart}. The primary purpose +of this class is to prevent the use of the \method{attach()} method, +which only makes sense for \mimetype{multipart} messages. If +\method{attach()} is called, a \exception{MultipartConversionError} +exception is raised. + +\versionadded{2.2.2} +\end{classdesc} + +\begin{classdesc}{MIMEMultipart}{\optional{subtype\optional{, + boundary\optional{, _subparts\optional{, _params}}}}} + +A subclass of \class{MIMEBase}, this is an intermediate base class for +MIME messages that are \mimetype{multipart}. Optional \var{_subtype} +defaults to \mimetype{mixed}, but can be used to specify the subtype +of the message. A \mailheader{Content-Type} header of +\mimetype{multipart/}\var{_subtype} will be added to the message +object. A \mailheader{MIME-Version} header will also be added. + +Optional \var{boundary} is the multipart boundary string. When +\code{None} (the default), the boundary is calculated when needed. + +\var{_subparts} is a sequence of initial subparts for the payload. It +must be possible to convert this sequence to a list. You can always +attach new subparts to the message by using the +\method{Message.attach()} method. + +Additional parameters for the \mailheader{Content-Type} header are +taken from the keyword arguments, or passed into the \var{_params} +argument, which is a keyword dictionary. + +\versionadded{2.2.2} +\end{classdesc} + +\begin{classdesc}{MIMEAudio}{_audiodata\optional{, _subtype\optional{, + _encoder\optional{, **_params}}}} + +A subclass of \class{MIMENonMultipart}, the \class{MIMEAudio} class +is used to create MIME message objects of major type \mimetype{audio}. +\var{_audiodata} is a string containing the raw audio data. If this +data can be decoded by the standard Python module \refmodule{sndhdr}, +then the subtype will be automatically included in the +\mailheader{Content-Type} header. Otherwise you can explicitly specify the +audio subtype via the \var{_subtype} parameter. If the minor type could +not be guessed and \var{_subtype} was not given, then \exception{TypeError} +is raised. + +Optional \var{_encoder} is a callable (i.e. function) which will +perform the actual encoding of the audio data for transport. This +callable takes one argument, which is the \class{MIMEAudio} instance. +It should use \method{get_payload()} and \method{set_payload()} to +change the payload to encoded form. It should also add any +\mailheader{Content-Transfer-Encoding} or other headers to the message +object as necessary. The default encoding is \emph{Base64}. See the +\refmodule{email.Encoders} module for a list of the built-in encoders. + +\var{_params} are passed straight through to the base class constructor. +\end{classdesc} + +\begin{classdesc}{MIMEImage}{_imagedata\optional{, _subtype\optional{, + _encoder\optional{, **_params}}}} + +A subclass of \class{MIMENonMultipart}, the \class{MIMEImage} class is +used to create MIME message objects of major type \mimetype{image}. +\var{_imagedata} is a string containing the raw image data. If this +data can be decoded by the standard Python module \refmodule{imghdr}, +then the subtype will be automatically included in the +\mailheader{Content-Type} header. Otherwise you can explicitly specify the +image subtype via the \var{_subtype} parameter. If the minor type could +not be guessed and \var{_subtype} was not given, then \exception{TypeError} +is raised. + +Optional \var{_encoder} is a callable (i.e. function) which will +perform the actual encoding of the image data for transport. This +callable takes one argument, which is the \class{MIMEImage} instance. +It should use \method{get_payload()} and \method{set_payload()} to +change the payload to encoded form. It should also add any +\mailheader{Content-Transfer-Encoding} or other headers to the message +object as necessary. The default encoding is \emph{Base64}. See the +\refmodule{email.Encoders} module for a list of the built-in encoders. + +\var{_params} are passed straight through to the \class{MIMEBase} +constructor. +\end{classdesc} + +\begin{classdesc}{MIMEMessage}{_msg\optional{, _subtype}} +A subclass of \class{MIMENonMultipart}, the \class{MIMEMessage} class +is used to create MIME objects of main type \mimetype{message}. +\var{_msg} is used as the payload, and must be an instance of class +\class{Message} (or a subclass thereof), otherwise a +\exception{TypeError} is raised. + +Optional \var{_subtype} sets the subtype of the message; it defaults +to \mimetype{rfc822}. +\end{classdesc} + +\begin{classdesc}{MIMEText}{_text\optional{, _subtype\optional{, + _charset\optional{, _encoder}}}} + +A subclass of \class{MIMENonMultipart}, the \class{MIMEText} class is +used to create MIME objects of major type \mimetype{text}. +\var{_text} is the string for the payload. \var{_subtype} is the +minor type and defaults to \mimetype{plain}. \var{_charset} is the +character set of the text and is passed as a parameter to the +\class{MIMENonMultipart} constructor; it defaults to \code{us-ascii}. No +guessing or encoding is performed on the text data, but a newline is +appended to \var{_text} if it doesn't already end with a newline. + +\deprecated{2.2.2}{The \var{_encoding} argument has been deprecated. +Encoding now happens implicitly based on the \var{_charset} argument.} +\end{classdesc} diff --git a/Doc/lib/emailparser.tex b/Doc/lib/emailparser.tex index 40ce853028..b5d9900497 100644 --- a/Doc/lib/emailparser.tex +++ b/Doc/lib/emailparser.tex @@ -1,20 +1,20 @@ \declaremodule{standard}{email.Parser} \modulesynopsis{Parse flat text email messages to produce a message - object tree.} + object structure.} -Message object trees can be created in one of two ways: they can be +Message object structures can be created in one of two ways: they can be created from whole cloth by instantiating \class{Message} objects and -stringing them together via \method{add_payload()} and +stringing them together via \method{attach()} and \method{set_payload()} calls, or they can be created by parsing a flat text representation of the email message. The \module{email} package provides a standard parser that understands most email document structures, including MIME documents. You can pass the parser a string or a file object, and the parser will return -to you the root \class{Message} instance of the object tree. For +to you the root \class{Message} instance of the object structure. For simple, non-MIME messages the payload of this root object will likely be a string containing the text of the message. For MIME -messages, the root object will return true from its +messages, the root object will return \code{True} from its \method{is_multipart()} method, and the subparts can be accessed via the \method{get_payload()} and \method{walk()} methods. @@ -27,28 +27,46 @@ message object trees any way it finds necessary. The primary parser class is \class{Parser} which parses both the headers and the payload of the message. In the case of \mimetype{multipart} messages, it will recursively parse the body of -the container message. The \module{email.Parser} module also provides -a second class, called \class{HeaderParser} which can be used if -you're only interested in the headers of the message. -\class{HeaderParser} can be much faster in this situations, since it -does not attempt to parse the message body, instead setting the -payload to the raw body as a string. \class{HeaderParser} has the -same API as the \class{Parser} class. +the container message. Two modes of parsing are supported, +\emph{strict} parsing, which will usually reject any non-RFC compliant +message, and \emph{lax} parsing, which attempts to adjust for common +MIME formatting problems. + +The \module{email.Parser} module also provides a second class, called +\class{HeaderParser} which can be used if you're only interested in +the headers of the message. \class{HeaderParser} can be much faster in +these situations, since it does not attempt to parse the message body, +instead setting the payload to the raw body as a string. +\class{HeaderParser} has the same API as the \class{Parser} class. \subsubsection{Parser class API} -\begin{classdesc}{Parser}{\optional{_class}} -The constructor for the \class{Parser} class takes a single optional +\begin{classdesc}{Parser}{\optional{_class\optional{, strict}}} +The constructor for the \class{Parser} class takes an optional argument \var{_class}. This must be a callable factory (such as a function or a class), and it is used whenever a sub-message object needs to be created. It defaults to \class{Message} (see \refmodule{email.Message}). The factory will be called without arguments. + +The optional \var{strict} flag specifies whether strict or lax parsing +should be performed. Normally, when things like MIME terminating +boundaries are missing, or when messages contain other formatting +problems, the \class{Parser} will raise a +\exception{MessageParseError}. However, when lax parsing is enabled, +the \class{Parser} will attempt to workaround such broken formatting +to produce a usable message structure (this doesn't mean +\exception{MessageParseError}s are never raised; some ill-formatted +messages just can't be parsed). The \var{strict} flag defaults to +\code{False} since lax parsing usually provides the most convenient +behavior. + +\versionchanged[The \var{strict} flag was added]{2.2.2} \end{classdesc} The other public \class{Parser} methods are: -\begin{methoddesc}[Parser]{parse}{fp} +\begin{methoddesc}[Parser]{parse}{fp\optional{, headersonly}} Read all the data from the file-like object \var{fp}, parse the resulting text, and return the root message object. \var{fp} must support both the \method{readline()} and the \method{read()} methods @@ -56,32 +74,49 @@ on file-like objects. The text contained in \var{fp} must be formatted as a block of \rfc{2822} style headers and header continuation lines, optionally preceeded by a -\emph{Unix-From} header. The header block is terminated either by the +envelope header. The header block is terminated either by the end of the data or by a blank line. Following the header block is the body of the message (which may contain MIME-encoded subparts). + +Optional \var{headersonly} is a flag specifying whether to stop +parsing after reading the headers or not. The default is \code{False}, +meaning it parses the entire contents of the file. + +\versionchanged[The \var{headersonly} flag was added]{2.2.2} \end{methoddesc} -\begin{methoddesc}[Parser]{parsestr}{text} +\begin{methoddesc}[Parser]{parsestr}{text\optional{, headersonly}} Similar to the \method{parse()} method, except it takes a string object instead of a file-like object. Calling this method on a string is exactly equivalent to wrapping \var{text} in a \class{StringIO} instance first and calling \method{parse()}. + +Optional \var{headersonly} is a flag specifying whether to stop +parsing after reading the headers or not. The default is \code{False}, +meaning it parses the entire contents of the file. + +\versionchanged[The \var{headersonly} flag was added]{2.2.2} \end{methoddesc} -Since creating a message object tree from a string or a file object is -such a common task, two functions are provided as a convenience. They -are available in the top-level \module{email} package namespace. +Since creating a message object structure from a string or a file +object is such a common task, two functions are provided as a +convenience. They are available in the top-level \module{email} +package namespace. -\begin{funcdesc}{message_from_string}{s\optional{, _class}} +\begin{funcdesc}{message_from_string}{s\optional{, _class\optional{, strict}}} Return a message object tree from a string. This is exactly -equivalent to \code{Parser().parsestr(s)}. Optional \var{_class} is -interpreted as with the \class{Parser} class constructor. +equivalent to \code{Parser().parsestr(s)}. Optional \var{_class} and +\var{strict} are interpreted as with the \class{Parser} class constructor. + +\versionchanged[The \var{strict} flag was added]{2.2.2} \end{funcdesc} -\begin{funcdesc}{message_from_file}{fp\optional{, _class}} +\begin{funcdesc}{message_from_file}{fp\optional{, _class\optional{, strict}}} Return a message object tree from an open file object. This is exactly -equivalent to \code{Parser().parse(fp)}. Optional \var{_class} is -interpreted as with the \class{Parser} class constructor. +equivalent to \code{Parser().parse(fp)}. Optional \var{_class} and +\var{strict} are interpreted as with the \class{Parser} class constructor. + +\versionchanged[The \var{strict} flag was added]{2.2.2} \end{funcdesc} Here's an example of how you might use this at an interactive Python @@ -99,15 +134,17 @@ Here are some notes on the parsing semantics: \begin{itemize} \item Most non-\mimetype{multipart} type messages are parsed as a single message object with a string payload. These objects will return - 0 for \method{is_multipart()}. -\item One exception is for \mimetype{message/delivery-status} type - messages. Because the body of such messages consist of - blocks of headers, \class{Parser} will create a non-multipart - object containing non-multipart subobjects for each header - block. -\item Another exception is for \mimetype{message/*} types (more - general than \mimetype{message/delivery-status}). These are - typically \mimetype{message/rfc822} messages, represented as a - non-multipart object containing a singleton payload which is - another non-multipart \class{Message} instance. + \code{False} for \method{is_multipart()}. Their + \method{get_payload()} method will return a string object. +\item All \mimetype{multipart} type messages will be parsed as a + container message object with a list of sub-message objects for + their payload. These messages will return \code{True} for + \method{is_multipart()} and their \method{get_payload()} method + will return a list of \class{Message} instances. +\item Most messages with a content type of \mimetype{message/*} + (e.g. \mimetype{message/deliver-status} and + \mimetype{message/rfc822}) will also be parsed as container + object containing a list payload of length 1. Their + \method{is_multipart()} method will return \code{True}. The + single element in the list payload will be a sub-message object. \end{itemize} diff --git a/Doc/lib/emailutil.tex b/Doc/lib/emailutil.tex index 75f3798704..e2ff752330 100644 --- a/Doc/lib/emailutil.tex +++ b/Doc/lib/emailutil.tex @@ -21,10 +21,10 @@ Parse address -- which should be the value of some address-containing field such as \mailheader{To} or \mailheader{Cc} -- into its constituent \emph{realname} and \emph{email address} parts. Returns a tuple of that information, unless the parse fails, in which case a 2-tuple of -\code{(None, None)} is returned. +\code{('', '')} is returned. \end{funcdesc} -\begin{funcdesc}{dump_address_pair}{pair} +\begin{funcdesc}{formataddr}{pair} The inverse of \method{parseaddr()}, this takes a 2-tuple of the form \code{(realname, email_address)} and returns the string value suitable for a \mailheader{To} or \mailheader{Cc} header. If the first element of @@ -48,27 +48,6 @@ all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs) \end{verbatim} \end{funcdesc} -\begin{funcdesc}{decode}{s} -This method decodes a string according to the rules in \rfc{2047}. It -returns the decoded string as a Python unicode string. -\end{funcdesc} - -\begin{funcdesc}{encode}{s\optional{, charset\optional{, encoding}}} -This method encodes a string according to the rules in \rfc{2047}. It -is not actually the inverse of \function{decode()} since it doesn't -handle multiple character sets or multiple string parts needing -encoding. In fact, the input string \var{s} must already be encoded -in the \var{charset} character set (Python can't reliably guess what -character set a string might be encoded in). The default -\var{charset} is \samp{iso-8859-1}. - -\var{encoding} must be either the letter \character{q} for -Quoted-Printable or \character{b} for Base64 encoding. If -neither, a \exception{ValueError} is raised. Both the \var{charset} and -the \var{encoding} strings are case-insensitive, and coerced to lower -case in the returned string. -\end{funcdesc} - \begin{funcdesc}{parsedate}{date} Attempts to parse a date according to the rules in \rfc{2822}. however, some mailers don't follow that format as specified, so @@ -116,7 +95,48 @@ Optional \var{timeval} if given is a floating point time value as accepted by \function{time.gmtime()} and \function{time.localtime()}, otherwise the current time is used. -Optional \var{localtime} is a flag that when true, interprets +Optional \var{localtime} is a flag that when \code{True}, interprets \var{timeval}, and returns a date relative to the local timezone instead of UTC, properly taking daylight savings time into account. +The default is \code{False} meaning UTC is used. +\end{funcdesc} + +\begin{funcdesc}{make_msgid}{\optional{idstring}} +Returns a string suitable for an \rfc{2822}-compliant +\mailheader{Message-ID} header. Optional \var{idstring} if given, is +a string used to strengthen the uniqueness of the message id. +\end{funcdesc} + +\begin{funcdesc}{decode_rfc2231}{s} +Decode the string \var{s} according to \rfc{2231}. +\end{funcdesc} + +\begin{funcdesc}{encode_rfc2231}{s\optional{, charset\optional{, language}}} +Encode the string \var{s} according to \rfc{2231}. Optional +\var{charset} and \var{language}, if given is the character set name +and language name to use. If neither is given, \var{s} is returned +as-is. If \var{charset} is given but \var{language} is not, the +string is encoded using the empty string for \var{language}. \end{funcdesc} + +\begin{funcdesc}{decode_params}{params} +Decode parameters list according to \rfc{2231}. \var{params} is a +sequence of 2-tuples containing elements of the form +\code{(content-type, string-value)}. +\end{funcdesc} + +The following functions have been deprecated: + +\begin{funcdesc}{dump_address_pair}{pair} +\deprecated{2.2.2}{Use \function{formataddr()} instead.} +\end{funcdesc} + +\begin{funcdesc}{decode}{s} +\deprecated{2.2.2}{Use \method{Header.decode_header()} instead.} +\end{funcdesc} + + +\begin{funcdesc}{encode}{s\optional{, charset\optional{, encoding}}} +\deprecated{2.2.2}{Use \method{Header.encode()} instead.} +\end{funcdesc} + -- 2.40.0