Big email 3.0 API changes, with updated unit tests and documentation.

author Barry Warsaw <barry@python.org>

Sun, 3 Oct 2004 03:16:19 +0000 (03:16 +0000)

committer Barry Warsaw <barry@python.org>

Sun, 3 Oct 2004 03:16:19 +0000 (03:16 +0000)
author Barry Warsaw <barry@python.org>
Sun, 3 Oct 2004 03:16:19 +0000 (03:16 +0000)
committer Barry Warsaw <barry@python.org>
Sun, 3 Oct 2004 03:16:19 +0000 (03:16 +0000)
diff --git a/Doc/lib/email.tex b/Doc/lib/email.tex

index debed70a709171aaa3ed03b20f2a918014278a33..56affa5369c1e0d0c1fe72ecfe4f4e95a0f54598 100644 (file)
--- a/Doc/lib/email.tex
+++ b/Doc/lib/email.tex
@@ -1,5 +1,5 @@
-% Copyright (C) 2001,2002 Python Software Foundation
-% Author: barry@zope.com (Barry Warsaw)
+% Copyright (C) 2001-2004 Python Software Foundation
+% Author: barry@python.org (Barry Warsaw)
  
  \section{\module{email} ---
          An email and MIME handling package}
@@ -7,8 +7,8 @@
  \declaremodule{standard}{email}
  \modulesynopsis{Package supporting the parsing, manipulating, and
      generating email messages, including MIME documents.}
-\moduleauthor{Barry A. Warsaw}{barry@zope.com}
-\sectionauthor{Barry A. Warsaw}{barry@zope.com}
+\moduleauthor{Barry A. Warsaw}{barry@python.org}
+\sectionauthor{Barry A. Warsaw}{barry@python.org}
  
  \versionadded{2.2}
  
@@ -22,7 +22,7 @@ sending of email messages to SMTP (\rfc{2821}) servers; that is the
  function of the \refmodule{smtplib} module.  The \module{email}
  package attempts to be as RFC-compliant as possible, supporting in
  addition to \rfc{2822}, such MIME-related RFCs as
-\rfc{2045}-\rfc{2047}, and \rfc{2231}.
+\rfc{2045}, \rfc{2046}, \rfc{2047}, and \rfc{2231}.
  
  The primary distinguishing feature of the \module{email} package is
  that it splits the parsing and generating of email messages from the
@@ -79,7 +79,7 @@ package, a section on differences and porting is provided.
  \subsection{Encoders}
  \input{emailencoders}
  
-\subsection{Exception classes}
+\subsection{Exception and Defect classes}
  \input{emailexc}
  
  \subsection{Miscellaneous utilities}
@@ -88,14 +88,41 @@ package, a section on differences and porting is provided.
  \subsection{Iterators}
  \input{emailiter}
  
-\subsection{Differences from \module{email} v1 (up to Python 2.2.1)}
+\subsection{Package History}
  
  Version 1 of the \module{email} package was bundled with Python
  releases up to Python 2.2.1.  Version 2 was developed for the Python
  2.3 release, and backported to Python 2.2.2.  It was also available as
-a separate distutils based package.  \module{email} version 2 is
-almost entirely backward compatible with version 1, with the
-following differences:
+a separate distutils-based package, and is compatible back to Python 2.1.
+
+\module{email} version 3.0 was released with Python 2.4 and as a separate
+distutils-based package.  It is compatible back to Python 2.3.
+
+Here are the differences between \module{email} version 3 and version 2:
+
+\begin{itemize}
+\item The \class{FeedParser} class was introduced, and the \class{Parser}
+      class was implemented in terms of the \class{FeedParser}.  All parsing
+      there for is non-strict, and parsing will make a best effort never to
+      raise an exception.  Problems found while parsing messages are stored in
+      the message's \var{defect} attribute.
+
+\item All aspects of the API which raised \exception{DeprecationWarning}s in
+      version 2 have been removed.  These include the \var{_encoder} argument
+      to the \class{MIMEText} constructor, the \method{Message.add_payload()}
+      method, the \function{Utils.dump_address_pair()} function, and the
+      functions \function{Utils.decode()} and \function{Utils.encode()}.
+
+\item New \exception{DeprecationWarning}s have been added to:
+      \method{Generator.__call__()}, \method{Message.get_type()},
+      \method{Message.get_main_type()}, \method{Message.get_subtype()}, and
+      the \var{strict} argument to the \class{Parser} class.  These are
+      expected to be removed in email 3.1.
+
+\item Support for Pythons earlier than 2.3 has been removed.
+\end{itemize}
+
+Here are the differences between \module{email} version 2 and version 1:
  
  \begin{itemize}
  \item The \module{email.Header} and \module{email.Charset} modules
diff --git a/Doc/lib/emailencoders.tex b/Doc/lib/emailencoders.tex

index cd54d68be9b1e7e30ada045f7ab9fe6d047061b2..a49e04d9b05fac136b17721e632e19f69896d54c 100644 (file)
--- a/Doc/lib/emailencoders.tex
+++ b/Doc/lib/emailencoders.tex
@@ -8,11 +8,11 @@ type messages containing binary data.
  
  The \module{email} package provides some convenient encodings in its
  \module{Encoders} module.  These encoders are actually used by the
-\class{MIMEImage} and \class{MIMEText} class constructors to provide default
-encodings.  All encoder functions take exactly one argument, the
-message object to encode.  They usually extract the payload, encode
-it, and reset the payload to this newly encoded value.  They should also
-set the \mailheader{Content-Transfer-Encoding} header as appropriate.
+\class{MIMEAudio} and \class{MIMEImage} class constructors to provide default
+encodings.  All encoder functions take exactly one argument, the message
+object to encode.  They usually extract the payload, encode it, and reset the
+payload to this newly encoded value.  They should also set the
+\mailheader{Content-Transfer-Encoding} header as appropriate.
  
  Here are the encoding functions provided:
  
diff --git a/Doc/lib/emailexc.tex b/Doc/lib/emailexc.tex

index 824a276f1738ce0ccf11cf3f8f95b83384e33601..6ac088949cdbd4212a69a4dc53cccbff8bab1de4 100644 (file)
--- a/Doc/lib/emailexc.tex
+++ b/Doc/lib/emailexc.tex
@@ -52,3 +52,36 @@ rarely raised in practice.  However the exception may also be raised
  if the \method{attach()} method is called on an instance of a class
  derived from \class{MIMENonMultipart} (e.g. \class{MIMEImage}).
  \end{excclassdesc}
+
+Here's the list of the defects that the \class{FeedParser} can find while
+parsing messages.  Note that the defects are added to the message where the
+problem was found, so for example, if a message nested inside a
+\mimetype{multipart/alternative} had a malformed header, that nested message
+object would have a defect, but the containing messages would not.
+
+All defect classes are subclassed from \class{email.Errors.MessageDefect}, but
+this class is \emph{not} an exception!
+
+\versionadded[All the defect classes were added]{2.4}
+
+\begin{itemize}
+\item \class{NoBoundaryInMultipartDefect} -- A message claimed to be a
+      multipart, but had no \mimetype{boundary} parameter.
+
+\item \class{StartBoundaryNotFoundDefect} -- The start boundary claimed in the
+      \mailheader{Content-Type} header was never found.
+
+\item \class{FirstHeaderLineIsContinuationDefect} -- The message had a
+      continuation line as its first header line.
+
+\item \class{MisplacedEnvelopeHeaderDefect} - A ``Unix From'' header was found
+      in the middle of a header block.
+
+\item \class{MalformedHeaderDefect} -- A header was found that was missing a
+      colon, or was otherwise malformed.
+
+\item \class{MultipartInvariantViolationDefect} -- A message claimed to be a
+      \mimetype{multipart}, but no subparts were found.  Note that when a
+      message has this defect, its \method{is_multipart()} method may return
+      false even though its content type claims to be \mimetype{multipart}.
+\end{itemize}
diff --git a/Doc/lib/emailmessage.tex b/Doc/lib/emailmessage.tex

index 1943273a13c1a3f2ad6eaa84f2fbdad1735d2add..f732054350ce306bbfa86f12c5f6f924be633993 100644 (file)
--- a/Doc/lib/emailmessage.tex
+++ b/Doc/lib/emailmessage.tex
@@ -359,13 +359,16 @@ the form \code{(CHARSET, LANGUAGE, VALUE)}.  Note that both \code{CHARSET} and
  \code{VALUE} to be encoded in the \code{us-ascii} charset.  You can
  usually ignore \code{LANGUAGE}.
  
-Your application should be prepared to deal with 3-tuple return
-values, and can convert the parameter to a Unicode string like so:
+If your application doesn't care whether the parameter was encoded as in
+\rfc{2231}, you can collapse the parameter value by calling
+\function{email.Utils.collapse_rfc2231_value()}, passing in the return value
+from \method{get_param()}.  This will return a suitably decoded Unicode string
+whn the value is a tuple, or the original string unquoted if it isn't.  For
+example:
  
  \begin{verbatim}
-param = msg.get_param('foo')
-if isinstance(param, tuple):
-    param = unicode(param[2], param[0] or 'us-ascii')
+rawparam = msg.get_param('foo')
+param = email.Utils.collapse_rfc2231_value(rawparam)
  \end{verbatim}
  
  In any case, the parameter value (either the returned string, or the
@@ -549,32 +552,21 @@ newline get printed after your closing \mimetype{multipart} boundary,
  set the \var{epilogue} to the empty string.
  \end{datadesc}
  
-\subsubsection{Deprecated methods}
-
-The following methods are deprecated in \module{email} version 2.
-They are documented here for completeness.
+\begin{datadesc}{defects}
+The \var{defects} attribute contains a list of all the problems found when
+parsing this message.  See \refmodule{email.Errors} for a detailed description
+of the possible parsing defects.
  
-\begin{methoddesc}[Message]{add_payload}{payload}
-Add \var{payload} to the message object's existing payload.  If, prior
-to calling this method, the object's payload was \code{None}
-(i.e. never before set), then after this method is called, the payload
-will be the argument \var{payload}.
+\versionadded{2.4}
+\end{datadesc}
  
-If the object's payload was already a list
-(i.e. \method{is_multipart()} returns \code{True}), then \var{payload} is
-appended to the end of the existing payload list.
+\subsubsection{Deprecated methods}
  
-For any other type of existing payload, \method{add_payload()} will
-transform the new payload into a list consisting of the old payload
-and \var{payload}, but only if the document is already a MIME
-multipart document.  This condition is satisfied if the message's
-\mailheader{Content-Type} header's main type is either
-\mimetype{multipart}, or there is no \mailheader{Content-Type}
-header.  In any other situation,
-\exception{MultipartConversionError} is raised.
+\versionchanged[The \method{add_payload()} method was removed; use the
+\method{attach()} method instead]{2.4}
  
-\deprecated{2.2.2}{Use the \method{attach()} method instead.}
-\end{methoddesc}
+The following methods are deprecated.  They are documented here for
+completeness.
  
  \begin{methoddesc}[Message]{get_type}{\optional{failobj}}
  Return the message's content type, as a string of the form
diff --git a/Doc/lib/emailmimebase.tex b/Doc/lib/emailmimebase.tex

index 3318d6aa8dcbb9304ee0dab23350f086e0c31983..070c9a27b208cef4fd2eb8454473d815c73790f4 100644 (file)
--- a/Doc/lib/emailmimebase.tex
+++ b/Doc/lib/emailmimebase.tex
@@ -142,9 +142,7 @@ Optional \var{_subtype} sets the subtype of the message; it defaults
  to \mimetype{rfc822}.
  \end{classdesc}
  
-\begin{classdesc}{MIMEText}{_text\optional{, _subtype\optional{,
-    _charset\optional{, _encoder}}}}
-
+\begin{classdesc}{MIMEText}{_text\optional{, _subtype\optional{, _charset}}}
  A subclass of \class{MIMENonMultipart}, the \class{MIMEText} class is
  used to create MIME objects of major type \mimetype{text}.
  \var{_text} is the string for the payload.  \var{_subtype} is the
@@ -153,6 +151,7 @@ character set of the text and is passed as a parameter to the
  \class{MIMENonMultipart} constructor; it defaults to \code{us-ascii}.  No
  guessing or encoding is performed on the text data.
  
-\deprecated{2.2.2}{The \var{_encoding} argument has been deprecated.
-Encoding now happens implicitly based on the \var{_charset} argument.}
+\versionchanged[The previously deprecated \var{_encoding} argument has
+been removed.  Encoding happens implicitly based on the \var{_charset}
+argument]{2.4}
  \end{classdesc}
diff --git a/Doc/lib/emailparser.tex b/Doc/lib/emailparser.tex

index 1e8597c66f4a6a8101f78df268e592b43bc8233b..5fac92f41f9fccb6383f426087cfd46fb98c673f 100644 (file)
--- a/Doc/lib/emailparser.tex
+++ b/Doc/lib/emailparser.tex
@@ -18,29 +18,79 @@ messages, the root object will return \code{True} from its
  \method{is_multipart()} method, and the subparts can be accessed via
  the \method{get_payload()} and \method{walk()} methods.
  
+There are actually two parser interfaces available for use, the classic
+\class{Parser} API and the incremental \class{FeedParser} API.  The classic
+\class{Parser} API is fine if you have the entire text of the message in
+memory as a string, or if the entire message lives in a file on the file
+system.  \class{FeedParser} is more appropriate for when you're reading the
+message from a stream which might block waiting for more input (e.g. reading
+an email message from a socket).  The \class{FeedParser} can consume and parse
+the message incrementally, and only returns the root object when you close the
+parser\footnote{As of email package version 3.0, introduced in
+Python 2.4, the classic \class{Parser} was re-implemented in terms of the
+\class{FeedParser}, so the semantics and results are identical between the two
+parsers.}.
+
  Note that the parser can be extended in limited ways, and of course
  you can implement your own parser completely from scratch.  There is
  no magical connection between the \module{email} package's bundled
  parser and the \class{Message} class, so your custom parser can create
  message object trees any way it finds necessary.
  
-The primary parser class is \class{Parser} which parses both the
-headers and the payload of the message.  In the case of
-\mimetype{multipart} messages, it will recursively parse the body of
-the container message.  Two modes of parsing are supported,
-\emph{strict} parsing, which will usually reject any non-RFC compliant
-message, and \emph{lax} parsing, which attempts to adjust for common
-MIME formatting problems.
+\subsubsection{FeedParser API}
+
+\versionadded{2.4}
+
+The \class{FeedParser} provides an API that is conducive to incremental
+parsing of email messages, such as would be necessary when reading the text of
+an email message from a source that can block (e.g. a socket).  The
+\class{FeedParser} can of course be used to parse an email message fully
+contained in a string or a file, but the classic \class{Parser} API may be
+more convenient for such use cases.  The semantics and results of the two
+parser APIs are identical.
+
+The \class{FeedParser}'s API is simple; you create an instance, feed it a
+bunch of text until there's no more to feed it, then close the parser to
+retrieve the root message object.  The \class{FeedParser} is extremely
+accurate when parsing standards-compliant messages, and it does a very good
+job of parsing non-compliant messages, providing information about how a
+message was deemed broken.  It will populate a message object's \var{defects}
+attribute with a list of any problems it found in a message.  See the
+\refmodule{email.Errors} module for the list of defects that it can find.
+
+Here is the API for the \class{FeedParser}:
+
+\begin{classdesc}{FeedParser}{\optional{_factory}}
+Create a \class{FeedParser} instance.  Optional \var{_factory} is a
+no-argument callable that will be called whenever a new message object is
+needed.  It defaults to the \class{email.Message.Message} class.
+\end{classdesc}
+
+\begin{methoddesc}[FeedParser]{feed}{data}
+Feed the \class{FeedParser} some more data.  \var{data} should be a
+string containing one or more lines.  The lines can be partial and the
+\class{FeedParser} will stitch such partial lines together properly.  The
+lines in the string can have any of the common three line endings, carriage
+return, newline, or carriage return and newline (they can even be mixed).
+\end{methoddesc}
+
+\begin{methoddesc}[FeedParser]{close}{}
+Closing a \class{FeedParser} completes the parsing of all previously fed data,
+and returns the root message object.  It is undefined what happens if you feed
+more data to a closed \class{FeedParser}.
+\end{methoddesc}
  
-The \module{email.Parser} module also provides a second class, called
+\subsubsection{Parser class API}
+
+The \class{Parser} provides an API that can be used to parse a message when
+the complete contents of the message are available in a string or file.  The
+\module{email.Parser} module also provides a second class, called
  \class{HeaderParser} which can be used if you're only interested in
  the headers of the message. \class{HeaderParser} can be much faster in
  these situations, since it does not attempt to parse the message body,
  instead setting the payload to the raw body as a string.
  \class{HeaderParser} has the same API as the \class{Parser} class.
  
-\subsubsection{Parser class API}
-
  \begin{classdesc}{Parser}{\optional{_class\optional{, strict}}}
  The constructor for the \class{Parser} class takes an optional
  argument \var{_class}.  This must be a callable factory (such as a
@@ -49,19 +99,14 @@ needs to be created.  It defaults to \class{Message} (see
  \refmodule{email.Message}).  The factory will be called without
  arguments.
  
-The optional \var{strict} flag specifies whether strict or lax parsing
-should be performed.  Normally, when things like MIME terminating
-boundaries are missing, or when messages contain other formatting
-problems, the \class{Parser} will raise a
-\exception{MessageParseError}.  However, when lax parsing is enabled,
-the \class{Parser} will attempt to work around such broken formatting
-to produce a usable message structure (this doesn't mean
-\exception{MessageParseError}s are never raised; some ill-formatted
-messages just can't be parsed).  The \var{strict} flag defaults to
-\code{False} since lax parsing usually provides the most convenient
-behavior.
+The optional \var{strict} flag is ignored.  \deprecated{2.4}{Because the
+\class{Parser} class is a backward compatible API wrapper around the
+new-in-Python 2.4 \class{FeedParser}, \emph{all} parsing is effectively
+non-strict.  You should simply stop passing a \var{strict} flag to the
+\class{Parser} constructor.}
  
  \versionchanged[The \var{strict} flag was added]{2.2.2}
+\versionchanged[The \var{strict} flag was deprecated]{2.4}
  \end{classdesc}
  
  The other public \class{Parser} methods are:
@@ -149,4 +194,13 @@ Here are some notes on the parsing semantics:
        object containing a list payload of length 1.  Their
        \method{is_multipart()} method will return \code{True}.  The
        single element in the list payload will be a sub-message object.
+
+\item Some non-standards compliant messages may not be internally consistent
+      about their \mimetype{multipart}-edness.  Such messages may have a
+      \mailheader{Content-Type} header of type \mimetype{multipart}, but their
+      \method{is_multipart()} method may return \code{False}.  If such
+      messages were parsed with the \class{FeedParser}, they will have an
+      instance of the \class{MultipartInvariantViolationDefect} class in their
+      \var{defects} attribute list.  See \refmodule{email.Errors} for
+      details.
  \end{itemize}
diff --git a/Doc/lib/emailutil.tex b/Doc/lib/emailutil.tex

index 80f0acfd37131c313e07da847bfef7cfbf701366..c41f06696e320d219a103bfc4b61a0c6624286b7 100644 (file)
--- a/Doc/lib/emailutil.tex
+++ b/Doc/lib/emailutil.tex
@@ -119,24 +119,33 @@ as-is.  If \var{charset} is given but \var{language} is not, the
  string is encoded using the empty string for \var{language}.
  \end{funcdesc}
  
+\begin{funcdesc}{collapse_rfc2231_value}{value\optional{, errors\optional{,
+    fallback_charset}}}
+When a header parameter is encoded in \rfc{2231} format,
+\method{Message.get_param()} may return a 3-tuple containing the character
+set, language, and value.  \function{collapse_rfc2231_value()} turns this into
+a unicode string.  Optional \var{errors} is passed to the \var{errors}
+argument of the built-in \function{unicode()} function; it defaults to
+\code{replace}.  Optional \var{fallback_charset} specifies the character set
+to use if the one in the \rfc{2231} header is not known by Python; it defaults
+to \code{us-ascii}.
+
+For convenience, if the \var{value} passed to
+\function{collapse_rfc2231_value()} is not a tuple, it should be a string and
+it is returned unquoted.
+\end{funcdesc}
+
  \begin{funcdesc}{decode_params}{params}
  Decode parameters list according to \rfc{2231}.  \var{params} is a
  sequence of 2-tuples containing elements of the form
  \code{(content-type, string-value)}.
  \end{funcdesc}
  
-The following functions have been deprecated:
-
-\begin{funcdesc}{dump_address_pair}{pair}
-\deprecated{2.2.2}{Use \function{formataddr()} instead.}
-\end{funcdesc}
-
-\begin{funcdesc}{decode}{s}
-\deprecated{2.2.2}{Use \method{Header.decode_header()} instead.}
-\end{funcdesc}
-
+\versionchanged[The \function{dump_address_pair()} function has been removed;
+use \function{formataddr()} instead.]{2.4}
  
-\begin{funcdesc}{encode}{s\optional{, charset\optional{, encoding}}}
-\deprecated{2.2.2}{Use \method{Header.encode()} instead.}
-\end{funcdesc}
+\versionchanged[The \function{decode()} function has been removed; use the
+\method{Header.decode_header()} method instead.]{2.4}
  
+\versionchanged[The \function{encode()} function has been removed; use the
+\method{Header.encode()} method instead.]{2.4}
diff --git a/Lib/email/Charset.py b/Lib/email/Charset.py

index 3c8f7a4d594e3f24b2925c61e0fe039e865ab112..6a3e3ca10f8b349d1c22f29df64e93922d01acf9 100644 (file)
--- a/Lib/email/Charset.py
+++ b/Lib/email/Charset.py
@@ -1,18 +1,6 @@
  # Copyright (C) 2001-2004 Python Software Foundation
-# Author: che@debian.org (Ben Gertzfield), barry@python.org (Barry Warsaw)
-
-# XXX The following information needs updating.
-
-# Python 2.3 doesn't come with any Asian codecs by default.  Two packages are
-# currently available and supported as of this writing (30-Dec-2003):
-#
-# CJKCodecs
-# http://cjkpython.i18n.org
-# This package contains Chinese, Japanese, and Korean codecs
-
-# JapaneseCodecs
-# http://www.asahi-net.or.jp/~rd6t-kjym/python
-# Some Japanese users prefer this codec package
+# Author: Ben Gertzfield, Barry Warsaw
+# Contact: email-sig@python.org
  
  import email.base64MIME
  import email.quopriMIME
@@ -21,9 +9,9 @@ from email.Encoders import encode_7or8bit
  
  \f
  # Flags for types of header encodings
-QP     = 1   # Quoted-Printable
-BASE64 = 2   # Base64
-SHORTEST = 3 # the shorter of QP and base64, but only for headers
+QP          = 1 # Quoted-Printable
+BASE64      = 2 # Base64
+SHORTEST    = 3 # the shorter of QP and base64, but only for headers
  
  # In "=?charset?q?hello_world?=", the =?, ?q?, and ?= add up to 7
  MISC_LEN = 7
@@ -128,7 +116,7 @@ def add_charset(charset, header_enc=None, body_enc=None, output_charset=None):
      documentation for more information.
      """
      if body_enc == SHORTEST:
-        raise ValueError, 'SHORTEST not allowed for body_enc'
+        raise ValueError('SHORTEST not allowed for body_enc')
      CHARSETS[charset] = (header_enc, body_enc, output_charset)
  
  
diff --git a/Lib/email/Encoders.py b/Lib/email/Encoders.py

index 6851094e9bd73f2cd9103ecbcbcd51c7cebdc245..baac2a345883e548164dc313bf46f615e2ec5039 100644 (file)
--- a/Lib/email/Encoders.py
+++ b/Lib/email/Encoders.py
@@ -1,37 +1,16 @@
  # Copyright (C) 2001-2004 Python Software Foundation
-# Author: barry@python.org (Barry Warsaw)
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
  """Encodings and related functions."""
  
  import base64
+from quopri import encodestring as _encodestring
  
-
-\f
-# Helpers
-try:
-    from quopri import encodestring as _encodestring
-
-    def _qencode(s):
-        enc = _encodestring(s, quotetabs=1)
-        # Must encode spaces, which quopri.encodestring() doesn't do
-        return enc.replace(' ', '=20')
-except ImportError:
-    # Python 2.1 doesn't have quopri.encodestring()
-    from cStringIO import StringIO
-    import quopri as _quopri
-
-    def _qencode(s):
-        if not s:
-            return s
-        hasnewline = (s[-1] == '\n')
-        infp = StringIO(s)
-        outfp = StringIO()
-        _quopri.encode(infp, outfp, quotetabs=1)
-        # Python 2.x's encode() doesn't encode spaces even when quotetabs==1
-        value = outfp.getvalue().replace(' ', '=20')
-        if not hasnewline and value[-1] == '\n':
-            return value[:-1]
-        return value
+def _qencode(s):
+    enc = _encodestring(s, quotetabs=True)
+    # Must encode spaces, which quopri.encodestring() doesn't do
+    return enc.replace(' ', '=20')
  
  
  def _bencode(s):
diff --git a/Lib/email/Errors.py b/Lib/email/Errors.py

index e233219225da279feb273b2906ffeb4da68d1e37..e13a2c75c8fdfd60ed3ad2746cf720d3dfd9f3a7 100644 (file)
--- a/Lib/email/Errors.py
+++ b/Lib/email/Errors.py
@@ -1,5 +1,6 @@
  # Copyright (C) 2001-2004 Python Software Foundation
-# Author: barry@python.org (Barry Warsaw)
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
  """email package exception classes."""
  
@@ -33,17 +34,20 @@ class MessageDefect:
      def __init__(self, line=None):
          self.line = line
  
-class NoBoundaryInMultipart(MessageDefect):
+class NoBoundaryInMultipartDefect(MessageDefect):
      """A message claimed to be a multipart but had no boundary parameter."""
  
-class StartBoundaryNotFound(MessageDefect):
+class StartBoundaryNotFoundDefect(MessageDefect):
      """The claimed start boundary was never found."""
  
-class FirstHeaderLineIsContinuation(MessageDefect):
+class FirstHeaderLineIsContinuationDefect(MessageDefect):
      """A message had a continuation line as its first header line."""
  
-class MisplacedEnvelopeHeader(MessageDefect):
+class MisplacedEnvelopeHeaderDefect(MessageDefect):
      """A 'Unix-from' header was found in the middle of a header block."""
  
-class MalformedHeader(MessageDefect):
-    """Found a header that was missing a colon, or was otherwise malformed"""
+class MalformedHeaderDefect(MessageDefect):
+    """Found a header that was missing a colon, or was otherwise malformed."""
+
+class MultipartInvariantViolationDefect(MessageDefect):
+    """A message claimed to be a multipart but no subparts were found."""
diff --git a/Lib/email/FeedParser.py b/Lib/email/FeedParser.py

index dc3027dbda4d21b1dccfc876191f547f64cfca55..de2754e099ae598abb2780ece703b7f16f43cbce 100644 (file)
--- a/Lib/email/FeedParser.py
+++ b/Lib/email/FeedParser.py
@@ -1,5 +1,6 @@
  # Copyright (C) 2004 Python Software Foundation
  # Authors: Baxter, Wouters and Warsaw
+# Contact: email-sig@python.org
  
  """FeedParser - An email feed parser.
  
@@ -15,7 +16,7 @@ This completes the parsing and returns the root message object.
  The other advantage of this parser is that it will never throw a parsing
  exception.  Instead, when it finds something unexpected, it adds a 'defect' to
  the current message.  Defects are just instances that live on the message
-object's .defect attribute.
+object's .defects attribute.
  """
  
  import re
@@ -100,7 +101,7 @@ class BufferedSubFile(object):
          # and the eol character(s).  Gather up a list of lines after
          # re-attaching the newlines.
          lines = []
-        for i in range(len(parts) / 2):
+        for i in range(len(parts) // 2):
              lines.append(parts[i*2] + parts[i*2+1])
          self.pushlines(lines)
  
@@ -156,6 +157,10 @@ class FeedParser:
          self._call_parse()
          root = self._pop_message()
          assert not self._msgstack
+        # Look for final set of defects
+        if root.get_content_maintype() == 'multipart' \
+               and not root.is_multipart():
+            root.defects.append(Errors.MultipartInvariantViolationDefect())
          return root
  
      def _new_message(self):
@@ -166,7 +171,6 @@ class FeedParser:
              self._msgstack[-1].attach(msg)
          self._msgstack.append(msg)
          self._cur = msg
-        self._cur.defects = []
          self._last = msg
  
      def _pop_message(self):
@@ -259,7 +263,7 @@ class FeedParser:
                  # defined a boundary.  That's a problem which we'll handle by
                  # reading everything until the EOF and marking the message as
                  # defective.
-                self._cur.defects.append(Errors.NoBoundaryInMultipart())
+                self._cur.defects.append(Errors.NoBoundaryInMultipartDefect())
                  lines = []
                  for line in self._input:
                      if line is NeedMoreData:
@@ -305,6 +309,8 @@ class FeedParser:
                              if eolmo:
                                  preamble[-1] = lastline[:-len(eolmo.group(0))]
                              self._cur.preamble = EMPTYSTRING.join(preamble)
+                        #import pdb ; pdb.set_trace()
+                        # See SF bug #1030941
                          capturing_preamble = False
                          self._input.unreadline(line)
                          continue
@@ -363,7 +369,7 @@ class FeedParser:
              # that as a defect and store the captured text as the payload.
              # Otherwise everything from here to the EOF is epilogue.
              if capturing_preamble:
-                self._cur.defects.append(Errors.StartBoundaryNotFound())
+                self._cur.defects.append(Errors.StartBoundaryNotFoundDefect())
                  self._cur.set_payload(EMPTYSTRING.join(preamble))
                  return
              # If the end boundary ended in a newline, we'll need to make sure
@@ -408,7 +414,7 @@ class FeedParser:
                      # The first line of the headers was a continuation.  This
                      # is illegal, so let's note the defect, store the illegal
                      # line, and ignore it for purposes of headers.
-                    defect = Errors.FirstHeaderLineIsContinuation(line)
+                    defect = Errors.FirstHeaderLineIsContinuationDefect(line)
                      self._cur.defects.append(defect)
                      continue
                  lastvalue.append(line)
@@ -436,13 +442,13 @@ class FeedParser:
                  else:
                      # Weirdly placed unix-from line.  Note this as a defect
                      # and ignore it.
-                    defect = Errors.MisplacedEnvelopeHeader(line)
+                    defect = Errors.MisplacedEnvelopeHeaderDefect(line)
                      self._cur.defects.append(defect)
                      continue
              # Split the line on the colon separating field name from value.
              i = line.find(':')
              if i < 0:
-                defect = Errors.MalformedHeader(line)
+                defect = Errors.MalformedHeaderDefect(line)
                  self._cur.defects.append(defect)
                  continue
              lastheader = line[:i]
diff --git a/Lib/email/Generator.py b/Lib/email/Generator.py

index 7fe634fb8f927b8438d0840bc02c0e6a279a9fe4..9411a9e6ca2e402746f3e9e7701c3c1ffdeb5851 100644 (file)
--- a/Lib/email/Generator.py
+++ b/Lib/email/Generator.py
@@ -1,13 +1,14 @@
  # Copyright (C) 2001-2004 Python Software Foundation
-# Author: barry@python.org (Barry Warsaw)
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
-"""Classes to generate plain text from a message object tree.
-"""
+"""Classes to generate plain text from a message object tree."""
  
  import re
  import sys
  import time
  import random
+import warnings
  from cStringIO import StringIO
  
  from email.Header import Header
@@ -81,7 +82,10 @@ class Generator:
          self._write(msg)
  
      # For backwards compatibility, but this is slower
-    __call__ = flatten
+    def __call__(self, msg, unixfrom=False):
+        warnings.warn('__call__() deprecated; use flatten()',
+                      DeprecationWarning, 2)
+        self.flatten(msg, unixfrom)
  
      def clone(self, fp):
          """Clone this generator with the exact same options."""
@@ -175,7 +179,7 @@ class Generator:
          if cset is not None:
              payload = cset.body_encode(payload)
          if not isinstance(payload, basestring):
-            raise TypeError, 'string payload expected: %s' % type(payload)
+            raise TypeError('string payload expected: %s' % type(payload))
          if self._mangle_from_:
              payload = fcre.sub('>From ', payload)
          self._fp.write(payload)
@@ -271,6 +275,8 @@ class Generator:
  
  
  \f
+_FMT = '[Non-text (%(type)s) part of message omitted, filename %(filename)s]'
+
  class DecodedGenerator(Generator):
      """Generator a text representation of a message.
  
@@ -301,13 +307,13 @@ class DecodedGenerator(Generator):
          """
          Generator.__init__(self, outfp, mangle_from_, maxheaderlen)
          if fmt is None:
-            fmt = ('[Non-text (%(type)s) part of message omitted, '
-                   'filename %(filename)s]')
-        self._fmt = fmt
+            self._fmt = _FMT
+        else:
+            self._fmt = fmt
  
      def _dispatch(self, msg):
          for part in msg.walk():
-            maintype = part.get_main_type('text')
+            maintype = part.get_content_maintype()
              if maintype == 'text':
                  print >> self, part.get_payload(decode=True)
              elif maintype == 'multipart':
@@ -315,9 +321,9 @@ class DecodedGenerator(Generator):
                  pass
              else:
                  print >> self, self._fmt % {
-                    'type'       : part.get_type('[no MIME type]'),
-                    'maintype'   : part.get_main_type('[no main MIME type]'),
-                    'subtype'    : part.get_subtype('[no sub-MIME type]'),
+                    'type'       : part.get_content_type(),
+                    'maintype'   : part.get_content_maintype(),
+                    'subtype'    : part.get_content_subtype(),
                      'filename'   : part.get_filename('[no filename]'),
                      'description': part.get('Content-Description',
                                              '[no description]'),
diff --git a/Lib/email/Header.py b/Lib/email/Header.py

index 21acaf1179014bb8c172f9f45af24e947fe7da83..5e24afede013f775e836a5931c35f6ca3edfd1a9 100644 (file)
--- a/Lib/email/Header.py
+++ b/Lib/email/Header.py
@@ -1,5 +1,6 @@
  # Copyright (C) 2002-2004 Python Software Foundation
-# Author: che@debian.org (Ben Gertzfield), barry@python.org (Barry Warsaw)
+# Author: Ben Gertzfield, Barry Warsaw
+# Contact: email-sig@python.org
  
  """Header encoding and decoding functionality."""
  
diff --git a/Lib/email/Iterators.py b/Lib/email/Iterators.py

index af6095e7c76785544a7990a65989edda57ef7aaf..74a93c7b93094ed5355dd877d44d5901e8f442d5 100644 (file)
--- a/Lib/email/Iterators.py
+++ b/Lib/email/Iterators.py
@@ -1,8 +1,8 @@
  # Copyright (C) 2001-2004 Python Software Foundation
-# Author: Barry Warsaw <barry@python.org>
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
-"""Various types of useful iterators and generators.
-"""
+"""Various types of useful iterators and generators."""
  
  import sys
  from cStringIO import StringIO
diff --git a/Lib/email/MIMEAudio.py b/Lib/email/MIMEAudio.py

index dda7689a4c87a9b81202edac54f8ae296bdec2f6..266ec4c4095f5fe0482c054c24a05471af5dbc30 100644 (file)
--- a/Lib/email/MIMEAudio.py
+++ b/Lib/email/MIMEAudio.py
@@ -1,7 +1,8 @@
+# Copyright (C) 2001-2004 Python Software Foundation
  # Author: Anthony Baxter
+# Contact: email-sig@python.org
  
-"""Class representing audio/* type MIME documents.
-"""
+"""Class representing audio/* type MIME documents."""
  
  import sndhdr
  from cStringIO import StringIO
@@ -65,7 +66,7 @@ class MIMEAudio(MIMENonMultipart):
          if _subtype is None:
              _subtype = _whatsnd(_audiodata)
          if _subtype is None:
-            raise TypeError, 'Could not find audio MIME subtype'
+            raise TypeError('Could not find audio MIME subtype')
          MIMENonMultipart.__init__(self, 'audio', _subtype, **_params)
          self.set_payload(_audiodata)
          _encoder(self)
diff --git a/Lib/email/MIMEBase.py b/Lib/email/MIMEBase.py

index 7485d855c4fa022b68f34fdea073101162ec6519..88691f802e316b1670abb89a7a4d843652259dfc 100644 (file)
--- a/Lib/email/MIMEBase.py
+++ b/Lib/email/MIMEBase.py
@@ -1,8 +1,8 @@
-# Copyright (C) 2001,2002 Python Software Foundation
-# Author: barry@zope.com (Barry Warsaw)
+# Copyright (C) 2001-2004 Python Software Foundation
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
-"""Base class for MIME specializations.
-"""
+"""Base class for MIME specializations."""
  
  from email import Message
  
diff --git a/Lib/email/MIMEImage.py b/Lib/email/MIMEImage.py

index 5306e537066d5dec4de9f182ea42e5d39602a5e6..a658067839f21ec1ac7b51ce7a601bc982b724ed 100644 (file)
--- a/Lib/email/MIMEImage.py
+++ b/Lib/email/MIMEImage.py
@@ -1,8 +1,8 @@
-# Copyright (C) 2001,2002 Python Software Foundation
-# Author: barry@zope.com (Barry Warsaw)
+# Copyright (C) 2001-2004 Python Software Foundation
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
-"""Class representing image/* type MIME documents.
-"""
+"""Class representing image/* type MIME documents."""
  
  import imghdr
  
@@ -39,7 +39,7 @@ class MIMEImage(MIMENonMultipart):
          if _subtype is None:
              _subtype = imghdr.what(None, _imagedata)
          if _subtype is None:
-            raise TypeError, 'Could not guess image MIME subtype'
+            raise TypeError('Could not guess image MIME subtype')
          MIMENonMultipart.__init__(self, 'image', _subtype, **_params)
          self.set_payload(_imagedata)
          _encoder(self)
diff --git a/Lib/email/MIMEMessage.py b/Lib/email/MIMEMessage.py

index 2042dd97529f62f3a5ad86ee70914f376bf74257..3021934ef9996e9b2a47a442f838d289770a1af3 100644 (file)
--- a/Lib/email/MIMEMessage.py
+++ b/Lib/email/MIMEMessage.py
@@ -1,8 +1,8 @@
-# Copyright (C) 2001,2002 Python Software Foundation
-# Author: barry@zope.com (Barry Warsaw)
+# Copyright (C) 2001-2004 Python Software Foundation
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
-"""Class representing message/* MIME documents.
-"""
+"""Class representing message/* MIME documents."""
  
  from email import Message
  from email.MIMENonMultipart import MIMENonMultipart
@@ -24,7 +24,7 @@ class MIMEMessage(MIMENonMultipart):
          """
          MIMENonMultipart.__init__(self, 'message', _subtype)
          if not isinstance(_msg, Message.Message):
-            raise TypeError, 'Argument is not an instance of Message'
+            raise TypeError('Argument is not an instance of Message')
          # It's convenient to use this base class method.  We need to do it
          # this way or we'll get an exception
          Message.Message.attach(self, _msg)
diff --git a/Lib/email/MIMEMultipart.py b/Lib/email/MIMEMultipart.py

index ea6ae0c5edd0e1e916dde8d4b0ba6f3b3201af49..9072a6421c9f1cf6433a054b7602b75e85b93c43 100644 (file)
--- a/Lib/email/MIMEMultipart.py
+++ b/Lib/email/MIMEMultipart.py
@@ -1,8 +1,8 @@
  # Copyright (C) 2002-2004 Python Software Foundation
-# Author: barry@python.org (Barry Warsaw)
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
-"""Base class for MIME multipart/* type messages.
-"""
+"""Base class for MIME multipart/* type messages."""
  
  from email import MIMEBase
  
diff --git a/Lib/email/MIMENonMultipart.py b/Lib/email/MIMENonMultipart.py

index 1b3bcfd353a01e1e27e9e3e1f9bb5d9bca4299a6..4195d2a66894e395e4143e32b3a94a975d1939b9 100644 (file)
--- a/Lib/email/MIMENonMultipart.py
+++ b/Lib/email/MIMENonMultipart.py
@@ -1,8 +1,8 @@
-# Copyright (C) 2002 Python Software Foundation
-# Author: barry@zope.com (Barry Warsaw)
+# Copyright (C) 2002-2004 Python Software Foundation
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
-"""Base class for MIME type messages that are not multipart.
-"""
+"""Base class for MIME type messages that are not multipart."""
  
  from email import Errors
  from email import MIMEBase
diff --git a/Lib/email/MIMEText.py b/Lib/email/MIMEText.py

index d049ad9fd80bf92fdb795adbbc5a42c94faebf83..5ef187673a1bea260d6925f3192c9e27eb94c798 100644 (file)
--- a/Lib/email/MIMEText.py
+++ b/Lib/email/MIMEText.py
@@ -1,10 +1,9 @@
-# Copyright (C) 2001,2002 Python Software Foundation
-# Author: barry@zope.com (Barry Warsaw)
+# Copyright (C) 2001-2004 Python Software Foundation
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
-"""Class representing text/* type MIME documents.
-"""
+"""Class representing text/* type MIME documents."""
  
-import warnings
  from email.MIMENonMultipart import MIMENonMultipart
  from email.Encoders import encode_7or8bit
  
@@ -13,8 +12,7 @@ from email.Encoders import encode_7or8bit
  class MIMEText(MIMENonMultipart):
      """Class for generating text/* type MIME documents."""
  
-    def __init__(self, _text, _subtype='plain', _charset='us-ascii',
-                 _encoder=None):
+    def __init__(self, _text, _subtype='plain', _charset='us-ascii'):
          """Create a text/* type MIME document.
  
          _text is the string for this message object.
@@ -24,22 +22,7 @@ class MIMEText(MIMENonMultipart):
          _charset is the character set parameter added to the Content-Type
          header.  This defaults to "us-ascii".  Note that as a side-effect, the
          Content-Transfer-Encoding header will also be set.
-
-        The use of the _encoder is deprecated.  The encoding of the payload,
-        and the setting of the character set parameter now happens implicitly
-        based on the _charset argument.  If _encoder is supplied, then a
-        DeprecationWarning is used, and the _encoder functionality may
-        override any header settings indicated by _charset.  This is probably
-        not what you want.
          """
          MIMENonMultipart.__init__(self, 'text', _subtype,
                                    **{'charset': _charset})
          self.set_payload(_text, _charset)
-        if _encoder is not None:
-            warnings.warn('_encoder argument is obsolete.',
-                          DeprecationWarning, 2)
-            # Because set_payload() with a _charset will set its own
-            # Content-Transfer-Encoding header, we need to delete the
-            # existing one or will end up with two of them. :(
-            del self['content-transfer-encoding']
-            _encoder(self)
diff --git a/Lib/email/Message.py b/Lib/email/Message.py

index 2245f9b8340bd81881996a609627ec8ddb1cd2b4..d23a26f8c87a128a5b1ef888ea4615564e42073f 100644 (file)
--- a/Lib/email/Message.py
+++ b/Lib/email/Message.py
@@ -1,5 +1,6 @@
  # Copyright (C) 2001-2004 Python Software Foundation
-# Author: barry@python.org (Barry Warsaw)
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
  """Basic message object for the email package object model."""
  
@@ -69,6 +70,10 @@ def _parseparam(s):
  
  
  def _unquotevalue(value):
+    # This is different than Utils.collapse_rfc2231_value() because it doesn't
+    # try to convert the value to a unicode.  Message.get_param() and
+    # Message.get_params() are both currently defined to return the tuple in
+    # the face of RFC 2231 parameters.
      if isinstance(value, tuple):
          return value[0], value[1], Utils.unquote(value[2])
      else:
@@ -98,6 +103,7 @@ class Message:
          self._charset = None
          # Defaults for multipart messages
          self.preamble = self.epilogue = None
+        self.defects = []
          # Default content type
          self._default_type = 'text/plain'
  
@@ -124,9 +130,7 @@ class Message:
  
      def is_multipart(self):
          """Return True if the message consists of multiple parts."""
-        if isinstance(self._payload, list):
-            return True
-        return False
+        return isinstance(self._payload, list)
  
      #
      # Unix From_ line
@@ -140,26 +144,6 @@ class Message:
      #
      # Payload manipulation.
      #
-    def add_payload(self, payload):
-        """Add the given payload to the current payload.
-
-        If the current payload is empty, then the current payload will be made
-        a scalar, set to the given value.
-
-        Note: This method is deprecated.  Use .attach() instead.
-        """
-        warnings.warn('add_payload() is deprecated, use attach() instead.',
-                      DeprecationWarning, 2)
-        if self._payload is None:
-            self._payload = payload
-        elif isinstance(self._payload, list):
-            self._payload.append(payload)
-        elif self.get_main_type() not in (None, 'multipart'):
-            raise Errors.MultipartConversionError(
-                'Message main content type must be "multipart" or missing')
-        else:
-            self._payload = [self._payload, payload]
-
      def attach(self, payload):
          """Add the given payload to the current payload.
  
@@ -195,7 +179,7 @@ class Message:
          if i is None:
              payload = self._payload
          elif not isinstance(self._payload, list):
-            raise TypeError, 'Expected list, got %s' % type(self._payload)
+            raise TypeError('Expected list, got %s' % type(self._payload))
          else:
              payload = self._payload[i]
          if decode:
@@ -254,7 +238,7 @@ class Message:
          if isinstance(charset, str):
              charset = Charset.Charset(charset)
          if not isinstance(charset, Charset.Charset):
-            raise TypeError, charset
+            raise TypeError(charset)
          # BAW: should we accept strings that can serve as arguments to the
          # Charset constructor?
          self._charset = charset
@@ -267,9 +251,9 @@ class Message:
              self.set_param('charset', charset.get_output_charset())
          if not self.has_key('Content-Transfer-Encoding'):
              cte = charset.get_body_encoding()
-            if callable(cte):
+            try:
                  cte(self)
-            else:
+            except TypeError:
                  self.add_header('Content-Transfer-Encoding', cte)
  
      def get_charset(self):
@@ -290,7 +274,7 @@ class Message:
          Return None if the header is missing instead of raising an exception.
  
          Note that if the header appeared multiple times, exactly which
-        occurrance gets returned is undefined.  Use getall() to get all
+        occurrance gets returned is undefined.  Use get_all() to get all
          the values matching a header field name.
          """
          return self.get(name)
@@ -320,7 +304,7 @@ class Message:
  
      def has_key(self, name):
          """Return true if the message contains the header."""
-        missing = []
+        missing = object()
          return self.get(name, missing) is not missing
  
      def keys(self):
@@ -422,11 +406,10 @@ class Message:
                  self._headers[i] = (k, _value)
                  break
          else:
-            raise KeyError, _name
+            raise KeyError(_name)
  
      #
-    # These methods are silently deprecated in favor of get_content_type() and
-    # friends (see below).  They will be noisily deprecated in email 3.0.
+    # Deprecated methods.  These will be removed in email 3.1.
      #
  
      def get_type(self, failobj=None):
@@ -436,7 +419,9 @@ class Message:
          string of the form `maintype/subtype'.  If there was no Content-Type
          header in the message, failobj is returned (defaults to None).
          """
-        missing = []
+        warnings.warn('get_type() deprecated; use get_content_type()',
+                      DeprecationWarning, 2)
+        missing = object()
          value = self.get('content-type', missing)
          if value is missing:
              return failobj
@@ -444,7 +429,9 @@ class Message:
  
      def get_main_type(self, failobj=None):
          """Return the message's main content type if present."""
-        missing = []
+        warnings.warn('get_main_type() deprecated; use get_content_maintype()',
+                      DeprecationWarning, 2)
+        missing = object()
          ctype = self.get_type(missing)
          if ctype is missing:
              return failobj
@@ -454,7 +441,9 @@ class Message:
  
      def get_subtype(self, failobj=None):
          """Return the message's content subtype if present."""
-        missing = []
+        warnings.warn('get_subtype() deprecated; use get_content_subtype()',
+                      DeprecationWarning, 2)
+        missing = object()
          ctype = self.get_type(missing)
          if ctype is missing:
              return failobj
@@ -479,7 +468,7 @@ class Message:
          appears inside a multipart/digest container, in which case it would be
          message/rfc822.
          """
-        missing = []
+        missing = object()
          value = self.get('content-type', missing)
          if value is missing:
              # This should have no parameters
@@ -529,7 +518,7 @@ class Message:
      def _get_params_preserve(self, failobj, header):
          # Like get_params() but preserves the quoting of values.  BAW:
          # should this be part of the public interface?
-        missing = []
+        missing = object()
          value = self.get(header, missing)
          if value is missing:
              return failobj
@@ -560,7 +549,7 @@ class Message:
          header.  Optional header is the header to search instead of
          Content-Type.  If unquote is True, the value is unquoted.
          """
-        missing = []
+        missing = object()
          params = self._get_params_preserve(missing, header)
          if params is missing:
              return failobj
@@ -713,17 +702,11 @@ class Message:
          The filename is extracted from the Content-Disposition header's
          `filename' parameter, and it is unquoted.
          """
-        missing = []
+        missing = object()
          filename = self.get_param('filename', missing, 'content-disposition')
          if filename is missing:
              return failobj
-        if isinstance(filename, tuple):
-            # It's an RFC 2231 encoded parameter
-            newvalue = _unquotevalue(filename)
-            return unicode(newvalue[2], newvalue[0] or 'us-ascii')
-        else:
-            newvalue = _unquotevalue(filename.strip())
-            return newvalue
+        return Utils.collapse_rfc2231_value(filename).strip()
  
      def get_boundary(self, failobj=None):
          """Return the boundary associated with the payload if present.
@@ -731,15 +714,11 @@ class Message:
          The boundary is extracted from the Content-Type header's `boundary'
          parameter, and it is unquoted.
          """
-        missing = []
+        missing = object()
          boundary = self.get_param('boundary', missing)
          if boundary is missing:
              return failobj
-        if isinstance(boundary, tuple):
-            # RFC 2231 encoded, so decode.  It better end up as ascii
-            charset = boundary[0] or 'us-ascii'
-            return unicode(boundary[2], charset).encode('us-ascii')
-        return _unquotevalue(boundary.strip())
+        return Utils.collapse_rfc2231_value(boundary).strip()
  
      def set_boundary(self, boundary):
          """Set the boundary parameter in Content-Type to 'boundary'.
@@ -751,7 +730,7 @@ class Message:
  
          HeaderParseError is raised if the message has no Content-Type header.
          """
-        missing = []
+        missing = object()
          params = self._get_params_preserve(missing, 'content-type')
          if params is missing:
              # There was no Content-Type header, and we don't know what type
@@ -793,7 +772,7 @@ class Message:
          Content-Type header, or if that header has no charset parameter,
          failobj is returned.
          """
-        missing = []
+        missing = object()
          charset = self.get_param('charset', missing)
          if charset is missing:
              return failobj
diff --git a/Lib/email/Parser.py b/Lib/email/Parser.py

index 8c5661db99be649daeb704ab76f25c2e579cf032..0c05224489e2df45b0f517308cde6158a3fadd4e 100644 (file)
--- a/Lib/email/Parser.py
+++ b/Lib/email/Parser.py
@@ -4,17 +4,15 @@
  
  """A parser of RFC 2822 and MIME email messages."""
  
-import re
+import warnings
  from cStringIO import StringIO
  from email.FeedParser import FeedParser
  from email.Message import Message
  
-NLCRE = re.compile('\r\n|\r|\n')
-
  
  \f
  class Parser:
-    def __init__(self, _class=Message, strict=False):
+    def __init__(self, *args, **kws):
          """Parser of RFC 2822 and MIME email messages.
  
          Creates an in-memory object tree representing the email message, which
@@ -29,14 +27,28 @@ class Parser:
          _class is the class to instantiate for new message objects when they
          must be created.  This class must have a constructor that can take
          zero arguments.  Default is Message.Message.
-
-        Optional strict tells the parser to be strictly RFC compliant or to be
-        more forgiving in parsing of ill-formatted MIME documents.  When
-        non-strict mode is used, the parser will try to make up for missing or
-        erroneous boundaries and other peculiarities seen in the wild.
-        Default is non-strict parsing.
          """
-        self._class = _class
+        if len(args) >= 1:
+            if '_class' in kws:
+                raise TypeError("Multiple values for keyword arg '_class'")
+            kws['_class'] = args[0]
+        if len(args) == 2:
+            if 'strict' in kws:
+                raise TypeError("Multiple values for keyword arg 'strict'")
+            kws['strict'] = args[1]
+        if len(args) > 2:
+            raise TypeError('Too many arguments')
+        if '_class' in kws:
+            self._class = kws['_class']
+            del kws['_class']
+        else:
+            self._class = Message
+        if 'strict' in kws:
+            warnings.warn("'strict' argument is deprecated (and ignored)",
+                          DeprecationWarning, 2)
+            del kws['strict']
+        if kws:
+            raise TypeError('Unexpected keyword arguments')
  
      def parse(self, fp, headersonly=False):
          """Create a message structure from the data in a file.
diff --git a/Lib/email/Utils.py b/Lib/email/Utils.py

index 3a4bbc83b63f9e56fa2885710c034b8f03392da1..e786d265863eb81cb90dc6f5b81d7e36c6ce737c 100644 (file)
--- a/Lib/email/Utils.py
+++ b/Lib/email/Utils.py
@@ -1,5 +1,6 @@
  # Copyright (C) 2001-2004 Python Software Foundation
-# Author: barry@python.org (Barry Warsaw)
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
  """Miscellaneous utilities."""
  
@@ -80,12 +81,6 @@ def formataddr(pair):
          return '%s%s%s <%s>' % (quotes, name, quotes, address)
      return address
  
-# For backwards compatibility
-def dump_address_pair(pair):
-    warnings.warn('Use email.Utils.formataddr() instead',
-                  DeprecationWarning, 2)
-    return formataddr(pair)
-
  
  \f
  def getaddresses(fieldvalues):
@@ -107,46 +102,6 @@ ecre = re.compile(r'''
    ''', re.VERBOSE | re.IGNORECASE)
  
  
-def decode(s):
-    """Return a decoded string according to RFC 2047, as a unicode string.
-
-    NOTE: This function is deprecated.  Use Header.decode_header() instead.
-    """
-    warnings.warn('Use Header.decode_header() instead.', DeprecationWarning, 2)
-    # Intra-package import here to avoid circular import problems.
-    from email.Header import decode_header
-    L = decode_header(s)
-    if not isinstance(L, list):
-        # s wasn't decoded
-        return s
-
-    rtn = []
-    for atom, charset in L:
-        if charset is None:
-            rtn.append(atom)
-        else:
-            # Convert the string to Unicode using the given encoding.  Leave
-            # Unicode conversion errors to strict.
-            rtn.append(unicode(atom, charset))
-    # Now that we've decoded everything, we just need to join all the parts
-    # together into the final string.
-    return UEMPTYSTRING.join(rtn)
-
-
-\f
-def encode(s, charset='iso-8859-1', encoding='q'):
-    """Encode a string according to RFC 2047."""
-    warnings.warn('Use Header.Header.encode() instead.', DeprecationWarning, 2)
-    encoding = encoding.lower()
-    if encoding == 'q':
-        estr = _qencode(s)
-    elif encoding == 'b':
-        estr = _bencode(s)
-    else:
-        raise ValueError, 'Illegal encoding code: ' + encoding
-    return '=?%s?%s?%s?=' % (charset.lower(), encoding, estr)
-
-
  \f
  def formatdate(timeval=None, localtime=False):
      """Returns a date string as specified by RFC 2822, e.g.:
@@ -179,7 +134,7 @@ def formatdate(timeval=None, localtime=False):
              sign = '-'
          else:
              sign = '+'
-        zone = '%s%02d%02d' % (sign, hours, minutes / 60)
+        zone = '%s%02d%02d' % (sign, hours, minutes // 60)
      else:
          now = time.gmtime(timeval)
          # Timezone offset is always -0000
@@ -314,3 +269,16 @@ def decode_params(params):
              new_params.append(
                  (name, (charset, language, '"%s"' % quote(value))))
      return new_params
+
+def collapse_rfc2231_value(value, errors='replace',
+                           fallback_charset='us-ascii'):
+    if isinstance(value, tuple):
+        rawval = unquote(value[2])
+        charset = value[0] or 'us-ascii'
+        try:
+            return unicode(rawval, charset, errors)
+        except LookupError:
+            # XXX charset is unknown to Python.
+            return unicode(rawval, fallback_charset, errors)
+    else:
+        return unquote(value)
diff --git a/Lib/email/__init__.py b/Lib/email/__init__.py

index 74b9b731e1ade8af2dca8acd9d5868073c678051..8a46fec28801eb3ede8e59ea4fc54bb0eeef8920 100644 (file)
--- a/Lib/email/__init__.py
+++ b/Lib/email/__init__.py
@@ -1,5 +1,6 @@
  # Copyright (C) 2001-2004 Python Software Foundation
-# Author: barry@python.org (Barry Warsaw)
+# Author: Barry Warsaw
+# Contact: email-sig@python.org
  
  """A package for parsing, handling, and generating email messages."""
  
@@ -33,25 +34,19 @@ __all__ = [
  # Some convenience routines.  Don't import Parser and Message as side-effects
  # of importing email since those cascadingly import most of the rest of the
  # email package.
-def message_from_string(s, _class=None, strict=False):
+def message_from_string(s, *args, **kws):
      """Parse a string into a Message object model.
  
      Optional _class and strict are passed to the Parser constructor.
      """
      from email.Parser import Parser
-    if _class is None:
-        from email.Message import Message
-        _class = Message
-    return Parser(_class, strict=strict).parsestr(s)
+    return Parser(*args, **kws).parsestr(s)
  
  
-def message_from_file(fp, _class=None, strict=False):
+def message_from_file(fp, *args, **kws):
      """Read a file and parse its contents into a Message object model.
  
      Optional _class and strict are passed to the Parser constructor.
      """
      from email.Parser import Parser
-    if _class is None:
-        from email.Message import Message
-        _class = Message
-    return Parser(_class, strict=strict).parse(fp)
+    return Parser(*args, **kws).parse(fp)
diff --git a/Lib/email/_parseaddr.py b/Lib/email/_parseaddr.py

index 4a31affe9f95a097afe90d111a66c07d04e806b7..f6efcd5c1ea3a9b996d20434caa63f41a9195d13 100644 (file)
--- a/Lib/email/_parseaddr.py
+++ b/Lib/email/_parseaddr.py
@@ -1,4 +1,5 @@
  # Copyright (C) 2002-2004 Python Software Foundation
+# Contact: email-sig@python.org
  
  """Email address parsing code.
  
@@ -115,7 +116,7 @@ def parsedate_tz(data):
              tzoffset = -tzoffset
          else:
              tzsign = 1
-        tzoffset = tzsign * ( (tzoffset/100)*3600 + (tzoffset % 100)*60)
+        tzoffset = tzsign * ( (tzoffset//100)*3600 + (tzoffset % 100)*60)
      tuple = (yy, mm, dd, thh, tmm, tss, 0, 1, 0, tzoffset)
      return tuple
  
diff --git a/Lib/email/base64MIME.py b/Lib/email/base64MIME.py

index af85949763e11db3135c47d8f41a2ed3a2b38129..6ed1d535841b7a58046900717f038a93562b3678 100644 (file)
--- a/Lib/email/base64MIME.py
+++ b/Lib/email/base64MIME.py
@@ -1,5 +1,6 @@
-# Copyright (C) 2002 Python Software Foundation
-# Author: che@debian.org (Ben Gertzfield)
+# Copyright (C) 2002-2004 Python Software Foundation
+# Author: Ben Gertzfield
+# Contact: email-sig@python.org
  
  """Base64 content transfer encoding per RFCs 2045-2047.
  
diff --git a/Lib/email/quopriMIME.py b/Lib/email/quopriMIME.py

index 2802bc2f1c0134bf42a40f35751cb825a9309fa5..a9b5d490ae7a8e6df12642d7a6cd952600017e4d 100644 (file)
--- a/Lib/email/quopriMIME.py
+++ b/Lib/email/quopriMIME.py
@@ -1,5 +1,6 @@
  # Copyright (C) 2001-2004 Python Software Foundation
-# Author: che@debian.org (Ben Gertzfield)
+# Author: Ben Gertzfield
+# Contact: email-sig@python.org
  
  """Quoted-printable content transfer encoding per RFCs 2045-2047.
  
@@ -43,12 +44,12 @@ bqre = re.compile(r'[^ !-<>-~\t]')
  # Helpers
  def header_quopri_check(c):
      """Return True if the character should be escaped with header quopri."""
-    return hqre.match(c) and True
+    return bool(hqre.match(c))
  
  
  def body_quopri_check(c):
      """Return True if the character should be escaped with body quopri."""
-    return bqre.match(c) and True
+    return bool(bqre.match(c))
  
  
  def header_quopri_len(s):
diff --git a/Lib/email/test/__init__.py b/Lib/email/test/__init__.py

index b8a7774849860d6ca1a4c457f43344e7cc9e684e..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 100644 (file)
--- a/Lib/email/test/__init__.py
+++ b/Lib/email/test/__init__.py
@@ -1,2 +0,0 @@
-# Copyright (C) 2002 Python Software Foundation
-# Author: barry@zope.com (Barry Warsaw)
diff --git a/Lib/email/test/data/msg_41.txt b/Lib/email/test/data/msg_41.txt

new file mode 100644 (file)

index 0000000..76cdd1c
--- /dev/null
+++ b/Lib/email/test/data/msg_41.txt
@@ -0,0 +1,8 @@
+From: "Allison Dunlap" <xxx@example.com>
+To: yyy@example.com
+Subject: 64423
+Date: Sun, 11 Jul 2004 16:09:27 -0300
+MIME-Version: 1.0
+Content-Type: multipart/alternative;
+
+Blah blah blah
diff --git a/Lib/email/test/test_email.py b/Lib/email/test/test_email.py

index d079b9e546264c312645d54d4de1840ae65f379b..a55ef5c93586aff71e32a3b00ebcc2d752d6408c 100644 (file)
--- a/Lib/email/test/test_email.py
+++ b/Lib/email/test/test_email.py
@@ -1,4 +1,5 @@
  # Copyright (C) 2001-2004 Python Software Foundation
+# Contact: email-sig@python.org
  # email package unit tests
  
  import os
@@ -51,25 +52,20 @@ def openfile(filename, mode='r'):
  \f
  # Base test class
  class TestEmailBase(unittest.TestCase):
-    if hasattr(difflib, 'ndiff'):
-        # Python 2.2 and beyond
-        def ndiffAssertEqual(self, first, second):
-            """Like failUnlessEqual except use ndiff for readable output."""
-            if first <> second:
-                sfirst = str(first)
-                ssecond = str(second)
-                diff = difflib.ndiff(sfirst.splitlines(), ssecond.splitlines())
-                fp = StringIO()
-                print >> fp, NL, NL.join(diff)
-                raise self.failureException, fp.getvalue()
-    else:
-        # Python 2.1
-        ndiffAssertEqual = unittest.TestCase.assertEqual
-
-    def _msgobj(self, filename, strict=False):
+    def ndiffAssertEqual(self, first, second):
+        """Like failUnlessEqual except use ndiff for readable output."""
+        if first <> second:
+            sfirst = str(first)
+            ssecond = str(second)
+            diff = difflib.ndiff(sfirst.splitlines(), ssecond.splitlines())
+            fp = StringIO()
+            print >> fp, NL, NL.join(diff)
+            raise self.failureException, fp.getvalue()
+
+    def _msgobj(self, filename):
          fp = openfile(findfile(filename))
          try:
-            msg = email.message_from_file(fp, strict=strict)
+            msg = email.message_from_file(fp)
          finally:
              fp.close()
          return msg
@@ -493,44 +489,12 @@ class TestMessageAPI(TestEmailBase):
  \f
  # Test the email.Encoders module
  class TestEncoders(unittest.TestCase):
-    def test_encode_noop(self):
-        eq = self.assertEqual
-        msg = MIMEText('hello world', _encoder=Encoders.encode_noop)
-        eq(msg.get_payload(), 'hello world')
-
-    def test_encode_7bit(self):
-        eq = self.assertEqual
-        msg = MIMEText('hello world', _encoder=Encoders.encode_7or8bit)
-        eq(msg.get_payload(), 'hello world')
-        eq(msg['content-transfer-encoding'], '7bit')
-        msg = MIMEText('hello \x7f world', _encoder=Encoders.encode_7or8bit)
-        eq(msg.get_payload(), 'hello \x7f world')
-        eq(msg['content-transfer-encoding'], '7bit')
-
-    def test_encode_8bit(self):
-        eq = self.assertEqual
-        msg = MIMEText('hello \x80 world', _encoder=Encoders.encode_7or8bit)
-        eq(msg.get_payload(), 'hello \x80 world')
-        eq(msg['content-transfer-encoding'], '8bit')
-
      def test_encode_empty_payload(self):
          eq = self.assertEqual
          msg = Message()
          msg.set_charset('us-ascii')
          eq(msg['content-transfer-encoding'], '7bit')
  
-    def test_encode_base64(self):
-        eq = self.assertEqual
-        msg = MIMEText('hello world', _encoder=Encoders.encode_base64)
-        eq(msg.get_payload(), 'aGVsbG8gd29ybGQ=')
-        eq(msg['content-transfer-encoding'], 'base64')
-
-    def test_encode_quoted_printable(self):
-        eq = self.assertEqual
-        msg = MIMEText('hello world', _encoder=Encoders.encode_quopri)
-        eq(msg.get_payload(), 'hello=20world')
-        eq(msg['content-transfer-encoding'], 'quoted-printable')
-
      def test_default_cte(self):
          eq = self.assertEqual
          msg = MIMEText('hello world')
@@ -932,16 +896,6 @@ class TestMIMEAudio(unittest.TestCase):
          au = MIMEAudio(self._audiodata, 'fish')
          self.assertEqual(im.get_type(), 'audio/fish')
  
-    def test_custom_encoder(self):
-        eq = self.assertEqual
-        def encoder(msg):
-            orig = msg.get_payload()
-            msg.set_payload(0)
-            msg['Content-Transfer-Encoding'] = 'broken64'
-        au = MIMEAudio(self._audiodata, _encoder=encoder)
-        eq(au.get_payload(), 0)
-        eq(au['content-transfer-encoding'], 'broken64')
-
      def test_add_header(self):
          eq = self.assertEqual
          unless = self.failUnless
@@ -985,16 +939,6 @@ class TestMIMEImage(unittest.TestCase):
          im = MIMEImage(self._imgdata, 'fish')
          self.assertEqual(im.get_type(), 'image/fish')
  
-    def test_custom_encoder(self):
-        eq = self.assertEqual
-        def encoder(msg):
-            orig = msg.get_payload()
-            msg.set_payload(0)
-            msg['Content-Transfer-Encoding'] = 'broken64'
-        im = MIMEImage(self._imgdata, _encoder=encoder)
-        eq(im.get_payload(), 0)
-        eq(im['content-transfer-encoding'], 'broken64')
-
      def test_add_header(self):
          eq = self.assertEqual
          unless = self.failUnless
@@ -1396,8 +1340,8 @@ class TestNonConformant(TestEmailBase):
          eq = self.assertEqual
          msg = self._msgobj('msg_14.txt')
          eq(msg.get_type(), 'text')
-        eq(msg.get_main_type(), None)
-        eq(msg.get_subtype(), None)
+        eq(msg.get_content_maintype(), 'text')
+        eq(msg.get_content_subtype(), 'plain')
  
      def test_same_boundary_inner_outer(self):
          unless = self.failUnless
@@ -1406,14 +1350,17 @@ class TestNonConformant(TestEmailBase):
          inner = msg.get_payload(0)
          unless(hasattr(inner, 'defects'))
          self.assertEqual(len(inner.defects), 1)
-        unless(isinstance(inner.defects[0], Errors.StartBoundaryNotFound))
+        unless(isinstance(inner.defects[0],
+                          Errors.StartBoundaryNotFoundDefect))
  
      def test_multipart_no_boundary(self):
          unless = self.failUnless
          msg = self._msgobj('msg_25.txt')
          unless(isinstance(msg.get_payload(), str))
-        self.assertEqual(len(msg.defects), 1)
-        unless(isinstance(msg.defects[0], Errors.NoBoundaryInMultipart))
+        self.assertEqual(len(msg.defects), 2)
+        unless(isinstance(msg.defects[0], Errors.NoBoundaryInMultipartDefect))
+        unless(isinstance(msg.defects[1],
+                          Errors.MultipartInvariantViolationDefect))
  
      def test_invalid_content_type(self):
          eq = self.assertEqual
@@ -1464,40 +1411,19 @@ Subject: here's something interesting
  counter to RFC 2822, there's no separating newline here
  """)
  
+    def test_lying_multipart(self):
+        unless = self.failUnless
+        msg = self._msgobj('msg_41.txt')
+        unless(hasattr(msg, 'defects'))
+        self.assertEqual(len(msg.defects), 2)
+        unless(isinstance(msg.defects[0], Errors.NoBoundaryInMultipartDefect))
+        unless(isinstance(msg.defects[1],
+                          Errors.MultipartInvariantViolationDefect))
+
  
  \f
  # Test RFC 2047 header encoding and decoding
  class TestRFC2047(unittest.TestCase):
-    def test_iso_8859_1(self):
-        eq = self.assertEqual
-        s = '=?iso-8859-1?q?this=20is=20some=20text?='
-        eq(Utils.decode(s), 'this is some text')
-        s = '=?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?='
-        eq(Utils.decode(s), u'Keld J\xf8rn Simonsen')
-        s = '=?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=' \
-            '=?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?='
-        eq(Utils.decode(s), 'If you can read this you understand the example.')
-        s = '=?iso-8859-8?b?7eXs+SDv4SDp7Oj08A==?='
-        eq(Utils.decode(s),
-           u'\u05dd\u05d5\u05dc\u05e9 \u05df\u05d1 \u05d9\u05dc\u05d8\u05e4\u05e0')
-        s = '=?iso-8859-1?q?this=20is?= =?iso-8859-1?q?some=20text?='
-        eq(Utils.decode(s), u'this issome text')
-        s = '=?iso-8859-1?q?this=20is_?= =?iso-8859-1?q?some=20text?='
-        eq(Utils.decode(s), u'this is some text')
-
-    def test_encode_header(self):
-        eq = self.assertEqual
-        s = 'this is some text'
-        eq(Utils.encode(s), '=?iso-8859-1?q?this=20is=20some=20text?=')
-        s = 'Keld_J\xf8rn_Simonsen'
-        eq(Utils.encode(s), '=?iso-8859-1?q?Keld_J=F8rn_Simonsen?=')
-        s1 = 'If you can read this yo'
-        s2 = 'u understand the example.'
-        eq(Utils.encode(s1, encoding='b'),
-           '=?iso-8859-1?b?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=')
-        eq(Utils.encode(s2, charset='iso-8859-2', encoding='b'),
-           '=?iso-8859-2?b?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=')
-
      def test_rfc2047_multiline(self):
          eq = self.assertEqual
          s = """Re: =?mac-iceland?q?r=8Aksm=9Arg=8Cs?= baz
@@ -1517,10 +1443,7 @@ class TestRFC2047(unittest.TestCase):
          s = '=?ISO-8859-1?Q?Andr=E9?= Pirard <pirard@dom.ain>'
          dh = decode_header(s)
          eq(dh, [('Andr\xe9', 'iso-8859-1'), ('Pirard <pirard@dom.ain>', None)])
-        # Python 2.1's unicode() builtin doesn't call the object's
-        # __unicode__() method.  Use the following alternative instead.
-        #hu = unicode(make_header(dh)).encode('latin-1')
-        hu = make_header(dh).__unicode__().encode('latin-1')
+        hu = unicode(make_header(dh)).encode('latin-1')
          eq(hu, 'Andr\xe9 Pirard <pirard@dom.ain>')
  
      def test_whitespace_eater_unicode_2(self):
@@ -1870,8 +1793,8 @@ class TestIdempotent(TestEmailBase):
          eq = self.assertEquals
          msg, text = self._msgobj('msg_01.txt')
          eq(msg.get_type(), 'text/plain')
-        eq(msg.get_main_type(), 'text')
-        eq(msg.get_subtype(), 'plain')
+        eq(msg.get_content_maintype(), 'text')
+        eq(msg.get_content_subtype(), 'plain')
          eq(msg.get_params()[1], ('charset', 'us-ascii'))
          eq(msg.get_param('charset'), 'us-ascii')
          eq(msg.preamble, None)
@@ -2712,11 +2635,7 @@ class TestHeader(TestEmailBase):
          eq(decode_header(enc),
             [(g_head, "iso-8859-1"), (cz_head, "iso-8859-2"),
              (utf8_head, "utf-8")])
-        # Test for conversion to unicode.  BAW: Python 2.1 doesn't support the
-        # __unicode__() protocol, so do things this way for compatibility.
-        ustr = h.__unicode__()
-        # For Python 2.2 and beyond
-        #ustr = unicode(h)
+        ustr = unicode(h)
          eq(ustr.encode('utf-8'),
             'Die Mieter treten hier ein werden mit einem Foerderband '
             'komfortabel den Korridor entlang, an s\xc3\xbcdl\xc3\xbcndischen '
@@ -2956,6 +2875,15 @@ Content-Type: text/plain;
          self.assertEqual(msg.get_content_charset(),
                           'this is even more ***fun*** is it not.pdf')
  
+    def test_rfc2231_unknown_encoding(self):
+        m = """\
+Content-Transfer-Encoding: 8bit
+Content-Disposition: inline; filename*0=X-UNKNOWN''myfile.txt
+
+"""
+        msg = email.message_from_string(m)
+        self.assertEqual(msg.get_filename(), 'myfile.txt')
+
  
  \f
  def _testclasses():
diff --git a/Misc/NEWS b/Misc/NEWS

index 448ef29889f0961e75fca2ae72438cc52401792b..cf7a321b557590d5387929894e62da4bf6338769 100644 (file)
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -34,6 +34,19 @@ Extension modules
  Library
  -------
  
+- Updates for the email package:
+  + All deprecated APIs that in email 2.x issued warnings have been removed:
+    _encoder argument to the MIMEText constructor, Message.add_payload(),
+    Utils.dump_address_pair(), Utils.decode(), Utils.encode()
+  + New deprecations: Generator.__call__(), Message.get_type(),
+    Message.get_main_type(), Message.get_subtype(), the 'strict' argument to
+    the Parser constructor.  These will be removed in email 3.1.
+  + Support for Python earlier than 2.3 has been removed (see PEP 291).
+  + All defect classes have been renamed to end in 'Defect'.
+  + Some FeedParser fixes; also a MultipartInvariantViolationDefect will be
+    added to messages that claim to be multipart but really aren't.
+  + Updates to documentation.
+
  - re's findall() and finditer() functions now take an optional flags argument
    just like the compile(), search(), and match() functions.  Also, documented
    the previously existing start and stop parameters for the findall() and
author	Barry Warsaw <barry@python.org>
	Sun, 3 Oct 2004 03:16:19 +0000 (03:16 +0000)
committer	Barry Warsaw <barry@python.org>
	Sun, 3 Oct 2004 03:16:19 +0000 (03:16 +0000)
Doc/lib/email.tex		patch \| blob \| history
Doc/lib/emailencoders.tex		patch \| blob \| history
Doc/lib/emailexc.tex		patch \| blob \| history
Doc/lib/emailmessage.tex		patch \| blob \| history
Doc/lib/emailmimebase.tex		patch \| blob \| history
Doc/lib/emailparser.tex		patch \| blob \| history
Doc/lib/emailutil.tex		patch \| blob \| history
Lib/email/Charset.py		patch \| blob \| history
Lib/email/Encoders.py		patch \| blob \| history
Lib/email/Errors.py		patch \| blob \| history
Lib/email/FeedParser.py		patch \| blob \| history
Lib/email/Generator.py		patch \| blob \| history
Lib/email/Header.py		patch \| blob \| history
Lib/email/Iterators.py		patch \| blob \| history
Lib/email/MIMEAudio.py		patch \| blob \| history
Lib/email/MIMEBase.py		patch \| blob \| history
Lib/email/MIMEImage.py		patch \| blob \| history
Lib/email/MIMEMessage.py		patch \| blob \| history
Lib/email/MIMEMultipart.py		patch \| blob \| history
Lib/email/MIMENonMultipart.py		patch \| blob \| history
Lib/email/MIMEText.py		patch \| blob \| history
Lib/email/Message.py		patch \| blob \| history
Lib/email/Parser.py		patch \| blob \| history
Lib/email/Utils.py		patch \| blob \| history
Lib/email/__init__.py		patch \| blob \| history
Lib/email/_parseaddr.py		patch \| blob \| history
Lib/email/base64MIME.py		patch \| blob \| history
Lib/email/quopriMIME.py		patch \| blob \| history
Lib/email/test/__init__.py		patch \| blob \| history
Lib/email/test/data/msg_41.txt	[new file with mode: 0644]	patch \| blob
Lib/email/test/test_email.py		patch \| blob \| history
Misc/NEWS		patch \| blob \| history