:mod:`email.generator` module:
-.. class:: Generator(outfp, mangle_from_=True, maxheaderlen=78)
+.. class:: Generator(outfp, mangle_from_=True, maxheaderlen=78, *, \
+ policy=policy.default)
The constructor for the :class:`Generator` class takes a :term:`file-like object`
called *outfp* for an argument. *outfp* must support the :meth:`write` method
:class:`~email.header.Header` class. Set to zero to disable header wrapping.
The default is 78, as recommended (but not required) by :rfc:`2822`.
+ The *policy* keyword specifies a :mod:`~email.policy` object that controls a
+ number of aspects of the generator's operation. The default policy
+ maintains backward compatibility.
+
+ .. versionchanged:: 3.3 Added the *policy* keyword.
+
The other public :class:`Generator` methods are:
- .. method:: flatten(msg, unixfrom=False, linesep='\\n')
+ .. method:: flatten(msg, unixfrom=False, linesep=None)
Print the textual representation of the message object structure rooted at
*msg* to the output file specified when the :class:`Generator` instance
Note that for subparts, no envelope header is ever printed.
Optional *linesep* specifies the line separator character used to
- terminate lines in the output. It defaults to ``\n`` because that is
- the most useful value for Python application code (other library packages
- expect ``\n`` separated lines). ``linesep=\r\n`` can be used to
- generate output with RFC-compliant line separators.
+ terminate lines in the output. If specified it overrides the value
+ specified by the ``Generator``\'s ``policy``.
- Messages parsed with a Bytes parser that have a
+ Because strings cannot represent non-ASCII bytes, ``Generator`` ignores
+ the value of the :attr:`~email.policy.Policy.must_be_7bit`
+ :mod:`~email.policy` setting and operates as if it were set ``True``.
+ This means that messages parsed with a Bytes parser that have a
:mailheader:`Content-Transfer-Encoding` of 8bit will be converted to a
use a 7bit Content-Transfer-Encoding. Non-ASCII bytes in the headers
will be :rfc:`2047` encoded with a charset of `unknown-8bit`.
formatted string representation of a message object. For more detail, see
:mod:`email.message`.
-.. class:: BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78)
+.. class:: BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78, *, \
+ policy=policy.default)
The constructor for the :class:`BytesGenerator` class takes a binary
:term:`file-like object` called *outfp* for an argument. *outfp* must
wrapping. The default is 78, as recommended (but not required) by
:rfc:`2822`.
+ The *policy* keyword specifies a :mod:`~email.policy` object that controls a
+ number of aspects of the generator's operation. The default policy
+ maintains backward compatibility.
+
+ .. versionchanged:: 3.3 Added the *policy* keyword.
+
The other public :class:`BytesGenerator` methods are:
- .. method:: flatten(msg, unixfrom=False, linesep='\n')
+ .. method:: flatten(msg, unixfrom=False, linesep=None)
Print the textual representation of the message object structure rooted
at *msg* to the output file specified when the :class:`BytesGenerator`
instance was created. Subparts are visited depth-first and the resulting
- text will be properly MIME encoded. If the input that created the *msg*
- contained bytes with the high bit set and those bytes have not been
- modified, they will be copied faithfully to the output, even if doing so
- is not strictly RFC compliant. (To produce strictly RFC compliant
- output, use the :class:`Generator` class.)
+ text will be properly MIME encoded. If the :mod:`~email.policy` option
+ :attr:`~email.policy.Policy.must_be_7bit` is ``False`` (the default),
+ then any bytes with the high bit set in the original parsed message that
+ have not been modified will be copied faithfully to the output. If
+ ``must_be_7bit`` is true, the bytes will be converted as needed using an
+ ASCII content-transfer-encoding. In particular, RFC-invalid non-ASCII
+ bytes in headers will be encoded using the MIME ``unknown-8bit``
+ character set, thus rendering them RFC-compliant.
+
+ .. XXX: There should be a complementary option that just does the RFC
+ compliance transformation but leaves CTE 8bit parts alone.
Messages parsed with a Bytes parser that have a
:mailheader:`Content-Transfer-Encoding` of 8bit will be reconstructed
Note that for subparts, no envelope header is ever printed.
Optional *linesep* specifies the line separator character used to
- terminate lines in the output. It defaults to ``\n`` because that is
- the most useful value for Python application code (other library packages
- expect ``\n`` separated lines). ``linesep=\r\n`` can be used to
- generate output with RFC-compliant line separators.
+ terminate lines in the output. If specified it overrides the value
+ specified by the ``Generator``\ 's ``policy``.
.. method:: clone(fp)
Here is the API for the :class:`FeedParser`:
-.. class:: FeedParser(_factory=email.message.Message)
+.. class:: FeedParser(_factory=email.message.Message, *, policy=policy.default)
Create a :class:`FeedParser` instance. Optional *_factory* is a no-argument
callable that will be called whenever a new message object is needed. It
defaults to the :class:`email.message.Message` class.
+ The *policy* keyword specifies a :mod:`~email.policy` object that controls a
+ number of aspects of the parser's operation. The default policy maintains
+ backward compatibility.
+
+ .. versionchanged:: 3.3 Added the *policy* keyword.
+
.. method:: feed(data)
Feed the :class:`FeedParser` some more data. *data* should be a string
.. versionadded:: 3.3 BytesHeaderParser
-.. class:: Parser(_class=email.message.Message)
+.. class:: Parser(_class=email.message.Message, *, policy=policy.default)
The constructor for the :class:`Parser` class takes an optional argument
*_class*. This must be a callable factory (such as a function or a class), and
:class:`~email.message.Message` (see :mod:`email.message`). The factory will
be called without arguments.
- .. versionchanged:: 3.2
- Removed the *strict* argument that was deprecated in 2.4.
+ The *policy* keyword specifies a :mod:`~email.policy` object that controls a
+ number of aspects of the parser's operation. The default policy maintains
+ backward compatibility.
+
+ .. versionchanged:: 3.3
+ Removed the *strict* argument that was deprecated in 2.4. Added the
+ *policy* keyword.
The other public :class:`Parser` methods are:
the entire contents of the file.
-.. class:: BytesParser(_class=email.message.Message, strict=None)
+.. class:: BytesParser(_class=email.message.Message, *, policy=policy.default)
This class is exactly parallel to :class:`Parser`, but handles bytes input.
The *_class* and *strict* arguments are interpreted in the same way as for
- the :class:`Parser` constructor. *strict* is supported only to make porting
- code easier; it is deprecated.
+ the :class:`Parser` constructor.
+
+ The *policy* keyword specifies a :mod:`~email.policy` object that
+ controls a number of aspects of the parser's operation. The default
+ policy maintains backward compatibility.
+
+ .. versionchanged:: 3.3
+ Removed the *strict* argument. Added the *policy* keyword.
.. method:: parse(fp, headeronly=False)
.. currentmodule:: email
-.. function:: message_from_string(s, _class=email.message.Message, strict=None)
+.. function:: message_from_string(s, _class=email.message.Message, *, \
+ policy=policy.default)
Return a message object structure from a string. This is exactly equivalent to
- ``Parser().parsestr(s)``. Optional *_class* and *strict* are interpreted as
+ ``Parser().parsestr(s)``. *_class* and *policy* are interpreted as
with the :class:`Parser` class constructor.
+ .. versionchanged:: removed *strict*, added *policy*
+
.. function:: message_from_bytes(s, _class=email.message.Message, strict=None)
Return a message object structure from a byte string. This is exactly
*strict* are interpreted as with the :class:`Parser` class constructor.
.. versionadded:: 3.2
+ .. versionchanged:: 3.3 removed *strict*, added *policy*
-.. function:: message_from_file(fp, _class=email.message.Message, strict=None)
+.. function:: message_from_file(fp, _class=email.message.Message, *, \
+ policy=policy.default)
Return a message object structure tree from an open :term:`file object`.
- This is exactly equivalent to ``Parser().parse(fp)``. Optional *_class*
- and *strict* are interpreted as with the :class:`Parser` class constructor.
+ This is exactly equivalent to ``Parser().parse(fp)``. *_class*
+ and *policy* are interpreted as with the :class:`Parser` class constructor.
+
+ .. versionchanged:: 3.3 removed *strict*, added *policy*
-.. function:: message_from_binary_file(fp, _class=email.message.Message, strict=None)
+.. function:: message_from_binary_file(fp, _class=email.message.Message, *, \
+ policy=policy.default)
Return a message object structure tree from an open binary :term:`file
object`. This is exactly equivalent to ``BytesParser().parse(fp)``.
- Optional *_class* and *strict* are interpreted as with the :class:`Parser`
+ *_class* and *policy* are interpreted as with the :class:`Parser`
class constructor.
.. versionadded:: 3.2
+ .. versionchanged:: 3.3 removed *strict*, added *policy*
Here's an example of how you might use this at an interactive Python prompt::
--- /dev/null
+:mod:`email`: Policy Objects
+----------------------------
+
+.. module:: email.policy
+ :synopsis: Controlling the parsing and generating of messages
+
+
+The :mod:`email` package's prime focus is the handling of email messages as
+described by the various email and MIME RFCs. However, the general format of
+email messages (a block of header fields each consisting of a name followed by
+a colon followed by a value, the whole block followed by a blank line and an
+arbitrary 'body'), is a format that has found utility outside of the realm of
+email. Some of these uses conform fairly closely to the main RFCs, some do
+not. And even when working with email, there are times when it is desirable to
+break strict compliance with the RFCs.
+
+Policy objects are the mechanism used to provide the email package with the
+flexibility to handle all these disparate use cases,
+
+A :class:`Policy` object encapsulates a set of attributes and methods that
+control the behavior of various components of the email package during use.
+:class:`Policy` instances can be passed to various classes and methods in the
+email package to alter the default behavior. The settable values and their
+defaults are described below. The :mod:`policy` module also provides some
+pre-created :class:`Policy` instances. In addition to a :const:`default`
+instance, there are instances tailored for certain applications. For example
+there is an :const:`SMTP` :class:`Policy` with defaults appropriate for
+generating output to be sent to an SMTP server. These are listed :ref:`below
+<Policy Instances>`.
+
+In general an application will only need to deal with setting the policy at the
+input and output boundaries. Once parsed, a message is represented by a
+:class:`~email.message.Message` object, which is designed to be independent of
+the format that the message has "on the wire" when it is received, transmitted,
+or displayed. Thus, a :class:`Policy` can be specified when parsing a message
+to create a :class:`~email.message.Message`, and again when turning the
+:class:`~email.message.Message` into some other representation. While often a
+program will use the same :class:`Policy` for both input and output, the two
+can be different.
+
+As an example, the following code could be used to read an email message from a
+file on disk and pass it to the system ``sendmail`` program on a ``unix``
+system::
+
+ >>> from email import msg_from_binary_file
+ >>> from email.generator import BytesGenerator
+ >>> import email.policy
+ >>> from subprocess import Popen, PIPE
+ >>> with open('mymsg.txt', 'b') as f:
+ >>> msg = msg_from_binary_file(f, policy=email.policy.mbox)
+ >>> p = Popen(['sendmail', msg['To'][0].address], stdin=PIPE)
+ >>> g = BytesGenerator(p.stdin, email.policy.policy=SMTP)
+ >>> g.flatten(msg)
+ >>> p.stdin.close()
+ >>> rc = p.wait()
+
+Some email package methods accept a *policy* keyword argument, allowing the
+policy to be overridden for that method. For example, the following code use
+the :meth:`email.message.Message.as_string` method to the *msg* object from the
+previous example and re-write it to a file using the native line separators for
+the platform on which it is running::
+
+ >>> import os
+ >>> mypolicy = email.policy.Policy(linesep=os.linesep)
+ >>> with open('converted.txt', 'wb') as f:
+ ... f.write(msg.as_string(policy=mypolicy))
+
+Policy instances are immutable, but they can be cloned, accepting the same
+keyword arguments as the class constructor and returning a new :class:`Policy`
+instance that is a copy of the original but with the specified attributes
+values changed. For example, the following creates an SMTP policy that will
+raise any defects detected as errors::
+
+ >>> strict_SMTP = email.policy.SMTP.clone(raise_on_defect=True)
+
+Policy objects can also be combined using the addition operator, producing a
+policy object whose settings are a combination of the non-default values of the
+summed objects::
+
+ >>> strict_SMTP = email.policy.SMTP + email.policy.strict
+
+This operation is not commutative; that is, the order in which the objects are
+added matters. To illustrate::
+
+ >>> Policy = email.policy.Policy
+ >>> apolicy = Policy(max_line_length=100) + Policy(max_line_length=80)
+ >>> apolicy.max_line_length
+ 80
+ >>> apolicy = Policy(max_line_length=80) + Policy(max_line_length=100)
+ >>> apolicy.max_line_length
+ 100
+
+
+.. class:: Policy(**kw)
+
+ The valid constructor keyword arguments are any of the attributes listed
+ below.
+
+ .. attribute:: max_line_length
+
+ The maximum length of any line in the serialized output, not counting the
+ end of line character(s). Default is 78, per :rfc:`5322`. A value of
+ ``0`` or :const:`None` indicates that no line wrapping should be
+ done at all.
+
+ .. attribute:: linesep
+
+ The string to be used to terminate lines in serialized output. The
+ default is '\\n' because that's the internal end-of-line discipline used
+ by Python, though '\\r\\n' is required by the RFCs. See `Policy
+ Instances`_ for policies that use an RFC conformant linesep. Setting it
+ to :attr:`os.linesep` may also be useful.
+
+ .. attribute:: must_be_7bit
+
+ If :const:`True`, data output by a bytes generator is limited to ASCII
+ characters. If :const:`False` (the default), then bytes with the high
+ bit set are preserved and/or allowed in certain contexts (for example,
+ where possible a content transfer encoding of ``8bit`` will be used).
+ String generators act as if ``must_be_7bit`` is `True` regardless of the
+ policy in effect, since a string cannot represent non-ASCII bytes.
+
+ .. attribute:: raise_on_defect
+
+ If :const:`True`, any defects encountered will be raised as errors. If
+ :const:`False` (the default), defects will be passed to the
+ :meth:`register_defect` method.
+
+ .. method:: handle_defect(obj, defect)
+
+ *obj* is the object on which to register the defect. *defect* should be
+ an instance of a subclass of :class:`~email.errors.Defect`.
+ If :attr:`raise_on_defect`
+ is ``True`` the defect is raised as an exception. Otherwise *obj* and
+ *defect* are passed to :meth:`register_defect`. This method is intended
+ to be called by parsers when they encounter defects, and will not be
+ called by code that uses the email library unless that code is
+ implementing an alternate parser.
+
+ .. method:: register_defect(obj, defect)
+
+ *obj* is the object on which to register the defect. *defect* should be
+ a subclass of :class:`~email.errors.Defect`. This method is part of the
+ public API so that custom ``Policy`` subclasses can implement alternate
+ handling of defects. The default implementation calls the ``append``
+ method of the ``defects`` attribute of *obj*.
+
+ .. method:: clone(obj, *kw):
+
+ Return a new :class:`Policy` instance whose attributes have the same
+ values as the current instance, except where those attributes are
+ given new values by the keyword arguments.
+
+
+Policy Instances
+................
+
+The following instances of :class:`Policy` provide defaults suitable for
+specific common application domains.
+
+.. data:: default
+
+ An instance of :class:`Policy` with all defaults unchanged.
+
+.. data:: SMTP
+
+ Output serialized from a message will conform to the email and SMTP
+ RFCs. The only changed attribute is :attr:`linesep`, which is set to
+ ``\r\n``.
+
+.. data:: HTTP
+
+ Suitable for use when serializing headers for use in HTTP traffic.
+ :attr:`linesep` is set to ``\r\n``, and :attr:`max_line_length` is set to
+ :const:`None` (unlimited).
+
+.. data:: strict
+
+ :attr:`raise_on_defect` is set to :const:`True`.
\f
# These are parsing defects which the parser was able to work around.
-class MessageDefect:
+class MessageDefect(Exception):
"""Base class for a message defect."""
def __init__(self, line=None):
from email import errors
from email import message
+from email import policy
NLCRE = re.compile('\r\n|\r|\n')
NLCRE_bol = re.compile('(\r\n|\r|\n)')
class FeedParser:
"""A feed-style parser of email."""
- def __init__(self, _factory=message.Message):
- """_factory is called with no arguments to create a new message obj"""
+ def __init__(self, _factory=message.Message, *, policy=policy.default):
+ """_factory is called with no arguments to create a new message obj
+
+ The policy keyword specifies a policy object that controls a number of
+ aspects of the parser's operation. The default policy maintains
+ backward compatibility.
+
+ """
self._factory = _factory
+ self.policy = policy
self._input = BufferedSubFile()
self._msgstack = []
self._parse = self._parsegen().__next__
# Look for final set of defects
if root.get_content_maintype() == 'multipart' \
and not root.is_multipart():
- root.defects.append(errors.MultipartInvariantViolationDefect())
+ defect = errors.MultipartInvariantViolationDefect()
+ self.policy.handle_defect(root, defect)
return root
def _new_message(self):
# defined a boundary. That's a problem which we'll handle by
# reading everything until the EOF and marking the message as
# defective.
- self._cur.defects.append(errors.NoBoundaryInMultipartDefect())
+ defect = errors.NoBoundaryInMultipartDefect()
+ self.policy.handle_defect(self._cur, defect)
lines = []
for line in self._input:
if line is NeedMoreData:
# that as a defect and store the captured text as the payload.
# Everything from here to the EOF is epilogue.
if capturing_preamble:
- self._cur.defects.append(errors.StartBoundaryNotFoundDefect())
+ defect = errors.StartBoundaryNotFoundDefect()
+ self.policy.handle_defect(self._cur, defect)
self._cur.set_payload(EMPTYSTRING.join(preamble))
epilogue = []
for line in self._input:
# is illegal, so let's note the defect, store the illegal
# line, and ignore it for purposes of headers.
defect = errors.FirstHeaderLineIsContinuationDefect(line)
- self._cur.defects.append(defect)
+ self.policy.handle_defect(self._cur, defect)
continue
lastvalue.append(line)
continue
import warnings
from io import StringIO, BytesIO
+from email import policy
from email.header import Header
from email.message import _has_surrogates
+import email.charset as _charset
UNDERSCORE = '_'
NL = '\n' # XXX: no longer used by the code below.
# Public interface
#
- def __init__(self, outfp, mangle_from_=True, maxheaderlen=78):
+ def __init__(self, outfp, mangle_from_=True, maxheaderlen=None, *,
+ policy=policy.default):
"""Create the generator for message flattening.
outfp is the output file-like object for writing the message to. It
defined in the Header class. Set maxheaderlen to zero to disable
header wrapping. The default is 78, as recommended (but not required)
by RFC 2822.
+
+ The policy keyword specifies a policy object that controls a number of
+ aspects of the generator's operation. The default policy maintains
+ backward compatibility.
+
"""
self._fp = outfp
self._mangle_from_ = mangle_from_
- self._maxheaderlen = maxheaderlen
+ self._maxheaderlen = (maxheaderlen if maxheaderlen is not None else
+ policy.max_line_length)
+ self.policy = policy
def write(self, s):
# Just delegate to the file object
self._fp.write(s)
- def flatten(self, msg, unixfrom=False, linesep='\n'):
+ def flatten(self, msg, unixfrom=False, linesep=None):
r"""Print the message object tree rooted at msg to the output file
specified when the Generator instance was created.
Note that for subobjects, no From_ line is printed.
linesep specifies the characters used to indicate a new line in
- the output. The default value is the most useful for typical
- Python applications, but it can be set to \r\n to produce RFC-compliant
- line separators when needed.
+ the output. The default value is determined by the policy.
"""
# We use the _XXX constants for operating on data that comes directly
# from the msg, and _encoded_XXX constants for operating on data that
# has already been converted (to bytes in the BytesGenerator) and
# inserted into a temporary buffer.
- self._NL = linesep
- self._encoded_NL = self._encode(linesep)
+ self._NL = linesep if linesep is not None else self.policy.linesep
+ self._encoded_NL = self._encode(self._NL)
self._EMPTY = ''
self._encoded_EMTPY = self._encode('')
if unixfrom:
Functionally identical to the base Generator except that the output is
bytes and not string. When surrogates were used in the input to encode
- bytes, these are decoded back to bytes for output.
+ bytes, these are decoded back to bytes for output. If the policy has
+ must_be_7bit set true, then the message is transformed such that the
+ non-ASCII bytes are properly content transfer encoded, using the
+ charset unknown-8bit.
The outfp object must accept bytes in its write method.
"""
# strings with 8bit bytes.
for h, v in msg._headers:
self.write('%s: ' % h)
- if isinstance(v, Header):
- self.write(v.encode(maxlinelen=self._maxheaderlen)+NL)
- elif _has_surrogates(v):
- # If we have raw 8bit data in a byte string, we have no idea
- # what the encoding is. There is no safe way to split this
- # string. If it's ascii-subset, then we could do a normal
- # ascii split, but if it's multibyte then we could break the
- # string. There's no way to know so the least harm seems to
- # be to not split the string and risk it being too long.
- self.write(v+NL)
- else:
- # Header's got lots of smarts and this string is safe...
- header = Header(v, maxlinelen=self._maxheaderlen,
- header_name=h)
- self.write(header.encode(linesep=self._NL)+self._NL)
+ if isinstance(v, str):
+ if _has_surrogates(v):
+ if not self.policy.must_be_7bit:
+ # If we have raw 8bit data in a byte string, we have no idea
+ # what the encoding is. There is no safe way to split this
+ # string. If it's ascii-subset, then we could do a normal
+ # ascii split, but if it's multibyte then we could break the
+ # string. There's no way to know so the least harm seems to
+ # be to not split the string and risk it being too long.
+ self.write(v+NL)
+ continue
+ h = Header(v, charset=_charset.UNKNOWN8BIT, header_name=h)
+ else:
+ h = Header(v, header_name=h)
+ self.write(h.encode(linesep=self._NL,
+ maxlinelen=self._maxheaderlen)+self._NL)
# A blank line always separates headers from body
self.write(self._NL)
# just write it back out.
if msg._payload is None:
return
- if _has_surrogates(msg._payload):
+ if _has_surrogates(msg._payload) and not self.policy.must_be_7bit:
self.write(msg._payload)
else:
super(BytesGenerator,self)._handle_text(msg)
from email.feedparser import FeedParser
from email.message import Message
+from email import policy
\f
class Parser:
- def __init__(self, _class=Message):
+ def __init__(self, _class=Message, *, policy=policy.default):
"""Parser of RFC 2822 and MIME email messages.
Creates an in-memory object tree representing the email message, which
_class is the class to instantiate for new message objects when they
must be created. This class must have a constructor that can take
zero arguments. Default is Message.Message.
+
+ The policy keyword specifies a policy object that controls a number of
+ aspects of the parser's operation. The default policy maintains
+ backward compatibility.
+
"""
self._class = _class
+ self.policy = policy
def parse(self, fp, headersonly=False):
"""Create a message structure from the data in a file.
parsing after reading the headers or not. The default is False,
meaning it parses the entire contents of the file.
"""
- feedparser = FeedParser(self._class)
+ feedparser = FeedParser(self._class, policy=self.policy)
if headersonly:
feedparser._set_headersonly()
while True:
--- /dev/null
+"""Policy framework for the email package.
+
+Allows fine grained feature control of how the package parses and emits data.
+"""
+
+__all__ = [
+ 'Policy',
+ 'default',
+ 'strict',
+ 'SMTP',
+ 'HTTP',
+ ]
+
+
+class _PolicyBase:
+
+ """Policy Object basic framework.
+
+ This class is useless unless subclassed. A subclass should define
+ class attributes with defaults for any values that are to be
+ managed by the Policy object. The constructor will then allow
+ non-default values to be set for these attributes at instance
+ creation time. The instance will be callable, taking these same
+ attributes keyword arguments, and returning a new instance
+ identical to the called instance except for those values changed
+ by the keyword arguments. Instances may be added, yielding new
+ instances with any non-default values from the right hand
+ operand overriding those in the left hand operand. That is,
+
+ A + B == A(<non-default values of B>)
+
+ The repr of an instance can be used to reconstruct the object
+ if and only if the repr of the values can be used to reconstruct
+ those values.
+
+ """
+
+ def __init__(self, **kw):
+ """Create new Policy, possibly overriding some defaults.
+
+ See class docstring for a list of overridable attributes.
+
+ """
+ for name, value in kw.items():
+ if hasattr(self, name):
+ super(_PolicyBase,self).__setattr__(name, value)
+ else:
+ raise TypeError(
+ "{!r} is an invalid keyword argument for {}".format(
+ name, self.__class__.__name__))
+
+ def __repr__(self):
+ args = [ "{}={!r}".format(name, value)
+ for name, value in self.__dict__.items() ]
+ return "{}({})".format(self.__class__.__name__, args if args else '')
+
+ def clone(self, **kw):
+ """Return a new instance with specified attributes changed.
+
+ The new instance has the same attribute values as the current object,
+ except for the changes passed in as keyword arguments.
+
+ """
+ for attr, value in self.__dict__.items():
+ if attr not in kw:
+ kw[attr] = value
+ return self.__class__(**kw)
+
+ def __setattr__(self, name, value):
+ if hasattr(self, name):
+ msg = "{!r} object attribute {!r} is read-only"
+ else:
+ msg = "{!r} object has no attribute {!r}"
+ raise AttributeError(msg.format(self.__class__.__name__, name))
+
+ def __add__(self, other):
+ """Non-default values from right operand override those from left.
+
+ The object returned is a new instance of the subclass.
+
+ """
+ return self.clone(**other.__dict__)
+
+
+class Policy(_PolicyBase):
+
+ """Controls for how messages are interpreted and formatted.
+
+ Most of the classes and many of the methods in the email package
+ accept Policy objects as parameters. A Policy object contains a set
+ of values and functions that control how input is interpreted and how
+ output is rendered. For example, the parameter 'raise_on_defect'
+ controls whether or not an RFC violation throws an error or not,
+ while 'max_line_length' controls the maximum length of output lines
+ when a Message is serialized.
+
+ Any valid attribute may be overridden when a Policy is created by
+ passing it as a keyword argument to the constructor. Policy
+ objects are immutable, but a new Policy object can be created
+ with only certain values changed by calling the Policy instance
+ with keyword arguments. Policy objects can also be added,
+ producing a new Policy object in which the non-default attributes
+ set in the right hand operand overwrite those specified in the
+ left operand.
+
+ Settable attributes:
+
+ raise_on_defect -- If true, then defects should be raised
+ as errors. Default False.
+
+ linesep -- string containing the value to use as
+ separation between output lines. Default '\n'.
+
+ must_be_7bit -- output must contain only 7bit clean data.
+ Default False.
+
+ max_line_length -- maximum length of lines, excluding 'linesep',
+ during serialization. None means no line
+ wrapping is done. Default is 78.
+
+ Methods:
+
+ register_defect(obj, defect)
+ defect is a Defect instance. The default implementation appends defect
+ to the objs 'defects' attribute.
+
+ handle_defect(obj, defect)
+ intended to be called by parser code that finds a defect. If
+ raise_on_defect is True, defect is raised as an error, otherwise
+ register_defect is called.
+
+ """
+
+ raise_on_defect = False
+ linesep = '\n'
+ must_be_7bit = False
+ max_line_length = 78
+
+ def handle_defect(self, obj, defect):
+ """Based on policy, either raise defect or call register_defect.
+
+ handle_defect(obj, defect)
+
+ defect should be a Defect subclass, but in any case must be an
+ Exception subclass. obj is the object on which the defect should be
+ registered if it is not raised. If the raise_on_defect is True, the
+ defect is raised as an error, otherwise the object and the defect are
+ passed to register_defect.
+
+ This class is intended to be called by parsers that discover defects,
+ and will not be called from code using the library unless that code is
+ implementing an alternate parser.
+
+ """
+ if self.raise_on_defect:
+ raise defect
+ self.register_defect(obj, defect)
+
+ def register_defect(self, obj, defect):
+ """Record 'defect' on 'obj'.
+
+ Called by handle_defect if raise_on_defect is False. This method is
+ part of the Policy API so that Policy subclasses can implement custom
+ defect handling. The default implementation calls the append method
+ of the defects attribute of obj.
+
+ """
+ obj.defects.append(defect)
+
+
+default = Policy()
+strict = default.clone(raise_on_defect=True)
+SMTP = default.clone(linesep='\r\n')
+HTTP = default.clone(linesep='\r\n', max_line_length=None)
# Base test class
class TestEmailBase(unittest.TestCase):
+ maxDiff = None
+
def __init__(self, *args, **kw):
super().__init__(*args, **kw)
self.addTypeEqualityFunc(bytes, self.assertBytesEqual)
# Test some badly formatted messages
-class TestNonConformant(TestEmailBase):
+class TestNonConformantBase:
+
+ def _msgobj(self, filename):
+ with openfile(filename) as fp:
+ return email.message_from_file(fp, policy=self.policy)
+
def test_parse_missing_minor_type(self):
eq = self.assertEqual
msg = self._msgobj('msg_14.txt')
# XXX We can probably eventually do better
inner = msg.get_payload(0)
unless(hasattr(inner, 'defects'))
- self.assertEqual(len(inner.defects), 1)
- unless(isinstance(inner.defects[0],
+ self.assertEqual(len(self.get_defects(inner)), 1)
+ unless(isinstance(self.get_defects(inner)[0],
errors.StartBoundaryNotFoundDefect))
def test_multipart_no_boundary(self):
unless = self.assertTrue
msg = self._msgobj('msg_25.txt')
unless(isinstance(msg.get_payload(), str))
- self.assertEqual(len(msg.defects), 2)
- unless(isinstance(msg.defects[0], errors.NoBoundaryInMultipartDefect))
- unless(isinstance(msg.defects[1],
+ self.assertEqual(len(self.get_defects(msg)), 2)
+ unless(isinstance(self.get_defects(msg)[0],
+ errors.NoBoundaryInMultipartDefect))
+ unless(isinstance(self.get_defects(msg)[1],
errors.MultipartInvariantViolationDefect))
def test_invalid_content_type(self):
unless = self.assertTrue
msg = self._msgobj('msg_41.txt')
unless(hasattr(msg, 'defects'))
- self.assertEqual(len(msg.defects), 2)
- unless(isinstance(msg.defects[0], errors.NoBoundaryInMultipartDefect))
- unless(isinstance(msg.defects[1],
+ self.assertEqual(len(self.get_defects(msg)), 2)
+ unless(isinstance(self.get_defects(msg)[0],
+ errors.NoBoundaryInMultipartDefect))
+ unless(isinstance(self.get_defects(msg)[1],
errors.MultipartInvariantViolationDefect))
def test_missing_start_boundary(self):
#
# [*] This message is missing its start boundary
bad = outer.get_payload(1).get_payload(0)
- self.assertEqual(len(bad.defects), 1)
- self.assertTrue(isinstance(bad.defects[0],
+ self.assertEqual(len(self.get_defects(bad)), 1)
+ self.assertTrue(isinstance(self.get_defects(bad)[0],
errors.StartBoundaryNotFoundDefect))
def test_first_line_is_continuation_header(self):
eq = self.assertEqual
m = ' Line 1\nLine 2\nLine 3'
- msg = email.message_from_string(m)
+ msg = email.message_from_string(m, policy=self.policy)
eq(msg.keys(), [])
eq(msg.get_payload(), 'Line 2\nLine 3')
- eq(len(msg.defects), 1)
- self.assertTrue(isinstance(msg.defects[0],
+ eq(len(self.get_defects(msg)), 1)
+ self.assertTrue(isinstance(self.get_defects(msg)[0],
errors.FirstHeaderLineIsContinuationDefect))
- eq(msg.defects[0].line, ' Line 1\n')
+ eq(self.get_defects(msg)[0].line, ' Line 1\n')
+
+
+class TestNonConformant(TestNonConformantBase, TestEmailBase):
+
+ policy=email.policy.default
+
+ def get_defects(self, obj):
+ return obj.defects
+
+
+class TestNonConformantCapture(TestNonConformantBase, TestEmailBase):
+
+ class CapturePolicy(email.policy.Policy):
+ captured = None
+ def register_defect(self, obj, defect):
+ self.captured.append(defect)
+
+ def setUp(self):
+ self.policy = self.CapturePolicy(captured=list())
+
+ def get_defects(self, obj):
+ return self.policy.captured
+
+class TestRaisingDefects(TestEmailBase):
+
+ def _msgobj(self, filename):
+ with openfile(filename) as fp:
+ return email.message_from_file(fp, policy=email.policy.strict)
+
+ def test_same_boundary_inner_outer(self):
+ with self.assertRaises(errors.StartBoundaryNotFoundDefect):
+ self._msgobj('msg_15.txt')
+
+ def test_multipart_no_boundary(self):
+ with self.assertRaises(errors.NoBoundaryInMultipartDefect):
+ self._msgobj('msg_25.txt')
+
+ def test_lying_multipart(self):
+ with self.assertRaises(errors.NoBoundaryInMultipartDefect):
+ self._msgobj('msg_41.txt')
+
+
+ def test_missing_start_boundary(self):
+ with self.assertRaises(errors.StartBoundaryNotFoundDefect):
+ self._msgobj('msg_42.txt')
+
+ def test_first_line_is_continuation_header(self):
+ m = ' Line 1\nLine 2\nLine 3'
+ with self.assertRaises(errors.FirstHeaderLineIsContinuationDefect):
+ msg = email.message_from_string(m, policy=email.policy.strict)
# Test RFC 2047 header encoding and decoding
g.flatten(msg, linesep='\r\n')
self.assertEqual(s.getvalue(), text)
+ def test_crlf_control_via_policy(self):
+ with openfile('msg_26.txt', newline='\n') as fp:
+ text = fp.read()
+ msg = email.message_from_string(text)
+ s = StringIO()
+ g = email.generator.Generator(s, policy=email.policy.SMTP)
+ g.flatten(msg)
+ self.assertEqual(s.getvalue(), text)
+
+ def test_flatten_linesep_overrides_policy(self):
+ # msg_27 is lf separated
+ with openfile('msg_27.txt', newline='\n') as fp:
+ text = fp.read()
+ msg = email.message_from_string(text)
+ s = StringIO()
+ g = email.generator.Generator(s, policy=email.policy.SMTP)
+ g.flatten(msg, linesep='\n')
+ self.assertEqual(s.getvalue(), text)
+
maxDiff = None
def test_multipart_digest_with_extra_mime_headers(self):
g.flatten(msg)
self.assertEqual(s.getvalue(), source)
+ def test_crlf_control_via_policy(self):
+ # msg_26 is crlf terminated
+ with openfile('msg_26.txt', 'rb') as fp:
+ text = fp.read()
+ msg = email.message_from_bytes(text)
+ s = BytesIO()
+ g = email.generator.BytesGenerator(s, policy=email.policy.SMTP)
+ g.flatten(msg)
+ self.assertEqual(s.getvalue(), text)
+
+ def test_flatten_linesep_overrides_policy(self):
+ # msg_27 is lf separated
+ with openfile('msg_27.txt', 'rb') as fp:
+ text = fp.read()
+ msg = email.message_from_bytes(text)
+ s = BytesIO()
+ g = email.generator.BytesGenerator(s, policy=email.policy.SMTP)
+ g.flatten(msg, linesep='\n')
+ self.assertEqual(s.getvalue(), text)
+
+ def test_must_be_7bit_handles_unknown_8bit(self):
+ msg = email.message_from_bytes(self.non_latin_bin_msg)
+ out = BytesIO()
+ g = email.generator.BytesGenerator(out,
+ policy=email.policy.default.clone(must_be_7bit=True))
+ g.flatten(msg)
+ self.assertEqual(out.getvalue(),
+ self.non_latin_bin_msg_as7bit_wrapped.encode('ascii'))
+
+ def test_must_be_7bit_transforms_8bit_cte(self):
+ msg = email.message_from_bytes(self.latin_bin_msg)
+ out = BytesIO()
+ g = email.generator.BytesGenerator(out,
+ policy=email.policy.default.clone(must_be_7bit=True))
+ g.flatten(msg)
+ self.assertEqual(out.getvalue(),
+ self.latin_bin_msg_as7bit.encode('ascii'))
+
maxDiff = None
--- /dev/null
+import io
+import textwrap
+import unittest
+from email import message_from_string, message_from_bytes
+from email.generator import Generator, BytesGenerator
+from email import policy
+from test.test_email import TestEmailBase
+
+# XXX: move generator tests from test_email into here at some point.
+
+
+class TestGeneratorBase():
+
+ long_subject = {
+ 0: textwrap.dedent("""\
+ To: whom_it_may_concern@example.com
+ From: nobody_you_want_to_know@example.com
+ Subject: We the willing led by the unknowing are doing the
+ impossible for the ungrateful. We have done so much for so long with so little
+ we are now qualified to do anything with nothing.
+
+ None
+ """),
+ 40: textwrap.dedent("""\
+ To: whom_it_may_concern@example.com
+ From:\x20
+ nobody_you_want_to_know@example.com
+ Subject: We the willing led by the
+ unknowing are doing the
+ impossible for the ungrateful. We have
+ done so much for so long with so little
+ we are now qualified to do anything
+ with nothing.
+
+ None
+ """),
+ 20: textwrap.dedent("""\
+ To:\x20
+ whom_it_may_concern@example.com
+ From:\x20
+ nobody_you_want_to_know@example.com
+ Subject: We the
+ willing led by the
+ unknowing are doing
+ the
+ impossible for the
+ ungrateful. We have
+ done so much for so
+ long with so little
+ we are now
+ qualified to do
+ anything with
+ nothing.
+
+ None
+ """),
+ }
+ long_subject[100] = long_subject[0]
+
+ def maxheaderlen_parameter_test(self, n):
+ msg = self.msgmaker(self.long_subject[0])
+ s = self.ioclass()
+ g = self.genclass(s, maxheaderlen=n)
+ g.flatten(msg)
+ self.assertEqual(s.getvalue(), self.long_subject[n])
+
+ def test_maxheaderlen_parameter_0(self):
+ self.maxheaderlen_parameter_test(0)
+
+ def test_maxheaderlen_parameter_100(self):
+ self.maxheaderlen_parameter_test(100)
+
+ def test_maxheaderlen_parameter_40(self):
+ self.maxheaderlen_parameter_test(40)
+
+ def test_maxheaderlen_parameter_20(self):
+ self.maxheaderlen_parameter_test(20)
+
+ def maxheaderlen_policy_test(self, n):
+ msg = self.msgmaker(self.long_subject[0])
+ s = self.ioclass()
+ g = self.genclass(s, policy=policy.default.clone(max_line_length=n))
+ g.flatten(msg)
+ self.assertEqual(s.getvalue(), self.long_subject[n])
+
+ def test_maxheaderlen_policy_0(self):
+ self.maxheaderlen_policy_test(0)
+
+ def test_maxheaderlen_policy_100(self):
+ self.maxheaderlen_policy_test(100)
+
+ def test_maxheaderlen_policy_40(self):
+ self.maxheaderlen_policy_test(40)
+
+ def test_maxheaderlen_policy_20(self):
+ self.maxheaderlen_policy_test(20)
+
+ def maxheaderlen_parm_overrides_policy_test(self, n):
+ msg = self.msgmaker(self.long_subject[0])
+ s = self.ioclass()
+ g = self.genclass(s, maxheaderlen=n,
+ policy=policy.default.clone(max_line_length=10))
+ g.flatten(msg)
+ self.assertEqual(s.getvalue(), self.long_subject[n])
+
+ def test_maxheaderlen_parm_overrides_policy_0(self):
+ self.maxheaderlen_parm_overrides_policy_test(0)
+
+ def test_maxheaderlen_parm_overrides_policy_100(self):
+ self.maxheaderlen_parm_overrides_policy_test(100)
+
+ def test_maxheaderlen_parm_overrides_policy_40(self):
+ self.maxheaderlen_parm_overrides_policy_test(40)
+
+ def test_maxheaderlen_parm_overrides_policy_20(self):
+ self.maxheaderlen_parm_overrides_policy_test(20)
+
+
+class TestGenerator(TestGeneratorBase, TestEmailBase):
+
+ msgmaker = staticmethod(message_from_string)
+ genclass = Generator
+ ioclass = io.StringIO
+
+
+class TestBytesGenerator(TestGeneratorBase, TestEmailBase):
+
+ msgmaker = staticmethod(message_from_bytes)
+ genclass = BytesGenerator
+ ioclass = io.BytesIO
+ long_subject = {key: x.encode('ascii')
+ for key, x in TestGeneratorBase.long_subject.items()}
+
+
+if __name__ == '__main__':
+ unittest.main()
--- /dev/null
+import types
+import unittest
+import email.policy
+
+class PolicyAPITests(unittest.TestCase):
+
+ longMessage = True
+
+ # These default values are the ones set on email.policy.default.
+ # If any of these defaults change, the docs must be updated.
+ policy_defaults = {
+ 'max_line_length': 78,
+ 'linesep': '\n',
+ 'must_be_7bit': False,
+ 'raise_on_defect': False,
+ }
+
+ # For each policy under test, we give here the values of the attributes
+ # that are different from the defaults for that policy.
+ policies = {
+ email.policy.Policy(): {},
+ email.policy.default: {},
+ email.policy.SMTP: {'linesep': '\r\n'},
+ email.policy.HTTP: {'linesep': '\r\n', 'max_line_length': None},
+ email.policy.strict: {'raise_on_defect': True},
+ }
+
+ def test_defaults(self):
+ for policy, changed_defaults in self.policies.items():
+ expected = self.policy_defaults.copy()
+ expected.update(changed_defaults)
+ for attr, value in expected.items():
+ self.assertEqual(getattr(policy, attr), value,
+ ("change {} docs/docstrings if defaults have "
+ "changed").format(policy))
+
+ def test_all_attributes_covered(self):
+ for attr in dir(email.policy.default):
+ if (attr.startswith('_') or
+ isinstance(getattr(email.policy.Policy, attr),
+ types.FunctionType)):
+ continue
+ else:
+ self.assertIn(attr, self.policy_defaults,
+ "{} is not fully tested".format(attr))
+
+ def test_policy_is_immutable(self):
+ for policy in self.policies:
+ for attr in self.policy_defaults:
+ with self.assertRaisesRegex(AttributeError, attr+".*read-only"):
+ setattr(policy, attr, None)
+ with self.assertRaisesRegex(AttributeError, 'no attribute.*foo'):
+ policy.foo = None
+
+ def test_set_policy_attrs_when_calledl(self):
+ testattrdict = { attr: None for attr in self.policy_defaults }
+ for policyclass in self.policies:
+ policy = policyclass.clone(**testattrdict)
+ for attr in self.policy_defaults:
+ self.assertIsNone(getattr(policy, attr))
+
+ def test_reject_non_policy_keyword_when_called(self):
+ for policyclass in self.policies:
+ with self.assertRaises(TypeError):
+ policyclass(this_keyword_should_not_be_valid=None)
+ with self.assertRaises(TypeError):
+ policyclass(newtline=None)
+
+ def test_policy_addition(self):
+ expected = self.policy_defaults.copy()
+ p1 = email.policy.default.clone(max_line_length=100)
+ p2 = email.policy.default.clone(max_line_length=50)
+ added = p1 + p2
+ expected.update(max_line_length=50)
+ for attr, value in expected.items():
+ self.assertEqual(getattr(added, attr), value)
+ added = p2 + p1
+ expected.update(max_line_length=100)
+ for attr, value in expected.items():
+ self.assertEqual(getattr(added, attr), value)
+ added = added + email.policy.default
+ for attr, value in expected.items():
+ self.assertEqual(getattr(added, attr), value)
+
+ def test_register_defect(self):
+ class Dummy:
+ def __init__(self):
+ self.defects = []
+ obj = Dummy()
+ defect = object()
+ policy = email.policy.Policy()
+ policy.register_defect(obj, defect)
+ self.assertEqual(obj.defects, [defect])
+ defect2 = object()
+ policy.register_defect(obj, defect2)
+ self.assertEqual(obj.defects, [defect, defect2])
+
+ class MyObj:
+ def __init__(self):
+ self.defects = []
+
+ class MyDefect(Exception):
+ pass
+
+ def test_handle_defect_raises_on_strict(self):
+ foo = self.MyObj()
+ defect = self.MyDefect("the telly is broken")
+ with self.assertRaisesRegex(self.MyDefect, "the telly is broken"):
+ email.policy.strict.handle_defect(foo, defect)
+
+ def test_handle_defect_registers_defect(self):
+ foo = self.MyObj()
+ defect1 = self.MyDefect("one")
+ email.policy.default.handle_defect(foo, defect1)
+ self.assertEqual(foo.defects, [defect1])
+ defect2 = self.MyDefect("two")
+ email.policy.default.handle_defect(foo, defect2)
+ self.assertEqual(foo.defects, [defect1, defect2])
+
+ class MyPolicy(email.policy.Policy):
+ defects = []
+ def register_defect(self, obj, defect):
+ self.defects.append(defect)
+
+ def test_overridden_register_defect_still_raises(self):
+ foo = self.MyObj()
+ defect = self.MyDefect("the telly is broken")
+ with self.assertRaisesRegex(self.MyDefect, "the telly is broken"):
+ self.MyPolicy(raise_on_defect=True).handle_defect(foo, defect)
+
+ def test_overriden_register_defect_works(self):
+ foo = self.MyObj()
+ defect1 = self.MyDefect("one")
+ my_policy = self.MyPolicy()
+ my_policy.handle_defect(foo, defect1)
+ self.assertEqual(my_policy.defects, [defect1])
+ self.assertEqual(foo.defects, [])
+ defect2 = self.MyDefect("two")
+ my_policy.handle_defect(foo, defect2)
+ self.assertEqual(my_policy.defects, [defect1, defect2])
+ self.assertEqual(foo.defects, [])
+
+ # XXX: Need subclassing tests.
+ # For adding subclassed objects, make sure the usual rules apply (subclass
+ # wins), but that the order still works (right overrides left).
+
+if __name__ == '__main__':
+ unittest.main()
Library
-------
+- Issue #11731: simplify/enhance email parser/generator API by introducing
+ policy objects.
+
- Issue #11768: The signal handler of the signal module only calls
Py_AddPendingCall() for the first signal to fix a deadlock on reentrant or
parallel calls. PyErr_SetInterrupt() writes also into the wake up file.