.. seealso::
- Module :mod:`urllib2`
+ Module :mod:`urllib.request`
URL opening with automatic cookie handling.
Module :mod:`http.cookies`
the :class:`CookieJar`'s :class:`CookiePolicy` instance are true and false
respectively), the :mailheader:`Cookie2` header is also added when appropriate.
- The *request* object (usually a :class:`urllib2.Request` instance) must support
- the methods :meth:`get_full_url`, :meth:`get_host`, :meth:`get_type`,
- :meth:`unverifiable`, :meth:`get_origin_req_host`, :meth:`has_header`,
- :meth:`get_header`, :meth:`header_items`, and :meth:`add_unredirected_header`,as
- documented by :mod:`urllib2`.
+ The *request* object (usually a :class:`urllib.request..Request` instance)
+ must support the methods :meth:`get_full_url`, :meth:`get_host`,
+ :meth:`get_type`, :meth:`unverifiable`, :meth:`get_origin_req_host`,
+ :meth:`has_header`, :meth:`get_header`, :meth:`header_items`, and
+ :meth:`add_unredirected_header`, as documented by :mod:`urllib.request`.
.. method:: CookieJar.extract_cookies(response, request)
as appropriate (subject to the :meth:`CookiePolicy.set_ok` method's approval).
The *response* object (usually the result of a call to
- :meth:`urllib2.urlopen`, or similar) should support an :meth:`info` method,
- which returns a :class:`email.message.Message` instance.
+ :meth:`urllib.request.urlopen`, or similar) should support an :meth:`info`
+ method, which returns a :class:`email.message.Message` instance.
- The *request* object (usually a :class:`urllib2.Request` instance) must support
- the methods :meth:`get_full_url`, :meth:`get_host`, :meth:`unverifiable`, and
- :meth:`get_origin_req_host`, as documented by :mod:`urllib2`. The request is
- used to set default values for cookie-attributes as well as for checking that
- the cookie is allowed to be set.
+ The *request* object (usually a :class:`urllib.request.Request` instance)
+ must support the methods :meth:`get_full_url`, :meth:`get_host`,
+ :meth:`unverifiable`, and :meth:`get_origin_req_host`, as documented by
+ :mod:`urllib.request`. The request is used to set default values for
+ cookie-attributes as well as for checking that the cookie is allowed to be
+ set.
.. method:: CookieJar.set_policy(policy)
The first example shows the most common usage of :mod:`http.cookiejar`::
- import http.cookiejar, urllib2
+ import http.cookiejar, urllib.request
cj = http.cookiejar.CookieJar()
- opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
+ opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
This example illustrates how to open a URL using your Netscape, Mozilla, or Lynx
cookies (assumes Unix/Netscape convention for location of the cookies file)::
- import os, http.cookiejar, urllib2
+ import os, http.cookiejar, urllib.request
cj = http.cookiejar.MozillaCookieJar()
cj.load(os.path.join(os.environ["HOME"], ".netscape/cookies.txt"))
- opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
+ opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
The next example illustrates the use of :class:`DefaultCookiePolicy`. Turn on
Netscape cookies, and block some domains from setting cookies or having them
returned::
- import urllib2
+ import urllib.request
from http.cookiejar import CookieJar, DefaultCookiePolicy
policy = DefaultCookiePolicy(
rfc2965=True, strict_ns_domain=Policy.DomainStrict,
blocked_domains=["ads.net", ".ads.net"])
cj = CookieJar(policy)
- opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
+ opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
Use the *headers* argument to the :class:`Request` constructor, or::
- import urllib
+ import urllib.request
req = urllib.request.Request('http://www.example.com/')
req.add_header('Referer', 'http://www.python.org/')
r = urllib.request.urlopen(req)
:class:`OpenerDirector` automatically adds a :mailheader:`User-Agent` header to
every :class:`Request`. To change this::
- import urllib
+ import urllib.request
opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
opener.open('http://www.example.com/')
return attrs
def add_cookie_header(self, request):
- """Add correct Cookie: header to request (urllib2.Request object).
+ """Add correct Cookie: header to request (urllib.request.Request object).
The Cookie2 header is also added unless policy.hide_cookie2 is true.
Send the record to the Web server as an URL-encoded dictionary
"""
try:
- import http.client, urllib
+ import http.client, urllib.parse
host = self.host
h = http.client.HTTP(host)
url = self.url
- data = urllib.urlencode(self.mapLogRecord(record))
+ data = urllib.parse.urlencode(self.mapLogRecord(record))
if self.method == "GET":
if (url.find('?') >= 0):
sep = '&'
Example usage:
-import urllib2
+import urllib.request
# set up authentication info
-authinfo = urllib2.HTTPBasicAuthHandler()
+authinfo = urllib.request.HTTPBasicAuthHandler()
authinfo.add_password(realm='PDQ Application',
uri='https://mahler:8092/site-updates.py',
user='klem',
passwd='geheim$parole')
-proxy_support = urllib2.ProxyHandler({"http" : "http://ahad-haam:3128"})
+proxy_support = urllib.request.ProxyHandler({"http" : "http://ahad-haam:3128"})
# build a new opener that adds authentication and caching FTP handlers
-opener = urllib2.build_opener(proxy_support, authinfo, urllib2.CacheFTPHandler)
+opener = urllib.request.build_opener(proxy_support, authinfo,
+ urllib.request.CacheFTPHandler)
# install it
-urllib2.install_opener(opener)
+urllib.request.install_opener(opener)
-f = urllib2.urlopen('http://www.python.org/')
+f = urllib.request.urlopen('http://www.python.org/')
"""
# XXX issues:
# Strictly (according to RFC 2616), 301 or 302 in response to
# a POST MUST NOT cause a redirection without confirmation
- # from the user (of urllib2, in this case). In practice,
+ # from the user (of urllib.request, in this case). In practice,
# essentially all clients do redirect in this case, so we do
# the same.
# be conciliant with URIs containing a space
if proxy_type is None:
proxy_type = orig_type
if user and password:
- user_pass = '%s:%s' % (unquote(user),
+ user_pass = '%s:%s' % (urllib.parse.unquote(user),
urllib.parse.unquote(password))
creds = base64.b64encode(user_pass.encode()).decode("ascii")
req.add_header('Proxy-authorization', 'Basic ' + creds)
def http_error_407(self, req, fp, code, msg, headers):
# http_error_auth_reqed requires that there is no userinfo component in
- # authority. Assume there isn't one, since urllib2 does not (and
+ # authority. Assume there isn't one, since urllib.request does not (and
# should not, RFC 3986 s. 3.2.1) support requests for URLs containing
# userinfo.
authority = req.get_host()
return urllib.response.addinfourl(open(localfile, 'rb'),
headers, 'file:'+file)
except OSError as msg:
- # urllib2 users shouldn't expect OSErrors coming from urlopen()
+ # users shouldn't expect OSErrors coming from urlopen()
raise urllib.error.URLError(msg)
raise urllib.error.URLError('file not on local host')
Usage: see USAGE variable in the script.
"""
-import platform, os, sys, getopt, textwrap, shutil, urllib2, stat, time, pwd
+import platform, os, sys, getopt, textwrap, shutil, stat, time, pwd
+import urllib.request
import grp
INCLUDE_TIMESTAMP = 1
if KNOWNSIZES.get(url) == size:
print("Using existing file for", url)
return
- fpIn = urllib2.urlopen(url)
+ fpIn = urllib.request.urlopen(url)
fpOut = open(fname, 'wb')
block = fpIn.read(10240)
try:
re Regular Expressions.
reprlib Redo repr() but with limits on most sizes.
rlcompleter Word completion for GNU readline 2.0.
-robotparser Parse robots.txt files, useful for web spiders.
sched A generally useful event scheduler class.
shelve Manage shelves of pickled objects.
shlex Lexical analyzer class for simple shell-like syntaxes.
types Define names for all type symbols in the std interpreter.
tzparse Parse a timezone specification.
unicodedata Interface to unicode properties.
-urllib Open an arbitrary URL.
-urlparse Parse URLs according to latest draft of standard.
+urllib.parse Parse URLs according to latest draft of standard.
+urllib.request Open an arbitrary URL.
+urllib.robotparser Parse robots.txt files, useful for web spiders.
user Hook to allow user-specified customization code to run.
uu UUencode/UUdecode.
unittest Utilities for implementing unit testing.