--- /dev/null
- prefered over a newer but non-final version (for example, whether to pick
+:mod:`packaging.pypi.simple` --- Crawler using the PyPI "simple" interface
+==========================================================================
+
+.. module:: packaging.pypi.simple
+ :synopsis: Crawler using the screen-scraping "simple" interface to fetch info
+ and distributions.
+
+
+The class provided by :mod:`packaging.pypi.simple` can access project indexes
+and provide useful information about distributions. PyPI, other indexes and
+local indexes are supported.
+
+You should use this module to search distributions by name and versions, process
+index external pages and download distributions. It is not suited for things
+that will end up in too long index processing (like "finding all distributions
+with a specific version, no matter the name"); use :mod:`packaging.pypi.xmlrpc`
+for that.
+
+
+API
+---
+
+.. class:: Crawler(index_url=DEFAULT_SIMPLE_INDEX_URL, \
+ prefer_final=False, prefer_source=True, \
+ hosts=('*',), follow_externals=False, \
+ mirrors_url=None, mirrors=None, timeout=15, \
+ mirrors_max_tries=0)
+
+ *index_url* is the address of the index to use for requests.
+
+ The first two parameters control the query results. *prefer_final*
+ indicates whether a final version (not alpha, beta or candidate) is to be
++ preferred over a newer but non-final version (for example, whether to pick
+ up 1.0 over 2.0a3). It is used only for queries that don't give a version
+ argument. Likewise, *prefer_source* tells whether to prefer a source
+ distribution over a binary one, if no distribution argument was prodived.
+
+ Other parameters are related to external links (that is links that go
+ outside the simple index): *hosts* is a list of hosts allowed to be
+ processed if *follow_externals* is true (default behavior is to follow all
+ hosts), *follow_externals* enables or disables following external links
+ (default is false, meaning disabled).
+
+ The remaining parameters are related to the mirroring infrastructure
+ defined in :PEP:`381`. *mirrors_url* gives a URL to look on for DNS
+ records giving mirror adresses; *mirrors* is a list of mirror URLs (see
+ the PEP). If both *mirrors* and *mirrors_url* are given, *mirrors_url*
+ will only be used if *mirrors* is set to ``None``. *timeout* is the time
+ (in seconds) to wait before considering a URL has timed out;
+ *mirrors_max_tries"* is the number of times to try requesting informations
+ on mirrors before switching.
+
+ The following methods are defined:
+
+ .. method:: get_distributions(project_name, version)
+
+ Return the distributions found in the index for the given release.
+
+ .. method:: get_metadata(project_name, version)
+
+ Return the metadata found on the index for this project name and
+ version. Currently downloads and unpacks a distribution to read the
+ PKG-INFO file.
+
+ .. method:: get_release(requirements, prefer_final=None)
+
+ Return one release that fulfills the given requirements.
+
+ .. method:: get_releases(requirements, prefer_final=None, force_update=False)
+
+ Search for releases and return a
+ :class:`~packaging.pypi.dist.ReleasesList` object containing the
+ results.
+
+ .. method:: search_projects(name=None)
+
+ Search the index for projects containing the given name and return a
+ list of matching names.
+
+ See also the base class :class:`packaging.pypi.base.BaseClient` for inherited
+ methods.
+
+
+.. data:: DEFAULT_SIMPLE_INDEX_URL
+
+ The address used by default by the crawler class. It is currently
+ ``'http://a.pypi.python.org/simple/'``, the main PyPI installation.
+
+
+
+
+Usage Exemples
+---------------
+
+To help you understand how using the `Crawler` class, here are some basic
+usages.
+
+Request the simple index to get a specific distribution
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Supposing you want to scan an index to get a list of distributions for
+the "foobar" project. You can use the "get_releases" method for that.
+The get_releases method will browse the project page, and return
+:class:`ReleaseInfo` objects for each found link that rely on downloads. ::
+
+ >>> from packaging.pypi.simple import Crawler
+ >>> crawler = Crawler()
+ >>> crawler.get_releases("FooBar")
+ [<ReleaseInfo "Foobar 1.1">, <ReleaseInfo "Foobar 1.2">]
+
+
+Note that you also can request the client about specific versions, using version
+specifiers (described in `PEP 345
+<http://www.python.org/dev/peps/pep-0345/#version-specifiers>`_)::
+
+ >>> client.get_releases("FooBar < 1.2")
+ [<ReleaseInfo "FooBar 1.1">, ]
+
+
+`get_releases` returns a list of :class:`ReleaseInfo`, but you also can get the
+best distribution that fullfil your requirements, using "get_release"::
+
+ >>> client.get_release("FooBar < 1.2")
+ <ReleaseInfo "FooBar 1.1">
+
+
+Download distributions
+^^^^^^^^^^^^^^^^^^^^^^
+
+As it can get the urls of distributions provided by PyPI, the `Crawler`
+client also can download the distributions and put it for you in a temporary
+destination::
+
+ >>> client.download("foobar")
+ /tmp/temp_dir/foobar-1.2.tar.gz
+
+
+You also can specify the directory you want to download to::
+
+ >>> client.download("foobar", "/path/to/my/dir")
+ /path/to/my/dir/foobar-1.2.tar.gz
+
+
+While downloading, the md5 of the archive will be checked, if not matches, it
+will try another time, then if fails again, raise `MD5HashDoesNotMatchError`.
+
+Internally, that's not the Crawler which download the distributions, but the
+`DistributionInfo` class. Please refer to this documentation for more details.
+
+
+Following PyPI external links
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The default behavior for packaging is to *not* follow the links provided
+by HTML pages in the "simple index", to find distributions related
+downloads.
+
+It's possible to tell the PyPIClient to follow external links by setting the
+`follow_externals` attribute, on instantiation or after::
+
+ >>> client = Crawler(follow_externals=True)
+
+or ::
+
+ >>> client = Crawler()
+ >>> client.follow_externals = True
+
+
+Working with external indexes, and mirrors
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The default `Crawler` behavior is to rely on the Python Package index stored
+on PyPI (http://pypi.python.org/simple).
+
+As you can need to work with a local index, or private indexes, you can specify
+it using the index_url parameter::
+
+ >>> client = Crawler(index_url="file://filesystem/path/")
+
+or ::
+
+ >>> client = Crawler(index_url="http://some.specific.url/")
+
+
+You also can specify mirrors to fallback on in case the first index_url you
+provided doesnt respond, or not correctly. The default behavior for
+`Crawler` is to use the list provided by Python.org DNS records, as
+described in the :PEP:`381` about mirroring infrastructure.
+
+If you don't want to rely on these, you could specify the list of mirrors you
+want to try by specifying the `mirrors` attribute. It's a simple iterable::
+
+ >>> mirrors = ["http://first.mirror","http://second.mirror"]
+ >>> client = Crawler(mirrors=mirrors)
+
+
+Searching in the simple index
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+It's possible to search for projects with specific names in the package index.
+Assuming you want to find all projects containing the "distutils" keyword::
+
+ >>> c.search_projects("distutils")
+ [<Project "collective.recipe.distutils">, <Project "Distutils">, <Project
+ "Packaging">, <Project "distutilscross">, <Project "lpdistutils">, <Project
+ "taras.recipe.distutils">, <Project "zerokspot.recipe.distutils">]
+
+
+You can also search the projects starting with a specific text, or ending with
+that text, using a wildcard::
+
+ >>> c.search_projects("distutils*")
+ [<Project "Distutils">, <Project "Packaging">, <Project "distutilscross">]
+
+ >>> c.search_projects("*distutils")
+ [<Project "collective.recipe.distutils">, <Project "Distutils">, <Project
+ "lpdistutils">, <Project "taras.recipe.distutils">, <Project
+ "zerokspot.recipe.distutils">]
--- /dev/null
- XML-RPC is a prefered way to retrieve metadata information from indexes.
+:mod:`packaging.pypi.xmlrpc` --- Crawler using the PyPI XML-RPC interface
+=========================================================================
+
+.. module:: packaging.pypi.xmlrpc
+ :synopsis: Client using XML-RPC requests to fetch info and distributions.
+
+
+Indexes can be queried using XML-RPC calls, and Packaging provides a simple
+way to interface with XML-RPC.
+
+You should **use** XML-RPC when:
+
+* Searching the index for projects **on other fields than project
+ names**. For instance, you can search for projects based on the
+ author_email field.
+* Searching all the versions that have existed for a project.
+* you want to retrieve METADATAs information from releases or
+ distributions.
+
+
+You should **avoid using** XML-RPC method calls when:
+
+* Retrieving the last version of a project
+* Getting the projects with a specific name and version.
+* The simple index can match your needs
+
+
+When dealing with indexes, keep in mind that the index queries will always
+return you :class:`packaging.pypi.dist.ReleaseInfo` and
+:class:`packaging.pypi.dist.ReleasesList` objects.
+
+Some methods here share common APIs with the one you can find on
+:class:`packaging.pypi.simple`, internally, :class:`packaging.pypi.client`
+is inherited by :class:`Client`
+
+
+API
+---
+
+.. class:: Client
+
+
+Usage examples
+--------------
+
+Use case described here are use case that are not common to the other clients.
+If you want to see all the methods, please refer to API or to usage examples
+described in :class:`packaging.pypi.client.Client`
+
+
+Finding releases
+^^^^^^^^^^^^^^^^
+
+It's a common use case to search for "things" within the index. We can
+basically search for projects by their name, which is the most used way for
+users (eg. "give me the last version of the FooBar project").
+
+This can be accomplished using the following syntax::
+
+ >>> client = xmlrpc.Client()
+ >>> client.get_release("Foobar (<= 1.3))
+ <FooBar 1.2.1>
+ >>> client.get_releases("FooBar (<= 1.3)")
+ [FooBar 1.1, FooBar 1.1.1, FooBar 1.2, FooBar 1.2.1]
+
+
+And we also can find for specific fields::
+
+ >>> client.search_projects(field=value)
+
+
+You could specify the operator to use, default is "or"::
+
+ >>> client.search_projects(field=value, operator="and")
+
+
+The specific fields you can search are:
+
+* name
+* version
+* author
+* author_email
+* maintainer
+* maintainer_email
+* home_page
+* license
+* summary
+* description
+* keywords
+* platform
+* download_url
+
+
+Getting metadata information
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
++XML-RPC is a preferred way to retrieve metadata information from indexes.
+It's really simple to do so::
+
+ >>> client = xmlrpc.Client()
+ >>> client.get_metadata("FooBar", "1.1")
+ <ReleaseInfo FooBar 1.1>
+
+
+Assuming we already have a :class:`packaging.pypi.ReleaseInfo` object defined,
+it's possible to pass it to the xmlrpc client to retrieve and complete its
+metadata::
+
+ >>> foobar11 = ReleaseInfo("FooBar", "1.1")
+ >>> client = xmlrpc.Client()
+ >>> returned_release = client.get_metadata(release=foobar11)
+ >>> returned_release
+ <ReleaseInfo FooBar 1.1>
+
+
+Get all the releases of a project
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To retrieve all the releases for a project, you can build them using
+`get_releases`::
+
+ >>> client = xmlrpc.Client()
+ >>> client.get_releases("FooBar")
+ [<ReleaseInfo FooBar 0.9>, <ReleaseInfo FooBar 1.0>, <ReleaseInfo 1.1>]
+
+
+Get information about distributions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Indexes have information about projects, releases **and** distributions.
+If you're not familiar with those, please refer to the documentation of
+:mod:`packaging.pypi.dist`.
+
+It's possible to retrieve information about distributions, e.g "what are the
+existing distributions for this release ? How to retrieve them ?"::
+
+ >>> client = xmlrpc.Client()
+ >>> release = client.get_distributions("FooBar", "1.1")
+ >>> release.dists
+ {'sdist': <FooBar 1.1 sdist>, 'bdist': <FooBar 1.1 bdist>}
+
+As you see, this does not return a list of distributions, but a release,
+because a release can be used like a list of distributions.