From: Regina Obe Date: Fri, 12 Jul 2013 12:35:58 +0000 (+0000) Subject: flesh out pagc_normalize_address and point out issue with batch and workaround for... X-Git-Tag: 2.2.0rc1~1447 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=db5eb557f57a93ca9ed06e55423f74c8c86283e6;p=postgis flesh out pagc_normalize_address and point out issue with batch and workaround for issue. git-svn-id: http://svn.osgeo.org/postgis/trunk@11674 b70326c6-7e19-0410-871a-916f4a2858ee --- diff --git a/doc/extras_tigergeocoder.xml b/doc/extras_tigergeocoder.xml index 5a63924f6..8e6450a2e 100644 --- a/doc/extras_tigergeocoder.xml +++ b/doc/extras_tigergeocoder.xml @@ -989,13 +989,13 @@ CREATE INDEX idx_tiger_data_ma_faces_countyfp ON tiger_data.ma_faces USING btree Pagc_Normalize_Address Given a textual street address, returns a composite norm_addy type that has road suffix, prefix and type standardized, street, streetname etc. broken into separate fields. This function - will work with just the lookup data packaged with the tiger_geocoder (no need for tiger census data). + will work with just the lookup data packaged with the tiger_geocoder (no need for tiger census data). Requires address_standardizer extension. - norm_addy normalize_address + norm_addy pagc_normalize_address varchar in_address @@ -1006,13 +1006,16 @@ CREATE INDEX idx_tiger_data_ma_faces_countyfp ON tiger_data.ma_faces USING btree Given a textual street address, returns a composite norm_addy type that has road suffix, prefix and type standardized, street, streetname etc. broken into separate fields. This is the first step in the geocoding process to get all addresses into normalized postal form. No other data is required aside from what is packaged with the geocoder. - This function just uses the various direction/state/suffix lookup tables preloaded with the tiger_geocoder and located in the tiger schema, so it doesn't need you to download tiger census data or any other additional data to make use of it. + This function just uses the various pagc_* lookup tables preloaded with the tiger_geocoder and located in the tiger schema, so it doesn't need you to download tiger census data or any other additional data to make use of it. You may find the need to add more abbreviations or alternative namings to the various lookup tables in the tiger schema. It uses various control lookup tables located in tiger schema to normalize the input address. Fields in the norm_addy type object returned by this function in this order where () indicates a field required by the geocoder, [] indicates an optional field: - This version uses the PAGC address standardizer + This version uses the PAGC address standardizer C extension which you can download. There are slight variations in casing and formatting and also provides a richer breakout. Availability: 2.1.0 (address) [predirAbbrev] (streetName) [streetTypeAbbrev] [postdirAbbrev] [internal] [location] [stateAbbrev] [zip] + The native standardaddr of address_standardizer extension is at this time a bit richer than norm_addy since its designed to support international addresses (including country). standardaddr equivalent fields are: + house_num,predir, name, suftype, sufdir, unit, city, state, postcode + address is an integer: The street number @@ -1048,7 +1051,40 @@ CREATE INDEX idx_tiger_data_ma_faces_countyfp ON tiger_data.ma_faces USING btree + + Examples + Single call example + +SELECT addy.* +FROM pagc_normalize_address('9000 E ROO ST STE 999, Springfield, CO') AS addy; + + address | predirabbrev | streetname | streettypeabbrev | postdirabbrev | internal | location | stateabbrev | zip | parsed + --------+--------------+------------+------------------+---------------+-----------+-------------+-------------+-----+-------- + 9000 | E | ROO | St | | SUITE 999 | SPRINGFIELD | CO | | t + + Batch call. There are currently speed issues with the way postgis_tiger_geocoder wraps the address_standardizer. These will hopefully +be resolved in later editions. To work around them, if you need speed for batch geocoding to call generate a normaddy in batch mode, you are encouraged +to directly call the address_standardizer standardize_address function as shown below which is similar exercise to what we did in . + + WITH g AS (SELECT address, ROW((sa).house_num, (sa).predir, (sa).name + , (sa).suftype, (sa).sufdir, (sa).unit , (sa).city, (sa).state, (sa).postcode, true)::norm_addy As na + FROM (SELECT address, standardize_address('tiger.pagc_lex' + , 'tiger.pagc_gaz' + , 'tiger.pagc_rules', address) As sa + FROM addresses_to_geocode) As g) +SELECT address As orig, (g.na).streetname, (g.na).streettypeabbrev + FROM g; + + orig | streetname | streettypeabbrev +-----------------------------------------------------+---------------+------------------ + 529 Main Street, Boston MA, 02129 | MAIN | St + 77 Massachusetts Avenue, Cambridge, MA 02139 | MASSACHUSETTS | Ave + 25 Wizard of Oz, Walaford, KS 99912323 | WIZARD OF | + 26 Capen Street, Medford, MA | CAPEN | St + 124 Mount Auburn St, Cambridge, Massachusetts 02138 | MOUNT AUBURN | St + 950 Main Street, Worcester, MA 01610 | MAIN | St + See Also