From: John Coggeshall Date: Mon, 12 Jan 2004 10:02:04 +0000 (+0000) Subject: This was way out of date. X-Git-Tag: php_ibase_before_split~265 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=9116f2750467d97e110b7f536855f30641420343;p=php This was way out of date. --- diff --git a/ext/tidy/README b/ext/tidy/README index 2d4e015176..19f6b9ff6a 100644 --- a/ext/tidy/README +++ b/ext/tidy/README @@ -1,7 +1,6 @@ README FOR ext/tidy by John Coggeshall -Tidy Version: 0.7b Tidy is an extension based on Libtidy (http://tidy.sf.net/) and allows a PHP developer to clean, repair, and traverse HTML, XHTML, and XML documents -- including ones with @@ -10,113 +9,8 @@ embedded scripting languages such as PHP or ASP within them using OO constructs. --------------------------------------------------------------------------------------- !! Important Note !! --------------------------------------------------------------------------------------- -At this time libtidy has a small memory leak inside the ParseConfigFileEnc() function +Older versions of libtidy have a small memory leak inside the ParseConfigFileEnc() function used to load configuration from a file. If you intend to use this functionality apply the "libtidy.txt" patch (cd tidy/src/; patch -p0 < libtidy.txt) to libtidy sources and then recompile libtidy. --------------------------------------------------------------------------------------- - -The Tidy extension has two separate APIs, one for general parsing, cleaning, and -repairing and another for document traversal. The general API is provided below: - - tidy_create() Reinitialize the tidy engine - tidy_parse_file($file) Parse the document stored in $file - tidy_parse_string($str) Parse the string stored in $str - - tidy_clean_repair() Clean and repair the document - tidy_diagnose() Diagnose a parsed document - - tidy_setopt($opt, $val) Set a configuration option $opt to $val - tidy_getopt($opt) Retrieve a configuration option - - ** note: $opt is a string representing the option. Although no formal - documentation yet exists for PHP, you can find a description of many - of them at http://www.w3.org/People/Raggett/tidy/ and a list of supported - options in the phpinfo(); output** - - tidy_get_output() Return the cleaned tidy HTML as a string - tidy_get_error_buffer() Return a log of the errors and warnings - returned by tidy - - tidy_get_release() Return the Libtidy release date - tidy_get_status() Return the status of the document - tidy_get_html_ver() Return the major HTML version detected for - the document; - - tidy_is_xhtml() Determines if the document is XHTML - tidy_is_xml() Determines if the document is a generic XML - - tidy_error_count() Returns the number of errors in the document - tidy_warning_count() Returns the number of warnings in the document - tidy_access_count() Returns the number of accessibility-related - warnings in the document. - tidy_config_count() Returns the number of configuration errors found - - tidy_load_config($file) Loads the specified configuration file - tidY_load_config_enc($file, - $enc) Loads the specified config file using the specified - character encoding - tidy_set_encoding($enc) Sets the current character encoding for the document - tidy_save_config($file) Saves the current config to $file - - -Beyond these general-purpose API functions, Tidy also supports the following -functions which are used to retrieve an object for document traversal: - - tidy_get_root() Returns an object starting at the root of the - document - tidy_get_head() Returns an object starting at the tag - tidy_get_html() Returns an object starting at the tag - tidy_get_body() Returns an object starting at the tag - -All Navigation of the specified document is done via the PHP5 object constructs. -There are two types of objects which Tidy can create. The first is TidyNode, which -represents HTML Tags, Text, and more (see the TidyNode_Type Constants). The second -is TidyAttr, which represents an attribute within an HTML tag (TidyNode). The -functionality of these objects is represented by the following schema: - -class TidyNode { - - public $name; // name of node (i.e. HEAD) - public $value; // value of node (everything between tags) - public $type; // type of node (text, php, asp, etc.) - public $id; // id of node (i.e. TIDY_TAG_HEAD) - - public function attributes(); // an array of attributes (see TidyAttr) - public function children(); // an array of child nodes - - function has_siblings(); // any sibling nodes? - function has_children(); // any child nodes? - - function is_comment(); // is node a comment? - function is_xhtml(); // is document XHTML? - function is_xml(); // is document generic XML (not HTML/XHTML) - function is_text(); // is node text? - function is_html(); // is node an HTML tag? - - function is_jste(); // is jste block? - function is_asp(); // is Microsoft ASP block? - function is_php(); // is PHP block? - - function next(); // returns next node - function prev(); // returns prev node - - /* Searches for a particular attribute in the current node based - on node ID. If found returns a TidyAttr object for it */ - function get_attr($attr_id); - - /* -} - -class TidyAttr { - - public $name; // attribute name i.e. HREF - public $value; // attribute value - public $id; // attribute id i.e. TIDY_ATTR_HREF - -} - -Examples of using these objects to navigate the tree can be found in the examples/ -directory (I suggest looking at urlgrab.php and dumpit.php) - -E-mail thoughts, suggestions, patches, etc. to