]> granicus.if.org Git - php/commitdiff
- Added my doc so that other people can work on it.
authorDerick Rethans <derick@php.net>
Wed, 3 May 2006 11:35:00 +0000 (11:35 +0000)
committerDerick Rethans <derick@php.net>
Wed, 3 May 2006 11:35:00 +0000 (11:35 +0000)
#- Please keep it RST compatible.

ext/filter/docs/filter.txt [new file with mode: 0644]

diff --git a/ext/filter/docs/filter.txt b/ext/filter/docs/filter.txt
new file mode 100644 (file)
index 0000000..5331afa
--- /dev/null
@@ -0,0 +1,303 @@
+Input Filter Extension
+~~~~~~~~~~~~~~~~~~~~~~
+
+Introduction
+============
+We all know that you should always check input variables, but PHP does not
+offer really good functionality for doing this in a safe way. The Input Filter
+extension is meant to address this issue by implementing a set of filters and
+mechanisms that users can use to safely access their input data. 
+
+
+Change Log
+==========
+2005-10-27
+    * Updated filter_data prototype
+    * Added filter constants
+    * Fixed minor problems
+    * Changes by David Tulloh
+
+2005-10-05
+    * Changed "input_filter.paranoid_admin_default_filter" to
+      "filter.default".
+    * Updated API prototypes to reflect implementation.
+    * Added 'on' and 'off' to the boolean filter.
+    * Removed min_range and max_range flags from the float filter.
+    * Added validate_url, validate_email and validate_ip filters.
+    * Updated allows flags for all filters.
+
+2005-08-15
+    * Unmade *source* a bitmask, it doesn't make sense to do.
+    * Changed return value of filters which got invalid data from 'false' to
+      'null.
+    * Failed filters do not throw an E_NOTICE any longer.
+    * Added a magic_quotes sanitizing filter.
+
+
+General Considerations
+======================
+* If the filter's expected input data mask does not match the provided data
+  for logical filters the filter function returns "null".
+* Character filters always return a string.
+* With the input filter extension enabled, and the
+  input_filter.paranoid_admin_default_filter is set to something != 'raw',
+  then all entries in the affected super globals will be passed through the
+  configured filter. The 'callback' filter can not be used here, as that
+  requieres a PHP script to be running already.
+* As the input filter acts on input data before the magic quotes function
+  mangles data, all access through the filter() function will not have any
+  quotes or slashes added - it will be the pure data as send by the browser.
+* All flags mentioned here should be prepended with `FILTER_FLAG_` when used
+  with PHP.
+
+  
+API
+===
+mixed *input_get* (int *source*, string *name*, [, int *filter* [, mixed *filter_options*, [ string *characterset* ] ]);
+    Returns the filtered variable *$name* from source *$source*. It uses the
+    filter as specified in *$filter* with a constant, and additional options
+    to the filter through *$filter_options*.
+
+bool *input_has_variable (int *source*, string *name*);
+    Returns *true* if the variable with the name *name* exists in *source*, or
+    *false* otherwise.
+
+array *input_filters_list* ();
+    Returns a list with all supported filter names.
+
+mixed *filter_data* (mixed *variable*, int *filter* [, mixed *filter_options*, [ string *characterset* ] ]);
+    Filters the user supplied variable *$variable* in the same manner as
+    *input_get*.
+
+*$source*:
+
+* INPUT_POST     0
+* INPUT_GET      1
+* INPUT_COOKIE   2
+* INPUT_ENV      4
+* INPUT_SERVER   5
+* INPUT_SESSION  6
+
+
+Logical Filters
+===============
+
+These filters check whether passed data was valid, and do never mangle input
+variables, but ofcourse they can deny the whole input variable getting to the
+application by returning false.
+
+================ ========== =========== ==================================================
+Name             Constant   Return Type Description      
+================ ========== =========== ==================================================
+int              FL_INT     integer     Returns the input variable as an integer
+
+                                        $filter_options - an array with the optional
+                                        elements:
+
+                                        * min_range: Minimal number that is allowed
+                                          (inclusive)
+                                        * max_range: Maximum number that is allowed
+                                          (inclusive)
+                                        * flags: A bitmask supporting the following flags:
+                          
+                                          - ALLOW_OCTAL: allow octal numbers with the format
+                                            0nn as input too.
+                                          - ALLOW_HEX: allow hexadecimal numbers with the
+                                            format 0xnn or 0Xnn too.
+
+boolean          FL_BOOLEAN boolean     Returns *true* for '1', 'on' and 'true' and *false*
+                                        for '0', 'off' and 'false'
+
+float            FL_FLOAT   float       Returns the input variable as a floating point value
+
+validate_regexp  FL_REGEXP  string      Matches the input value as a string against the
+                                        regular expression. If there is a match then the
+                                        string is returned, otherwise the filter returns
+                                        *null*.
+                                        Remarks: Only available if pcre has been compiled
+                                        into PHP.
+
+validate_url     FL_URL     string      Validates an URL's format.
+
+                                        $filter_options - an bitmask that supports the
+                                        following flags:
+           
+                                        * SCHEME_REQUIRED: The 'schema' part of the URL
+                                          needs to in the passed URL.
+                                        * HOST_REQUIRED: The 'host' part of the URL
+                                          needs to in the passed URL.
+                                        * PATH_REQUIRED: The 'path' part of the URL
+                                          needs to in the passed URL.
+                                        * QUERY_REQUIRED: The 'query' part of the URL
+                                          needs to in the passed URL.
+
+validate_email   FL_EMAIL   string      Validates the passed string against a reasonably
+                                        good regular expression for validating an email
+                                        address.
+
+validate_ip      FL_IP      string      Validates a string representing an IP address.
+
+                                        $filter_options - an bitmask that supports the
+                                        following flags:
+
+                                        * IPV4: Allows IPv4 addresses.
+                                        * IPV6: Allows IPv6 addresses.
+                                        * NO_RES_RANGE: Disallows addresses in reversed
+                                          ranges (IPv4 only)
+                                        * NO_PRIV_RANGE: Disallows addresses in private
+                                          ranges (IPv4 only)
+================ ========== =========== ==================================================
+
+
+Sanitizing Filters
+==================
+
+These filters remove data, or change data depending on the filter, and the
+set rules for this specific filter. Instead of taking an *options* array, they
+use this parameter for flags for the specific filter.
+                          
+============= ================ =========== =====================================================
+Name          Constant         Return Type Description      
+============= ================ =========== =====================================================
+string        FS_STRING        string      Returns the input variable as a string after it has
+                                           been stripped of XML/HTML tags and other evil things
+                                           that can cause XSS problems.
+
+                                           $filter_options - an bitmask that supports the
+                                           following flags:
+                 
+                                           * NO_ENCODE_QUOTES: Prevents single and double
+                                             quotes from being encoded as numerical HTML
+                                             entities.
+                                           * STRIP_LOW: excludes all characters < 0x20 from the
+                                             allowed character list
+                                           * STRIP_HIGH: excludes all characters >= 0x80 from
+                                             the allowed character list
+                                           * ENCODE_LOW: allows characters < 0x20 but encodes
+                                             them as numerical HTML entities
+                                           * ENCODE_HIGH: allows characters >= 0x80 but encodes
+                                             them as numerical HTML entities
+                                           * ENCODE_AMP: encodes & as &amp;
+
+                                           The flags STRIP_LOW and ENCODE_LOW are mutual
+                                           exclusive, and so are STRIP_HIGH and ENCODE_HIGH. In
+                                           the case they clash, the characters will be
+                                           stripped.
+
+stripped      FS_STRIPPED      string      Alias for 'string'.
+
+encoded       FS_ENCODED       string      Encodes all characters outside the range
+                                           "a-zA-Z0-9-._" as URL encoded values.
+
+                                           $filter_options - an bitmask that supports the
+                                           following flags:
+
+                                           * STRIP_LOW: excludes all characters < 0x20 from the
+                                             allowed character list
+                                           * STRIP_HIGH: excludes all characters >= 0x80 from
+                                             the allowed character list
+                                           * ENCODE_LOW: allows characters < 0x20 but encodes
+                                             them as numerical HTML entities
+                                           * ENCODE_HIGH: allows characters >= 0x80 but encodes
+                                             them as numerical HTML entities
+
+special_chars FS_SPECIAL_CHARS string      Encodes the 'special' characters ' " < > &, \0 and
+                                           everything below 0x20 as numerical HTML entities.
+                 
+                                           $filter_options - an bitmask that supports the
+                                           following flags:
+                 
+                                           * STRIP_LOW: excludes all characters < 0x20 from the
+                                             allowed character list. If this is not set, then
+                                             those characters are encoded as numerical HTML
+                                             entities
+                                           * STRIP_HIGH: excludes all characters >= 0x80 from
+                                             the allowed character list
+                                           * ENCODE_HIGH: allows characters >= 0x80 but encodes
+                                             them as numerical HTML entities
+
+unsafe_raw    FS_UNSAFE_RAW    string      Returns the input variable as a string without
+                                           XML/HTML being stripped from the input value.
+                 
+                                           $filter_options - an bitmask that supports the
+                                           following flags:
+                 
+                                           * STRIP_LOW: excludes all characters < 0x20 from the
+                                             allowed character list
+                                           * STRIP_HIGH: excludes all characters >= 0x80 from
+                                             the allowed character list
+                                           * ENCODE_LOW: allows characters < 0x20 but encodes
+                                             them as numerical HTML entities
+                                           * ENCODE_HIGH: allows characters >= 0x80 but encodes
+                                             them as numerical HTML entities
+                                           * ENCODE_AMP: encodes & as &amp;
+                 
+                                           The flags STRIP_LOW and ENCODE_LOW are mutual
+                                           exclusive, and so are STRIP_HIGH and ENCODE_HIGH. In
+                                           the case they clash, the characters will be
+                                           stripped.
+
+email         FS_EMAIL         string      Removes all characters that can not be part of a
+                                           correctly formed e-mail address (exception are
+                                           comments in the email address) (a-z A-Z 0-9 " ! # $
+                                           % & ' * + - / = ? ^ _ ` { | } ~ @ . [ ]). This
+                                           filter does `not` validate if the e-mail address has
+                                           the correct format, use the validate_email filter
+                                           for that.
+                 
+url           FS_URL           string      Removes all characters that can not be part of a
+                                           correctly formed URI. (a-z A-Z 0-9 $ - _ . + ! * ' (
+                                           ) , { } | \ ^ ~ [ ] ` < > # % " ; / ? : @ & =) This
+                                           filter does `not` validate if a URI has the correct
+                                           format, use the validate_url filter for that.
+                 
+number_int    FS_NUMBER_INT    int         Removes all characters that are [^0-9+-].
+
+number_float  FS_NUMBER_FLOAT  float       Removes all characters that are [^0-9+-].
+
+                                           $filter_options - an bitmask that supports the
+                                           following flags:
+                 
+                                           * ALLOW_FRACTION: adds "." to the characters that
+                                             are not stripped.
+                                           * ALLOW_THOUSAND: adds "," to the characters that
+                                             are not stripped.
+                                           * ALLOW_SCIENTIFIC: adds "eE" to the characters that
+                                             are not stripped.
+
+magic_quotes  FS_MAGIC_QUOTES  string      BC filter for people who like magic quotes.
+============= ================ =========== =====================================================
+
+
+Callback Filter
+===============
+
+This filter will callback to the specified callback function as specified with
+the *filter_options* parameter. All variants of callback functions are
+supported:
+
+* function with *'functionname'*
+* static method with *array('classname', 'methodname')*
+* dynamic method with *array(&$this, 'methodname')*
+                          
+============= =========== =========== =====================================================
+Name          Constant    Return Type Description      
+============= =========== =========== =====================================================
+callback      FC_CALLBACK mixed       Calls the callback function/method with the input
+                                      variable's value by reference which can do filtering
+                                      and modifying of the input value. If the callback
+                                      function returns "false" then the input value is
+                                      supposed to be incorrect and the returned value will
+                                      be 'false' (and an E_NOTICE will be raised).
+============= =========== =========== =====================================================
+
+The callback function's prototype is:
+
+boolean callback(&$value, $characterset);
+    With *$value* being a reference to the input variable and *$characterset*
+    containing the same value as this parameter's value in the call to
+    *input_get()* or *input_get_array()*. If the *$characterset* parameter was
+    not passed, it defaults to *'null'*.
+
+
+.. vim: et syn=rst tw=78