<directivesynopsis>
<name>HttpProtocolOptions</name>
<description>Modify restrictions on HTTP Request Messages</description>
-<syntax>HttpProtocolOptions [Strict|Unsafe] [Allow0.9|Require1.0]
-[StrictWhitespace|LenientWhitespace] [RegisteredMethods|LenientMethods]</syntax>
-<default>HttpProtocolOptions Strict Allow0.9 LenientWhitespace
-LenientMethods</default>
+<syntax>HttpProtocolOptions [Strict|Unsafe] [StrictURL|UnsafeURL]
+ [StrictWhitespace|LenientWhitespace] [RegisteredMethods|LenientMethods]
+ [Allow0.9|Require1.0]</syntax>
+<default>HttpProtocolOptions Strict StrictURL LenientWhitespace
+LenientMethods Allow0.9</default>
<contextlist><context>server config</context>
<context>virtual host</context></contextlist>
<compatibility>2.2.32 or 2.4.24 and later</compatibility>
(<a href="https://tools.ietf.org/html/rfc7230#section-3.2"
>RFC 7230 §3.2</a>), which are now applied by default or using
the <code>Strict</code> option. Due to legacy modules, applications or
- custom user-agents which must be deperecated, an <code>Unsafe</code>
- option has been added to revert to the legacy behavior. These rules are
- applied prior to request processing, so must be configured at the global
- or default (first) matching virtual host section, by interface and not
- by name, to be honored.</p>
+ custom user-agents which must be deperecated, <code>Unsafe</code>
+ and <code>UnsafeURL</code> options have been added to revert to the legacy
+ behaviors. These rules are applied prior to request processing, so must be
+ configured at the global or default (first) matching virtual host section,
+ by IP/port interface and not by name, to be honored.</p>
<p>Prior to the introduction of this directive, the Apache HTTP Server
request message parsers were tolerant of a number of forms of input
of this directive, all grammer rules of the specification are enforced in
the default <code>Strict</code> operating mode.</p>
+ <p><a href="https://tools.ietf.org/html/rfc3986#section-2.2"
+ >RFC 7230 §2.2 and 2.3</a> define "Reserved Characters" and
+ "Unreserved Characters". All other character octets are required to
+ be %XX encoded under this spec, and RFC7230 defers to these requirements.
+ By default the <code>StrictURI</code> option will reject all requests
+ containing invalid characters. This rule can be relaxed with the
+ <code>UnsafeURI</code> option to support badly written user-agents.</p>
+
<p>Users are strongly cautioned against toggling the <code>Unsafe</code>
- mode of operation for these reasons, most especially on outward-facing,
- publicly accessible server deployments. Reviewing the messages within the
- <directive>ErrorLog</directive>, configured with
- <directive>LogLevel</directive> <code>info</code> level or below,
+ and <code>UnsafeURI</code> modes of operation, most especially on
+ outward-facing, publicly accessible server deployments. If an interface
+ is required of faulty monitoring or other custom software running only
+ on an intranet, users should consider toggling these only for a specific
+ virtual host configured on their private subnet.</p>
+
+ <p>Reviewing the messages logged to the <directive>ErrorLog</directive>,
+ configured with <directive>LogLevel</directive> <code>info</code> level,
can help identify such faulty requests along with their origin.
Users should pay particular attention to any 400 responses in the access
log for indiciations that valid requests are unexpectedly rejected.</p>
- <p><a href="https://tools.ietf.org/html/rfc2616#section-19.6"
- >RFC 2616 §19.6</a> "Compatibility With Previous Versions" had
- encouraged HTTP servers to support legacy HTTP/0.9 requests. RFC 7230
- superceeds this with "The expectation to support HTTP/0.9 requests has
- been removed" and offers additional comments in
- <a href="https://tools.ietf.org/html/rfc7230#appendix-A"
- >RFC 2616 Appendix A</a>. The <code>Require1.0</code> option allows
- the user to remove support of the default <code>Allow0.9</code> option's
- behavior.</p>
-
<p><a href="https://tools.ietf.org/html/rfc7230#section-3.5"
>RFC 7230 §3.5</a> "Message Parsing Robustness" permits, and
identifies potential risks of parsing messages containing non-space
origin servers shall respond with an error when an unsupported method
is encountered in the request line. This already happens when the
<code>LenientMethods</code> option is used, but administrators may wish
- to toggle the <code>RegisteredMethods</code> option and register all
- permitted method tokens using the <directive>RegisterHttpMethod</directive>
+ to toggle the <code>RegisteredMethods</code> option and register any
+ non-standard methods using the <directive>RegisterHttpMethod</directive>
directive, particularly if the <code>Unsafe</code> option has been toggled.
The <code>RegisteredMethods</code> option should <strong>not</strong>
be toggled for forward proxy hosts, as the methods supported by the
origin servers are unknown to the proxy server.</p>
+
+ <p><a href="https://tools.ietf.org/html/rfc2616#section-19.6"
+ >RFC 2616 §19.6</a> "Compatibility With Previous Versions" had
+ encouraged HTTP servers to support legacy HTTP/0.9 requests. RFC 7230
+ superceeds this with "The expectation to support HTTP/0.9 requests has
+ been removed" and offers additional comments in
+ <a href="https://tools.ietf.org/html/rfc7230#appendix-A"
+ >RFC 2616 Appendix A</a>. The <code>Require1.0</code> option allows
+ the user to remove support of the default <code>Allow0.9</code> option's
+ behavior.</p>
</usage>
</directivesynopsis>
#define AP_HTTP_METHODS_REGISTERED 2
char http_methods;
+#define AP_HTTP_URI_UNSET 0
+#define AP_HTTP_URI_UNSAFE 1
+#define AP_HTTP_URI_STRICT 2
+ char http_stricturi;
+
#define AP_HTTP_CL_HEAD_ZERO_UNSET 0
#define AP_HTTP_CL_HEAD_ZERO_ENABLE 1
#define AP_HTTP_CL_HEAD_ZERO_DISABLE 2
*/
AP_DECLARE(const char *) ap_scan_http_token(const char *ptr);
+/* Scan a string for valid URI characters per RFC3986, and
+ * return a pointer to the first non-URI character encountered.
+ * @param ptr The string to scan
+ * @return A pointer to the first non-token character.
+ */
+AP_DECLARE(const char *) ap_scan_http_uri_safe(const char *ptr);
+
/* Retrieve a token, advancing the pointer to the first non-token character
* and returning a copy of the token string.
* @param ptr The string to scan. On return, this points to the first non-token
if (virt->http_conformance != AP_HTTP_CONFORMANCE_UNSET)
conf->http_conformance = virt->http_conformance;
+ if (virt->http_stricturi != AP_HTTP_URI_UNSET)
+ conf->http_stricturi = virt->http_stricturi;
+
if (virt->http_whitespace != AP_HTTP_WHITESPACE_UNSET)
conf->http_whitespace = virt->http_whitespace;
conf->http_conformance |= AP_HTTP_CONFORMANCE_STRICT;
else if (strcasecmp(arg, "unsafe") == 0)
conf->http_conformance |= AP_HTTP_CONFORMANCE_UNSAFE;
+ else if (strcasecmp(arg, "stricturi") == 0)
+ conf->http_stricturi |= AP_HTTP_URI_STRICT;
+ else if (strcasecmp(arg, "unsafeuri") == 0)
+ conf->http_stricturi |= AP_HTTP_URI_UNSAFE;
else if (strcasecmp(arg, "strictwhitespace") == 0)
conf->http_whitespace |= AP_HTTP_WHITESPACE_STRICT;
else if (strcasecmp(arg, "lenientwhitespace") == 0)
else if (strcasecmp(arg, "lenientmethods") == 0)
conf->http_methods |= AP_HTTP_METHODS_LENIENT;
else
- return "HttpProtocolOptions accepts 'Allow0.9' (default) or "
- "'Require1.0', 'Unsafe' or 'Strict' (default), "
+ return "HttpProtocolOptions accepts "
+ "'Unsafe' or 'Strict' (default), "
+ "'UnsafeURI' or 'StrictURI' (default), "
+ "'Require1.0' or 'Allow0.9' (default), "
"'StrictWhitespace' or 'LenientWhitespace (default), and "
"'RegisteredMethods' or 'LenientMethods (default)";
return "HttpProtocolOptions 'Allow0.9' and 'Require1.0'"
" are mutually exclusive";
+ if ((conf->http_stricturi & AP_HTTP_URI_STRICT)
+ && (conf->http_stricturi & AP_HTTP_URI_UNSAFE))
+ return "HttpProtocolOptions 'StrictURI' and 'UnsafeURI'"
+ " are mutually exclusive";
+
if ((conf->http_conformance & AP_HTTP_CONFORMANCE_STRICT)
&& (conf->http_conformance & AP_HTTP_CONFORMANCE_UNSAFE))
return "HttpProtocolOptions 'Strict' and 'Unsafe'"
#define T_ESCAPE_FORENSIC (0x20)
#define T_ESCAPE_URLENCODED (0x40)
#define T_HTTP_CTRLS (0x80)
+#define T_URI_RFC3986 (0x100)
int main(int argc, char *argv[])
{
unsigned c;
- unsigned char flags;
+ unsigned short flags;
printf("/* this file is automatically generated by gen_test_char, "
"do not edit */\n"
"#define T_ESCAPE_FORENSIC (%u)\n"
"#define T_ESCAPE_URLENCODED (%u)\n"
"#define T_HTTP_CTRLS (%u)\n"
+ "#define T_URI_RFC3986 (%u)\n"
"\n"
- "static const unsigned char test_char_table[256] = {",
+ "static const unsigned short test_char_table[256] = {",
T_ESCAPE_SHELL_CMD,
T_ESCAPE_PATH_SEGMENT,
T_OS_ESCAPE_PATH,
T_ESCAPE_LOGITEM,
T_ESCAPE_FORENSIC,
T_ESCAPE_URLENCODED,
- T_HTTP_CTRLS);
+ T_HTTP_CTRLS,
+ T_URI_RFC3986);
for (c = 0; c < 256; ++c) {
flags = 0;
* and "tspecials" (RFC2068) a.k.a. "separators" (RFC2616), which
* is easer to express as characters remaining in the ASCII token set
*/
- if (!(apr_isalnum(c) || strchr("!#$%&'*+-.^_`|~", c))) {
+ if (!c || !(apr_isalnum(c) || strchr("!#$%&'*+-.^_`|~", c))) {
flags |= T_HTTP_TOKEN_STOP;
}
flags |= T_HTTP_CTRLS;
}
+ /* From RFC3986, the specific sets of gen-delims, sub-delims (2.2),
+ * and unreserved (2.3) that are possible somewhere within a URI.
+ * Spec requires all others to be %XX encoded, including obs-text.
+ */
+ if (c && strchr(":/?#[]@" /* gen-delims */
+ "!$&'()*+,;=" /* sub-delims */
+ "-._~", c) || apr_isalnum(c)) { /* unreserved */
+ flags |= T_URI_RFC3986;
+ }
+
/* For logging, escape all control characters,
* double quotes (because they delimit the request in the log file)
* backslashes (because we use backslash for escaping)
flags |= T_ESCAPE_FORENSIC;
}
- printf("0x%02x%c", flags, (c < 255) ? ',' : ' ');
+ printf("0x%03x%c", flags, (c < 255) ? ',' : ' ');
}
printf("\n};\n");
{
enum {
rrl_none, rrl_badmethod, rrl_badwhitespace, rrl_excesswhitespace,
- rrl_missinguri, rrl_badprotocol, rrl_trailingtext
+ rrl_missinguri, rrl_baduri, rrl_badprotocol, rrl_trailingtext
} deferred_error = rrl_none;
char *ll;
char *uri;
int strict = (conf->http_conformance != AP_HTTP_CONFORMANCE_UNSAFE);
const char *badwhitespace = strict ? "\t\n\v\f\r" : "\n\v\f\r";
int strictspaces = (conf->http_whitespace == AP_HTTP_WHITESPACE_STRICT);
+ int stricturi = (conf->http_stricturi != AP_HTTP_URI_UNSAFE);
/* Read past empty lines until we get a real request line,
* a read error, the connection closes (EOF), or we timeout.
if (!*uri && deferred_error == rrl_none)
deferred_error = rrl_missinguri;
- if (!(ll = strpbrk(uri, " \t\n\v\f\r"))) {
- r->protocol = "";
- len = 0;
- goto rrl_done;
+ if (stricturi) {
+ ll = (char*) ap_scan_http_uri_safe(uri);
+ if (ll == uri || (*ll && !apr_isspace(*ll))) {
+ deferred_error = rrl_baduri;
+ ll = strpbrk(ll, "\t\n\v\f\r ");
+ }
+ }
+ else {
+ ll = strpbrk(uri, "\t\n\v\f\r ");
+ }
+ if (!ll) {
+ r->protocol = "";
+ len = 0;
+ goto rrl_done;
}
for (r->protocol = ll; apr_isspace(*r->protocol); ++r->protocol)
if (ap_strchr_c(badwhitespace, *r->protocol) && deferred_error == rrl_none)
else if (deferred_error == rrl_missinguri)
ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(03446)
"HTTP Request Line; Missing URI");
+ else if (deferred_error == rrl_baduri)
+ ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(03454)
+ "HTTP Request Line; URI incorrectly encoded: '%.*s'",
+ field_name_len(r->method), r->method);
else if (deferred_error == rrl_badwhitespace)
ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(03447)
"HTTP Request Line; Invalid whitespace");
}
if (strict) {
- if (ap_has_cntrl(r->the_request)) {
+ /* No sense re-testing here for what was evaulated above */
+ if (!stricturi && ap_has_cntrl(r->the_request)) {
ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(02420)
"HTTP Request Line; URI must not contain control"
" characters");
return tok;
}
+/* Scan a string for valid URI characters per RFC3986, and
+ * return a pointer to the first non-URI character encountered.
+ */
+AP_DECLARE(const char *) ap_scan_http_uri_safe(const char *ptr)
+{
+ for ( ; TEST_CHAR(*ptr, T_URI_RFC3986); ++ptr) ;
+
+ return ptr;
+}
+
/* Retrieve a token, spacing over it and returning a pointer to
* the first non-white byte afterwards. Note that these tokens
* are delimited by semis and commas; and can also be delimited