X-Git-Url: https://granicus.if.org/sourcecode?a=blobdiff_plain;f=ROADMAP;h=4c8078002e9725d7f11923879167a976e634a729;hb=58dbbce194bf3706c5dc024e2e0a93a59516de47;hp=a8787683f1bb1f684ec9d36782d518ea40fac15d;hpb=011fbb220a3fc668539da426507d053ef7d3f201;p=apache diff --git a/ROADMAP b/ROADMAP index a8787683f1..4c8078002e 100644 --- a/ROADMAP +++ b/ROADMAP @@ -1,8 +1,10 @@ -APACHE 2.1+ ROADMAP: +APACHE 2.x ROADMAP +================== +Last modified at [$Date$] -Last modified at [$Date: 2002/02/14 05:21:45 $] -DEFERRRED FOR APACHE 2.1 +WORKS IN PROGRESS +----------------- * Source code should follow style guidelines. OK, we all agree pretty code is good. Probably best to clean this @@ -16,23 +18,19 @@ DEFERRRED FOR APACHE 2.1 David says: The style guide needs to be reviewed before this can be done. - http://dev.apache.org/styleguide.html + http://httpd.apache.org/dev/styleguide.html The current file is dated April 20th 1998! - Also the file should be moved to the correct location for - future use. Q: should APR have it's own copy as well? - - * revamp the input filter syntax to provide for ordering of - filters created with the Set{Input|Output}Filter and the - Add{Input|Output}Filter directives. A 'relative to filterx' - syntax is definately preferable, but not realistic for 2.0. - - * Platforms that do not support fork (primarily Win32 and AS/400) - Architect start-up code that avoids initializing all the modules - in the parent process on platforms that do not support fork. - Better yet - not only inform the startup of which phase it's in, - but allow the parent 'process' to initialize shared memory, etc, - and create a module-by-module stream to pass to the child, so the - parent can actually arbitrate the important stuff. + + OtherBill offers: + It's survived since '98 because it's welldone :-) Suggest we + simply follow whatever is documented in styleguide.html as we + branch the next tree. Really sort of straightforward, if you + dislike a bit within that doc, bring it up on the dev@httpd + list prior to the next branch. + + So Bill sums up ... let's get the code cleaned up in CVS head. + Remember, it just takes cvs diff -b (that is, --ignore-space-change) + to see the code changes and ignore that cruft. Get editing Justin :) * Replace stat [deferred open] with open/fstat in directory_walk. Justin, Ian, OtherBill all interested in this. Implies setting up @@ -40,17 +38,192 @@ DEFERRRED FOR APACHE 2.1 that file, and allow the cleanup to close it [if it isn't a shared, cached file handle.] - * Refactor auth into auth protocols and auth database stores. - Many interested hackers, too destabilizing for 2.0 inclusion. - -DEFERRRED FOR APACHE 3.0 - * The Async Apache Server implemented in terms of APR. [Bill Stoddard's pet project.] Message-ID: <008301c17d42$9b446970$01000100@sashimi> (dev@apr) - * Add a string "class" that combines a char* with a length - and a reference count. This will help reduce the number - of strlen and strdup operations during request processing. - Including both the length and allocation will save us a ton - of reallocation we do today, in terms of string manipulation. + OtherBill notes that this can proceed in two parts... + + Async accept, setup, and tear-down of the request + e.g. dealing with the incoming request headers, prior to + dispatching the request to a thread for processing. + This doesn't need to wait for a 2.x/3.0 bump. + + Async delegation of the entire request processing chain + Too many handlers use stack storage and presume it is + available for the life of the request, so a complete + async implementation would need to happen 3.0 release. + + Brian notes that async writes will provide a bigger + scalability win than async reads for most servers. + We may want to try a hybrid sync-read/async-write MPM + as a next step. This should be relatively easy to + build: start with the current worker or leader/followers + model, but hand off each response brigade to a "completion + thread" that multiplexes writes on many connections, so + that the worker thread doesn't have to wait around for + the sendfile to complete. + + +MAKING APACHE REPOSITORY-AGNOSTIC +(or: remove knowledge of the filesystem) + +[ 2002/10/01: discussion in progress on items below; this isn't + planned yet ] + + * dav_resource concept for an HTTP resource ("ap_resource") + + * r->filename, r->canonical_filename, r->finfo need to + disappear. All users need to use new APIs on the ap_resource + object. + + (backwards compat: today, when this occurs with mod_dav and a + custom backend, the above items refer to the topmost directory + mapped by a location; e.g. docroot) + + Need to preserve a 'filename'-like string for mime-by-name + sorts of operations. But this only needs to be the name itself + and not a full path. + + Justin: Can we leverage the path info, or do we not trust the + user? + + gstein: well, it isn't the "path info", but the actual URI of + the resource. And of course we trust the user... that is + the resource they requested. + + dav_resource->uri is the field you want. path_info might + still exist, but that portion might be related to the + CGI concept of "path translated" or some other further + resolution. + + To continue, I would suggest that "path translated" and + having *any* path info is Badness. It means that you did + not fully resolve a resource for the given URI. The + "abs_path" in a URI identifies a resource, and that + should get fully resolved. None of this "resolve to + and then we have a magical second resolution + (inside the CGI script)" or somesuch. + + Justin: Well, let's consider mod_mbox for a second. It is sort of + a virtual filesystem in its own right - as it introduces + it's own notion of a URI space, but it is intrinsically + tied to the filesystem to do the lookups. But, for the + portion that isn't resolved on the file system, it has + its own addressing scheme. Do we need the ability to + layer resolution? + + * The translate_name hook goes away + + Wrowe altogether disagrees. translate_name today even operates + on URIs ... this mechansim needs to be preserved. + + * The doc for map_to_storage is totally opaque to me. It has + something to do with filesystems, but it also talks about + security and per_dir_config and other stuff. I presume something + needs to happen there -- at least better doc. + + Wrowe agrees and will write it up. + + * The directory_walk concept disappears. All configuration is + tagged to Locations. The "mod_filesystem" module might have some + internal concept of the same config appearing in multiple + places, but that is handled internally rather than by Apache + core. + + Wrowe suggests this is wrong, instead it's private to filesystem + requests, and is already invoked from map_to_storage, not the core + handler. and blocks are preserved as-is, + but sections become specific to the filesystem handler + alone. Because alternate filesystem schemes could be loaded, this + should be exposed, from the core, for other file-based stores to + share. Consider an archive store where the layers become + -> -> + + Justin: How do we map Directory entries to Locations? + + * The "Location tree" is an in-memory representation of the URL + namespace. Nodes of the tree have configuration specific to that + location in the namespace. + + Something like: + + typedef struct { + const char *name; /* name of this node relative to parent */ + + struct ap_conf_vector_t *locn_config; + + apr_hash_t *children; /* NULL if no child configs */ + } ap_locn_node; + + The following config: + + + SetHandler server-status + Order deny,allow + Deny from all + Allow from 127.0.0.1 + + + Creates a node with name=="server_status", and the node is a + child of the "/" node. (hmm. node->name is redundant with the + hash key; maybe drop node->name) + + In the config vector, mod_access has stored its Order, Deny, and + Allow configs. mod_core has stored the SetHandler. + + During the Location walk, we merge the config vectors normally. + + Note that an Alias simply associates a filesystem path (in + mod_filesystem) with that Location in the tree. Merging + continues with child locations, but a merge is never done + through filesystem locations. Config on a specific subdir needs + to be mapped back into the corresponding point in the Location + tree for proper merging. + + * Config is parsed into a tree, as we did for the 2.0 timeframe, + but that tree is just a representation of the config (for + multiple runs and for in-memory manipulation and usage). It is + unrelated to the "Location tree". + + * Calls to apr_file_io functions generally need to be replaced + with operations against the ap_resource. For example, rather + than calling apr_dir_open/read/close(), a caller uses + resource->repos->get_children() or somesuch. + + Note that things like mod_dir, mod_autoindex, and mod_negotiation + need to be converted to use these mechanisms so that their + functions will work on logical repositories rather than just + filesystems. + + * How do we handle CGI scripts? Especially when the resource may + not be backed by a file? Ideally, we should be able to come up + with some mechanism to allow CGIs to work in a + repository-independent manner. + + - Writing the virtual data as a file and then executing it? + - Can a shell be executed in a streamy manner? (Portably?) + - Have an 'execute_resource' hook/func that allows the + repository to choose its manner - be it exec() or whatever. + - Won't this approach lead to duplication of code? Helper fns? + + gstein: PHP, Perl, and Python scripts are nominally executed by + a filter inserted by mod_php/perl/python. I'd suggest + that shell/batch scripts are similar. + + But to ask further: what if it is an executable + *program* rather than just a script? Do we yank that out + of the repository, drop it onto the filesystem, and run + it? eeewwwww... + + I'll vote -0.9 for CGIs as a filter. Keep 'em handlers. + + Justin: So, do we give up executing CGIs from virtual repositories? + That seems like a sad tradeoff to make. I'd like to have + my CGI scripts under DAV (SVN) control. + + * How do we handle overlaying of Location and Directory entries? + Right now, we have a problem when /cgi-bin/ is ScriptAlias'd and + mod_dav has control over /. Some people believe that /cgi-bin/ + shouldn't be under DAV control, while others do believe it + should be. What's the right strategy?