{2:overview Overview over the HTTP daemon} This library implements an HTTP 1.1 server. Because it is a library and not a stand-alone server like Apache, it can be used in very flexible ways. The disadvantage is that the user of the library must do more to get a running program than just configuring the daemon. The daemon has five modules: - {!Nethttpd_types} is just a module with common type definitions used by the other modules - {!Nethttpd_kernel} is the implementation of the HTTP protocol. If we talk about the "kernel" we mean this module. The kernel has two interface sides: There is the "socket side" and the "message side" that are connected by bidirectional data flow. The task of the kernel is to decode input received by the socket side and to deliver it to a consumer on the message side, and conversely to encode input coming in through the message side and to send it to the socket. The kernel is a quite low-level module; the socket is accessed in a multiplexing-compatible style, and the messages are sequences of tokens like "status line", "header", "body chunk" and so on. Normally a user of the daemon does not program the kernel directly. It is, however, possible to pass certain configuration options to the kernel even if an encapsulation is used. - {!Nethttpd_reactor} is an encapsulation of the kernel with a nicer interface. An instance of the reactor processes, like the kernel, only a single HTTP connection. It is used as follows: The user of the reactor pulls the arriving HTTP requests from the reactor, processes them, and writes the responses back to the reactor. This means that the requests are processed in a strictly sequential way. The reactor hides the details of the HTTP protocol. The reactor is able to perform socket input and output at the same time, i.e. when the response is sent to the client the next request(s) can already be read (pipelining). The reactor can be configured such that buffering of requests and responses is avoided, even if large messages are transferred. As mentioned, the reactor can only deal with one connection at the same time. In order to serve several connections, one must use multi-threading. - {!Nethttpd_engine} is another encapsulation of the kernel. It is event-based, and this makes it possible that several instances can work at the same time without using multi-threading. The user of the engine is called back when the beginning of the next HTTP request arrives and at certain other events. The user processes the event, and generates the response. The engine is a highly efficient implementation of an HTTP server, but there are also drawbacks, so user may feel more comfortable with the reactor. Especially, the engine needs large buffers for input and output (there is an idea to use helper threads to avoid these buffers, but this has not been implemented yet). Of course, the engine also supports pipelining. - {!Nethttpd_services} has functions to compose complex service functions from simpler ones. In particular, one can configure name- or IP-based virtual hosting, one can bind services to URLs, and one can define static file serving, directory listings, and dynamic services. It is quite easy to turn a Netcgi program into a dynamic service for Nethttpd. - {!Nethttpd_plex} provides nice integration into [netplex]. Most features provided by {!Nethttpd_services} can be activated by simply referencing them in the netplex configuration file. It is also important to mention what Nethttpd does not include: - There is no function to create the socket, and to accept connections. - There is no function to manage threads or subprocesses Ocamlnet provides this in the [netplex] library. {2 Suggested strategy} First, look at {!Nethttpd_services}. This module allows the user to define the services of the web server. For example, the following code defines a single host with an URL space: {[ let fs_spec = { file_docroot = "/data/docroot"; file_uri = "/"; file_suffix_types = [ "txt", "text/plain"; "html", "text/html" ]; file_default_type = "application/octet-stream"; file_options = [ `Enable_gzip; `Enable_listings simple_listing ] } let srv = host_distributor [ default_host ~pref_name:"localhost" ~pref_port:8765 (), uri_distributor [ "*", (options_service()); "/files", (file_service fs_spec); "/service", (dynamic_service { dyn_handler = process_request; dyn_activation = std_activation `Std_activation_buffered; dyn_uri = Some "/service"; dyn_translator = file_translator fs_spec; dyn_accept_all_conditionals = false }) ] ] ]} The [/files] path is bound to a static service, i.e. the files found in the directory [/data/docroot] can be accessed over the web. The record [fs_spec] configures the static service. The [/service] path is bound to a dynamic service, i.e. the requests are processed by the user-defined function [process_request]. This function is very similar to the request processors used in Netcgi. The symbolic [*] path is only bound for the [OPTIONS] method. This is recommended, because clients can use this method to find out the capabilities of the server. Second, select an encapsulation. As mentioned, the reactor is much simpler to use, but you must take a multi-threaded approach to serve multiple connections simultaneously. The engine is more efficient, but may use more memory (unless it is only used for static pages). Third, write the code to create the socket and to accept connections. For the reactor, you should do this in a multi-threaded way (but multi-processing is also possible). For the engine, you should do this in an event-based way. Now, just call {!Nethttpd_reactor.process_connection} or {!Nethttpd_engine.process_connection}, and pass the socket descriptor as argument. These functions do all the rest. The Ocamlnet source tarball includes examples for several approaches. Especially look at [file_reactor.ml], [file_mt_reactor.ml], and [file_engine.ml]. {2 Configuration} One of the remaining questions is: How to set all these configuration options. The user configures the daemon by passing a configuration object. This object has a number of methods that usually return constants, but there are also a few functions, e.g. {[ let config : http_reactor_config = object method config_timeout_next_request = 15.0 method config_timeout = 300.0 method config_reactor_synch = `Write method config_cgi = Netcgi_env.default_config method config_error_response n = "<html>Error " ^ string_of_int n ^ "</html>" method config_log_error _ _ _ _ msg = printf "Error log: %s\n" msg method config_max_reqline_length = 256 method config_max_header_length = 32768 method config_max_trailer_length = 32768 method config_limit_pipeline_length = 5 method config_limit_pipeline_size = 250000 end ]} Some of the options are interpreted by the encapsulation, and some by the kernel. The object approach has been taken, because it can be arranged that the layers of the daemon correspond to a hierarchy of class types. The options are documented in the modules where the class types are defined. Some of them are difficult to understand. In doubt, it is recommended to just copy the values found in the examples, because these are quite reasonable for typical usage scenarios. {2 Linking} In Ocamlnet 3 the nethttpd library can be both referenced as [-package nethttpd] or [-package nethttpd-for-netcgi2] (the latter being an alias). As the [netcgi1] library was dropped, there is no reason for [nethttpd-for-netcgi1] anymore - this name is now unavailable.