Plasma GitLab Archive
Projects Blog Knowledge

Module Netcgi


module Netcgi: sig .. end
Common data-structures for CGI-like connectors.

This library tries to minimize the use of unsafe practices. It cannot be bullet proof however and you should read about security.

REMARK: It happens frequently that hard to predict random numbers are needed in Web applications. The previous version of this library used to include some facilities for that (in the Netcgi_jserv module). They have been dropped in favor of Cryptokit.



Arguments


class type cgi_argument = object .. end
Represent a key-value pair of data passed to the script (including file uploads).
module Argument: sig .. end
Operations on arguments and lists of thereof.
class type rw_cgi_argument = object .. end
Old deprecated writable argument type.
class simple_argument : ?ro:bool -> string -> string -> rw_cgi_argument
Old deprecated simple argument class.
class mime_argument : ?work_around_backslash_bug:bool -> string -> Netmime.mime_message -> rw_cgi_argument
Old deprecated MIME argument class.

Cookies


module Cookie: sig .. end
Functions to manipulate cookies.

The environment of a request


type http_method = [ `DELETE | `GET | `HEAD | `POST | `PUT ] 
type config = Netcgi_common.config = {
   tmp_directory : string; (*The directory where to create temporary files. This should be an absolute path name.*)
   tmp_prefix : string; (*The name prefix for temporary files. This must be a non-empty string. It must not contain '/'.*)
   permitted_http_methods : http_method list; (*The list of accepted HTTP methods*)
   permitted_input_content_types : string list; (*The list of accepted content types in requests. Content type parameters (like "charset") are ignored. If the list is empty, all content types are allowed.*)
   input_content_length_limit : int; (*The maximum size of the request, in bytes.*)
   max_arguments : int; (*The maximum number of CGI arguments*)
   workarounds : [ `Backslash_bug
| `MSIE_Content_type_bug
| `Work_around_MSIE_Content_type_bug
| `Work_around_backslash_bug ] list
;
(*The list of enabled workarounds.

  • `MSIE_Content_type_bug
  • `Backslash_bug A common error in MIME implementations is not to escape backslashes in quoted string and comments. This workaround mandates that backslashes are handled as normal characters. This is important for e.g. DOS filenames because, in the absence of this, the wrongly encoded "C:\dir\file" will be decoded as "C:dirfile".
Remark: `Work_around_MSIE_Content_type_bug and `Work_around_backslash_bug are deprecated versions of, respectively, `MSIE_Content_type_bug and `Backslash_bug.
*)
   default_exn_handler : bool; (*Whether to catch exceptions raised by the script and display an error page. This will keep the connector running even if your program has bugs in some of its components. This will however also prevent a stack trace to be printed; if you want this turn this option off.*)
}
val default_config : config
The default configuration is:
  • tmp_directory: Netsys_tmp.tmp_directory()
  • tmp_prefix: "netcgi"
  • permitted_http_methods: `GET, `HEAD, `POST.
  • permitted_input_content_types: "multipart/form-data", "application/x-www-form-urlencoded".
  • input_content_length_limit: maxint (i.e., no limit).
  • max_arguments = 10000 (for security reasons)
  • workarounds: all of them.
  • default_exn_handler: set to true.
To create a custom configuration, it is recommended to use this syntax:
      let custom_config = { default_config with tmp_prefix = "my_prefix" }
      
(This syntax is also robust w.r.t. the possible addition of new config flields.)
class type cgi_environment = object .. end
The environment of a request consists of the information available besides the data sent by the user (as key-value pairs).

CGI object


type other_url_spec = [ `Env | `None | `This of string ] 
Determines how an URL part is generated:
  • `Env: Take the value from the environment.
  • `This v: Use this value v. It must already be URL-encoded.
  • `None: Do not include this part into the URL.

type query_string_spec = [ `Args of rw_cgi_argument list
| `Env
| `None
| `This of cgi_argument list ]
Determines how the query part of URLs is generated:
  • `Env: The query string of the current request.
  • `This l: The query string is created from the specified argument list l.
  • `None: The query string is omitted.
  • `Args: deprecated, use `This (left for backward compatibility).

type cache_control = [ `Max_age of int | `No_cache | `Unspecified ] 
This is only a small subset of the HTTP 1.1 cache control features, but they are usually sufficient, and they work for HTTP/1.0 as well. The directives mean:

  • `No_cache: Caches are disabled. The following headers are sent: Cache-control: no-cache, Pragma: no-cache, Expires: (now - 1 second). Note that many versions of Internet Explorer have problems to process non-cached contents when TLS/SSL is used to transfer the file. Use `Max_age in such cases (see http://support.microsoft.com/kb/316431).
  • `Max_age n: Caches are allowed to store a copy of the response for n seconds. After that, the response must be revalidated. The following headers are sent: Cache-control: max-age n, Cache-control: must-revalidate, Expires: (now + n seconds)
  • `Unspecified: No cache control header is added to the response.
Notes:
  • Cache control directives only apply to GET requests; POST requests are never cached.
  • Not only proxies are considered as cache, but also the local disk cache of the browser.
  • HTTP/1.0 did not specify cache behaviour as strictly as HTTP/1.1 does. Because of this the Pragma and Expires headers are sent, too. These fields are not interpreted by HTTP/1.1 clients because Cache-control has higher precedence.

class type cgi = object .. end
Object symbolizing a CGI-like request/response cycle.
class type cgi_activation = cgi
Alternate, more descriptive name for cgi

Connectors


type output_type = [ `Direct of string
| `Transactional of
config ->
Netchannels.out_obj_channel -> Netchannels.trans_out_obj_channel ]
The ouput type determines how generated data is buffered.
  • `Direct sep: Data written to the output channel of the activation object is not collected in a transaction buffer, but directly sent to the browser (the normal I/O buffering is still active, however, so call #flush to ensure that data is really sent). The method #commit_work of the output channel is the same as #flush. The method #rollback_work causes that the string sep is sent, meant as a separator between the already generated output, and the now following error message.
  • `Transactional f: A transactional channel tc is created from the real output channel ch by calling f cfg ch (here, cfg is the CGI configuration). The channel tc is propagated as the output channel of the activation object. This means that the methods commit_work and rollback_work are implemented by tc, and the intended behaviour is that data is buffered in a special transaction buffer until commit_work is called. This invocation forces the buffered data to be sent to the browser. If, however, rollback_work is called, the buffer is cleared.
Two important examples for `Transactional are:
  • The transaction buffer is implemented in memory:
          let buffered _ ch = new Netchannels.buffered_trans_channel ch in
          `Transactional buffered
          
  • The transaction buffer is implemented as an external file:
          `Transactional(fun _ ch -> new Netchannels.tempfile_output_channel ch)
          

val buffered_transactional_outtype : output_type
The output_type implementing transactions with a RAM-based buffer
val buffered_transactional_optype : output_type
Deprecated name for buffered_transactional_outtype
val tempfile_transactional_outtype : output_type
The output_type implementing transactions with a tempfile-based buffer
val tempfile_transactional_optype : output_type
Deprecated name for tempfile_transactional_outtype
type arg_store = cgi_environment ->
string ->
Netmime.mime_header_ro ->
[ `Automatic
| `Automatic_max of float
| `Discard
| `File
| `File_max of float
| `Memory
| `Memory_max of float ]
This is the type of functions arg_store so that arg_store env name header tells whether to `Discard the argument or to store it into a `File or in `Memory. The parameters passed to arg_store are as follows:

  • env is the CGI environment. Thus, for example, you can have different policies for different cgi_path_info.
  • name is the name of the argument.
  • header is the MIME header of the argument (if any).
Any exception raised by arg_store will be treated like if it returned `Discard. Note that the `File will be treated like `Memory except for `POST "multipart/form-data" and `PUT queries.

`Automatic means to store it into a file if the header contains a file name and otherwise in memory (strictly speaking `Automatic is not necessary since arg_store can check the header but is provided for your convenience).

`Memory_max (resp. `File_max, resp. `Automatic_max) is the same as `Memory (resp. `File, resp. `Automatic) except that the parameter indicates the maximum size in kB of the argument value. If the size is bigger, the Netcgi.cgi_argument methods #value and #open_value_rd methods will raise Netcgi.Argument.Oversized.

Remark: this allows for fine grained size constraints while Netcgi.config.input_content_length_limit option is a limit on the size of the entire request.

type exn_handler = cgi_environment -> (unit -> unit) -> unit 
A function of type exn_handler allows to define a custom handler of uncaught exceptions raised by the unit -> unit parameter. A typical example of exn_handler is as follows:
      let exn_handler env f =
        try f()
        with
        | Exn1 -> (* generate error page *)
            env#set_output_header_fields [...];
            env#send_output_header();
            env#out_channel#output_string "...";
            env#out_channel#close_out()
        | ...
      

type connection_directive = [ `Conn_close | `Conn_close_linger | `Conn_error of exn | `Conn_keep_alive ] 
Directive how to go on with the current connection:
  • `Conn_close: Just shut down and close descriptor
  • `Conn_close_linger: Linger, shut down, and close descriptor
  • `Conn_keep_alive: Check for another request on the same connection
  • `Conn_error e: Shut down and close descriptor, and handle the exception e


Specific connectors can be found in separate modules. For example:

A typical use is as follows:
    open Netcgi

    let main (cgi:cgi) =
       let arg = cgi#argument_value "name" in
       ...
       cgi#out_channel#commit_work()

    let () =
      let buffered _ ch = new Netchannels.buffered_trans_channel ch in
      Netcgi_cgi.run ~output_type:(`Transactional buffered) main
    

This web site is published by Informatikbüro Gerd Stolpmann
Powered by Caml