Plasma GitLab Archive
Projects Blog Knowledge

Module Nethttp_client

module Nethttp_client: sig .. end
HTTP 1.1 client


Note for beginners: There is a simplified interface called Nethttp_client.Convenience.

Implements the following advanced features:
  • chunked message transport
  • persistent connections
  • connections in pipelining mode ("full duplex" connections)
  • modular authentication methods, currently Basic, Digest, and Negotiate
  • event-driven implementation; allows concurrent service for several network connections
  • HTTP proxy support, also with Basic and Digest authentication
  • SOCKS proxy support (1)
  • HTTPS support is now built-in, but requires that a TLS provider is initialized (see Tls for more information). HTTPS proxies are also supported (CONNECT method) (1)
  • Automatic and configurable retry of failed idempotent requests
  • Redirections can be followed automatically
  • Compressed message bodies can be automatically decoded (gzip only, method set_accept_encoding) (1)
Left out:
  • multipart messages, including multipart/byterange
  • content digests specified by RFC 2068 and 2069 (2)
  • content negotiation (2)
  • conditional and partial GET (2)
  • client-side caching (2)
  • HTTP/0.9 compatibility
(1) Since Ocamlnet-3.3

(2) These features can be implemented on top of this module if really needed, but there is no special support for them.

Related modules/software:

  • Nethttp_fs allows you to access HTTP servers in the style of filesystems
  • WebDAV: If you are looking for WebDAV there is an extension of this module: Webdav, which is separately available.


Thread safety

The module can be compiled such that it is thread-safe. In particular, one has to link the http_client_mt.cmxo object, and thread-safety is restricted to the following kinds of usage:

  • The golden rule is that threads must not share pipeline objects. If every thread uses its own pipeline, every thread will have its own set of state variables. It is not detected if two threads errornously share a pipeline, neither by an error message nor by implicit serialization. Strange things may happen.
  • The same applies to the other objects, e.g. http_call objects
  • The Convenience module even serializes; see below.


Types and Exceptions


exception Bad_message of string
The server sent a message which cannot be interpreted. The string indicates the reason.
exception No_reply
There was no response to the request because some other request failed earlier and it was not allowed to send the request again.
exception Too_many_redirections
While following redirections the limit has been reached
exception Name_resolution_error of string
Could not resolve this name - same as Uq_engines.Host_not_found
exception URL_syntax_error of string
This URL cannot be parsed after a redirection has been followed.
exception Timeout of string
A timeout. The string explains which connection is affected. New since Ocamlnet-3.3.
exception Proxy_error of int
An error status from a proxy. This is only used when extra proxy messages are used to configure the proxy (e.g. the CONNECT message).
exception Response_too_large
The length of the response exceeds the configured maximum
exception Http_protocol of exn
The request could not be processed because the exception condition was raised. The inner exception is one of the above defined.
exception Http_error of (int * string)
Deprecated in the scope of pipeline. The server sent an error message. The left component of the pair is the error code, the right component is the error text. This exception is only used by get_resp_body, and by the Nethttp_client.Convenience module. Note that for the latter usage the exception does not count as deprecated.
type status = [ `Client_error
| `Http_protocol_error of exn
| `Redirection
| `Server_error
| `Successful
| `Unserved ]
Condensed status information of a HTTP call:
  • `Unserved: The call has not yet been finished
  • `HTTP_protocol_error e: An error on HTTP level occurred. Corresponds to the exception Http_protocol.
  • `Successful: The call is successful, and the response code is between 200 and 299.
  • `Redirection: The call is successful, and the response code is between 300 and 399.
  • `Client_error: The call failed with a response code between 400 and 499.
  • `Server_error: The call failed for any other reason.

type 'a auth_status = [ `Auth_error
| `Continue of 'a
| `Continue_reroute of 'a * int
| `None
| `OK
| `Reroute of int ]
Status of HTTP-level authentication:
  • `None: Authentication wasn't tried.
  • `OK: The authentication protocol finished. What this means exactly depends on the protocol. For most protocols it just means that the server authenticated the client. For some protocols it may also mean that the client authenticated the server (mutual authentication).
  • `Auth_error: The authentication protocol did not succeed. Note that this state can also be reached for an otherwise successful HTTP response (i.e. code 200) when the client could not authenticate the server and the protocol demands this.
  • `Reroute trans_id: The request should be retried on a new connection for the transport identified by trans_id
  • `Continue: The authentication is still in progress. Normally the user should not see this state as the engine automatically continues the protocol. The argument of `Continue is private.
  • `Continue_reroute: the combination of continue and reroute: the auth protocol continues, but the next request must be sent on the indicated transport.

type 'message_class how_to_reconnect = 
| Send_again (*Send the request automatically again*)
| Request_fails (*Drop the request*)
| Inquire of ('message_class -> bool) (*If the function return true send again, otherwise drop the request.*)
| Send_again_if_idem (*Default behaviour: Send_again for idempotent methods (GET, HEAD), Request_fails for the rest*)
How to deal with automatic reconnections, especially when the connection crashes.
type 'message_class how_to_redirect = 
| Redirect (*Perform the redirection*)
| Do_not_redirect (*No redirection*)
| Redirect_inquire of ('message_class -> bool) (*If the function return true redirect, otherwise do not redirect. It is legal to set the Location header as part of the action performed by the function. (Should be an absolute http URL.)*)
| Redirect_if_idem (*Default behaviour: Redirect for idempotent methods (GET, HEAD), Do_not_redirect for the rest*)
type private_api 
The private part of the http_call class type
type response_body_storage = [ `Body of unit -> Netmime.mime_body
| `Device of unit -> Uq_io.out_device
| `File of unit -> string
| `Memory ]
How to create the response body:
  • `Memory: The response body is in-memory
  • `File f: The response body is stored into the file whose name is returned by f()
  • `Body f: The response body is stored into the object returned by f()
  • `Device f: The response is directly forwarded to the device obtained by f() (new since Ocamlnet-3.3)
When the function f is called in the latter cases the response header has already been received, and can be retrieved with the response_header method of the call object. Also, response_status_text, response_status_code, and response_status return meaningful values.
type synchronization = 
| Sync (*The next request begins after the response of the last request has been received.*)
| Pipeline of int (*The client is allowed to send several requests without waiting for responses. The number is the maximum number of unreplied requests that are allowed. A typical value: 5. If you increase this value, the risk becomes higher that requests must be repeatedly sent to the server in the case the connection crashes. Increasing is recommended if you send a bigger number of GET or HEAD requests to the server. Decreasing is recommended if you send large POST or PUT requests to the server.

Values > 8 are interpreted as 8.

*)
This type determines whether to keep requests and responses synchronized or not.

The first request/response round is always done in Sync mode, because the protocol version of the other side is not known at that moment. Pipeline requires HTTP/1.1.

In previous versions of netclient there was a third option, Sync_with_handshake_before_request_body. This option is no longer necessary because the HTTP specification has been updated in the meantime, and there is a better mechanism now (the Expect header is set).

type resolver = Unixqueue.unix_event_system ->
string -> (Unix.inet_addr option -> unit) -> unit
A name resolver is a function r called as r esys name reply. As name the name to resolve is passed. The resolver must finally call reply with either the resolved address or with None, indicating an error in the latter case. The event system esys can be used to carry out the resolution process in an asynchronous way, but this is optional.

Only 1:1 resolution is supported, 1:n resolution not.

type transport_layer_id = int 
The ID identifies a requirement for the transport channel, especially whether plain HTTP is sufficient, or HTTPS needs to be used, and if so, whether there are further requirements for the TLS context. There are the predefined IDs:


type http_options = {
   synchronization : synchronization; (*Default: Pipeline 5.*)
   maximum_connection_failures : int; (*This option limits the number of connection attempts. Default: 2*)
   maximum_message_errors : int; (*This option limits the number of protocol errors tolerated per request. If a request leads to a protocol error, the connection is shut down, the server is connected again, and the request is tried again (if the kind of the message allows retransmission). If a request repeatedly fails, this option limits the number of retransmissions. Default: 2*)
   inhibit_persistency : bool; (*This option turns persistent connections off. Default: false It is normally not necessary to change this option.*)
   connection_timeout : float; (*If there is no network transmission for this period of time, the connection is shut down, and tried again. Default: 300.0 (seconds) It may be necessary to increase this value if HTTP is used for batch applications that contact extremely slow services.*)
   number_of_parallel_connections : int; (*The client keeps up to this number of parallel connections to a single content server or proxy. Default: 2 You may increase this value if you are mainly connected with an HTTP/1.0 proxy.*)
   maximum_redirections : int; (*The maximum number of redirections per message*)
   handshake_timeout : float; (*The timeout when waiting for "100 Continue". Default: 1.0*)
   resolver : resolver; (*The function for name resolution*)
   configure_socket : Unix.file_descr -> unit; (*A function to configure socket options*)
   tls : Netsys_crypto_types.tls_config option; (*The TLS configuration to use by default for https URLs. (This can be overridden per request by using a different transport_layer_id.)*)
   schemes : (string * Neturl.url_syntax * int option * transport_layer_id)
list
;
(*The list of supported URL schemes. The tuples mean (scheme, syntax, default_port, cb). By default, the schemes "http", "https", and "ipp" are supported.*)
   verbose_status : bool;
   verbose_request_header : bool;
   verbose_response_header : bool;
   verbose_request_contents : bool;
   verbose_response_contents : bool;
   verbose_connection : bool;
   verbose_events : bool; (*Enable various debugging message types.
  • verbose_status: reports about status of received documents
  • verbose_request_header: prints the header sent to the server
  • verbose_request_contents: prints the document sent to the server
  • verbose_response_header: prints the header of the answer from the server
  • verbose_response_contents: prints the document received from the server
  • verbose_connection: reports many connection events; authentication, too.
  • verbose_events: everything about the interaction with Unixqueue
By default, verbose_status and verbose_connection are enabled. Note that you also have to set Debug.enable to true to see any log message at all!
*)
}
Options for the whole pipeline. It is recommended to change options the following way:
    let opts = pipeline # get_options in
    let new_opts = { opts with <field> = <value>; ... } in
    pipeline # set_options new_opts
 

New fields can be added anytime to this record, and this style of changing options is transparent to field additions.

type header_kind = [ `Base | `Effective ] 
The `Base header is set by the user of http_call and is never changed during processing the call. The `Effective header is a copy of the base header at the time the request is sent. The effective header contains additions like Content-length and authentication info.
class type http_call = object .. end
The container for HTTP calls
class type tls_cache = object .. end
A cache object for storing TLS session data
val null_tls_cache : unit -> tls_cache
This "cache" does not remember any session
val unlim_tls_cache : unit -> tls_cache
Returns a simple cache without limit for the number of cached sessions
class type transport_channel_type = object .. end

HTTP methods


class virtual generic_call : object .. end
This class is an implementation of http_call.

The following classes are implementations for the various HTTP methods. These classes do not initialize the call object.
class get_call : http_call
class trace_call : http_call
class options_call : http_call
class head_call : http_call
class post_call : http_call
class put_call : http_call
class delete_call : http_call

The following classes initialize the request message of the call (header and body). These classes are also backward compatible to the classes found in earlier versions of netclient.
class get : string -> http_call
Argument: URI
class trace : string -> int -> http_call
Arguments: URI, maximum number of hops
class options : string -> http_call
Argument: URI or "*"
class head : string -> http_call
Argument: URI
class post : string -> (string * string) list -> http_call
Arguments: URI, parameter list to be transferred as application/x-www-form-urlencoded body
class post_raw : string -> string -> http_call
Arguments: URI, body
class put : string -> string -> http_call
Arguments: URI, body
class delete : string -> http_call
Argument: URI

Authentication


class type key = object .. end
A key is a user/password combination for a certain realm
val key : user:string ->
password:string ->
realm:string -> domain:Neturl.url list -> key
Create a key object
val key_creds : user:string ->
creds:(string * string * (string * string) list) list ->
http_options -> key
Create a key object from a list of credentials
class type key_handler = object .. end
class key_ring : ?uplink:#key_handler -> ?no_invalidation:bool -> unit -> object .. end
The key_ring is a cache for keys.
class type auth_session = object .. end
An auth_session represents an authenticated session
class type auth_handler = object .. end
An authentication handler has the capability of adding the necessary headers to messages.
class basic_auth_handler : ?enable_reauth:bool -> ?skip_challenge:bool -> #key_handler -> auth_handler
Basic authentication.
class digest_auth_handler : #key_handler -> auth_handler
Digest authentication.
class unified_auth_handler : ?insecure:bool -> #key_handler -> auth_handler
Support both digest and basic authentication, with preference to digest.
class generic_auth_handler : #key_handler -> (module Nethttp.HTTP_MECHANISM) list -> auth_handler
Authenticate with the passed generic HTTP mechanisms

For the Negotiate method (SPNEGO/GSSAPI), have a look at Netmech_spnego_http.

Transport



A connection cache is an object that keeps connections open that are currently unused. A connection cache can be shared by several pipelines.
type connection_cache = Nethttp_client_conncache.connection_cache 
val close_connection_cache : connection_cache -> unit
Closes all descriptors known to the cache
val create_restrictive_cache : unit -> connection_cache
A restrictive cache closes connections as soon as there are no pending requests.
val create_aggressive_cache : unit -> connection_cache
This type of cache tries to keep connections as long open as possible. The consequence is that users are responsible for closing the descriptors (by calling close_connection_cache) when the cache is no longer in use.
val http_trans_id : transport_layer_id
Identifies the pure HTTP transport (without SSL), with or without web proxies
val https_trans_id : transport_layer_id
Identifies anonymous HTTPS transport (i.e. no client certificates), with or without web proxies.
val spnego_trans_id : transport_layer_id
Identifies an anonymous HTTPS transport that is additionally authenticated via SPNEGO (as described in RFC 4559)
val proxy_only_trans_id : transport_layer_id
Identifies web proxy connections. Use this to e.g. send an FTP URL to a web proxy via HTTP
val new_trans_id : unit -> transport_layer_id
Allocates and returns a new ID
val http_transport_channel_type : transport_channel_type
Transport via HTTP
val https_transport_channel_type : Netsys_crypto_types.tls_config -> transport_channel_type
Create a new transport for HTTPS and this configuration. As of OCamlnet-4, https is automatically enabled if Netsys_crypto.current_tls returns something, i.e. if TLS is globally initialized, so there is often no reason to configure a special https transport with this function. You still need it if you want to enable special configurations per request:

        let my_trans_id = Nethttp_client.new_trans_id()
        let my_tct = Nethttp_client.https_transport_channel_type my_tls_config
        pipeline # configure_transport my_trans_id my_tct;
      

Now you can enable this special configuration for a request object call:

        call # set_transport_layer my_trans_id
      

If you want to change the TLS configuration for the whole pipeline, just set the tls field of Nethttp_client.http_options.

type proxy_type = [ `Http_proxy | `Socks5 ] 

Pipelines


class pipeline : object .. end
A pipeline is a queue of HTTP calls to perform

Example using the pipeline:

 let call = new get "http://server/path" in
 let pipeline = new pipeline in
 pipeline # add call;
 pipeline # run();    (* Now the HTTP client is working... *)
 match call # status with
 | `Successful -> ...
 | ...
 


Auxiliary pipeline functions


val parse_proxy_setting : insecure:bool -> string -> string * int * (string * string * bool) option
Parses the value of an environment variable like http_proxy, i.e. an HTTP URL. The argument is the URL. Returns (host,port,auth) where auth may include user name, password, and the insecure flag.
val parse_no_proxy : string -> string list
Parses the value of an environment variable like no_proxy. Returns the list of domains.

Convenience module for simple applications



Do open Nethttp_client.Convenience for simple applications.
module Convenience: sig .. end

Debugging


module Debug: sig .. end
This web site is published by Informatikbüro Gerd Stolpmann
Powered by Caml