module Rpc_proxy:RPC proxiessig
..end
Rpc_proxy
module provides an improved reliability layer on
top of Rpc_client
. This layer especially features:The proxy functionality is implemented in two layers, the managed clients, and the managed sets. The former layer can handle only one TCP connection (with reconnect), whereas the latter is able to manage a bunch of connections to the same service. Both layers can profit from a reliability cache that knows which services had errors in the past.
See below for a tutorial.
There is also a blog article explaining RPC proxies:
The next server,
please!
module ReliabilityCache:sig
..end
module ManagedClient:sig
..end
module ManagedSet:sig
..end
Rpc_proxy
tutorial
Managed clients
A normal RPC client has a very limited lifecylce: It is created, then a connection is made to an RPC service, messages are exchanged, and finally the connection is terminated. After that the client becomes unusable. In short, it is "use once" client.
In contrast to this, managed clients can be recycled. This is especially useful for dealing with socket errors, and connection terminations triggered by the RPC server.
How to use managed clients: For a normal RPC client the
generator ocamlrpcgen
creates all required glue code to easily
start RPC calls. For example, if a file proto.x
is taken as input
for ocamlrpcgen
, a piece of code doing a call could look like:
let client =
Proto_clnt.PROG.VERS.create_client connector protocol
let result =
Proto_clnt.PROG.VERS.procedure client argument
(Here, PROG
, VERS
, procedure
are just placeholders for the
name of the program, the version identifier, and the procedure name.)
For RPC proxies, however, this is slightly more complicated. ocamlrpcgen
does not produce a managed client that is ready for use. Instead,
only a functor is provided that can take the
Rpc_proxy.ManagedlClient
module as input:
module M = Proto_clnt.Make'PROG(Rpc_proxy.ManagedClient)
let esys =
Unixqueue.create_unix_event_system()
let mclient_config =
Rpc_proxy.ManagedClient.create_mclient_config
~programs:[ Proto_clnt.PROG.VERS._program ]
() in
let mclient =
Rpc_proxy.ManagedClient.create_mclient mclient_config connector esys
let result =
M.VERS.procedure mclient argument
(The functor approach has been chosen, because it gives the
user more flexibility - it is possible to apply the functor
on other implementations of improved clients than
Rpc_proxy.ManagedClient
.)
Note that esys
is always explicit, even in the case the
user only performs synchronous calls - the user should create
a new esys
then, pass it to mclient
, and ignore it otherwise.
Now, how does the recycling feature work? The managed client can be in one of three states:
`Down
: The client is not connected. This is the initial state,
and the state after errors and terminated connections (no matter
whether triggered by the client or by the server)`Connecting
: The client is busy (re)connecting (only used in
some cases)`Up sockaddr
: The client is connected and has the socket address
sockaddr
Rpc_proxy.ManagedClient.mclient_state
.
When it is `Down
, the next RPC call automatically starts the
reconnect to the service. When the connection is established, the
call is done, and the messages are exchanged that are representing
the call. After that, the state remains `Up
after the call.
When the call stops because of an error, the error is reported to
the user in the normal way, and the client is shut down, i.e. after
an error the state is `Down
. If the user decides to try the call
again, the client automatically reconnects following the outlined
rules. Note that managed clients never automatically retry calls
by themselves.
When the TCP connection is regularly shut down (either by the server
or by the client calling Rpc_proxy.ManagedClient.shut_down
), the
client state is changed to `Down
at the next opportunity. Especially
a server-driven shutdown may first be detected when the next RPC call
is tried on the connection. This may or may not lead to an error
depending on the exact timing. In any way, the connection is finally
established again.
Of course, managed clients must be shut down after use, because
there is no other (automatic) way of recognizing that they are no
longer used. Call Rpc_proxy.ManagedClient.shut_down
for this.
Managed client also have a few more features that can be
enabled in mclient_config
, especially:
esys
is running. Especially for synchronous calls this is typically
not the case, so one would have to call Unixqueue.run esys
now
and then to create opportunities for detecting the idle timeout.Managed sets are another layer on top of the managed clients. These sets are able to manage several connections where each is implemented as managed client. The connections can go to the same server endpoint in order to parallelize RPCs at the client side, or to several server endpoints that provide the same service. The latter can be used for client-driven load balancing, and for client-driven failover management of HA setups (HA = high availability).
For creating a managed set, the code looks like
module M = Proto_clnt.Make'PROG(Rpc_proxy.ManagedClient)
let esys =
Unixqueue.create_unix_event_system()
let mclient_config =
Rpc_proxy.ManagedClient.create_mclient_config
~programs:[ Proto_clnt.PROG.VERS._program ]
() in
let mset_config =
Rpc_proxy.ManagedSet.create_mset_config
~mclient_config
() in
let services =
[| connector, n_connections; ... |] in
let mset =
Rpc_proxy.ManagedSet.create_mset
mset_config
services
esys in
let mclient, idx =
Rpc_proxy.ManagedSet.mset_pick mset in
let result =
M.VERS.procedure mclient argument
The managed clients are internally created by the set - one
only should pass in mclient_config
so the set knows what kind of
client is preferred. For the simple application of maintaining
several connections to the same server, one would create the mset
with a one-element service array:
let services =
[| connector, n_connections |]
where connector
describes the server port, and n_connections
is
the maximum number of connections to create and maintain.
The Rpc_proxy.ManagedSet.mset_pick
function creates internally up to n_connections
managed clients,
and returns one of them. By default, it is not guaranteed that the
client is idle (meaning no previous call is pending) -
if the connections are all already busy, mset_pick
starts returning busy connections (but the least busy one first).
There are a number of options allowing to modify the default behavior:
mset_pick
.
To do this, pass the argument ~mset_pending_calls_max:1
to
Rpc_proxy.ManagedSet.create_mset_config
. It can then happen
that no client is idle, and mset_pick
will raise
Rpc_proxy.ManagedSet.Cluster_service_unavailable
.services
array has more than one element, they are
considered as equivalent service endpoints. mset_pick
will
pick one of the endpoints. There are two policies controlling
the selection: With ~policy:`Balance_load
it is aimed at
sending roughly the same number of calls to all endpoints. With
~policy:`Failover
the services are assigned precedences by the position
in the array (i.e. the first service is used as long as possible,
then the second service is used, etc.). The policy
argument
is again to be passed to Rpc_proxy.ManagedSet.create_mset_config
.Rpc_proxy.ManagedSet.shut_down
for this.
Caching reliability data
The cache allows to disable certain hosts or ports when the error
counter reaches a limit. The service is disabled for a limited time span.
This is especially useful when there is an alternate port that can
jump in for the failing one, i.e. when the services
array of a
managed set has two or more elements.
There is a single global cache object, but one can also create
specific cache objects. Generally, cache objects can be shared by
many managed clients and managed sets. The hope is that sharing
is useful because more data can be made available to users of
services. If you do not want to use the global cache object, you
can create your own, and configure it in mclient_config
.
The global cache object is automatically used when nothing else is specified. The global cache object is by default configured in a way so it does not have any effect, though. So we have to change this in order to enable the cache:
let rcache_config =
Rpc_proxy.ReliabilityCache.create_rcache_config
~policy:`Independent
~threshold:3
() in
Rpc_proxy.ReliabilityCache.set_global_rcache_config rcache_config
This means that 3 errors in sequence disable a service port. `Independent
means that each port is handled independently in this respect.
At the first time, the port is only disabled for one second. The duration of the time span is increased by each additional error until it reaches 64 seconds. These durations can be changed, of course.
As the impact of changing the global cache object is sometimes
unpredictable, one can also create a private cache object
(Rpc_proxy.ReliabilityCache.create_rcache
). Another way is
to derive a semi-private object from the global one. This means
that the error counters are global, but the interpretation can
be set individually in each use. This would look like:
let rcache_config =
Rpc_proxy.ReliabilityCache.create_rcache_config
~policy:`Independent
~threshold:3
() in
let rcache =
Rpc_proxy.ReliabilityCache.derive_rcache
(Rpc_proxy.ReliabilityCache.global_rcache())
rcache_config in
...
let mclient_config =
Rpc_proxy.ManagedClient.create_mclient_config
...
~rcache
...
()
Idempotent calls
In the layer of managed sets there is some limited support for automatically repeating failing idempotent RPC calls.
Instead of calling the RPC with
let mclient, idx =
Rpc_proxy.ManagedSet.mset_pick mset in
let result =
M.VERS.procedure mclient argument
one uses
let result =
Rpc_proxy.ManagedSet.idempotent_sync_call
mset
M.VERS.procedure'async
argument
The effet is that Rpc_proxy.ManagedSet.idempotent_sync_call
repeats automatically the call when an error occurs. It is
assumed that the call is idempotent so it can be repeated
without changing the meaning.
The call may be repeated several times. This is configured in
the managed set mset
(parameter mset_idempotent_max
).
Note that one has to pass the asynchronous version (suffix 'async
)
of the RPC wrapper even when doing a synchronous call.
Also see the documentation for
Rpc_proxy.ManagedSet.idempotent_async_call
.