module Xdr_mstring:Managed Strings
A managed string
ms is declared in the XDR file as in
typedef _managed string ms<>;
In the encoded XDR stream there is no difference between strings and
managed strings, i.e. the wire representation is identical. Only
the Ocaml type differs to which the managed string is mapped. This
In the RPC context there is often the problem that the I/O backend would profit from a different string representation than the user of the RPC layer. To bridge this gap, managed strings have been invented. Generally, the user can determine how to represent strings (usually either as an Ocaml string, or as memory), and the I/O backend can request to transform to a different representation when this leads to an improvement (i.e. copy operations can be saved).
Only large managed strings result in a speedup of the program (at least several K).
How to practically use managed strings
There are two cases: The encoding case, and the decoding case.
In the encoding case the
mstring object is created by the user
and passed to the RPC library. This happens when a client prepares
an argument for calling a remote procedure, or when the server
sends a response back to the caller. In the decoding case the client
analyzes the response from an RPC call, or the server looks at the
arguments of an RPC invocation. The difference here is that in the
encoding case user code can directly create
mstring objects by
calling functions of this module, whereas in the decoding case the
RPC library creates the
For simplicity, let us only look at this problem from the perspective of an RPC client.
Encoding. Image a client wants to call an RPC, and one of the
arguments is a managed string. This means we finally need an
object that can be put into the argument list of the call.
This library supports two string representation specially: The normal
string type, and
Netsys_mem.memory which is actually just
a bigarray of char's. There are two factories
mstringto pass to the RPC layer. It should be noted that this layer can process the
memoryrepresentation a bit better. So, if the original
datavalue is a string, the factory for
stringshould be used, and if it is a char bigarray, the factory for
memoryshould be used. Now, the
mstringobject is created by
let mstring = fac # create_from_string data pos len copy_flag, or by
let mstring = fac # create_from_memory data pos len copy_flag.
facis the factory for strings, the
create_from_stringmethod works better, and if
create_from_memorymethod works better.
lencan select a substring of
mstringobject does not copy the data if possible, but just keeps a reference to
datauntil it is accessed; otherwise if
true, a copy is made immediately. Of couse, delaying the copy is better, but this requires that
datais not modified until the RPC call is completed.
Decoding. Now, the call is done, and the client looks at the
result. There is also an
mstring object in the result. As noted
mstring object was already created by the RPC library
(and currently this library prefers string-based objects if not
told otherwise). The user code can now access this
object with the access methods of the
mstring class (see below).
As these methods are quite limited, it makes normally only sense
to output the
mstring contents to a file descriptor.
The user can request a different factory for managed strings. The
Rpc_client.set_mstring_factories can be used for this
purpose. (Similar ways exist for managed clients, and for RPC servers.)
Potential. Before introducing managed strings, a clean analysis
was done how many copy operations can be avoided by using this
technique. Example: The first N bytes of a file are taken as
argument of an RPC call. Instead of reading these bytes into a
normal Ocaml string, an optimal implementation uses now a
buffer for this purpose. This gives:
memoryvalue), and the second copy writes the data into the socket.
Unix.writedo a completely avoidable copy of the data which is prevented by switching to
Netsys_mem.mem_write, respectively. The latter two functions exploit an optimization that is only possible when the data is
The possible optimizations for the decoding side of the problem
are slightly less impressive, but still worth doing it.
class type mstring_factory =
val string_based_mstrings :
val string_to_mstring :
?pos:int -> ?len:int -> string -> mstring
val memory_based_mstrings :
val memory_to_mstring :
?pos:int -> ?len:int -> Netsys_mem.memory -> mstring
val paligned_memory_based_mstrings :
Netsys_mem.alloc_memory_pagesif available, and
val memory_pool_based_mstrings :
Netsys_mem.memory_pool -> mstring_factory
val length_mstrings :
mstring list -> int
val concat_mstrings :
mstring list -> string
val prefix_mstrings :
mstring list -> int -> string
prefix_mstrings l n: returns the first
nchars of the concatenated mstrings
las single string
val blit_mstrings_to_memory :
mstring list -> Netsys_mem.memory -> unit
shared_sub_mstring ms pos len: returns an mstring that includes a substring of
ms, starting at
pos, and with
lenbytes. The returned mstring shares the buffer with the original mstring
(string, mstring_factory) Hashtbl.t