Module Plasma_client

module Plasma_client: sig .. end

Client access to the Plasma Filesystem

This is a client library providing full access to the Plasma filesystem. It is probably intuitive to understand this interface, but if any question pops up, please consult the page Plasmafs_protocol. It explains all background concepts of the PlasmaFS protocol.

Many of the following functions return so-called engines. These functions have the suffix _e. There is always a "normal", i.e. synchronous variant not returning engines computing the result, but directly the result. The engines make it possible to send queries asynchronously. For more information about engines, see the module Uq_engines of Ocamlnet.

Some of the following types are defined in Plasma_rpcapi_aux, especially

type plasma_cluster

an open Plasma cluster

type plasma_trans

plasma transaction

type inode = int64

inodes are int64 numbers

type errno = [ `eaccess
       | `econflict
       | `ecoord
       | `eexist
       | `efailed
       | `efailedcommit
       | `efbig
       | `efhier
       | `einval
       | `eio
       | `elongtrans
       | `enametoolong
       | `enoent
       | `enoient
       | `enonode
       | `enospc
       | `enotrans
       | `eperm
       | `erofs
       | `etbusy ]

type topology = [ `Chain | `Star ]

see copy_in

exception Plasma_error of errno

Error reported by the server

exception Cluster_down of string

No access to the cluster possible

Open cluster

val open_cluster : string ->
       (string * int) list -> Unixqueue.event_system -> plasma_cluster

open_cluster name namenodes: Opens the cluster with these namenodes (given as (hostname,port) pairs). The client automatically determines which is the coordinator.

val open_cluster_cc : Plasma_client_config.client_config ->
       Unixqueue.event_system -> plasma_cluster

Same, but takes a Plasma_client_config.client_config object which can in turn be obtained via Plasma_client_config.get_config.

val buffer_stats : plasma_cluster -> string

Statistics report

val close_cluster : plasma_cluster -> unit

Closes all file descriptors permanently

val abort_cluster : plasma_cluster -> unit

Closes the descriptors to remote services so far possible, but does not permanently shut down the client functionality. The descriptors are automatically opened again when needed. The effect is not only that resources are given back temporarily, but also that the pending transactions are aborted.

val cluster_name : plasma_cluster -> string

Returns the cluster name

val cluster_namenodes : plasma_cluster -> (string * int) list

Returns the namenodes passed to open_cluster

val configure_buffer : plasma_cluster -> int -> unit

configure_buffer c n: configures to use n buffers. Each buffer is one block. These buffers are only used for buffered I/O, i.e. for Plasma_client.read and Plasma_client.write, but not for Plasma_client.copy_in and Plasma_client.copy_out.

val configure_pref_nodes : plasma_cluster -> string list -> unit

Configures that the data nodes with the given identities are preferred for the allocation of new blocks. This config is active until changed again. Useful for configuring local identities (see local_identities below), i.e. for enforcing that blocks are allocated on the same machine, so far possible.

val configure_shm_manager : plasma_cluster -> Plasma_shm.shm_manager -> unit

Configures a shared memory manager. This is an optional feature. The manager must be configured before the cluster is used.

val shm_manager : plasma_cluster -> Plasma_shm.shm_manager

Returns the current manager

val blocksize_e : plasma_cluster -> int Uq_engines.engine

val blocksize : plasma_cluster -> int

Returns the blocksize

val fsstat_e : plasma_cluster -> Plasma_rpcapi_aux.fsstat Uq_engines.engine

val fsstat : plasma_cluster -> Plasma_rpcapi_aux.fsstat

Return statistics

val local_identities_e : plasma_cluster -> string list Uq_engines.engine

val local_identities : plasma_cluster -> string list

Return the identities of the data nodes running on this machine (for configure_pref_nodes)

Transactions

All functions requiring a plasma_trans value as argument must be run inside a transaction. This means one has to first call start to open the transaction, call then the functions covered by the transaction, and then either commit or abort.

It is allowed to open several transactions simultaneously.

If you use the engine-based interface, it is important to ensure that the next function in a transaction can first be called when the current function has responded the result. This restriction is only valid in the same transaction - other transactions are totally independent in this respect.

val start_e : plasma_cluster -> plasma_trans Uq_engines.engine

val start : plasma_cluster -> plasma_trans

Starts a transaction

val commit_e : plasma_trans -> unit Uq_engines.engine

val commit : plasma_trans -> unit

Commits a transaction, and makes the changes of the transaction permanent.

val abort_e : plasma_trans -> unit Uq_engines.engine

val abort : plasma_trans -> unit

Aborts a transaction, and abandons the changes of the transaction

val cluster : plasma_trans -> plasma_cluster

the cluster to which a transaction belongs

File creation/access over the inode interface

val create_inode_e : plasma_trans ->
       Plasma_rpcapi_aux.inodeinfo -> inode Uq_engines.engine

val create_inode : plasma_trans ->
       Plasma_rpcapi_aux.inodeinfo -> inode

Create a new inode. The inode does initially not have a name.

At the end of the transaction inodes are automatically deleted that do not have a name. Use link_e to assign names (below).

See also Plasma_client.create_file below, which immediately links the inode to a name. See also Plasma_client.regular_ii, Plasma_client.dir_ii, and Plasma_client.symlink_ii for how to create inodeinfo values.

val delete_inode_e : plasma_trans -> inode -> unit Uq_engines.engine

val delete_inode : plasma_trans -> inode -> unit

Delete the inode

val get_inodeinfo_e : plasma_trans ->
       inode -> Plasma_rpcapi_aux.inodeinfo Uq_engines.engine

val get_inodeinfo : plasma_trans ->
       inode -> Plasma_rpcapi_aux.inodeinfo

get info about inode

val set_inodeinfo_e : plasma_trans ->
       inode -> Plasma_rpcapi_aux.inodeinfo -> unit Uq_engines.engine

val set_inodeinfo : plasma_trans ->
       inode -> Plasma_rpcapi_aux.inodeinfo -> unit

set info about inode

Fast data access

The function copy_in writes a local file to the cluster. copy_out reads a file from the cluster and copies it into a local file.

Especially copy_in works only in units of whole blocks. The function never reads a block from the filesystem, modifies it, and writes it back. Instead, it writes the block with the data it has, and if there is still space to fill, it pads the block with zero bytes. If you need support for accessing parts of a block only, better use the buffered access below.

The local file must be a seekable file (pipes etc. not supported). It is possible to pass descriptors for shared memory files, though. This way you can write data from RAM, or read into RAM.

val copy_in_e : plasma_cluster ->
       inode ->
       int64 ->
       Unix.file_descr -> int64 -> topology -> unit Uq_engines.engine

val copy_in : plasma_cluster ->
       inode ->
       int64 -> Unix.file_descr -> int64 -> topology -> unit

copy_in_e c inode pos fd len: Copies the data from the file descriptor fd to the file given by inode. The data is taken from the current position of the descriptor. Exactly len bytes are copied. The data is written to position pos of the file referenced by the inode. If it is written past the EOF position, the EOF position is advanced.

fd must be seekable.

topology says how to transfer data from the client to the data nodes. `Star means the client organizes the writes to the data nodes as independent streams. `Chain means that the data is first written to one of the data nodes, and the replicas are transferred from there to the next data node.

Limitation: pos must be a multiple of the blocksize. The file is written in units of the blocksize.

copy_in performs its operations always in separate transactions.

val copy_out_e : plasma_cluster ->
       inode ->
       int64 -> Unix.file_descr -> int64 -> unit Uq_engines.engine

val copy_out : plasma_cluster ->
       inode -> int64 -> Unix.file_descr -> int64 -> unit

copy_out_e c inode pos fd len Copies the data from the file referenced by inode to file descriptor fd. The data is taken from position pos to pos+len-1 of the file, and it is written to the current position of fd.

fd must be seekable.

Limitation: pos must be a multiple of the blocksize.

copy_out performs its operations always in separate transactions.

Buffered data access

For getting well-performing buffered access, you should configure the size of the buffer via Plasma_client.configure_buffer.

type strmem = [ `Memory of Netsys_mem.memory | `String of string ]

The buffer for read and write can be given as string or as bigarray (memory). The latter is very advantageous: often, copying from and to the buffer can be completely avoided, as only the page table of the OS is modified ("copy-on-write" optimization). However, the bigarray must start at a page boundary for getting this effect (using map_file or one of the functions in Ocamlnet's Netsys_mem). Also, the buffer should be quickly discarded after the read or write, and it must not be reused for further reads or writes (otherwise the copy is only delayed but not avoided). Not meeting these requirements is not an error, though - the only downside is that the read/write is slightly more costly.

val read_e : plasma_cluster ->
       inode ->
       int64 ->
       strmem ->
       int -> int -> (int * bool * Plasma_rpcapi_aux.inodeinfo) Uq_engines.engine

val read : plasma_cluster ->
       inode ->
       int64 ->
       strmem ->
       int -> int -> int * bool * Plasma_rpcapi_aux.inodeinfo

read_e c inode pos s spos len: Reads data from inode, and returns (n,eof,ii) where n is the number of read bytes, and eof the indicator that EOF was reached. This number n may be less than len only if EOF is reached. ii is the current inodeinfo.

Before a read is responded from a clean buffer it is checked whether the buffer is still up to date.

val write_e : plasma_cluster ->
       inode ->
       int64 -> strmem -> int -> int -> int Uq_engines.engine

val write : plasma_cluster ->
       inode -> int64 -> strmem -> int -> int -> int

write_e c inode pos s spos len: Writes data to inode and returns the number of written bytes. This number n may be less than len for arbitrary reasons (unlike read - to be fixed).

A write that is not aligned to a block implies that the old version of the block is read first (if not available in a buffer). This is a big performance penalty, and best avoided.

It is not ensured that the write is completed when the return value becomes available. The write is actually done in the background, and can be explicitly triggered with the flush_e operation. Also, note that the write happens in a separate transaction. (With "background" we do not mean a separate kernel thread, but an execution thread modeled with engines.)

Writing also triggers that the EOF position is at least set to the position after the last written position. However, this is first done when the blocks are flushed in the background.

As writing happens in the background, some special attention has to be paid for the way errors are reported. At the first error the write thread stops, and an error code is set. This code is reported at the next write or flush. After being reported, the code is cleared again. Writing is not automatically resumed - only further write and flush invocations will restart the writing thread. Also, the data buffers are kept intact after errors - so everything will be again tried to be written (which may run into the same error). The function drop_inode can be invoked to drop all dirty buffers of the inode in the near future.

val flush_e : plasma_cluster ->
       inode -> int64 -> int64 -> unit Uq_engines.engine

val flush : plasma_cluster -> inode -> int64 -> int64 -> unit

flush_e inode pos len: Flushes all buffered data of inode from pos to pos+len-1, or to the end of the file if len=0. This ensures that data is really written.

val drop_inode : plasma_cluster -> inode -> unit

Drops all dirty buffers of this inode. This will prevent that they are again tried to be written, and it will free up buffer space.

val flush_all_e : plasma_cluster -> unit Uq_engines.engine

val flush_all : plasma_cluster -> unit

Flushes all buffers. (No error reporting, though.)

Filename interface

val lookup_e : plasma_trans -> string -> inode Uq_engines.engine

val lookup : plasma_trans -> string -> inode

Looks the filename up and returns the inode number

Inside a transaction the filename is read-locked: a competing transaction cannot delete/replace it while the transaction exists.

val dir_lookup_e : plasma_trans ->
       inode -> string -> inode Uq_engines.engine

val dir_lookup : plasma_trans ->
       inode -> string -> inode

Looks the filename up relative to a directory (given as inode) and returns the inode number.

This also places a read-lock on the filename like lookup.

val rev_lookup_e : plasma_trans ->
       inode -> string list Uq_engines.engine

val rev_lookup : plasma_trans -> inode -> string list

Returns the filenames linked with this inode number

val link_count_e : plasma_trans -> inode -> int Uq_engines.engine

val link_count : plasma_trans -> inode -> int

Returns the number of links

val link_e : plasma_trans ->
       string -> inode -> unit Uq_engines.engine

val link : plasma_trans -> string -> inode -> unit

Links a name with an inode

For directories there is the restriction that at most one name may be linked with the inode.

val unlink_e : plasma_trans -> string -> unit Uq_engines.engine

val unlink : plasma_trans -> string -> unit

Unlinks the name. If the count of links drops to 0 this also removes the inode.

This also works for directories! (They must be empty, of course.)

val list_inode_e : plasma_trans ->
       inode -> (string * inode) list Uq_engines.engine

val list_inode : plasma_trans ->
       inode -> (string * inode) list

Lists the contents of the directory, given by inode

Note that this operation can result in `econflict (although read-only)

val list_e : plasma_trans ->
       string -> (string * inode) list Uq_engines.engine

val list : plasma_trans -> string -> (string * inode) list

Lists the contents of the directory, given by filename

Note that this operation can result in `econflict (although read-only)

val create_file_e : plasma_trans ->
       string ->
       Plasma_rpcapi_aux.inodeinfo -> inode Uq_engines.engine

val create_file : plasma_trans ->
       string -> Plasma_rpcapi_aux.inodeinfo -> inode

Creates a regular file (inode plus name) or a symlink. The file type must be `ftype_regular or `ftype_symlink.

val mkdir_e : plasma_trans ->
       string ->
       Plasma_rpcapi_aux.inodeinfo -> inode Uq_engines.engine

val mkdir : plasma_trans ->
       string -> Plasma_rpcapi_aux.inodeinfo -> inode

Creates a directory

val regular_ii : plasma_cluster -> int -> Plasma_rpcapi_aux.inodeinfo

regular_ii c mode: Creates an inodeinfo record for a new empty regular file, where the mode field is set to mode modulo the current mask

val symlink_ii : plasma_cluster -> string -> Plasma_rpcapi_aux.inodeinfo

regular_ii c target: Creates an inodeinfo record for a symlink pointing to target

val dir_ii : plasma_cluster -> int -> Plasma_rpcapi_aux.inodeinfo

regular_ii c mode: Creates an inodeinfo record for a new directory, where the mode field is set to mode modulo the current mask

Low-level functions

val get_blocklist_e : plasma_trans ->
       inode ->
       int64 -> int -> Plasma_rpcapi_aux.blockinfo list Uq_engines.engine

val get_blocklist : plasma_trans ->
       inode -> int64 -> int -> Plasma_rpcapi_aux.blockinfo list

get_blocklist_e t inode block n Returns the list of blocks for blocks block to blocks+n-1. This is useful for analyzing where the blocks are actually physically stored.

Utilities

val retry : string -> int -> ('a -> 'b) -> 'a -> 'b

retry name n f arg: Executes f arg and returns the result. If a Plasma_error occurs the execution is repeated, up to n times.

Errors are logged (Netlog). name is used in log output.

This web site is published by Informatikbüro Gerd Stolpmann

Plasma	GitLab	Archive
Projects	Blog	Knowledge