Plasma GitLab Archive
Projects Blog Knowledge

Module Mapred_fs

module Mapred_fs: sig .. end
Filesystem abstraction


This module defines the filesystem abstraction used for the map/reduce jobs. Usually a filesystem is backed by PlasmaFS, but it could be essentially anything. We provide here also an implementation for local files.

This module here is also responsible for resolving paths with tree prefix like treename::/path. See below for more.

The class type filesystem



This type is an extension of the type Netfs.stream_fs provided by Ocamlnet.

The following flags are explained where used in Mapred_fs.filesystem:

type read_flag = [ `Binary | `Dummy | `Skip of int64 | `Streaming ] 
type read_file_flag = [ `Binary | `Destination of string | `Dummy | `Temp of string * string ] 
type write_flag = [ `Binary
| `Create
| `Dummy
| `Exclusive
| `Location of string list
| `Repl of int
| `Streaming
| `Truncate ]
type write_file_flag = [ `Binary
| `Create
| `Dummy
| `Exclusive
| `Link
| `Location of string list
| `Repl of int
| `Truncate ]
type write_common = [ `Binary
| `Create
| `Dummy
| `Exclusive
| `Location of string list
| `Repl of int
| `Truncate ]
The intersection of write_flag and write_file_flag
type size_flag = [ `Dummy ] 
type test_flag = [ `Dummy | `Link ] 
type remove_flag = [ `Dummy | `Recursive ] 
type rename_flag = [ `Dummy ] 
type symlink_flag = [ `Dummy ] 
type readdir_flag = [ `Dummy ] 
type readlink_flag = [ `Dummy ] 
type mkdir_flag = [ `Dummy | `Nonexcl | `Path ] 
type rmdir_flag = [ `Dummy ] 
type copy_flag = [ `Dummy | `Location of string list | `Repl of int ] 
type link_flag = [ `Dummy ] 
type test_type = [ `D | `E | `F | `H | `N | `R | `S | `W | `X ] 
Tests:
  • `N: the file name exists
  • `E: the file exists
  • `D: the file exists and is a directory
  • `F: the file exists and is regular
  • `H: the file exists and is a symlink (possibly to a non-existing target)
  • `R: the file exists and is readable
  • `W: the file exists and is writable
  • `X: the file exists and is executable
  • `S: the file exists and is non-empty

class type local_file = object .. end
class type filesystem = object .. end
Abstract access to filesystems

Implementations


val plasma_filesystem : ?plasma_root:string ->
Plasma_client_config.client_config ->
(Plasma_client.plasma_cluster -> unit) -> filesystem
plasma_filesystem cc configure:

Access a PlasmaFS filesystem

plasma_root: If the filesystem is NFS-mounted, one can pass the mount directory here. This has an effect on the local_root method.

val local_filesystem : string -> filesystem
local_filesystem root: Access the local filesystem where the directory root is the assumed root.

Multiple Trees


val multi_tree : (string * filesystem) list -> string -> filesystem
multi_tree trees default_tree:

Returns a filesystem supporting paths like treename::/path. The treename names the filesystem in the trees argument. The path is taken relative to the chosen filesystem.

The default_tree is assumed if no treename:: prefix is specified.

Not all operations need to be supported on all trees. Especially, actions involving two files in two trees may fail (rename, copy).


Standard Configuration


val standard_fs : ?custom:(string * filesystem) list ->
?default:string ->
?configure_cluster:(Plasma_client.plasma_cluster -> unit) ->
Mapred_config.mapred_config -> filesystem
standard_fs conf: Returns the standard filesystem that is used for map/red jobs. There is always a "file" tree for local file access. Unless PlasmaFS is disabled, there is also a "plasma" tree for PlasmaFS access.

One can add further trees using the custom argument. With default one can override the default tree.

configure_cluster may be used to configure aspects of the "plasma" tree. By default, only daemon authentication is configured.

val standard_fs_cc : ?custom:(string * filesystem) list ->
?default:string ->
?configure_cluster:(Plasma_client.plasma_cluster -> unit) ->
?client_config:Plasma_client_config.client_config ->
?file_root:string -> ?plasma_root:string -> unit -> filesystem
Another version of standard_fs: If client_config is passed, PlasmaFS will be enabled, otherwise disabled.

  • file_root: the local directory corresponding to "file::/"
  • plasma_root: the local directory corresponding to "plasma::/" (if PlasmaFS is locally mounted)

This web site is published by Informatikbüro Gerd Stolpmann
Powered by Caml