Plasma GitLab Archive
Projects Blog Knowledge

Cmd_plasma



plasma - command-line access to PlasmaFS files

Synopsis

plasma (list|ls)   <options> file ...
plasma (delete|rm) <options> file ...
plasma (rename|mv) <options> <see below>
plasma (link|ln)   <options> <see below>
plasma (copy|cp)   <options> <see below>
plasma mkdir       <options> file ...
plasma create      <options> file ...
plasma put         <options> (local_file | -stdin) pfs_file
plasma get         <options> pfs_file local_file
plasma cat         <options> file ...
plasma blocks      <options> file ...
plasma fsstat      <options>

General options for all or many commands:

  -cluster <name> 
  -namenode <host>:<port>
  -rep <n>
  -tree <prefix>=<url>
  -glob
  -no-glob

Description

The utility plasma allows one to directly access files stored in PlasmaFS via the PlasmaFS-specific RPC protocol. Additionally, it also includes methods for accessing a number of non-PlasmaFS file systems.

All pfs_file arguments refer to the file hierarchy of the PlasmaFS cluster. For now, all such files need to be absolute, e.g. /a/plasmafs/file.

The file arguments can additionally include prefixed files. The prefix is separated from the path by a colon, e.g. file:/a/local/path. The prefix "file" refers to files of the local filesystem. Additional prefixes can be defined with the -tree switch. Unknown prefixes are taken as host names, and it is tried to access the files via the ssh protocol, e.g. host:/a/remote/path.

Selecting the cluster

  • -cluster name: Specifies the name of the PlasmaFS cluster. This can also be given by setting the environment variable PLASMAFS_CLUSTER.
  • -namenode <host>:<port>: Specifies the namenode to contact. This option can be given several times - the system searches then for the right namenode.
The cluster to use is determined as follows:

  1. If -cluster and -namenode options are given, this cluster is used
  2. If there is a configuration file ~/.plasmafs the name set via -cluster is used to select which cluster is accessed
  3. If there is a configuration file ~/.plasmafs, but no -cluster option is passed to the command, the first configuration in the file is taken
If there is the environment variable PLASMAFS_CLUSTER and no -cluster option, the cluster name is taken from this variable instead. If there is PLASMAFS_CLUSTER it is also looked for PLASMAFS_NAMENODES. If the latter is set, the namenodes are taken from there (comma-separated, in host:port syntax).

The ~/.plasmafs configuration file

See Plasma_client_config.parse_config_file for a description.

Additional file trees

The -tree option defines additional prefixes, for example

plasma ls -tree homepage=http://host/root homepage:/

The URL in the -tree argument can be as follows:

  • file://<path>: Accesses the local filesystem at <path>
  • http://<host>/<path>: Access an HTTP filesystem
  • ssh://<host>: Access a remote filesystem via ssh
  • plasma://<name>@: Access the PlasmaFS filesystem <name> as configured in ~/.plasmafs (note the trailing "@")
  • plasma://<name>@<host>:<port> Access the PlasmaFS filesystem <name> at the namenode <host> and at this <port>.

Globbing

Globbing is supported on all filesystem trees. The wildcards *, ?, brackets, and braces are supported.

The switches -no-glob and -glob can be used to turn globbing off and on, respectively. They affect only files following these switches on the command line.

list subcommand

list lists files (like Unix ls -l).

Synopsis:

plasma (list|ls) <options> file ...

Options:

  • -1: Outputs one file per line
  • -l: Long lines
  • -a: also include entries starting with "." (but not "." and "..")
  • -d: list directory entries instead of contents
  • -i: output inode numbers
  • -r: reverse sorting order
  • -S: sort by file size
  • -t: sort by modification time
The default is -l if stdout is a tty, and -l otherwise.

For non-PlasmaFS filesystems only the -1 format is supported.

delete subcommand

delete removes an existing file (like Unix rm), or an existing and empty directory (like Unix rmdir).

Synopsis:

plasma (delete|rm) <options> file ...

Options:

  • -r: recursively delete directories when they are non-empty

rename subcommand

rename renames files, symlinks or directories like Unix mv.

Synopsis:

  rename <source> <destination>
  rename <source> ... <destination_directory>
  rename -t <destination_directory> <source> ...

link subcommand

link creates another link for an existing file. If the option -s is given, a symbolic link is created, otherwise a hard link is created.

Synopsis:

  link [-s] <target> <new_link>                           (1)
  link [-s] <target> ... <new_link_directory>             (2)
  link [-s] -t <new_link_directory> <target> ...          (3)

With form (1) a new link is created for an existing target (which must be a regular file or a symlink). With form (2) the links for the targets are created in a directory (keeping the base names of the files). (3) is a syntactic variant of (2).

copy subcommand

copy copies files like Unix cp

Synopsis:

  copy <source> <destination>
  copy <source> ... <destination_directory>
  copy -t <destination_directory> <source> ...

mkdir subcommand

mkdir creates a new directory (which must not exist already).

Synopsis:

plasma mkdir <options> file ...

Options:

  • -p: Create parent directories as necessary

create subcommand

create creates a new file (which must not exist already).

Synopsis:

plasma create <options> file ...

Option:

  • -rep n: Creates the file with n replicas. n=0 means the server default, which is also the default if there is no -rep option.

put subcommand

put creates a new file in PlasmaFS, and copies the contents of local_file to it. local_file can be a seekable or non-seekable file (pipe). If -stdin is given, the standard input is copied to the new PlasmaFS file.

Tree prefixes are not supported here, and will be rejected.

Synopsis:

plasma put <options> (local_file | -stdin) pfs_file

Options:

  • -rep n: Creates the file with n replicas. n=0 means the server default, which is also the default if there is no -rep option.
  • -chain: By default, the file is copied using star topology (i.e. a block is independently copied to all datanodes holding replicas). The -chain switch changes this to the chain topology where a block is first copied to one datanode, and from there to the other datanodes storing the replicas.
  • -stdin: Takes the input from standard input
  • -f: If pfs_file already exists, it is deleted before the put operation is executed.

get subcommand

get downloads a file from PlasmaFS to the local filesystem.

Tree prefixes are not supported here, and will be rejected.

Synopsis:

plasma get <options> pfs_file local_file

cat subcommand

get downloads files from PlasmaFS, concatenates them, and outputs everything to standard output.

Synopsis:

plasma cat <options> file ...

blocks subcommand

blocks shows the block list for each file.

Synopsis:

plasma block <options> pfs_file ...

Example output:

/input/words_10M:
        0 -      1022: 127.0.1.1:2728[3072-4094]
     1023 -      1023: 127.0.1.1:2728[4096-4096]
     1024 -      1308: 127.0.1.1:2728[7168-7452]
     1309 -      1309: 127.0.1.1:2728[7489-7489]
     1310 -      1464: 127.0.1.1:2728[7526-7680]
     1465 -      1634: 127.0.1.1:2728[7718-7887]
  blocks:                1635
  actual replication:    1
  requested replication: 1

For instance, the line 1310-1464 means that the blocks 1310 to 1464 of the file are stored on one datanode (127.0.1.1:2728), and the positions 7526-7680 in this blockstore are used.

The actual replication is the minimum number of replicas on which each block is stored. Dead nodes are not counted.

fsstat subcommand

fsstat shows how many blocks are free in the file system.

Synopsis:

plasma fsstat <options> pfs_file

Example output:

Total:                       10000 blocks
Used:                         1860 blocks
Transitional:                    0 blocks
Free:                         8140 blocks

Transitional blocks are actually used, but their state will change soon. These are either unused blocks that are allocated by a transaction, and the transaction is not yet committed, or these are used blocks that are freed by an uncommmitted transaction.

Implementation restrictions

This utility is not yet perfect:

  • It is not always possible to rename local files (in particular, it is not supported to rename a local file with an absolute path to a local file given by a relative path, or vice versa).
  • The option -rep only affects the default PlasmaFS filesystem, but not additional filesystems defined with -tree.
  • Moving files between file systems is not supported.
  • There might still be bugs with character sets

Examples

List files in a PlasmaFS directory:

plasma ls /this/is/a/directory

Concatenate many files: (Note the single quotes.)

plasma cat '/my/dir/file*'

Copy a file from an HTTP server to PlasmaFS:

plasma cp -tree h=http://server/path/to/directory h:/file /plasmafs/dir/file

Grab some files via ssh:

plasma cp "host:/remote/dir/file*" /plasmafs/dir

This web site is published by Informatikbüro Gerd Stolpmann
Powered by Caml