Plasmafs_dn_admin

Administration of datanodes

PlasmaFS allows it to hot-add and hot-remove datanodes, without interruption of the service. This can be done with a few commands, explained here. We assume here that the sysop works from the deployment directory clusterconfig.

Concepts

Each datanode volume is uniquely identified by a random string, called the identity of the volume. It is absolutely required that the identity remains unique at all times.

Right now there can only be one volume on a node (this might change in the future). In the data directory you can find two files:

$ ls -l /data/plasma/data/
total 268435472
-rw-r--r-- 1 gerd gerd           41 2011-10-05 22:51 config
-rw-r--r-- 1 gerd gerd 274877906944 2011-10-11 17:20 data

The data file contains the blocks. The config file contains the identity and the blocksize.

Caveat: It is possible to copy these files to another node in order to relocate the datanode volume. However, please be very careful! It must never happen that datanode servers are started on both nodes which would make both volumes available to the system. If you move data, make the original version inaccessible, e.g. by renaming the directory.

A volume can be in one of three states:

disabled: This is the state just after creation, and it is intended for maintenance. The volume is not available to the system.
alive: The datanode server is up, and serves requests for this volume.
dead: The datanode server is down (unexpected)

Note that a dead volume will automatically be set to alive when the datanode server is back.

Listing volumes

The managedn_inst.sh script can be used to list the datanodes, together with their states and how much they are filled:

$ ./managedn_inst.sh list ssd2
f4f605ff5c7c09f40f395a4696087512  alive         3%  192.168.5.30:2728
a20b364f71009d51d4f1ed2687083e48  alive         3%  192.168.5.40:2728
4a7bcc383b30de2472490c121bc165e1  alive         0%  192.168.5.10:2728
OK

The "ssd2" argument is the name of the instance.

The columns:

The identity (long hex number)
The state
How much filled
The serving node

Disabling volumes

With managedn_inst.sh one can also disable volumes:

$ ./managedn_inst.sh disable ssd2 4a7bcc383b30de2472490c121bc165e1
OK

A disabled volume is no longer considered by the namenode for file operations. The datanode server remains up, though, and continues to respond to requests.

Note that it may take some time until no longer requests are emitted targeting the volume. All currently running transactions can be finished - depending on the transactions this can take a few seconds to a few minutes.

Disabling a volume is a safe way for removing the serving datanode from the system. Once the number of requests for this volume drops to zero the datanode server can be turned off.

It is also recommended to disable dead volumes - this prevents that requests are served by accident, e.g. when the crashed node is tried to be rebooted for maintenance.

Volumes remain disabled after a namenode restart.

Enabling volumes

With managedn_inst.sh one can also enable volumes:

$ ./managedn_inst.sh enable ssd2 4a7bcc383b30de2472490c121bc165e1 192.168.5.10:2728
OK

Here, one has to pass another argument, saying where the volume can be found. The datanode server must be running on this machine. It is checked whether the server has the right volume.

After a volume has been enabled once, the system will search for it on the network whenever PlasmaFS starts up. (This search is done with multicast messages.)

Adding a new data node to the system

First, add the node to the file instances/<inst>/datanode.hosts.

Install the PlasmaFS binaries:

$ ./deploy_inst.sh -only-host <host> <inst>

Replace here <host> with the host name, and <inst> with the instance name.

Initialize and start the datanode:

$ ./initdn_inst.sh <inst> <size> <host>

(where <size> is the amount of data in bytes - use K, M or G suffix).

That's it!

Moving data disks from one node to the other

Imagine the machine breaks, but the disks remain intact. You want to put the disks into a different machine. How to do?

First: Disable the volume! (See above.) Once you did this, you can put the disks into the new machine.

Second: Add the new node to instances/<inst>/datanode.hosts. Remove the broken node.

Third: Install the PlasmaFS binaries on the new node:

$ ./deploy_inst.sh -only-host <host> <inst>

Replace here <host> with the host name, and <inst> with the instance name.

Fourth: Start the datanode server:

$ ssh <host> <dir>/etc/rc_dn.ssh start

(Here, <dir> is the installation prefix.)

Fifth: Enable the volume again (see above).

This web site is published by Informatikbüro Gerd Stolpmann

Plasma	GitLab	Archive
Projects	Blog	Knowledge