Plasmafs_dn_discovery

Datanode discovery

This article explains how the discovery and monitoring of datanodes work. This is a function of the namenode.

Multicast discovery

Multicast discovery is the default. It works out of the box if there is only a Level-2 switch between the namenode and the datanodes. It needs tweaking if there is a router between the nodes, and it is not working if the network does not support multicast (e.g. the Amazon EC2 network does not).

It works as follows: The namenode sends multicast messages to 224.0.0.1:2728, and hopes to get responses by the datanodes in the network. All responding IP addresses are more closely checked. The TTL value is set to 1 by default, i.e. routers cannot be passed.

This default works automatically for many users, but not for everybody. The following subsections will guide you through the configuration. If it turns out that multicasting does not work for you, the alternative is to enter the IP addresses of all datanodes directly into the configuration file. This is described at the end of this article.

There is another advantage of the multicast discovery: If more datanodes are added to the system, it is not necessary to restart the namenode. The new datanodes will automatically be found, and it is sufficient to enable them (with a command).

Inspect your network

If you are in a company network, you should ask the operator whether multicasting is possible and how.

If you are the operator, please note:

If all Plasma nodes are attached to the same unmanaged switch, multicasting will work. (An unmanaged switch does not allow any configuration. In particular, all SOHO switches fall into this category. Remember that the backplane bandwidth is often limited for small switches, and they might be inappropriate for Plasma.)
If the nodes are attached to a managed switch, it might be necessary to enable multicasting in the switch. A "managed switch" in this sense is a device that connects the hosts on level 2, and allows configuration.
If there is a router between the nodes, multicasting must be enabled in the router, and there must be a protocol enabled like IGMP.
For more complicated topologies I cannot give any advice, e.g. for tunnels, VPNs etc.
The public Internet does not support multicast.
Plasma generates extremely low traffic, usually only 1-2 small multicast messages per second.
There is not yet any support for becoming member of a multicast group.

Inspect your systems

Especially check whether eth0 (or your default network device) has multicast enabled:

$ /sbin/ifconfig eth0
eth0      Link encap:Ethernet  HWaddr bc:ae:c5:6c:b4:1e  
          inet addr:192.168.5.10  Bcast:192.168.5.255  Mask:255.255.255.0
          inet6 addr: fe80::beae:c5ff:fe6c:b41e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:645651689 errors:0 dropped:0 overruns:0 frame:0
          TX packets:980018099 errors:0 dropped:4360 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:248979503115 (231.8 GiB)  TX bytes:1133161951156 (1.0 TiB)
          Interrupt:251 Base address:0x6000

The keyword "MULTICAST" indicates this. Some devices do not support multicast (e.g. lo, and Wifi cards).

Also, there must be a route for "224.0.0.0/4" pointing to eth0. It is sufficient when the default route is set, e.g. when you have a line for "0.0.0.0" in

$ /sbin/route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.5.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0
0.0.0.0         192.168.5.1     0.0.0.0         UG    0      0        0 eth0

If there is no such line, you need to set a special route for multicast traffic:

$ /sbin/route add -net 224.0.0.0 netmask 240.0.0.0 dev eth0

Inspect the Plasma configuration

The configuration of the namenode is relevant here (namenode.conf). Check the following settings (all in the datanodes section):

discovery: This parameter says to which IP address the multicast messages are sent. The default is "224.0.0.1", which is the "all site" address. This means that all hosts of the site are automatically member of this group, or in other words, that messages sent to this IP reach all hosts like a site-wide broadcast. In order to change it, use the syntax discovery { addr = "224.0.0.2" };
multicast_ttl: This parameter limits the number of hops. You should set this to n+1 when n is the maximum number of routers between the namenode and any datanode.

Try it, and check the logs

After doing the necessary changes and redeploying, you should check the log files of the namenode to see if any datanodes are discovered. These messages look like

[Mon Jan 30 21:05:54 2012] [Nn_monitor] [info] Discovered datanode f758c8022530c
e0ea8c23b35f28dedb1 at 127.0.0.1:2728 with size 6553600 (enabled)

If there is an error with multicasting, one possible error message is

[Mon Jan 30 18:34:38 2012] [Nn_monitor] [alert] Datanode discovery: Multicast re
quest to 224.0.0.1 cannot be routed. This seems to be an error in the multicast 
configuration of this host

Not all problems can be discovered, however. Sometimes, the only visible effect is that no datanodes are found.

Unicast discovery

The alternate solution to the discovery problem is to enter the IP's to check manually. The list of IP's should at minimum include all hosts that are currently used as datanodes, but it is also allowed to add more IP's for hosts that might become datanode in the future.

There are two methods: First, put the IP's directly into namenode.conf. The datanodes section must be extended by discovery subsections with all IP's (or hostnames), as in

  datanodes {
    discovery { addr="10.0.0.1" };
    discovery { addr="10.0.0.2" };
    discovery { addr="10.0.0.3" };
    ...
  }

The second method is to only enter the name of a file containing the IP's or hostnames:

  datanodes {
    discovery_list = "/path/to/file";
  }

The file should contain one IP or hostname per line. Of course, one possible candidate for this file is datanode.hosts.

This web site is published by Informatikbüro Gerd Stolpmann

Plasma	GitLab	Archive
Projects	Blog	Knowledge