Plasma_release_0_3
Release Notes For Plasma
This is version: 0.3 "Einmaleins". This is an alpha release to make
Plasma known to interested developers.
Changes
Changed in 0.3
New features in PlasmaFS:
- Name nodes can now also be referenced by IP addresses
- Added Plasma_netfsmodule for easy access to PlasmaFS files
- Improved plasmacommand-line utility (wildcards, tree mounting)
Implementation improvements in PlasmaFS:
- The representation of blocklists has been optimized. Whereever possible,
  a compressed format is used now, including the database table. This
  is assumed to fix performance problems for files with long blocklists.
- Improved block allocation algorithm: it is tried to avoid
  discontiguities as much as possible
- The DB accesses are now completely non-blocking
- Access tickets are allocated in advance by the namenode
- Enhanced versions of Plasma_client.copy_inandcopy_outallow
  buffered reading and writing to some extent
- Better scheme for emitting debug log messages
New features in the map/reduce framework:
- It is now possible to execute several jobs at the same time. However,
  the jobs must use the same task server, and hence the same m/r algorithm.
  (The latter is no problem for streaming jobs.)
Implementation improvements in the map/reduce framework:
- Optimized record reader
- Optimized sorting algorithm
- If RAM is tight, the execution of tasks can be delayed
Improvements for both PlasmaFS and map/reduce:
- RAM management: The RAM for large buffers and shared memory buffers is
  actively managed, and this works now in a fairly automatic way
Compatibility:
- Existing PlasmaFS filesystems are incompatible (db schema changes)
- There are incompatible protocol changes
What is working and not working in PlasmaFS
Generally, PlasmaFS works as described in the documentation. Crashes
have not been observed for quite some time now, but occasionally one
might see critical exceptions in the log file.
PlasmaFS has so far only been tested on 64 bit, and only on Linux
as operation system. There are known issues for 32 bit machines,
especially the blocksize must not be larger than 4M.
Data safety: Cannot be guaranteed. It is not suggested to put valuable
data into PlasmaFS.
Known problems:
- It is still unclear whether the timeout settings are acceptable.
- There might be name clashes for generated file names. Right now it is
   assumed that the random number generator returns unique names, but this
   is for sure not the case.
- The generated inode numbers are not necessarily unique after namenode 
   restarts.
- The cpsubcommand of theplasmautility is slower than necessary.
   Avoid it for now! Alternatives: Use theget,put, andcatsubcommands instead.
Not implemented features:
- The namenodes cannot yet detect crashed datanodes. Datanodes are always
   reported as alive.
- The ticket system is not fully implemented (support for "read").
- There is no authorization system (file access rights are ignored)
- There is no authentication system to secure filesystem accesses (such
   as Kerberos)
- There are too many hard-coded constants.
- The file name read/lookup functions should never return ECONFLICTerrors. (This has been improved in 0.2, though.)
- Translation of access rights to NFS
- Support for checksums
- Support for "host groups", so that it is easier to control which machines
   may store which blocks. Semantics have to be specified yet.
- Define how blocks are handled that are allocated but never written.
- Recognition of the death of the coordinator, and restart of the
   election algorithm.
- Multicast discovery of datanodes
- Lock manager (avoid that clients have to busy wait on locks)
- Restoration of missing replicas
- Rebalancing of the cluster
- Automated copying of the namenode database to freshly added namenode slaves
What is working and not working in Plasma MapReduce
Not implemented features:
- Task servers should be able to provide several kinds of jobs
- Think about dynamically extensible task servers
- Run jobs only defining mapbut noreduce.
- Support for combining (an additional fold function run after each
   shuffle task to reduce the amount of data)
- nice web interface
- support user counters as in Hadoop
- restart/relocation of failed tasks 
- recompute intermediate files that are no longer accessible due to node
   failures
- Speculative execution of tasks
- Support job management (remember which jobs have been run etc.)
What we will never implement:
- Jobs only consisting of reducebut nomapcannot be supported
   due to the task scheme. (Reason: Input files for sort tasks must
   not exceedsort_limit.)
Changed in 0.3.1
Release 0.3.1 was tested in a configuration with four nodes and
lots of RAM. The encountered difficulty was that the periodic
fsync call took that long that client requests timed out.
The following measures improved that:
- Increasing general timeouts in Plasma client
- Changed flow control in datanode servers: Read requests are never
   throttled - it is assumed the OS throttles automatically. Write
   requests are still throttled if the fsyncsyscall takes long.
- Datanode servers: Local write requests are now included in the
   throttling implementation.
- Added a statistics report to the datanode servers
Other bugfixes:
- Wrong parsing of the alive_minconfig parameter
- Tickets: Avoiding a race between a reset request and the
   normal housekeeping loop
Also:
- Increasing the backlog for all servers in the standard config