class type mapred_config = object
.. end
General
method nn_clustername : string
The clustername
method nn_nodes : string list
The name nodes in "host:port" syntax
method mr_task_nodes : string list
Task nodes (only hostname)
method mr_task_port : int
The port number
method mr_task_tmpdir : string
A directory where to put executables, logs, etc.
Resource parameters
There are two ways for limiting the resource consumption:
- by setting parameters to absolute numbers
- by setting parameters relatively to an automaticlly determined
maximum
The first method has always precedence. The second method is nicer
because it also works well when the cluster is not homogeneous,
and the systems differ in the amount of RAM and cores. However,
getting the available resources is very OS-dependent, and there are
only routines for a handfull of operating systems. Linux, BSD, and
Solaris should work here.
Note that the maximum for shared memory is assumed to be 1/8 of
physical RAM (independent of real OS settings - this is really
hard to find out).
method mr_task_load_limit : float
Load limit per task server (in number of tasks). Should be set to
a small multiple of the number of cores of the biggest machine.
This is a required parameter.
method mr_shm_low : int64 option
Low watermark for shared memory. If shm consumption drops below this
value shm is no longer considered as scarce resource. Default: None
method mr_shm_low_factor : float
Alternate way for setting the low watermark as fraction of
available shared memory. This should be a number between 0 and 1.0.
The factor is only considered if mr_shm_low = None
.
Default: 0.25
method mr_shm_high : int64 option
High watermark for shared memory. If shm consumption is above this
value shm is considered as scarce resource. Default: None
method mr_shm_high_factor : float
Alternate way for setting the high watermark as fraction of
available shared memory. This should be a number between 0 and 1.0.
The factor is only considered if mr_shm_high = None
.
Default: 0.5
method mr_shm_max : int64 option
Maximum for shared memory. If this amount of shm consumption is
reached, shm is considered as non-available. Default: None
method mr_shm_max_factor : float
Alternate way for setting the maximum as fraction of
available shared memory. This should be a number between 0 and 1.0.
The factor is only considered if mr_shm_max = None
.
Default: 0.75
method mr_buf_low : int64 option
Low watermark for buffer memory. If bufmem consumption drops below this
value bufmem is no longer considered as scarce resource. Default: None
method mr_buf_low_factor : float
Alternate way for setting the low watermark as fraction of
available physical RAM. This should be a number between 0 and 1.0.
The factor is only considered if mr_buf_low = None
.
Default: 0.25
method mr_buf_high : int64 option
High watermark for buffer memory. If bufmem consumption is above this
value bufmem is considered as scarce resource. Default: None
method mr_buf_high_factor : float
Alternate way for setting the high watermark as fraction of
available physical RAM. This should be a number between 0 and 1.0.
The factor is only considered if mr_buf_high = None
.
Default: 0.5
method mr_buf_max : int64 option
Maximum for buffer memory. If this amount of bufmem consumption is
reached, bufmem is considered as non-available. Default: None
method mr_buf_max_factor : float
Alternate way for setting the maximum as fraction of
available physical RAM. This should be a number between 0 and 1.0.
The factor is only considered if mr_buf_max = None
.
Default: 0.75
Buffer parameters
method mr_buffer_size : int
The normal size of I/O buffers. E.g. 64M
method mr_buffer_size_tight : int
The size of I/O buffers when RAM is tight. E.g. 16M
method mr_sort_size : int
The size of the buffers for sorting. E.g. 128M