module Mapred_streaming:Support for streamingsig..end
The following additional job configs are interpreted:
map_exec: The command to execute for mapping. This command is
      run on the task node.reduce_exec: The command to execute for reducing. This command is
      run on the task node.extract_mode: How to split a line into keys and values. This
      mode is only applied to lines written by the map command.
      Possible values are:
      key: The whole line is taken as key. This is the default.key_tab_value: The field before the first TAB is taken as key,
           and the rest of the line as value.key_tab_partition_tab_value: The field before the first TAB
           is taken as key. The field between the first and the second TAB
           is taken as partition number (decimal number). The rest of the
           line is taken as value.
    The job config task_files is very useful to install the executable
    for the map and reduce commands on the task nodes. E.g.:
       task_files = "my_command";
       map_exec = "./my_command -map arg1 arg2 ...";
       reduce_exec = "./my_command -reduce arg1 arg2 ...";
    
    The working directory when starting the command is exactly the
    directory where the files are installed by the task_files
    directive.
The following environment variables are also set:
PLASMAMR_LOCAL_DIR: The local directoryPLASMAMR_LOCAL_LOG_DIR: The log directory. Files whose names begin with
      PLASMAMR_TASK_PREFIX are immediately moved to the PlasmaFS log
      directory when the task is finished. Files with other names are also
      moved, but first when the job finishes, because it cannot be tracked
      which task created them.PLASMAMR_REQ_ID: The request ID of the taskPLASMAMR_PARTITION: The partition (only reduce)PLASMAMR_NAME: The job namePLASMAMR_JOB_ID: The job IDPLASMAMR_INPUT_DIR: The input directory in PlasmaFSPLASMAMR_OUTPUT_DIR: The output directory in PlasmaFSPLASMAMR_WORK_DIR: The work directory in PlasmaFSPLASMAMR_LOG_DIR: The log directory in PlasmaFSPLASMAMR_BIGBLOCK_SIZE: The size of bigblocksPLASMAMR_PARTITIONS: The number of partitionsPLASMAMR_CONF: The task server configuration file (use this file
      to restore Mapred_config.mapred_config fully if needed). The
      job-specific settings from Mapred_def.mapred_job_config cannot
      be retrieved from here, though.PLASMAFS_CLUSTER: The name of the PlasmaFS clusterPLASMAFS_NAMENODES: The list of namenodesval job : unit -> Mapred_def.mapred_job
      The Plasma distribution comes already with a program that runs this
      job via Mapred_main: mr_streaming