Plasma GitLab Archive
Projects Blog Knowledge

Module Mapred_job_exec


module Mapred_job_exec: sig .. end
Execute a scheduled job by submitting tasks to servers

type runtime_job_config 
val create_runtime_job_config : ?map_weight:float * float ->
?emap_weight:float * float ->
?sort_weight:float * float ->
?shuffle_weight:float * float ->
?reduce_weight:float * float ->
?dump_plan_when_complete:bool ->
?dump_plan_as_svg:string ->
?shm_low:int64 ->
?shm_high:int64 ->
?shm_max:int64 ->
?buf_low:int64 ->
?buf_high:int64 ->
?buf_max:int64 ->
?shm_low_factor:float ->
?shm_high_factor:float ->
?shm_max_factor:float ->
?buf_low_factor:float ->
?buf_high_factor:float ->
?buf_max_factor:float ->
?simulate:bool ->
?pre_sort_algo:string ->
?keep_temp_files:bool ->
?report:bool ->
?report_to:Netchannels.out_obj_channel ->
Mapred_job_config.m_job_config -> runtime_job_config
Especially how many tasks are started per task server. Each task counts with a certain weight, given as (io_weight,cpu_weight)- The sum of the I/O weights must not exceed io_load, and the sum of the CPU weights must not exceed cpu_load.
val create_runtime_job_config_from_mapred_config : ?dump_plan_when_complete:bool ->
?dump_plan_as_svg:string ->
?simulate:bool ->
?pre_sort_algo:string ->
?keep_temp_files:bool ->
?report:bool ->
?report_to:Netchannels.out_obj_channel ->
Mapred_job_config.m_job_config ->
Mapred_config.mapred_config -> runtime_job_config
Like the above function, but gets many values from a mapred_config object
type machine_params 
val investigate_machines : Mapred_config.mapred_config ->
string -> string -> machine_params
investigate_machines config auth_ticket job_id:

Queries machine parameters directly from the machines for the tasks.

val planning_capacity : machine_params -> float
The total capacity of all machines (for planning)
type running_job 
val start : Unixqueue.event_system ->
Mapred_sched.plan ->
Mapred_def.mapred_job ->
runtime_job_config ->
machine_params ->
Mapred_taskfiles.taskfile_manager ->
Mapred_def.mapred_env ->
int -> (running_job -> unit) -> running_job
Starts the job which is running asynchronously with the event system
val kill : running_job -> unit
Kills the job (also async)
val cleanup : running_job -> unit
Deletes all temp files (sync)
type status = [ `Errors of string list | `Killed | `Running | `Successful ] 
val status : running_job -> status
val event_system : running_job -> Unixqueue.event_system
val stats : running_job -> Mapred_stats.stats
val get_simulate : runtime_job_config -> bool
val get_report : runtime_job_config -> bool
val get_report_to : runtime_job_config -> Netchannels.out_obj_channel
val get_dump_plan_when_complete : runtime_job_config -> bool
val get_keep_temp_files : runtime_job_config -> bool
val get_job_config : runtime_job_config -> Mapred_job_config.m_job_config
Get the values of these config settings
val get_rjc : running_job -> runtime_job_config
val get_job : running_job -> Mapred_def.mapred_job
val get_tm : running_job -> Mapred_taskfiles.taskfile_manager
val get_plan : running_job -> Mapred_sched.plan
val get_env : running_job -> Mapred_def.mapred_env
val message_rjc : runtime_job_config -> string -> unit
val message : running_job -> string -> unit
Logs a message
This web site is published by Informatikbüro Gerd Stolpmann
Powered by Caml