module Mapred_sched:Schedulersig..end
type plan_config
val configure_plan : ?keep_temp_files:bool ->
planning_capacity:float ->
internal_suffix:string ->
output_suffix:string ->
Mapred_def.mapred_job_config ->
Mapred_config.mapred_config -> plan_configconfigure_plan jc conf
Parameters:
keep_temp_files: if true, temporary files created during the
map/reduce execution are not immediately deletedplanning_capacity: how many cores are totally available. This
parameter can be retrieved at runtime via
Mapred_job_exec.planning_capacity.internal_suffix: This is the filename suffix added to names
of intermediate filesoutput_suffix: This is the filename suffix added to names
of final filestype plan
val create_plan : ?dn_identities:string list ->
Mapred_fs.filesystem -> plan_config -> plan
dn_identities: This is an optional list of datanode identities.
These are used in some circumstances as preferences for data
blocks (currently only for the output of emap jobs)val bigblock_size : plan -> intconfigure_plan (via jc) rounded up to the next multiple of blocks.val add_inputs : plan -> unitval add_map_output : plan ->
int ->
(Mapred_tasks.file_tag * Mapred_tasks.file) list -> Unix.inet_addr -> unit
The IP addr points to the machine that executed the map or emap task
(which is also the likely storage for the files)
val plan_complete : plan -> boolval complete_inputs : plan -> unitval executable_tasks : plan -> Mapred_tasks.task listval hosts : plan -> (string * Unix.inet_addr) listval mark_as_finished : plan -> Mapred_tasks.task -> unitval mark_as_started : plan ->
Mapred_tasks.task -> Unix.inet_addr -> int -> bool -> unitval remove_marks : plan -> Mapred_tasks.task -> unitmark_as_started or mark_as_finishedval task_depends_on_list : plan -> Mapred_tasks.task -> Mapred_tasks.task listval plan_finished : plan -> boolval n_running : plan -> intval n_finished : plan -> intval n_total : plan -> intval avg_running : plan -> floatval avg_runnable : plan -> floatval avg_runqueue : plan -> floatval round_points : plan -> Mapred_tasks.task -> floatval greediness_points : plan -> Mapred_tasks.task -> float
If the (imagined) completion of the task t enabled that other
tasks could be run, t gets as many points as tasks would be newly
runnable. The idea here is to prefer tasks that make the most other
tasks runnable ("greediness").
If t does not yet make a task u runnable, but just fulfills
one more precondition among others, t does not get a whole point for
u but just the fraction r/n where r are the fulfilled preconditions
of u, and n are all preconditions of u.
val print_plan : Netchannels.out_obj_channel -> plan -> unitval generate_svg : plan -> stringval task_stats : plan -> Mapred_tasks.task -> int * intNot_found if the task has never been started