class type mapred_job = object .. end
method custom_params : string list
The list of allowed custom parameters
method check_config : mapred_env -> mapred_job_config -> unit
Check the config. If not ok, this method can raise exceptions to
stop everything
method pre_job_start : mapred_env -> mapred_job_config -> unit
This is run by the job process before the first task is started
method post_job_finish : mapred_env -> mapred_job_config -> unit
This is run by the job process after the last task is finished
method map : mapred_env ->
mapred_job_config ->
task_info ->
Mapred_io.record_reader -> Mapred_io.record_writer -> unit
The mapper reads records, maps them, and writes them into a
second file.
method extract_key : mapred_env -> mapred_job_config -> string -> string
Extracts the key from a record. This method is always called by
first evaluating
let f = job#extract_key me jc, and then
calling
f line for each input line. Because of this, it is
possible to factor initializations out as in
method extract_key me jc =
...; (* init stuff *)
(fun line -> ... (* real extraction *) )
method partition_of_key : mapred_env -> mapred_job_config -> string -> int
Determines the partition of a key. Can be something simple like
fun k -> (Hashtbl.hash k) mod partitions, or something more
elaborated. This method is always called by
first evaluating
let f = job#partition_of_key me jc, and then
calling
f line for each input line. Because of this, it is
possible to factor initializations out as in
method partition_of_key me jc =
...; (* init stuff *)
(fun line -> ... (* real extraction *) )
method reduce : mapred_env ->
mapred_job_config ->
task_info ->
Mapred_io.record_reader -> Mapred_io.record_writer -> unit
The reducer reads all the records of one partition, and puts them
into an output file.