Plasma GitLab Archive
Projects Blog Knowledge

Class type Mapred_io.record_reader


class type record_reader = object .. end

method pos_in : int
The ordinal number of the record that will be read next. Numbers start at 0
method input_record : unit -> string
Reads one record, and advances the cursor. May raise End_of_file
method peek_record : unit -> string
Reads the record, but does not advance the cursor, i.e. the next peek_record or input_record will read the line again. May raise End_of_file.
method input_records : string Queue.t -> unit
Reads some records and appends them to the passed queue. It is at least one record read. Will raise End_of_file if the end of the file is reached.
method close_in : unit -> unit
Releases resources (e.g. closes transactions)
method abort : unit -> unit
Drops resources - intended to be used for error cleanup
method to_fd_e : Unix.file_descr -> Unixqueue.event_system -> unit Uq_engines.engine
to_fd_e fd esys: The records are written to fd. The position pos_in is not updated. The length of the records is not checked except for a few records that are crucial for interpreting the boundaries of the bigblocks.

One is only allowed to use either to_fd_e or input_record, but not to switch between these APIs.

While the engine is running no other method must be called.

This method is only available if the underlying filesystem is PlasmaFS.

method to_dev_e : Uq_io.out_device -> Unixqueue.event_system -> unit Uq_engines.engine
Similar to to_fd_e, only that this method writes to an Uq_io.out_device.

This method is only available if the underlying filesystem is PlasmaFS.

method to_any_e : (Netsys_mem.memory -> int -> int -> unit Uq_engines.engine) ->
Unixqueue.event_system -> unit Uq_engines.engine
to_any_e dest esys: like to_fd_e but the data is not written to a file descriptor. Instead, the function dest is called like dest m pos len to output some data.

This is an experimental method! It might not be defined on every record reader.

This method is only available if the underlying filesystem is PlasmaFS.

method filesystem : Mapred_fs.filesystem
The filesystem
method record_config : record_config
The record_config
method stats : Mapred_stats.stats
returns statistics:
  • read_blocks: how many blocks have been read in
  • read_lines: how many lines have been processed (unavailable if to_fd_e is used)
  • read_bytes: how many bytes have been processed
  • read_fs_time: the time spent for waiting on the filesystem layer for new data

This web site is published by Informatikbüro Gerd Stolpmann
Powered by Caml