Plasma GitLab Archive
Projects Blog Knowledge

Class type Mapred_io.record_writer


class type record_writer = object .. end

method output_record : string -> unit
Outputs these records. The records must not contain LF chars. The length is limited to the blocksize-1.
method output_records : string Queue.t -> unit
Outputs the records in the queue. The queue is empty when this method returns.
method flush : unit -> unit
Flushes the records from the buffer to disk
method from_fd_e : Unix.file_descr -> Unixqueue.event_system -> unit Uq_engines.engine
Outputs the records coming from this file. The file is read until EOF. While the engine is running output_record and flush must not be called.

The file must be line-structured. If the LF after the last line is missing it is silently added. The length of the lines is not checked.

One should only either use from_fd_e or output_record. When mixing both styles, it is undefined which data is read by which method.

Note that this method is not implemented for all writers! In particular, this method is only available if the underlying filesystem is PlasmaFS.

method from_dev_e : string Queue.t ->
Uq_io.in_bdevice ->
int64 option ->
int option -> Unixqueue.event_system -> bool Uq_engines.engine
from_dev_e q dev size_limit lines_limit esys; Generalization of from_fd_e. In this version, the size and the number of records can be limited that are read from the input device. If a size_limit is set, only up to this number of bytes are read (not counting record framing). If a lines_limit is set, only up to this number of records are read.

The data is read from the queue q first, and then from dev.

When a limit is hit, it can happen that there are records which have not been processed. These records are left behind in q.

The method returns true if EOF is reached.

Note that this method is not implemented for all writers! In particular, this method is only available if the underlying filesystem is PlasmaFS.

method close_out : unit -> unit
Releases resources (e.g. closes transactions)
method abort : unit -> unit
Drops resources - intended to be used for error cleanup
method filesystem : Mapred_fs.filesystem
The filesystem
method record_config : record_config
The record_config
method stats : Mapred_stats.stats
returns statistics:
  • write_blocks: how many blocks have been written out
  • write_lines: how many lines have been processed (unavailable if from_fd_e is used)
  • write_bytes: how many bytes have been processed
  • write_fs_time: the time spent for waiting on the filesystem layer for writing out data

This web site is published by Informatikbüro Gerd Stolpmann
Powered by Caml