class type record_writer =object
..end
method output_record : string -> unit
method output_records : string Queue.t -> unit
method flush : unit -> unit
method from_fd_e : Unix.file_descr -> Unixqueue.event_system -> unit Uq_engines.engine
output_record
and flush
must
not be called.
The file must be line-structured. If the LF after the last line is missing it is silently added. The length of the lines is not checked.
One should only either use from_fd_e
or output_record
. When
mixing both styles, it is undefined which data is read by which
method.
Note that this method is not implemented for all writers!
In particular, this method is only available if the underlying
filesystem is PlasmaFS.
method from_dev_e : string Queue.t ->
Uq_io.in_bdevice ->
int64 option ->
int option -> Unixqueue.event_system -> bool Uq_engines.engine
from_dev_e q dev size_limit lines_limit esys
; Generalization of
from_fd_e
. In this version, the size and the number of
records can be limited that are read from the input device.
If a size_limit
is set, only up to this number of bytes
are read (not counting record framing). If a lines_limit
is set, only up to this number of records are read.
The data is read from the queue q
first, and then from
dev
.
When a limit is hit, it can happen that there are records
which have not been processed. These records are left behind
in q
.
The method returns true
if EOF is reached.
Note that this method is not implemented for all writers!
In particular, this method is only available if the underlying
filesystem is PlasmaFS.
method close_out : unit -> unit
method abort : unit -> unit
method filesystem : Mapred_fs.filesystem
method record_config : record_config
record_config
method stats : Mapred_stats.stats
write_blocks
: how many blocks have been written outwrite_lines
: how many lines have been processed (unavailable
if from_fd_e is used)write_bytes
: how many bytes have been processedwrite_fs_time
: the time spent for waiting on the filesystem layer
for writing out data