Plasma GitLab Archive
Projects Blog Knowledge

Module Netfs

module Netfs: sig .. end
Class type stream_fs for filesystems with stream access to files


The class type Netfs.stream_fs is an abstraction for both kernel-level and user-level filesystems. It is used as parameter for algorithms (like globbing) that operate on filesystems but do not want to assume any particular filesystem. Only stream access is provided (no seek).

File paths:

The filesystem supports hierarchical file names. File paths use Unix conventions, i.e.

  • / is the root
  • Path components are separated by slashes. Several consecutive slashes are allowed but mean the same as a single slash.
  • . is the same directory
  • .. is the parent directory
All paths need to be absolute (i.e. start with /).

There can be additional constraints on paths:

  • Character encoding restriction: A certain ASCII-compatible character encoding is assumed (including UTF-8)
  • Character exclusion: Certain characters may be excluded
Implementations may impose more constraints that cannot be expressed here (case insensitivity, path length, exclusion of special names etc.).

Virtuality:

There is no assumption that / is the real root of the local filesystem. It can actually be anywhere - a local subdirectory, or a remote directory, or a fictive root. There needs not to be any protection against "running beyond root", e.g. with the path /...

This class type also supports remote filesystems, and thus there is no concept of file handle (because this would exclude a number of implementations).

Errors:

Errors should generally be indicated by raising Unix_error. For many error codes the interpretation is already given by POSIX. Here are some more special cases:

  • EINVAL: should also be used for invalid paths, or when a flag cannot be supported (and it is non-ignorable)
  • ENOSYS: should also be used if an operation is generally unavailable
In case of hard errors (like socket errors when communicating with the remote server) there is no need to stick to Unix_error, though.

Subtyping:

The class type Netfs.stream_fs is subtypable, and subtypes can add more features by:

  • adding more methods
  • adding more flags to existing methods
Omitted:

Real filesystems usually provide a lot more features than what is represented here, such as:

  • Access control and file permissions
  • Metadata like timestamps
  • Random access to files
This definition here is intentionally minimalistic. In the future this class type will be extended, and more more common filesystem features will be covered. See Netfs.empty_fs for a way how to ensure that your definition of a stream_fs can still be built after stream_fs has been extended.

The class type stream_fs


type read_flag = [ `Binary | `Dummy | `Skip of int64 | `Streaming ] 
type read_file_flag = [ `Binary | `Dummy ] 
type write_flag = [ `Binary | `Create | `Dummy | `Exclusive | `Streaming | `Truncate ] 
type write_file_flag = [ `Binary | `Create | `Dummy | `Exclusive | `Link | `Truncate ] 
type write_common = [ `Binary | `Create | `Dummy | `Exclusive | `Truncate ] 
The intersection of write_flag and write_file_flag
type size_flag = [ `Dummy ] 
type test_flag = [ `Dummy | `Link ] 
type remove_flag = [ `Dummy | `Recursive ] 
type rename_flag = [ `Dummy ] 
type symlink_flag = [ `Dummy ] 
type readdir_flag = [ `Dummy ] 
type readlink_flag = [ `Dummy ] 
type mkdir_flag = [ `Dummy | `Nonexcl | `Path ] 
type rmdir_flag = [ `Dummy ] 
type copy_flag = [ `Dummy ] 

Note `Dummy: this flag is always ignored. There are two reasons for having it:
  • Ocaml does not allow empty variants
  • it is sometimes convenient to have it (e.g. in: if <condition> then `Create else `Dummy)

type test_type = [ `D | `E | `F | `H | `N | `R | `S | `W | `X ] 
Tests:
  • `N: the file name exists
  • `E: the file exists
  • `D: the file exists and is a directory
  • `F: the file exists and is regular
  • `H: the file exists and is a symlink (possibly to a non-existing target)
  • `R: the file exists and is readable
  • `W: the file exists and is writable
  • `X: the file exists and is executable
  • `S: the file exists and is non-empty

class type local_file = object .. end
class type stream_fs = object .. end
class empty_fs : string -> stream_fs
This is a class where all methods fail with ENOSYS.
val local_fs : ?encoding:Netconversion.encoding ->
?root:string -> ?enable_relative_paths:bool -> unit -> stream_fs
local_fs(): Returns a filesystem object for the local filesystem.

  • encoding: Specifies the character encoding of paths. The default is system-dependent.
  • root: the root of the returned object is the directory root of the local filesystem. If omitted, the root is the root of the local filesystem (i.e. / for Unix, and see comments for Windows below). Use root="." to make the current working directory the root. Note that "." like other relative paths are interpreted at the time when the access method is executed.
  • enable_relative_paths: Normally, only absolute paths can be passed to the access methods like read. By setting this option to true one can also enable relative paths. These are taken relative to the working directory, and not relative to root. Relative names are off by default because there is usually no counterpart in network filesystems.


OS Notes



Unix in general: There is no notion of character encoding of paths. Paths are just bytes. Because of this, the default encoding is None. If a different encoding is passed to local_fs, these bytes are just interpreted in this encoding. There is no conversion.

For desktop programs, though, usually the character encoding of the locale is taken for filenames. You can get this by passing

    let encoding = 
      Netconversion.user_encoding()
    

as encoding argument.

Windows: If the root argument is not passed to local_fs it is possible to access the whole filesystem:

  • Paths starting with drive letters like c:/ are also considered as absolute
  • Additionally, paths starting with slashes like /c:/ mean the same
  • UNC paths starting with two slashes like //hostname are supported
However, when a root directory is passed, these additional notations are not possible anymore - paths must start with /, and there is neither support for drive letters nor for UNC paths.

The encoding arg defaults to current ANSI codepage, and it is not supported to request a different encoding. (The difficulty is that the Win32 bindings of the relevant OS functions always assume the ANSI encoding.)

There is no support for backslashes as path separators (such paths will be rejected), for better compatibility with other platforms.



List:

  • Http_fs allows one to access HTTP-based filesystems
  • Ftp_fs allows on to access filesystems via FTP
  • Shell_fs allows one to access filesystems by executing shell commands. This works locally and via ssh.
There are even some implementations outside Ocamlnet:
  • Webdav provides an extension of Http_fs for the full WebDAV set of filesystem operations


Algorithms


val copy : ?replace:bool ->
?streaming:bool ->
#stream_fs -> string -> #stream_fs -> string -> unit
copy orig_fs orig_name dest_fs dest_name: Copies the file orig_name from orig_fs to the file dest_name in dest_fs. By default, the destination file is truncated and overwritten if it already exists.

If orig_fs and dest_fs are the same object, the copy method is called to perform the operation. Otherwise, the data is read chunk by chunk from the file in orig_fs and then written to the destination file in dest_fs.

Symlinks are resolved, and the linked file is copied, not the link as such.

The copy does not preserve ownerships, file permissions, or timestamps. (The stream_fs object does not represent these.) There is no protection against copying an object to itself.

  • replace: If set, the destination file is removed and created again if it already exists
  • streaming: use streaming mode for reading and writing files

val copy_into : ?replace:bool ->
?subst:(int -> string) ->
?streaming:bool ->
#stream_fs -> string -> #stream_fs -> string -> unit
copy_into orig_fs orig_name dest_fs dest_name: Like copy, but this version also supports recursive copies. The dest_name must be an existing directory, and the file or tree at orig_name is copied into it.

Symlinks are copied as symlinks.

If replace and the destination file/directory already exists, it is deleted before doing the copy.

  • subst: See Netfs.convert_path
  • streaming: use streaming mode for reading and writing files

type file_kind = [ `Directory | `None | `Other | `Regular | `Symlink ] 
val iter : pre:(string -> file_kind -> file_kind -> unit) ->
?post:(string -> unit) -> #stream_fs -> string -> unit
iter pre fs start: Iterates over the file hierarchy at start. The function pre is called for every filename. The filenames passed to pre are relative to start. The start must be a directory.

For directories, the pre function is called for the directory before it is called for the members of the directories. The function post can additionally be passed. It is only called for directories, but after the members.

pre is called as pre rk lk where rk is the file kind after following symlinks and lk the file kind without following symlinks (the link itself).

Example: iter pre fs "/foo" would call

  • pre "dir" `Directory `Directory (meaning the directory "/foo/dir")
  • pre "dir/file1" `File `File
  • pre "dir/file2" `File `Symlink
  • post "dir"
Note: symlinks to non-existing files are reported as pre name `None `Symlink.
val convert_path : ?subst:(int -> string) ->
#stream_fs -> #stream_fs -> string -> string
convert_path oldfs newfs oldpath: The encoding of oldpath (which is assumed to reside in oldfs) is converted to the encoding of newfs and returned.

It is possible that the conversion is not possible, and the function subst is then called with the problematic code point as argument (in the encoding of oldfs). The default subst function just raises Netconversion.Cannot_represent.

If one of the filesystem objects does not specify an encoding, the file name is not converted, but simply returned as-is. This may result in errors when newfs has an encoding while oldfs does not have one because the file name might use byte representations that are illegal in newfs.

This web site is published by Informatikbüro Gerd Stolpmann
Powered by Caml