module Pxp_reader:Resolving identifiers and associating resourcessig
..end
Pxp_reader
module allows you to exactly
specify how external identifiers (SYSTEM
or PUBLIC
) are mapped to
files or channels. This is normally only necessary for advanced
configurations, as the built-in functions Pxp_types.from_file
,
Pxp_types.from_channel
, and Pxp_types.from_string
often suffice.
There are two ways to use this module. First, you can compose the
desired behaviour by combining several predefined resolver objects
or functions. See the example section at the end of the file.
Second, you can inherit from the classes (or define a resolver class
from scratch). I hope this is seldom necessary as this way is much
more complicated; however it allows you to implement any required magic.
Types and exceptions
exception Not_competent
open_in
method if the object does not know how to
handle the passed external ID.exception Not_resolvable of exn
Not_resolvable(Not_found)
serves as indicator for an unknown reason.type
lexer_source = {
|
lsrc_lexbuf : |
|
lsrc_unicode_lexbuf : |
resolver
class type
The class type resolver
is the official type of all "resolvers".
Resolvers take file names (or better, external identifiers) and
return lexbufs, scanning the file for tokens. Resolvers may be
cloned, and clones can interpret relative file names relative to
their creator.
Example of cloning:
Given resolver r
reads from file:/dir/f1.xml
this text:
<tag>some XML text &e; </tag>
The task is to switch to a resolver for reading from the entity
e
(which is referenced by &e;
), and to switch back to the original
resolver when the parser is done with e
. Let us assume that e
has the SYSTEM
ID subdir/f2.xml
. Our approach is to first create
a clone of the original resolver so that we can do the switch to e
in a copy. That means switching back is easy: We give up the cloned
resolver, and continue with the original, unmodified resolver.
This gives us the freedom to modify the clone in order to switch
to e
. We do this by changing the input file:
let r' =
<create clone of r
>r'
to open the file subdir/f2.xml
>r'
must still know the directory of the file r
is reading, otherwise
it would not be able to resolve subdir/f2.xml
, which expands to
file:/dir/subdir/f2.xml
.
Actually, this example can be coded as:
let r = new resolve_as_file in
let lbuf = r # open_in "file:/dir/f1.xml" in
... read from lbuf ...
let r' = r # clone in
let lbuf' = r' # open_in "subdir/f2.xml" in
... read from lbuf' ...
r' # close_in;
... read from lbuf ...
r # close_in;
class type resolver =object
..end
typeaccepted_id =
Netchannels.in_obj_channel * Pxp_core_types.I.encoding option *
Pxp_core_types.I.resolver_id option
in_obj_channel
is the channel to read data from, the encoding option
may enforce a certain character encoding, and the resolver_id
option
may detail the ID (this ID will be returned by active_id
).
If None
is passed as encoding option, the standard autodetection of
the encoding is performed.
If None
is passed as resolver_id
option, the original ID is taken
unchanged.
class resolve_to_this_obj_channel :?id:Pxp_core_types.I.ext_id -> ?rid:Pxp_core_types.I.resolver_id -> ?fixenc:Pxp_core_types.I.encoding -> ?close:Netchannels.in_obj_channel -> unit -> Netchannels.in_obj_channel ->
resolver
in_obj_channel
.
class resolve_to_any_obj_channel :?close:Netchannels.in_obj_channel -> unit -> channel_of_id:(Pxp_core_types.I.resolver_id -> accepted_id) -> unit ->
resolver
channel_of_id
to open a new channel for
the passed resolver_id
.
class resolve_to_url_obj_channel :?close:Netchannels.in_obj_channel -> unit -> url_of_id:(Pxp_core_types.I.resolver_id -> Neturl.url) -> base_url_of_id:(Pxp_core_types.I.resolver_id -> Neturl.url) -> channel_of_url:(Pxp_core_types.I.resolver_id -> Neturl.url -> accepted_id) -> unit ->
resolver
url_of_id
to get the corresponding URL (such IDs are normally
system IDs, but it is also possible to other kinds of IDs to URLs).
class resolve_as_file :?file_prefix:[ `Allowed | `Not_recognized | `Required ] -> ?host_prefix:[ `Allowed | `Not_recognized | `Required ] -> ?system_encoding:Pxp_core_types.I.encoding -> ?map_private_id:Pxp_core_types.I.private_id -> Neturl.url -> ?open_private_id:Pxp_core_types.I.private_id ->
Pervasives.in_channel * Pxp_core_types.I.encoding option -> ?base_url_defaults_to_cwd:bool -> ?not_resolvable_if_not_found:bool -> unit ->resolver
val make_file_url : ?system_encoding:Pxp_core_types.I.encoding ->
?enc:Pxp_core_types.I.encoding -> string -> Neturl.url
Sys.getcwd()
to the passed file name.
system_encoding
: Specifies the encoding of file names of
the local file system. Default: UTF-8. (This argument is
necessary to interpret Sys.getcwd()
correctly.)
enc
: The encoding of the passed string. Defaults to `Enc_utf8
Note: To get a string representation of the URL, apply
Neturl.string_of_url
to the result.
class lookup_id :(Pxp_core_types.I.ext_id * resolver) list ->
resolver
class lookup_id_as_file :?fixenc:Pxp_core_types.I.encoding -> (Pxp_core_types.I.ext_id * string) list ->
resolver
(xid,file)
mapping external IDs xid
to files.
class lookup_id_as_string :?fixenc:Pxp_core_types.I.encoding -> (Pxp_core_types.I.ext_id * string) list ->
resolver
(xid,s)
mapping external IDs xid
to strings s
.
class lookup_public_id :(string * resolver) list ->
resolver
PUBLIC
id catalog resolvers: The
list (catalog)
argument specifies pairs (pubid, r)
mapping PUBLIC
identifiers to
subresolvers.
class lookup_public_id_as_file :?fixenc:Pxp_core_types.I.encoding -> (string * string) list ->
resolver
PUBLIC
identifiers.
class lookup_public_id_as_string :?fixenc:Pxp_core_types.I.encoding -> (string * string) list ->
resolver
PUBLIC
identifiers.
class lookup_system_id :(string * resolver) list ->
resolver
(url, r)
mapping URL's identifiers to
subresolvers.
class lookup_system_id_as_file :?fixenc:Pxp_core_types.I.encoding -> (string * string) list ->
resolver
(url, filename)
mapping URL's to filenames.
class lookup_system_id_as_string :?fixenc:Pxp_core_types.I.encoding -> (string * string) list ->
resolver
(url, text)
mapping URL's to XML text (which must
begin with <?xml ...?>
).
class norm_system_id :resolver ->
resolver
class rewrite_system_id :?forward_unmatching_urls:bool -> (string * string) list -> resolver ->
resolver
type
combination_mode =
| |
Public_before_system |
| |
System_before_public |
class combine :?mode:combination_mode -> resolver list ->
resolver
val set_debug_mode : bool -> unit