Plasma GitLab Archive
Projects Blog Knowledge

(* $Id: pxp_document.mli,v 1.23 2001/12/03 23:46:29 gerd Exp $
 * ----------------------------------------------------------------------
 * PXP: The polymorphic XML parser for Objective Caml.
 * Copyright by Gerd Stolpmann. See LICENSE for details.
 *)

(**********************************************************************)
(*                                                                    *)
(* Pxp_document:                                                      *)
(*     Object model of the document/element instances                 *)
(*                                                                    *)
(**********************************************************************)

(* QUESTIONS:
 * - T_attribute of (string * att_value)
 *   may be better. Attributes do not have attributes (XPATH?)
 *)


(* ======================================================================
 * OVERVIEW
 *
 * class type node ............. The common class type of the nodes of
 *                               the element tree. Nodes are either
 *                               elements (inner nodes) or data nodes
 *                               (leaves)
 * class type extension ........ The minimal properties of the so-called
 *                               extensions of the nodes: Nodes can be
 *                               customized by applying a class parameter
 *                               that adds methods/values to nodes.
 * class data_impl : node ...... Implements data nodes.
 * class element_impl : node ... Implements element nodes
 * class document .............. A document is an element with some additional
 *                               properties
 *
 * ======================================================================
 *
 * THE STRUCTURE OF NODE TREES:
 *
 * Every node except the root node has a parent node. The parent node is
 * always an element, because data nodes never contain other nodes.
 * In the other direction, element nodes may have children; both elements
 * and data nodes are possible as children.
 * Every node knows its parent (if any) and all its children (if any);
 * the linkage is maintained in both directions. A node without a parent
 * is called a root.
 * It is not possible that a node is the child of two nodes (two different nodes
 * or a multiple child of the same node).
 * You can break the connection between a node and its parent; the method
 * "delete" performs this operations and deletes the node from the parent's
 * list of children. The node is now a root, for itself and for all
 * subordinate nodes. In this context, the node is also called an orphan,
 * because it has lost its parent (this is a bit misleading because the
 * parent is not always the creator of a node).
 * In order to simplify complex operations, you can also set the list of
 * children of an element. Nodes that have been children before are unchanged;
 * new nodes are added (and the linkage is set up), nodes no more occurring
 * in the list are handled if they have been deleted.
 * If you try to add a node that is not a root (either by an "add" or by a
 * "set" operation) the operation fails.
 *
 * CREATION OF NODES
 *
 * The class interface supports creation of nodes by cloning a so-called
 * exemplar. The idea is that it is sometimes useful to implement different
 * element types by different classes, and to implement this by looking up
 * exemplars.
 * Imagine you have three element types A, B, and C, and three classes
 * a, b, and c implementing the node interface (for example, by providing
 * different extensions, see below). The XML parser can be configured to
 * have a lookup table
 *   { A --> a0,  B --> b0, C --> c0 }
 * where a0, b0, c0 are exemplars of the classes a, b, and c, i.e. empty
 * objects belonging to these classes. If the parser finds an instance of
 * A, it looks up the exemplar a0 of A and clones it (actually, the method
 * "create_element" performs this for elements, and "create_data" for data
 * nodes). Clones belong to the same class as the original nodes, so the
 * instances of the elements have the same classes as the configured
 * exemplars.
 * Note: This technique assumes that the interface of all exemplars is the
 * same!
 *
 * THE EXTENSION
 *
 * The class type node and all its implementations have a class parameter
 * 'ext which must at least fulfil the properties of the class type "extension".
 * The idea is that you can add properties, for example:
 *
 * class my_extension =
 *   object
 *     (* minimal properties required by class type "extension": *)
 *     method clone = ...
 *     method node = ...
 *     method set_node n = ...
 *     (* here my own methods: *)
 *     method do_this_and_that ...
 *   end
 *
 * class my_element_impl = [ my_extension ] element_impl
 * class my_data_impl    = [ my_extension ] data_impl
 *
 * The whole XML parser is parameterized with 'ext, so your extension is
 * visible everywhere (this is the reason why extensibility is solved by
 * parametric polymorphism and not by inclusive polymorphism (subtyping)).
 *
 *
 * SOME COMPLICATED TYPE EXPRESSIONS
 *
 * Sometimes the following type expressions turn out to be necessary:
 *
 * 'a node extension as 'a
 *      This is the type of an extension that belongs to a node that
 *      has an extension that is the same as we started with.
 *
 * 'a extension node as 'a
 *      This is the type of a node that has an extension that belongs to a
 *      node of the type we started with.
 *
 *
 * DOCUMENTS
 * ...
 *
 * ======================================================================
 *
 * SIMPLE USAGE: ...
 *)

(* ======================================================================
 * THE DYNAMIC MODIFICATION OF NODE TREES AND VALIDATION
 * ======================================================================
 *
 * The parser creates a node tree while parsing the input text, and the
 * node tree can be modified later by some transformation algorithm. For
 * both tasks the same interface may be used. However, PXP 1.0 introduced
 * an interface that did not separate the two aspects "modification of the
 * tree" and "validation of the tree", i.e. modification methods also
 * did some validation. The following two sections describe: The PXP 1.0
 * model, and the PXP 1.1 changes.
 *
 * -------
 * PXP 1.0
 * -------
 *
 * Method add_node: There are two different modes selected by the optional
 *   argument ~force. ~force:true simply adds the node as last child to the
 *   current node. However, ~force:false (the default) performs some validation
 *   checks that may have three results: (1) The node is added, (2) The node
 *   is silently dropped, (3) An error condition is detected, and an exception
 *   is raised. The mode ~force:false is used by the parser, and historically,
 *   add_node was designed as the parser's method of adding new nodes; ~force
 *   was added later.
 * 
 *   The checks are only performed if the added node is a text node (node type
 *   is T_data), and if the current element node has a type restricting the
 *   addition of text nodes. In detail, the following is checked:
 *    - If the element has type EMPTY, the addition of whitespace text is
 *      not rejected, but the text is dropped (case 2). The addition of
 *      other text material is an error (case 3).
 *    - If the element has a regexp type, the addition of whitespace text is
 *      not rejected, but the text is dropped (case 2); however there is 
 *      a special mode forcing to add such whitespace text nodes (see below).
 *      The addition of other text material is an error (case 3).
 *      Furthermore, it is also an error if whitespace text is added, and the
 *      document is stand-alone, and the element is declared in an external
 *      entity.
 *
 * Method keep_always_whitespace_mode: turns a special mode on forcing that
 *   whitespace text nodes inside regexp-type elements are always added.
 *
 * Method internal_init (i.e. object creation): When an element node is
 *   created, the attribute list is passed as (string * string) list. This
 *   method compares this list with the declared attlist of the DTD, and
 *    - adds missing attributes if the DTD has a default value
 *    - rejects nondeclared attributes
 *    - checks whether required attributes are passed
 *    - parses and normalizes attribute values
 *    - checks some conditions for stand-alone documents
 *
 * Method local_validate: Checks whether the subnodes of the element match
 *   the type of the element.
 *
 * ---------------------
 * PROBLEMS WITH PXP 1.0
 * ---------------------
 *
 * - It is not very obvious when validation checks are performed (which
 *   methods do them and under which conditions)
 * - It is difficult to transform trees because the transformation algorithm
 *   might call a modification method that also performs some validation checks,
 *   but the tree is not yet valid because the algorithm is in the middle of
 *   the transformation
 *
 * -------
 * PXP 1.1
 * -------
 *
 * New method append_node: always adds the node to the node list (same as
 *   add_node ~force:true)
 *
 * New method classify_data_node: performs the checks of add_node ~force:false,
 *   and returns the result:
 *     - CD_other: The node to add is not a text node and cannot be classified
 *     - CD_normal: The text node can be added
 *     - CD_empty: The node is ignorable (= empty), and the containing
 *       element is declared as EMPTY. The parser must not add the node.
 *     - CD_ignorable: The node contains ignorable whitespace, and the parser
 *       should not add the node unless a special configuration forces the
 *       addition
 *     - CD_error: the rules do not allow to add the text node here
 *
 * Method add_node: is now deprecated. For compatibility, the method
 *   classifies the node to add, and decides whether to add the node, not
 *   to add the node, or whether to raise an exception.
 *
 * Method keep_always_whitespace_mode: is removed. A new parser option 
 *   modifies the behaviour of the parser such that ignorable whitespace is
 *   added anyway (option drop_ignorable_whitespace = false).
 *
 * Object creation: You can pass attributes as (string * string) list, and
 *   as (string * att_value) list; internal_init simply processes both lists.
 *   Attributes passed as att_value are already normalized (and compatible
 *   with the stand-alone declaration, if any); the method does not normalize
 *   them again. Two options control validation:
 *     ~valcheck: (default true) It is checked that there
 *         is an element type declaration, or that the DTD is in well-formed
 *         mode. Passing 'false' means that it is not checked whether the
 *         element type exists, that you can add any attributes.
 *
 * New method validate_contents: The new name for local_validate; it is
 *   checked whether the elements contained in the list of sub nodes match
 *   the declared content model. 
 * (The name local_validate is deprecated.)
 *
 * New method validate_attlist: Checks whether the attlist matches the
 *   ATTLIST declaration.
 *   (Impl.: Call create_element again with valcheck options ON.)
 *
 * New method validate: This method can be called after manual modifications
 *   of the tree to ensure that the changed tree is still valid:
 *    - All text subnodes must be classified as non-errorneous
 *    - All element subnodes are validated by validate_subelements
 *    - The attributes are validated by validate_attlist
 *   Note that this method is not used by the parser.
 *)

open Pxp_dtd


type node_type =
    T_element of string
  | T_data
  | T_super_root                        (* XPath calls them simply root nodes *)
  | T_pinstr of string                  (* The string is the target of the PI *)
  | T_comment
  | T_none
  | T_attribute of string          (* The string is the name of the attribute *)
  | T_namespace of string            (* The string is the namespace srcprefix *)
  (* <ID:type-node-type>
   * <TYPE:type>
   * <CALL>   [node_type]
   * <SIG>    AUTO
   * <DESCR>  This type enumerates the possible node types:
   *   - [T_element name]: The node is an element and has element type [name]
   *   - [T_data]: The node is a data node
   *   - [T_super_root]: The node is a super root node
   *   - [T_pinstr name]: The node contains a processing instruction with
   *     target [name]
   *   - [T_comment]: The node is a comment
   *   - [T_attribute name]: The node contains an attribute called [name]
   *   - [T_namespace prefix]: The node identifies a namespace for the
   *     [prefix]
   *   - [T_none]: This is a "bottom value" used if there is no reasonable
   *     type.
   *     --
   * </ID>
   *)

  (* About T_super_root, T_pinstr, T_comment: 
   * These types are extensions to my original design. They have mainly
   * been added to simplify the implementation of standards (such as
   * XPath) that require that nodes of these types are included into the
   * main document tree.
   * There are options (see Pxp_yacc) forcing the parser to insert such
   * nodes; in this case, the nodes are actually element nodes serving
   * as wrappers for the additional data structures. The options are:
   * enable_super_root_node, enable_pinstr_nodes, enable_comment_nodes.
   * By default, such nodes are not created.
   *)

  (* About T_attribute, T_namespace:
   * These types are fully virtual. This means that it is impossible
   * to make the parser insert such nodes into the regular tree. They are
   * normally created by special methods to allow additional views on the
   * document tree.
   *)


(* The result type of the method classify_data_node: *)
type data_node_classification =
    CD_normal
  | CD_other
  | CD_empty
  | CD_ignorable
  | CD_error of exn
  (* <ID:type-data-node-classification>
   * <CALL>   [data_node_classification]
   * <SIG>    AUTO
   * <DESCR>  This type enumerates the result values of the method
   *   [classify_data_node]. See the description of this method.
   * </ID>
   *)


(* QUESTION: Perhaps we should reexport att_value here. It is the only
 * type from Pxp_types that is needed regularly.
 *)


(* Regular definition: *)


class type [ 'node ] extension =
  object ('self)
    method clone : 'self
      (* "clone" should return an exact deep copy of the object. *)
    method node : 'node
      (* "node" returns the corresponding node of this extension. This method
       * intended to return exactly what previously has been set by "set_node".
       *)
    method set_node : 'node -> unit
      (* "set_node" is invoked once the extension is associated to a new
       * node object.
       *)
  end
;;


class type [ 'ext ] node =
  (* <ID:type-node>
   * <CALL>   'ext [node]
   * <SIG>    class type 'ext node = object ... end
   * <DESCR>  This is the common class type of all classes representing 
   *    nodes.
   *
   *    Not all classes implement all methods. As the type system of O'Caml
   *    demands that there must be always a method definition for all
   *    methods of the type, methods will raise the exception
   *    [Method_not_applicable] if they are called on a class not supporting
   *    them. The exception [Namespace_method_not_applicable] is reserved
   *    for the special case that a namespace method is invoked on a 
   *    class that does not support namespaces.
   * <SEE> sig-class-type-node
   * </ID>
   *)

  object ('self)
    constraint 'ext = 'ext node #extension

    method extension : 'ext
      (* <ID:type-node-extension>
       * <TYPE:method>
       * <CALL>   obj # [extension]
       * <SIG>    AUTO
       * <DESCR>  Returns the extension object of the node object [obj].
       * <DOMAIN> Applicable to element, data, comment, processing instruction,
       *   and super root nodes.
       * </ID>
       *)

    method remove : unit -> unit
      (* <ID:type-node-remove>
       * <CALL>   obj # [remove] ()
       * <SIG>    AUTO
       * <DESCR>  Removes [obj] from the tree. After this
       *    operation, [obj] is no longer the child of the former father node,
       *    i.e. it does neither occur in the former father's list of children
       *    nor is the former father the parent of [obj]. The node [obj]
       *    becomes orphaned.
       *
       *    If [obj] is already a root, [remove] does nothing.
       *
       *    Note: This method does not check whether the modified XML tree
       *    is still valid.
       * <DOMAIN>  Elements, comments, processing instructions, data nodes,
       *   super root nodes.
       * <SEE>     node-delete
       * </ID>
       *)

    method delete : unit
      (* DEPRECATED METHOD
       * remove() does exactly the same
       *)

    method remove_nodes : ?pos:int -> ?len:int -> unit -> unit
      (* <ID:type-node-remove-nodes>
       * <CALL>   obj # [remove_nodes] ~pos ~len ()
       * <SIG>    AUTO
       * <DESCR>  Removes the specified nodes from the list of children of
       *    [obj]. The method deletes the nodes from position [pos] to 
       *    [pos+len-1]. The optional argument [pos] defaults to 0. The 
       *    optional argument [len] defaults to the length of the children
       *    list.
       *
       *    Note: This method does not check whether the modified XML tree
       *    is still valid.
       * <DOMAIN> Elements.
       * </ID>
       *)

    method parent : 'ext node
      (* <ID:type-node-parent>
       * <CALL>   obj # [parent]
       * <SIG>    AUTO
       * <DESCR>  Get the parent node, or raise [Not_found] if this node is
       *   a root node. For attribute and namespace nodes, the parent is
       *   artificially defined as the element to which these nodes apply.
       * <DOMAIN> All node types.
       * </ID>
       *)

    method root : 'ext node
      (* <ID:type-node-root>
       * <CALL>   obj # [root]
       * <SIG>    AUTO
       * <DESCR>  Gets the root node of the tree.
       *   Every node is contained in a tree with a root, so this method always 
       *   succeeds. Note that this method searches the root,
       *   which costs time proportional to the length of the path to the root.
       * <DOMAIN> All node types.
       * </ID>
       *)

    method orphaned_clone : 'self
      (* <ID:type-node-orphaned-clone>
       * <CALL>   obj # [orphaned_clone]
       * <SIG>    AUTO
       * <DESCR>  Returns a clone of the node and the complete tree below
       *    this node (deep clone). The clone does not have a parent (i.e. the
       *    reference to the parent node is not cloned). While copying the
       *    subtree strings are skipped; normally the original tree and the
       *    copy tree share strings. Extension objects are cloned by invoking
       *    the [clone] method on the original objects; how much of
       *    the extension objects is cloned depends on the implemention of
       *    this method.
       * <DOMAIN> All node types.
       * <SEE> node-clone
       * </ID>
       *)

    method orphaned_flat_clone : 'self
      (* <ID:type-node-orphaned-flat-clone>
       * <CALL>   obj # [orphaned_flat_clone]
       * <SIG>    AUTO
       * <DESCR>  return a clone of this element where all subnodes are omitted.
       *     The type of the node, and the attributes are the same as in the
       *     original node. The clone has no parent.
       * <DOMAIN> All node types.
       * </ID>
       *)

    method append_node : 'ext node -> unit
      (* <ID:type-node-append-node>
       * <CALL>   obj # [append_node] n
       * <SIG>    AUTO
       * <DESCR>  Adds the node [n] to the list of children of [obj]. The 
       *   method expects that [n] is a root, and it requires that [n] and
       *   [obj] share the same DTD.
       *
       *   Note: This method does not check whether the modified XML tree
       *   is still valid.
       * <DOMAIN> This method is only applicable to element nodes.
       * <SEE> node-add
       * </ID>
       *)

    method classify_data_node : 'ext node -> data_node_classification
      (* <ID:type-node-classify-data-node>
       * <CALL>   obj # [classify_data_node] n
       * <SIG>    AUTO
       * <DESCR>  Classifies the passed data node [n], and returns whether it
       *      is reasonable to append the data node to the list of subnodes
       *     (using [append_node]). The following return values are possible:
       *      - [CD_normal]: Adding [n] does not violate any validation 
       *        constraint
       *      - [CD_other]: [n] is not a data node
       *      - [CD_empty]: The element [obj] is declared as [EMTPY], and
       *        [n] contains the empty string. It is allowed to append
       *        [n] but it does not make sense
       *      - [CD_ignorable]: The element [obj] is declared such that
       *        it is forbidden to put character data into it. However,
       *        the node [n] only contains white space which is allowed
       *        as an exception to this rule. This means that it is allowed
       *        to append [n] but [n] would not contain any information
       *        except formatting hints.
       *      - [CD_error e]: It is an error to append [n]. The exception
       *        [e], usually a [Validation_error], contains details about
       *        the problem.
       *      --
       * Note that the method always returns and never raises an exception.
       * <DOMAIN> Elements.
       * </ID>
       *)

    method add_node : ?force:bool -> 'ext node -> unit
      (* add_node is now DEPRECATED; use append_node instead! *)
      (* Append new sub nodes -- mainly used by the parser itself, but
       * of course open for everybody. If an element is added, it must be
       * an orphan (i.e. does not have a parent node); and after addition
       * *this* node is the new parent.
       * The method performs some basic validation checks if the current node
       * has a regular expression as content model, or is EMPTY. You can
       * turn these checks off by passing ~force:true to the method.
       *)

    method insert_nodes : ?pos:int -> 'ext node list -> unit
      (* <ID:type-node-insert-nodes>
       * <CALL>   obj # [insert_nodes] ~pos nl
       * <SIG>    AUTO
       * <DESCR>  Inserts the list of nodes [nl] in-place into the list of
       *    children of [obj]. The insertion is performed at position [pos],
       *    i.e. in the modified list of children, the first element of
       *    [nl] will have position [pos]. If the optional argument [pos]
       *    is not passed to the method, the list [nl] is appended
       *    to the list of children. 
       *
       *    The method requires that all elements of
       *    the list [nl] are roots, and that all elements and [obj]
       *    share the same DTD.
       *
       *    Note: This method does not check whether the modified XML tree
       *    is still valid.
       * <DOMAIN>  Elements.
       * </ID>
       *)

    method set_nodes : 'ext node list -> unit
      (* <ID:type-node-set-nodes>
       * <CALL>   obj # [set_nodes] l
       * <SIG>    AUTO
       * <DESCR>  Sets the list of children to [l]. It is required that
       *     every member of [l] is either a root or was already a children
       *     of this node before the method call, and it is required that 
       *     all members and the current object share the same DTD.
       *
       *     Former children which are not members of [l] are removed from
       *     the tree and get orphaned (see method [remove]).
       *
       *    Note: This method does not check whether the modified XML tree
       *    is still valid.
       * <DOMAIN>  Elements.
       * </ID>
       *)

    method add_pinstr : proc_instruction -> unit
      (* <ID:type-node-add-pinstr>
       * <CALL>   obj # [add_pinstr] pi
       * <SIG>    AUTO
       * <DESCR>  Adds the processing instruction [pi] to the set of
       *   processing instructions contained in [obj]. If [obj] is an
       *   element node, you can add any number of processing instructions.
       *   If [obj] is a processing instruction node, you can put at most
       *   one processing instruction into this node.
       * <DOMAIN> Elements, and processing instruction nodes.
       * </ID>
       *)

    method pinstr : string -> proc_instruction list
      (* <ID:type-node-pinstr>
       * <CALL>   obj # [pinstr] n
       * <SIG>    AUTO
       * <DESCR>  Returns all processing instructions that are
       *   directly contained in [obj] and that have a target
       *   specification of [n].
       * <DOMAIN> All node types. However, this method is only reasonable
       *   for processing instruction nodes, and for elements; for all
       *   other node types the method will return the empty list. Note
       *   that the parser can be configured such that it creates 
       *   processing instruction nodes or not; in the first case, only
       *   the processing instruction nodes contain processing instruction,
       *   in the latter case, only the elements embracing the instructions
       *   contain them.
       * </ID>
       *)

    method pinstr_names : string list
      (* <ID:type-node-pinstr-names>
       * <CALL>   obj # [pinstr_names]
       * <SIG>    AUTO
       * <DESCR>  Returns the targets of all processing instructions that are
       *   directly contained in [obj].
       * <DOMAIN> All node types. However, this method is only reasonable
       *   for processing instruction nodes, and for elements; for all
       *   other node types the method will return the empty list. Note
       *   that the parser can be configured such that it creates 
       *   processing instruction nodes or not; in the first case, only
       *   the processing instruction nodes contain processing instruction,
       *   in the latter case, only the elements embracing the instructions
       *   contain them.
       * </ID>
       *)

    method node_position : int
      (* <ID:type-node-node-position>
       * <CALL>   obj # [node_position]
       * <SIG>    AUTO
       * <DESCR>  Returns the position of [obj] among all children of the parent
       *   node. Positions are counted from 0. There are several cases:
       *    - The regular nodes get positions from 0 to l-1 where l is the
       *      length of the list of regular children.
       *    - Attribute nodes and namespace nodes are irregular nodes, 
       *      which means here that their positions are counted seperately.
       *      All attribute nodes have positions from 0 to m-1; all namespace
       *      nodes have positions from 0 to n-1.
       *    - If [obj] is a root, this method raises [Not_found]
       *      --
       * <DOMAIN> All node types.
       * </ID>
       *)

    method node_path : int list
      (* <ID:type-node-node-path>
       * <CALL>  obj # [node_path]
       * <SIG>    AUTO
       * <DESCR> Returns the list of node positions describing
       *   the location of this node in the whole tree. The list describes 
       *   the path from the root node down to this node; the first path
       *   element is the index of the child of the root, the second path
       *   element is the index of the child of the child, and so on, and
       *   the last path element is the index of this node. The method returns 
       *   [[[]]] if this node is the root node.
       *
       *   Attribute and namespace nodes are not part of the regular tree, so 
       *   there is a special rule for them. Attribute nodes of an element 
       *   node [x] have the node path [[x # node_path @ [-1; p]]] where 
       *   [p] is the position of the attribute node. Namespace nodes of an 
       *   element node [x] have the node path [[x # node_path @ [-2; p]]] 
       *   where [p] is the position of the namespace node.
       *   (This definition respects the document order.)
       * <DOMAIN> All node types.
       * </ID>
       *)

    method sub_nodes : 'ext node list
      (* <ID:type-node-sub-nodes>
       * <CALL>   obj # [sub_nodes]
       * <SIG>    AUTO
       * <DESCR>  Returns the regular children of the node as list. Only
       *   Elements, data nodes, comments, and processing instructions can
       *   occur in this list; attributes and namespace nodes are not
       *   considered as regular nodes, and super root nodes can only
       *   be root nodes and will never be children of another node.
       *   The returned list is always empty if [obj] is a data node,
       *   comment, processing instruction, attribute, or namespace.
       * <DOMAIN> All node types.
       * </ID>
       *)

    method iter_nodes : ('ext node -> unit) -> unit
      (* <ID:type-node-iter-nodes>
       * <CALL>   obj # [iter_nodes] f
       * <SIG>    AUTO
       * <DESCR>  Iterates over the regular children of [obj], and
       *   calls the function [f] for every child ch: [f ch]. The
       *   regular children are the nodes returned by [sub_nodes], see
       *   there for an explanation.
       * <DOMAIN> All node types.
       * <SEE>    document-iterators
       * </ID>
       *)

    method iter_nodes_sibl :
      ('ext node option -> 'ext node -> 'ext node option -> unit) -> unit
      (* <ID:type-node-iter-nodes-sibl>
       * <CALL>   obj # [iter_nodes_sibl] f
       * <SIG>    AUTO
       * <DESCR>  Iterates over the regular children of [obj], and
       *   calls the function [f] for every child: [f pred ch succ].
       *   - [ch] is the child
       *   - [pred] is [None] if the child is the first in the list,
       *     and [Some p] otherwise; [p] is the predecessor of [ch]
       *   - [succ] is [None] if the child is the last in the list,
       *     and [Some s] otherwise; [s] is the successor of [ch]
       *     --
       *   The
       *   regular children are the nodes returned by [sub_nodes], see
       *   there for an explanation.
       * <DOMAIN> All node types.
       * <SEE>    document-iterators
       * </ID>
       *)

    method nth_node : int -> 'ext node
      (* <ID:type-node-nth-node>
       * <CALL>   obj # [nth_node] n
       * <SIG>    AUTO
       * <DESCR>  Returns the n-th regular child of [obj], [n >= 0].
       *    Raises [Not_found] if the index [n] is out of the valid range.
       * <DOMAIN> All node types.
       * </ID>
       *)

    method previous_node : 'ext node
      (* <ID:type-node-previous-node>
       * <CALL>   obj # [previous_node]
       * <SIG>    AUTO
       * <DESCR>  Returns the predecessor of [obj]
       *   in the list of regular children of the parent, or raise [Not_found]
       *   if this node is the first child. This is equivalent to
       *   [obj # parent # nth_node (obj # node_position - 1)].
       * <DOMAIN> All node types.
       * </ID>
       *)

    method next_node : 'ext node
      (* <ID:type-node-next-node>
       * <CALL>   obj # [next_node]
       * <SIG>    AUTO
       * <DESCR>  Returns the successor of [obj]
       *   in the list of regular children of the parent, or raise [Not_found]
       *   if this node is the last child. This is equivalent to
       *   [obj # parent # nth_node (obj # node_position + 1)].
       * <DOMAIN> All node types.
       * </ID>
       *)

    method data : string
      (* <ID:type-node-data>
       * <CALL>   obj # [data]
       * <SIG>    AUTO
       * <DESCR>  This method returns what is considered as
       *   the data of the node which depends on the node type:
       *    - Data nodes: the method returns the character string the node 
       *      represents
       *    - Element nodes, super root nodes: the method returns the
       *      concatenated character strings of all (direct or indirect)
       *      data nodes below [obj]
       *    - Comment nodes: the method returns the
       *      comment string (without delimiters), or it raises Not_found if the
       *      comment string is not set
       *    - Processing instructions: the
       *      method returns the data part of the instruction, or "" if the data
       *      part is missing
       *    - Attribute nodes: the method returns the attribute
       *      value as string, or it raises [Not_found] if the attribute
       *      is implied.
       *    - Namespace nodes: the method returns the namespace
       *      URI
       *      --
       * <DOMAIN> All node types.
       * </ID>
       *)

    method set_data : string -> unit
      (* <ID:type-node-set-data>
       * <CALL>   obj # [set_data] s
       * <SIG>    AUTO
       * <DESCR>  This method sets the character string contained in 
       *   data nodes.
       *
       *    Note: This method does not check whether the modified XML tree
       *    is still valid.
       * <DOMAIN> Data nodes.
       * </ID>
       *)

    method node_type : node_type
      (* <ID:type-node-node-type>
       * <CALL>   obj # [node_type]
       * <SIG>    AUTO
       * <DESCR>  Returns the type of [obj]:
       *   - [T_element t]: The node is an element with type [t]
       *   - [T_data]: The node is a data node
       *   - [T_comment]: The node is a comment node
       *   - [T_pinstr n]: The node is a processing instruction with
       *     target [n]
       *   - [T_super_root]: The node is a super root node
       *   - [T_attribute n]: The node is an attribute with name [n]
       *   - [T_namespace p]: The node is a namespace with prefix [p]
       *     --
       * <DOMAIN> All node types.
       * </ID>
       * XXX: <SEE> Where attribute and namespace nodes are discussed
       *)

    method position : (string * int * int)
      (* <ID:type-node-position>
       * <CALL>   obj # [position]
       * <SIG>    AUTO
       * <DESCR>  Returns a triple [(entity,line,pos)] describing the 
       *   location of the element in the original XML text. This triple is
       *   only available for elements, and only if the parser has been
       *   configured to store positions (see parser option
       *   [store_element_positions]). If available, [entity] describes 
       *   the entity where the element occurred, [line] is the line number
       *   [>= 1], and [pos] is the byte position of the first character
       *   of the element in the line. 
       *
       *   If unavailable, the method will return the triple [("?",0,0)].
       * <DOMAIN> All node types. Note that the method will always return
       *   [("?",0,0)] for non-element nodes.
       * </ID>
       *)

    method attribute : string -> Pxp_types.att_value
      (* <ID:type-node-attribute>
       * <CALL>   obj # [attribute] name
       * <SIG>    AUTO
       * <DESCR>  Returns the value of the attribute [name].
       *
       *   If the parser is in validating mode, the method is able to return
       *   values for declared attributes, and it raises [Not_found] for any 
       *   undeclared attribute. Note that it even returns a value if the
       *   attribute is actually missing but is declared as [#IMPLIED] or 
       *   has a default value. 
       *
       *   If the parser (more precisely, the DTD object) is in 
       *   well-formedness mode, the method is able to return values for 
       *   defined attributes, and it raises [Not_found] for any
       *   unknown attribute name.
       *
       *   Possible return values are:
       *     -  [Implied_value]: The attribute has been declared with the
       *        keyword [#IMPLIED], and the attribute definition is missing
       *        in the attribute list of the element.
       *     -  [Value s]: The attribute has been declared as type [CDATA], 
       *        as [ID], as [IDREF], as [ENTITY], or as [NMTOKEN], or as 
       *        enumeration or notation, and one of the two conditions holds: 
       *        (1) The attribute value is defined in the attribute list in
       *        which case this value is returned in the string [s]. (2) The
       *        attribute has been omitted, and the DTD declares the attribute
       *        with a default value. The default value is returned in [s]. 
       *        
       *        Summarized, [Value s] is returned for non-implied, non-list 
       *        attribute values.
       *
       *        Furthermore, [Value s] is returned for non-declared attributes
       *        if the DTD object allows this, for instance, if the DTD
       *        object specifies well-formedness mode.
       *     -  [Valuelist l]: The attribute has been declared as type
       *        [IDREFS], as [ENTITIES], or [NMTOKENS], and one of the two
       *        conditions holds: (1) The attribute value is defined in the 
       *        attribute list in which case the space-separated tokens of
       *        the value are returned in the string list [l]. (2) The
       *        attribute has been omitted, and the DTD declares the attribute 
       *        with a default value. The default value is returned in [l]. 
       *
       *        Summarized, [Valuelist l] is returned for all list-type
       *        attribute values.
       *        --
       *   Note that before the attribute value is returned, the value is
       *   normalized. This means that newlines are converted to spaces, and
       *   that references to character entities (i.e. [&#n;]) and
       *   general entities (i.e. [&name;]) are expanded; if necessary, 
       *   the expansion is performed recursively.
       * <DOMAIN> All node types. However, only elements and attribute nodes
       *   will return values, all other node types always raise [Not_found].
       * </ID>
       *)


    method attribute_names : string list
      (* <ID:type-node-attribute-names>
       * <CALL>    obj # [attribute_names]
       * <SIG>    AUTO
       * <DESCR>   Returns the list of all attribute names of this element.
       *   In validating mode, this list is simply the list of declared
       *   attributes. In well-formedness mode, this list is the list of
       *   defined attributes.
       * <DOMAIN> All node types. However, only elements and attribute nodes
       *   will return a non-empty list, all other node types always return
       *   the empty list.
       * </ID>
       *)

    method attribute_type : string -> Pxp_types.att_type
      (* <ID:type-node-attribute-type>
       * <CALL>   obj # [attribute_type] name
       * <SIG>    AUTO
       * <DESCR>  Returns the type of the attribute [name]. If the attribute
       *   is declared, the declared type is returned. If the attribute is
       *   defined but undeclared, the type [A_cdata] will be returned.
       *   (The module [Pxp_types] contains the Caml type of attribute types.)
       *   This method raises [Not_found] if the attribute is unknown.
       * <DOMAIN> All node types. However, only elements and attribute nodes
       *   will return values, all other node types always raise [Not_found].
       * </ID>
       *)

    method attributes : (string * Pxp_types.att_value) list
      (* <ID:type-node-attributes>
       * <CALL>   obj # [attributes]
       * <SIG>    AUTO
       * <DESCR>  Returns the list of [(name,value)] pairs describing
       *    all attributes (declared attributes plus defined attributes).
       * <DOMAIN> All node types. However, only elements and attribute nodes
       *   will return non-empty values, all other node types always
       *   return the empty list.
       * </ID>
       *)

    method required_string_attribute : string -> string
      (* <ID:type-node-required-string-attribute>
       * <CALL>    obj # [required_string_attribute] name
       * <SIG>    AUTO
       * <DESCR>   Returns the value of the attribute [name] as string,
       *    i.e. if the value of the attribute is [Value s], this method
       *    will return simply [s], and if the value is [Valuelist l],
       *    this method will return the elements of [l] separated by
       *    spaces. If the attribute value is [Implied_value], the method
       *    will fail.
       * <DOMAIN> All node types. However, only elements and attribute nodes
       *   will return values, all other node types always fail.
       * </ID>
       *)

    method required_list_attribute : string -> string list
      (* <ID:type-node-required-list-attribute>
       * <CALL>    obj # [required_list_attribute] name
       * <SIG>    AUTO
       * <DESCR>   Returns the value of the attribute [name] as string list,
       *    i.e. if the value of the attribute is [Valuelist l], this method
       *    will return simply [l], and if the value is [Value s],
       *    this method will return the one-element list [[[s]]].
       *    If the attribute value is [Implied_value], the method
       *    will fail.
       * <DOMAIN> All node types. However, only elements and attribute nodes
       *   will return values, all other node types always fail.
       * </ID>
       *)

    method optional_string_attribute : string -> string option
      (* <ID:type-node-optional-string-attribute>
       * <CALL>    obj # [optional_string_attribute] name
       * <SIG>    AUTO
       * <DESCR>   Returns the value of the attribute [name] as optional string,
       *    i.e. if the value of the attribute is [Value s], this method
       *    will return [Some s], and if the value is [Valuelist l],
       *    this method will return [Some s] where [s] consists of the
       *    concatenated elements of [l] separated by spaces. If the
       *    attribute value is [Implied_value], the method will return [None].
       * <DOMAIN> All node types. However, only elements and attribute nodes
       *   will return [Some] values, all other node types always return [None].
       * </ID>
       *)

    method optional_list_attribute : string -> string list
      (* <ID:type-node-optional-list-attribute>
       * <CALL>    obj # [required_list_attribute] name
       * <SIG>    AUTO
       * <DESCR>   Returns the value of the attribute [name] as string list,
       *    i.e. if the value of the attribute is [Valuelist l], this method
       *    will return simply [l], and if the value is [Value s],
       *    this method will return the one-element list [[[s]]].
       *    If the attribute value is [Implied_value], the method
       *    will return the empty list [[[]]].
       * <DOMAIN> All node types. However, only elements and attribute nodes
       *   will return non-empty values, all other node types always
       *   return the empty list.
       * </ID>
       *)

    method id_attribute_name : string
      (* <ID:type-node-id-attribute-name>
       * <CALL>   obj # [id_attribute_name]
       * <SIG>    AUTO
       * <DESCR>  Returns the name of the (at most one) attribute being
       *    declared as type [ID]. The method raises [Not_found] if there 
       *    is no declared [ID] attribute for the element type.
       * <DOMAIN> All node types. However, only elements and attribute nodes
       *   will return names, all other node types always raise [Not_found].
       * </ID>
       *)

    method id_attribute_value : string
      (* <ID:type-node-id-attribute-value>
       * <CALL>   obj # [id_attribute_value]
       * <SIG>    AUTO
       * <DESCR>  Returns the string value of the (at most one) attribute being
       *    declared as type [ID]. The method raises [Not_found] if there 
       *    is no declared [ID] attribute for the element type.
       * <DOMAIN> All node types. However, only elements and attribute nodes
       *   will return names, all other node types always raise [Not_found].
       * </ID>
       *)

    method idref_attribute_names : string list
      (* <ID:type-node-idref-attribute-names>
       * <CALL>   obj # [idref_attribute_names]
       * <SIG>    AUTO
       * <DESCR>  Returns the names of the attributes being
       *    declared as type [IDREF] or [IDREFS]. 
       * <DOMAIN> All node types. However, only elements and attribute nodes
       *   will return names, all other node types always return the empty
       *   list.
       * </ID>
       *)

    method quick_set_attributes : (string * Pxp_types.att_value) list -> unit
      (* DEPRECATED METHOD! set_attributes does exactly the same. *)

    method set_attributes : (string * Pxp_types.att_value) list -> unit
      (* <ID:type-node-set-attributes>
       * <CALL>   obj # [set_attributes] al
       * <SIG>    AUTO
       * <DESCR>  Sets the attributes of this element to [al].
       *
       *    Note that this method does not add missing attributes that are
       *    declared in the DTD. It also never rejects undeclared attributes.
       *    The passed values are not checked.
       *
       *    Note: This method does not check whether the modified XML tree
       *    is still valid.
       * <DOMAIN>   Elements.
       * </ID>
       *)

    method set_attribute : ?force:bool -> string -> Pxp_types.att_value -> unit
      (* <ID:type-node-set-attribute>
       * <CALL>   obj # [set_attribute] ~force n v
       * <SIG>    AUTO
       * <DESCR>  Sets the attribute [n] of this element to the value [v].
       *    By default, it is required that the attribute [n] has already
       *    some value. If you pass ~force:true, the attribute is added
       *    to the attribute list if it is missing.
       *
       *    Note: This method does not check whether the modified XML tree
       *    is still valid.
       * <DOMAIN>   Elements.
       * </ID>
       *)

    method reset_attribute : string -> unit
      (* <ID:type-node-reset-attribute>
       * <CALL>   obj # [reset_attribute] n
       * <SIG>    AUTO
       * <DESCR>  If the attribute [n] is a declared attribute, it is set
       *   to its default value, or to [Implied_value] if there is no default 
       *   (the latter is performed even if the attribute is [#REQUIRED]).
       *   If the attribute is an undeclared attribute, it is removed
       *   from the attribute list.
       *
       *   The idea of this method is to simulate what had happened if [n]
       *   had not been defined in the attribute list of the XML element.
       *   In validating mode, the parser would have chosen the default
       *   value if possible, or [Implied_value] otherwise, and in 
       *   well-formedness mode, the attribute would be simply missing
       *   in the attribute list.
       *
       *   Note: It is intentionally not possible to remove a declared
       *   attribute. (However, you can remove it by calling
       *   set_attributes, but this would be very inefficient.)
       *
       *    Note: This method does not check whether the modified XML tree
       *    is still valid.
       * <DOMAIN>   Elements.
       * </ID>
       *)


    method attributes_as_nodes : 'ext node list
      (* <ID:type-node-attributes-as-nodes>
       * <CALL>   obj # [attributes_as_nodes]
       * <SIG>    AUTO
       * <DESCR>  Returns all attributes (i.e. declared plus defined
       *   attributes) as a list of attribute nodes with node type 
       *   [T_attribute name]. 
       *
       *   This method should be used if it is required for typing reasons
       *   that the attributes have also type [node]. A common example
       *   are sets that may both contain elements and attributes, as they
       *   are used in the XPath language.
       *
       *   The attribute nodes are read-only; any call to a method
       *   modifying their contents will raise [Method_not_applicable].
       *   In order to get the value of such an attribute node [anode],
       *   one can invoke the method [attribute]:
       *
       *   [anode # attribute name]
       *
       *   where [name] is the name of the attribute represented by
       *   [anode]. This will return the attribute value as [att_value]. Of
       *   course, the other attribute observers can be applied as well. 
       *   Furthermore, the method [data] will return the attribute value as
       *   string. However, every attribute node only contains the value of the
       *   one attribute it represents, and it does not make sense to pass
       *   names of other attributes to the observer methods.
       *   
       *   The attribute nodes live outside of the regular XML tree, and
       *   they are not considered as children of the element node. However,
       *   the element node is the parent node of the attribute nodes 
       *   (i.e. the children/parent relationship is asymmetric).
       *
       *   The method [attributes_as_nodes] computes the list of attribute
       *   nodes when it is first invoked, and it will return the same list
       *   again in subsequent invocations.
       * <DOMAIN>  This method is only applicable to elements.
       * </ID>
       *)

    method set_comment : string option -> unit
      (* <ID:type-node-set-comment>
       * <CALL>   obj # [set_comment] c
       * <SIG>    AUTO
       * <DESCR>  Sets the comment string contained in comment nodes, if
       *    [c = Some s]. Otherwise, this method removes the comment string
       *    ([c = None]).
       *
       *    Note that the comment string must not include the delimiters
       *    [<--] and [-->]. Furthermore, it must not contain any character
       *    or character sequence that are forbidden in comments, such
       *    as ["--"]. However, this method does not check this condition.
       * <DOMAIN>  Comment nodes.
       * </ID>
       *)

    method comment : string option
      (* <ID:type-node-comment>
       * <CALL>   obj # [comment]
       * <SIG>    AUTO
       * <DESCR>  Returns [Some text] if the node is a comment node and if
       *    [text] is the comment string (without the delimiters [<--] and
       *    [-->]). Otherwise, [None] is passed back.
       *
       *    Note: The [data] method also returns the comment string, but it
       *    raises [Not_found] if the string is not available.
       * <DOMAIN> All node types. Note that the method will always return
       *    [None] for non-comment nodes.
       * </ID>
       *)

    method normprefix : string
      (* <ID:type-node-normprefix>
       * <CALL>   obj # [normprefix]
       * <SIG>    AUTO
       * <DESCR>  For namespace-aware implementations of the node class, this
       *     method returns the normalized prefix of the element or attribute.
       *     If the object does not have a prefix, "" will be passed back.
       *
       *     The normalized prefix is the part of the name before the 
       *     colon. It is normalized because the parser ensures that every
       *     prefix corresponds only to one namespace. Note that the
       *     prefix can be different than in the parsed XML source because
       *     the normalization step needs to change the prefix to avoid
       *     prefix conflicts.
       * <DOMAIN> Elements and attributes supporting namespaces.
       * </ID>
       *)

    method localname : string
      (* <ID:type-node-localname>
       * <CALL>   obj # [localname]
       * <SIG>    AUTO
       * <DESCR>  For namespace-aware implementations of the node class, this
       *     method returns the local part of the name of the element or
       *     attribute.
       *
       *     The local name is the part of the name after the colon, or
       *     the whole name if there is no colon.
       * <DOMAIN> Elements and attributes supporting namespaces.
       * </ID>
       *)

    method namespace_uri : string
      (* <ID:type-node-namespace-uri>
       * <CALL>   obj # [namespace_uri]
       * <SIG>    AUTO
       * <DESCR>  For namespace-aware implementations of the node class, this
       *     method returns the namespace URI of the element, attribute or
       *     namespace. It is required that a namespace manager is available.
       *
       *     If the node does not have a namespace prefix, and there is no
       *     default namespace, this method returns "".
       *
       *     The namespace URI is the unique name of the namespace.
       * <DOMAIN> Elements and attributes supporting namespaces; furthermore
       *     namespace nodes.
       * </ID>
       *)

    method namespace_manager : namespace_manager
      (* <ID:type-node-namespace-manager>
       * <CALL>   obj # [namespace_manager]
       * <SIG>    AUTO
       * <DESCR>  For namespace-aware implementations of the node class,
       *      this method returns the namespace manager. If the namespace
       *      manager has not been set, the exception [Not_found] is raised.
       *
       *      The namespace manager is an object that holds the mapping
       *      from namespace prefixes to namespace URIs, and vice versa.
       *      It is contained in the DTD.
       * <DOMAIN> Elements and attributes supporting namespaces; furthermore
       *     namespace nodes.
       * </ID>
       *)

    method namespace_info : 'ext namespace_info
      (* <ID:type-node-namespace-info>
       * <CALL>   obj # [namespace_info]
       * <SIG>    AUTO
       * <DESCR>  Returns additional information about the namespace prefixes
       *     in the parsed XML source. This method has been added for
       *     better XPath conformance. Note that it is still experimental
       *     and it is likely that it will be changed.
       *
       *     This record is only available if the parser has been configured
       * to support namespaces, and if the parser has been configured
       * to set this record (requires a lot of memory). Furthermore, only
       * the implementation namespace_element_impl supports this method.
       *
       * This method raises [Not_found] if the [namespace_info] field has not
       * been set.
       * <DOMAIN> Elements supporting namespaces.
       * </ID>
       *)

    method dtd : dtd
      (* <ID:type-node-dtd>
       * <CALL>   obj # [dtd]
       * <SIG>    AUTO
       * <DESCR>  Returns the DTD.
       * <DOMAIN> All node types. Note (1) that exemplars need not to have
       *   an associated DTD, in which case this method fails. (2) Even
       *   in well-formedness mode every node has a DTD object;
       *   this object specifies well-formedness mode.
       * </ID>
       *)

    method encoding : Pxp_types.rep_encoding
      (* <ID:type-node-encoding>
       * <CALL>   obj # [encoding]
       * <SIG>    AUTO
       * <DESCR>  Get the encoding which is always the same as the encoding of 
       *   the DTD. See also method [dtd]. (Note: This method fails, too, if
       *   no DTD is present.)
       * <DOMAIN> All node types. Note that exemplars need not to have
       *   an associated DTD, in which case this method fails.
       * </ID>
       *)

    method create_element :
             ?name_pool_for_attribute_values:Pxp_types.pool ->
             ?position:(string * int * int) ->
	     ?valcheck:bool ->      (* default: true *)
	     ?att_values:((string * Pxp_types.att_value) list) ->
             dtd -> node_type -> (string * string) list -> 'ext node
      (* <ID:type-node-create-element>
       * <CALL>   obj # [create_element] ~name_pool_for_attribute_values ~position ~valcheck ~att_values 
       *          dtd ntype att_list
       * <SIG>    AUTO
       * <DESCR>  Returns a flat copy of this element node with the following
       *    modifications: 
       *     - The DTD is set to [dtd]
       *     - The node type is set to [ntype] (which must be [T_element name])
       *     - The attribute list is set to the concatenation of 
       *       [att_list] and [att_values]; [att_list] passes attribute values
       *       as strings while [att_values] passes attribute values as
       *       type [att_value]
       *     - The copy does not have children nor a parent
       *     - The copy does not contain processing instructions.
       *     - The position triple is set to [position]
       *       --
       *   Note that the extension object is copied, too.
       *
       *   If [valcheck = true] (the default), it is checked whether the 
       *   element type exists and whether the passed attributes match the
       *   declared attribute list. Missing attributes are automatically
       *   added, if possible. If [valcheck = false], any element type
       *   and any attributes are accepted.
       *
       *   If a [name_pool_for_attribute_values] is passed, the attribute
       *   values in [att_list] are put into this pool.
       *
       *   The optional arguments have the following defaults:
       *    - [~name_pool_for_attribute_values]: No pool is used
       *    - [~position]: The position is not available in the copy
       *    - [~valcheck]: false
       *    - [~att_values]: empty
       *      --
       * <DOMAIN> Elements.
       * <SEE> type-node-ex-create-element
       * </ID>
       *)

    method create_data : dtd -> string -> 'ext node
      (* <ID:type-node-create-data>
       * <CALL>   obj # [create_data] dtd cdata
       * <SIG>    AUTO
       * <DESCR>  Returns a flat copy of this data node with the following
       *    modifications: 
       *     - The DTD is set to [dtd]
       *     - The character string is set to [cdata]
       *       --
       *   Note that the extension object is copied, too.
       * <DOMAIN> Data nodes.
       * <SEE> type-node-ex-create-data
       * </ID>
       *)

    method create_other : 
             ?position:(string * int * int) ->
             dtd -> node_type -> 'ext node
      (* <ID:type-node-create-other>
       * <CALL>   obj # [create_other] ~position dtd ntype
       * <SIG>    AUTO
       * <DESCR>  Returns a flat copy of this node with the following
       *   modification:
       *     - The DTD is set to [dtd]
       *     - The position triple is set to [position]
       *       --
       *   Note that the extension object is copied, too.
       *
       *   The passed node type [ntype] must match the node type
       *   of [obj].
       * <DOMAIN> Super root nodes, processing instruction nodes,
       *    comment nodes
       * </ID>
       *)


    method local_validate : 
              ?use_dfa:bool -> ?check_data_nodes:bool -> unit -> unit
      (* DEPRECATED NAME of validate_contents. *)

    method validate_contents : 
              ?use_dfa:bool -> ?check_data_nodes:bool -> unit -> unit
      (* <ID:type-node-validate-contents>
       * <CALL>   obj # [validate_contents] ?use_dfa ?check_data_nodes ()
       * <SIG>    AUTO
       * <DESCR>  Checks that the subnodes of this element match the declared
       *     content model of this element. The method returns [()] if
       *     the element is okay, and it raises an exception if an error
       *     is found (in most cases [Validation_error]).
       *
       *     This check is always performed by the parser, such that
       *     software that only reads parsed XML trees needs not call
       *     this method. However, if software modifies the tree itself,
       *     an invocation of this method ensures that the validation
       *     constraints about content models are fulfilled.
       *
       *     Note that the check is not performed recursively.
       *  
       *     - Option [~use_dfa]: If true, the deterministic finite automaton of
       *       regexp content models is used for validation, if available.
       *       Defaults to false.
       *     - Option [~check_data_nodes]: If true, it is checked whether data
       *       nodes only occur at valid positions. If false, these checks
       *       are left out. Defaults to true. (Usually, the parser turns
       *       this feature off because the parser already performs a similar
       *       check.)
       *
       *       See [classify_data_node] for details about what is checked.
       *       --
       *
       * In previous releases of PXP, this method was called [local_validate].
       * <DOMAIN> All node types. However, there are only real checks for
       *    elements; for other nodes, this method is a no-op.
       * </ID>
       *)

    method complement_attlist : unit -> unit
      (* <ID:type-node-complement-attlist>
       * <CALL>   obj # [complement_attlist] ()
       * <SIG>    AUTO
       * <DESCR>  Adds attributes that are declared in the DTD but are
       *     currently missing: [#IMPLIED] attributes are added with 
       *     [Implied_value], and if there is a default value for an attribute, 
       *     this value is added. [#REQUIRED] attributes are set to
       *     [Implied_value], too.
       * 
       *     It is only necessary to call this method if the element is created
       *     with ~valcheck:false, or the attribute list has been modified,
       *     and the element must be validated.
       * <DOMAIN> Elements.
       * </ID>
       *)

    method validate_attlist : unit -> unit
      (* <ID:type-node-validate-attlist>
       * <CALL>   obj # [validate_attlist] ()
       * <SIG>    AUTO
       * <DESCR>  Checks whether the attribute list of the element [obj] 
       *    matches the declared attribute list. The method returns [()]
       *    if the attribute list is formed correctly, and it raises an
       *    exception (usually a [Validation_error]) if there is an error.
       *
       *    This check is implicitly performed by [create_element] unless
       * the option [~valcheck:false] has been passed. This means that it
       * is usually not necessary to call this method; however, if the
       * attribute list has been changed by [set_attributes] or if 
       * [~valcheck:false] is in effect, the invocation of this method
       * ensures the validity of the attribute list.
       *
       * Note that the method complains about missing attributes even
       * if these attributes have been declared with a default value or as
       * being [#IMPLIED]; this method only checks the attributes but does
       * not modify the attribute list. If you know that attributes are
       * missing and you want to add them automatically just as 
       * [create_element] does, you can call [complement_attlist] before
       * doing this check.
       * <DOMAIN> All node types. However, for non-element nodes this
       *   check is a no-op.
       * </ID>
       *)

    method validate : unit -> unit
      (* <ID:type-node-validate>
       * <CALL>   obj # [validate] ()
       * <SIG>    AUTO
       * <DESCR>  Calls [validate_contents] and [validate_attlist], and
       *     ensures that this element is locally valid. The method 
       *     returns [()] if the element is valid, and raises an exception
       *     otherwise.
       * <DOMAIN> All node types. However, for non-element nodes this
       *   check is a no-op.
       * </ID>
       *)

    (* method keep_always_whitespace_mode : unit *)
      (* This method has been removed. You can now set the handling of
       * ignorable whitespace by a new Pxp_yacc.config option:
       * [drop_ignorable_whitespace]
       *)

    method write : 
             ?prefixes:string list ->
	     ?default:string ->
             Pxp_types.output_stream -> Pxp_types.encoding -> unit
      (* <ID:type-node-write>
       * <CALL>   obj # [write] ~prefixes stream enc
       * <SIG>    AUTO
       * <DESCR>  Write the contents of this node and the subtrees to the passed
       *    [stream] encoded as [enc]. The generated output is again XML.
       *    The output style is rather compact and should not be considered
       *    as "pretty printing".
       *
       *   Option [~prefixes]: The class [namespace_element_impl] interprets 
       *   this option and passes it recursively to subordinate invocations of
       *   [write]. The meaning is that the normprefixes enumerated by this list
       *   have already been declared by surrounding elements. The option
       *   defaults to [] forcing the method to output all necessary prefix
       *   declarations.
       *
       *   Option [~default]: Specifies the normprefix that becomes the
       *   default namespace in the output.
       *
       *   KNOWN BUG: comment nodes are not printed.
       * <DOMAIN> All regular node types (elements, data nodes, comments,
       *   processing instructions, super root nodes).
       * </ID>
       *)

    (* ---------------------------------------- *)
    (* internal methods: *)
    method internal_adopt : 'ext node option -> int -> unit
    method internal_set_pos : int -> unit
    method internal_delete : 'ext node -> unit
    method internal_init : (string * int * int) ->
                           Pxp_types.pool option ->
			   bool -> 
                           dtd -> string -> (string * string) list -> 
			   (string * Pxp_types.att_value) list -> unit
    method internal_init_other : (string * int * int) ->
                                 dtd -> node_type -> unit

    method set_namespace_info : 'ext namespace_info option -> unit
      (* Sets the namespace_info field.
       * Only the implementation namespace_element_impl supports this
       * method.
       *)

    method dump : Format.formatter -> unit

  end

and ['ext] namespace_info =
  (* IMPORTANT: namespace_info is very very very very experimental. It is
   * very likely that the signature will change in the future, or that
   * the class will be removed.
   *)
  object
    method srcprefix : string
      (* Returns the prefix before it is normalized *)

    method declaration : 'ext node list
      (* Returns the currently active namespace declaration. The list
       * enumerates all namespace objects with
       *   namespace # node_type = T_namespace "srcprefix"
       * meaning that the srcprefix is declared to correspond to the
       * namespace URI
       *   namespace # data.
       * This list always declares the prefix "xml". If there is a default
       * namespace, it is declared for the prefix "".
       *)
  end
;;


class [ 'ext ] data_impl : 'ext -> [ 'ext ] node
  (* <ID:class-data-impl>
   * <TYPE:class>
   * <CALL>   'ext [data_impl]
   * <SIG>    AUTO
   * <DESCR>  This class is an implementation of [node] which
   *   realizes data nodes. You can create a new object by
   *
   *   [let exemplar = new data_impl ext_obj]
   *
   *   which creates a special form of empty data node which already contains a
   *   reference to the [ext_obj], but is otherwise empty. This special form
   *   is called a data exemplar. In order to get a working data node
   *   that can be used in a node tree it is required to apply the method
   *   [create_data] on the exemplar object.
   * </ID>
   *)


class [ 'ext ] element_impl : 'ext -> [ 'ext ] node
  (* <ID:class-element-impl>
   * <TYPE:class>
   * <CALL>   'ext [element_impl]
   * <SIG>    AUTO
   * <DESCR>  This class is an implementation of [node] which
   *   realizes element nodes. You can create a new object by
   *
   *   [let exemplar = new element_impl ext_obj]
   *
   *   which creates a special form of empty element which already contains a
   *   reference to the [ext_obj], but is otherwise empty. This special form
   *   is called an element exemplar. In order to get a working element
   *   that can be used in a node tree it is required to apply the method
   *   [create_element] on the exemplar object.
   *
   *   Note that the class [element_impl] is not namespace-aware.
   * </ID>
   *)


class [ 'ext ] comment_impl : 'ext -> [ 'ext ] node ;;
  (* <ID:class-comment-impl>
   * <TYPE:class>
   * <CALL>   'ext [comment_impl]
   * <SIG>    AUTO
   * <DESCR>  This class is an implementation of [node] which
   *   realizes comment nodes. You can create a new object by
   *
   *   [let exemplar = new comment_impl ext_obj]
   *
   *   which creates a special form of empty element which already contains a
   *   reference to the [ext_obj], but is otherwise empty. This special form
   *   is called an comment exemplar. In order to get a working element
   *   that can be used in a node tree it is required to apply the method
   *   [create_other] on the exemplar object, e.g.
   *
   *   [let comment = exemplar # create_other dtd]
   * </ID>
   *)

class [ 'ext ] super_root_impl : 'ext -> [ 'ext ] node ;;
  (* <ID:class-super-root-impl>
   * <TYPE:class>
   * <CALL>   'ext [super_root_impl]
   * <SIG>    AUTO
   * <DESCR>  This class is an implementation of [node] which
   *   realizes super root nodes. You can create a new object by
   *
   *   [let exemplar = new super_root_impl ext_obj]
   *
   *   which creates a special form of empty super root which already contains a
   *   reference to the [ext_obj], but is otherwise empty. This special form
   *   is called a super root exemplar. In order to get a working node
   *   that can be used in a node tree it is required to apply the method
   *   [create_other] on the exemplar object, e.g.
   *
   *   [let root = exemplar # create_other dtd]
   * </ID>
   *)

class [ 'ext ] pinstr_impl : 'ext -> [ 'ext ] node ;;
  (* <ID:class-pinstr-impl>
   * <TYPE:class>
   * <CALL>   'ext [pinstr_impl]
   * <SIG>    AUTO
   * <DESCR>  This class is an implementation of [node] which
   *   realizes processing instruction nodes. You can create a new object by
   *
   *   [let exemplar = new pinstr_impl ext_obj]
   *
   *   which creates a special form of empty node which already contains a
   *   reference to the [ext_obj], but is otherwise empty. This special form
   *   is called a processing instruction exemplar. In order to get a working node
   *   that can be used in a node tree it is required to apply the method
   *   [create_other] on the exemplar object, e.g.
   *
   *   [let pi = exemplar # create_other dtd]
   * </ID>
   *)

val pinstr : 'ext node -> proc_instruction
  (* <ID:val-pinstr>
   * <TYPE:fun>
   * <CALL>   [pinstr] n
   * <SIG>    AUTO
   * <DESCR>  Returns the processing instruction contained in a
   *   processing instruction node.
   *   This function raises [Invalid_argument] if invoked for a different node
   *   type than T_pinstr.
   * </ID>
   *)

class [ 'ext ] attribute_impl :
  element:string -> name:string -> Pxp_types.att_value -> dtd -> [ 'ext ] node
;;
    (* Creation:
     *   new attribute_impl element_name attribute_name attribute_value dtd
     * Note that attribute nodes do intentionally not have extensions.
     *
     * Attribute nodes are created on demand by the first invocation of
     * attributes_as_nodes of the element node. Attribute nodes are
     * created directly and not by copying exemplar nodes, so you never
     * need to create them yourself.
     *
     * Attribute nodes have the following properties:
     * - The node type is T_attribute name.
     * - The parent node is the element node.
     * - The method "attributes" returns [ name, value ], i.e. such nodes
     *   have a single attribute "name". To get the value, call
     *   n # attribute name.
     * - The method "data" returns the string representation of the 
     *   attribute value.
     * - Attribute nodes are leaves of the tree.
     *
     * Attribute nodes are designed to be members of XPath node sets, and
     * are only useful if you need such sets.
     *)

val attribute_name  : 'ext node -> string
  (* <ID:val-attribute-name>
   * <TYPE:fun>
   * <CALL>   [attribute_name] n
   * <SIG>    AUTO
   * <DESCR>  Returns the name of the attribute contained in an attribute
   *    node. Raises [Invalid_argument] if [n] does not have node type
   *    [T_attribute].
   * </ID>
   *)

val attribute_value : 'ext node -> Pxp_types.att_value
  (* <ID:val-attribute-value>
   * <TYPE:fun>
   * <CALL>   [attribute_value] n
   * <SIG>    AUTO
   * <DESCR>  Returns the value of the attribute contained in an attribute
   *    node. Raises [Invalid_argument] if [n] does not have node type
   *    [T_attribute].
   * </ID>
   *)

val attribute_string_value : 'ext node -> string
  (* <ID:val-attribute-string-value>
   * <TYPE:fun>
   * <CALL>   [attribute_string_value] n
   * <SIG>    AUTO
   * <DESCR>  Returns the string value of the attribute contained in an attribute
   *    node. Raises [Invalid_argument] if [n] does not have node type
   *    [T_attribute].
   * </ID>
   *)

(* Very experimental namespace support: *)

class [ 'ext ] namespace_element_impl : 'ext -> [ 'ext ] node
  (* <ID:class-namespace-element-impl>
   * <TYPE:class>
   * <CALL>   'ext [namespace_element_impl]
   * <SIG>    AUTO
   * <DESCR>  This class is an implementation of [node] which
   *   realizes element nodes. In contrast to [element_impl], this class
   *   also implements the namespace methods.
   *   You can create a new object by
   *
   *   [let exemplar = new namespace_element_impl ext_obj]
   *
   *   which creates a special form of empty element which already contains a
   *   reference to the [ext_obj], but is otherwise empty. This special form
   *   is called an element exemplar. In order to get a working element
   *   that can be used in a node tree it is required to apply the method
   *   [create_element] on the exemplar object.
   * </ID>
   *)

  (* namespace_element_impl: the namespace-aware implementation of element
   * nodes. 
   *
   * This class has an extended definition of the create_element method.
   * It accepts element names of the form "normprefix:localname" where
   * normprefix must be a prefix managed by the namespace_manager. Note
   * that create_element does not itself normalize prefixes; it is expected
   * that the prefixes are already normalized.
   *
   * Such nodes have a node type T_element "normprefix:localname".
   *
   * Furthermore, this class implements the methods:
   * - normprefix
   * - localname
   * - namespace_uri
   * - namespace_info
   * - set_namespace_info
   * - namespace_manager
   *)


class [ 'ext ] namespace_attribute_impl :
  element:string -> name:string -> Pxp_types.att_value -> dtd -> [ 'ext ] node
;;

  (* namespace_attribute_impl: the namespace-aware implementation of
   * attribute nodes.
   *)


class [ 'ext ] namespace_impl :
  (* srcprefix: *) string -> (* normprefix: *) string -> dtd -> [ 'ext ] node
;;
  (* Namespace objects are only used to represent the namespace declarations
   * occurring in the attribute lists of elements.
   * They are stored in the namespace_info object if that is requested.
   *)

val namespace_normprefix : 'ext node -> string
val namespace_srcprefix : 'ext node -> string
val namespace_uri : 'ext node -> string
  (* These functions return the normprefix, the srcprefix, and the URI
   * stored in a namespace object. 
   * If invoked for a different node type, the functions raise Invalid_argument.
   *)


class [ 'ext ] namespace_info_impl :
  (* srcprefix: *) string -> 
  (* element: *)   'ext node -> 
  (* src_norm_mapping: *) (string * string) list ->
     [ 'ext ] namespace_info
;;


(********************************** spec *********************************)

type 'ext spec
constraint 'ext = 'ext node #extension
  (* <ID:type-spec>
   * <TYPE:type>
   * <CALL>   'ext [spec]
   * <SIG>    AUTO
   * <DESCR>  The abstract data type specifying which objects are actually
   *    created by the parser.
   * </ID>
   *)

val make_spec_from_mapping :
      ?super_root_exemplar : 'ext node ->
      ?comment_exemplar : 'ext node ->
      ?default_pinstr_exemplar : 'ext node ->
      ?pinstr_mapping : (string, 'ext node) Hashtbl.t ->
      data_exemplar: 'ext node ->
      default_element_exemplar: 'ext node ->
      element_mapping: (string, 'ext node) Hashtbl.t ->
      unit ->
        'ext spec
  (* <ID:val-make-spec-from-mapping>
   * <TYPE:fun>
   * <CALL>   [make_spec_from_mapping]
   *            ~super_root_exemplar ~comment_exemplar ~default_pinstr_exemplar
   *            ~pinstr_mapping ~data_exemplar ~default_element_exemplar
   *            ~element_mapping
   *            ()
   * <SIG>    AUTO
   * <DESCR>  Creates a [spec] from the arguments. Some arguments are optional,
   *    some arguments are mandatory.
   *      - [~super_root_exemplar]: Specifies the exemplar to be used for
   *        new super root nodes. This exemplar is optional.
   *      - [~comment_exemplar]: Specifies the exemplar to be used for
   *        new comment nodes. This exemplar is optional.
   *      - [~pinstr_exemplar]: Specifies the exemplar to be used for
   *        new processing instruction nodes by a hashtable mapping target
   *        names to exemplars. This hashtable is optional.
   *      - [~default_pinstr_exemplar]: Specifies the exemplar to be used for
   *        new processing instruction nodes. This exemplar will be used
   *        for targets that are not contained in the [~pinstr_exemplar]
   *        hashtable. This exemplar is optional.
   *      - [~data_exemplar]: Specifies the exemplar to be used for
   *        new data nodes. This exemplar is mandatory.
   *      - [~element_mapping]: Specifies the exemplar to be used for
   *        new element nodes by a hashtable mapping element types to
   *        exemplars. This hashtable is mandatory (but may be empty).
   *      - [~default_element_exemplar]: Specifies the exemplar to be used for
   *        new element nodes. This exemplar will be used
   *        for element types that are not contained in the [~element_mapping]
   *        hashtable. This exemplar is mandatory.
   *        --
   * </ID>
   *)

val make_spec_from_alist :
      ?super_root_exemplar : 'ext node ->
      ?comment_exemplar : 'ext node ->
      ?default_pinstr_exemplar : 'ext node ->
      ?pinstr_alist : (string * 'ext node) list ->
      data_exemplar: 'ext node ->
      default_element_exemplar: 'ext node ->
      element_alist: (string * 'ext node) list ->
      unit ->
        'ext spec
  (* <ID:val-make-spec-from-alist>
   * <TYPE:fun>
   * <CALL>   [make_spec_from_alist]
   *            ~super_root_exemplar ~comment_exemplar ~default_pinstr_exemplar
   *            ~pinstr_alist ~data_exemplar ~default_element_exemplar
   *            ~element_alist
   *            ()
   * <SIG>    AUTO
   * <DESCR>  Creates a [spec] from the arguments. This is a convenience
   *    function for [make_spec_from_mapping]; instead of requiring hashtables
   *    the function allows it to pass associative lists.
   * </ID>
   *)

val create_data_node :
      'ext spec -> dtd -> string -> 'ext node
  (* <ID:val-create-data-node>
   * <TYPE:fun>
   * <CALL>   [create_data_node] spec dtd datastring
   * <SIG>    AUTO
   * <DESCR>  Creates a new data node from the exemplar contained in [spec].
   *    The new node contains [datastring] and is connected with the [dtd].
   * </ID>
   *)

val create_element_node :
      ?name_pool_for_attribute_values:Pxp_types.pool ->
      ?position:(string * int * int) ->
      ?valcheck:bool ->
      ?att_values:((string * Pxp_types.att_value) list) ->
      'ext spec -> dtd -> string -> (string * string) list -> 'ext node
  (* <ID:val-create-element-node>
   * <CALL>    [create_element_node] ~name_pool_for_attribute_values
   *              ~position ~valcheck ~att_values spec dtd eltype
   *              att_list
   * <SIG>     AUTO
   * <DESCR>   Creates a new element node from the exemplar(s) contained in
   *    [spec]:
   *      - The new node will be connected to the passed [dtd].
   *      - The new node will have the element type [eltype].
   *      - The attributes of the new node will be the concatenation of
   *        [att_list] and [att_values]; [att_list] passes attribute values
   *        as strings while [att_values] passes attribute values as
   *        type [att_value]
   *      - The source position is set to [~position] (if passed)
   *      - The [~name_pool_for_attribute_values] will be used, if passed.
   *      - If [~valcheck = true] (the default), the attribute list is 
   *        immediately validated. If [~valcheck = false], the validation
   *        is left out; in this case you can pass any element type and
   *        and any attributes, and it does not matter whether and how
   *        they are declared.
   *        --
   * </ID>
   *)

val create_super_root_node :
      ?position:(string * int * int) ->
      'ext spec -> dtd -> 'ext node
  (* <ID:val-create-super-root-node>
   * <CALL>   [create_super_root_node] ~position spec dtd
   * <SIG>    AUTO
   * <DESCR>  Creates a new super root node from the exemplar contained in
   *    [spec]. The new node is connected to [dtd], and the position
   *    triple is set to [~position].
   *
   *    The function fails if there is no super root exemplar in [spec].
   * </ID>
   *)

val create_comment_node :
      ?position:(string * int * int) ->
      'ext spec -> dtd -> string -> 'ext node
  (* <ID:val-create-comment-node>
   * <CALL>   [create_comment_node] ~position spec dtd commentstring
   * <SIG>    AUTO
   * <DESCR>  Creates a new comment node from the exemplar contained in
   *    [spec]. The new node is connected to [dtd], and the position
   *    triple is set to [~position]. The contents of the node are set
   *    to [commentstring].
   *
   *    The function fails if there is no comment exemplar in [spec].
   * </ID>
   *)


val create_pinstr_node :
      ?position:(string * int * int) ->
      'ext spec -> dtd -> proc_instruction -> 'ext node
  (* <ID:val-create-pinstr-node>
   * <CALL>   [create_pinstr_node] ~position spec dtd pi
   * <SIG>    AUTO
   * <DESCR>  Creates a new processing instruction node from the exemplar 
   *    contained in [spec]. The new node is connected to [dtd], and the 
   *    position triple is set to [~position]. The contents of the node are set
   *    to [pi].
   *
   *    The function fails if there is no processing instruction exemplar in
   *    [spec].
   * </ID>
   *)

val create_no_node :
       ?position:(string * int * int) -> 'ext spec -> dtd -> 'ext node
  (* Creates a T_none node with limited functionality 
   * NOTE: This function is conceptually broken and may be dropped in the
   * future.
   *)

val get_data_exemplar :
      'ext spec -> 'ext node
val get_element_exemplar :
      'ext spec -> string -> (string * string) list -> 'ext node
val get_super_root_exemplar :
      'ext spec -> 'ext node
val get_comment_exemplar :
      'ext spec -> 'ext node
val get_pinstr_exemplar :
      'ext spec -> proc_instruction -> 'ext node
  (* These functions just return the exemplars (or raise Not_found).
   * Notes:
   * (1) In future versions, it may be possible that the element exemplar
   *     depends on attributes, too, so the attlist must be passed
   *     to get_element_exemplar
   * (2) In future versions, it may be possible that the pinstr exemplar
   *     depends on the full value of the processing instruction and
   *     not only on the target, so the full proc_instruction must be
   *     passed to get_pinstr_exemplar.
   *)


(*********************** Ordering of nodes ******************************)

(* The functions compare and ord_compare implement the so-called
 * "document order". The basic principle is that the nodes are linearly
 * ordered by their occurence in the textual XML representation of the
 * tree. While this is clear for element nodes, data nodes, comments, and
 * processing instructions, a more detailed definition is necessary for the
 * other node types. In particular, attribute nodes of an element node
 * occur before any regular subnode of the element, and namespace nodes
 * of that element occur even before the attribute nodes. So the order
 * of nodes of
 *   <sample a1="5" a2="6"><subnode/></sample> 
 * is
 *   1. element "sample"
 *   2. attribute "a1"
 *   3. attribute "a2"
 *   4. element "subnode"
 * Note that the order of the attributes of the same element is unspecified,
 * so "a2" may alternatively be ordered before "a1". If there were namespace
 * nodes, they would occur between 1 and 2.
 *   If there is a super root node, it will be handled as the very first
 * node.
 *)

val compare : 'ext node -> 'ext node -> int
  (* <ID:val-compare>
   * <TYPE:fun>
   * <CALL>   [compare] n1 n2
   * <SIG>    AUTO
   * <DESCR>  Returns -1 if [n1] occurs before [n2], or +1 if [n1] occurs
   * after [n2], or 0 if both nodes are identical.
   * If the nodes are unrelated (do not have a common ancestor), the result
   * is undefined (Note: this case is different from [ord_compare]).
   * This test is rather slow, but it works even if the XML tree changes
   * dynamically (in contrast to [ord_compare] below).
   * </ID>
   *)

type 'ext ord_index
constraint 'ext = 'ext node #extension
  (* <ID:type-ord-index>
   * <TYPE:type>
   * <CALL>   'ext [ord_index]
   * <SIG>    AUTO
   * <DESCR>  The type of ordinal indexes.
   * </ID>
   *)

val create_ord_index : 'ext node -> 'ext ord_index
  (* <ID:val-create-ord-index>
   * <TYPE:fun>
   * <CALL>   [create_ord_index] startnode
   * <SIG>    AUTO
   * <DESCR>   
   * Creates an ordinal index for the subtree starting at [startnode].
   * This index assigns to every node an ordinal number (beginning with 0) such
   * that nodes are numbered upon the order of the first character in the XML
   * representation (document order).
   * Note that the index is not automatically updated when the tree is
   * modified.
   * </ID>
   *)

val ord_number : 'ext ord_index -> 'ext node -> int
  (* Returns the ordinal number of the node, or raises Not_found.
   * Note that attribute nodes and namespace nodes are treated specially:
   * All attribute nodes for a certain element node have the _same_
   * ordinal index. All namespace nodes for a certain element node
   * have the _same_ ordinal index.
   * (So ord_number x = ord_number y does not imply x == y for these
   * nodes. However, this is true for the other node types.)
   * It is not recommended to work with the ordinal number directly but
   * to call ord_compare which already handles the special cases.
   *)

val ord_compare : 'ext ord_index -> 'ext node -> 'ext node -> int
  (* <ID:val-ord-compare>
   * <TYPE:fun>
   * <CALL>   [ord_compare] idx n1 n2
   * <SIG>    AUTO
   * <DESCR>
   * Compares two nodes like [compare]:
   * Returns -1 if [n1] occurs before [n2], or +1 if [n1] occurs
   * after [n2], or 0 if both nodes are identical.
   * If one of the nodes does not occur in the ordinal index, [Not_found]
   * is raised. (Note that this is a different behaviour than what [compare]
   * would do.)
   *
   * This test is much faster than [compare].
   * </ID>
   *)


(***************************** Iterators ********************************)

(* General note: The iterators ignore attribute and namespace nodes *)

val find : ?deeply:bool ->
           ('ext node -> bool) -> 'ext node -> 'ext node
  (* <ID:val-find>
   * <TYPE:fun>
   * <CALL>    [find] ~deeply f startnode
   * <SIG>     AUTO
   * <DESCR>   Searches the first node in the tree below [startnode] for which 
   *     the predicate f is true, and returns it. Raises [Not_found]
   *     if there is no such node.
   *
   *     By default, [~deeply=false]. In this case, only the children of
   *     [startnode] are searched.
   *
   *     If passing [~deeply=true], the children are searched recursively
   *     (depth-first search). Note that even in this case [startnode] itself
   *     is not checked.
   *
   *     Attribute and namespace nodes are ignored.
   * </ID>
   *)

val find_all : ?deeply:bool ->
               ('ext node -> bool) -> 'ext node -> 'ext node list
  (* <ID:val-find-all>
   * <CALL>    [find_all] ~deeply f startnode
   * <SIG>     AUTO
   * <DESCR>   Searches all nodes in the tree below [startnode] for which 
   *     the predicate f is true, and returns them. 
   *
   *     By default, [~deeply=false]. In this case, only the children of
   *     [startnode] are searched.
   *
   *     If passing [~deeply=true], the children are searched recursively
   *     (depth-first search). Note that even in this case [startnode] itself
   *     is not checked.
   *
   *     Attribute and namespace nodes are ignored.
   * </ID>
   *)

val find_element : ?deeply:bool ->
                   string -> 'ext node -> 'ext node
  (* <ID:val-find-element>
   * <TYPE:fun>
   * <CALL>    [find_element] ~deeply eltype startnode
   * <SIG>     AUTO
   * <DESCR>   Searches the first element in the tree below [startnode] 
   *     that has the element type [eltype], and returns it. Raises [Not_found]
   *     if there is no such node.
   *
   *     By default, [~deeply=false]. In this case, only the children of
   *     [startnode] are searched.
   *
   *     If passing [~deeply=true], the children are searched recursively
   *     (depth-first search). Note that even in this case [startnode] itself
   *     is not checked.
   * </ID>
   *)

val find_all_elements : ?deeply:bool ->
                        string -> 'ext node -> 'ext node list
  (* <ID:val-find-all-elements>
   * <TYPE:fun>
   * <CALL>    [find_all_elements] ~deeply eltype startnode
   * <SIG>     AUTO
   * <DESCR>   Searches all elements in the tree below [startnode] 
   *     having the element type [eltype], and returns them.
   *
   *     By default, [~deeply=false]. In this case, only the children of
   *     [startnode] are searched.
   *
   *     If passing [~deeply=true], the children are searched recursively
   *     (depth-first search). Note that even in this case [startnode] itself
   *     is not checked.
   * </ID>
   *)

exception Skip
  (* <ID:exc-skip>
   * <TYPE:exception>
   * <CALL>   [Skip]
   * <SIG>    AUTO
   * <DESCR>  This exception can be used in the functions passed to
   *    [map_tree], [map_tree_sibl], [iter_tree], and [iter_tree_sibl]
   *    to skip the current node, and to proceed with the next node.
   *    See these function for details.
   * </ID>
   *)

val map_tree :  pre:('exta node -> 'extb node) ->
               ?post:('extb node -> 'extb node) ->
               'exta node ->
                   'extb node
  (* <ID:val-map-tree>
   * <TYPE:fun>
   * <CALL>   [map_tree] ~pre ~post startnode
   * <SIG>    AUTO
   * <DESCR>  Maps the tree beginning at [startnode] to a second tree
   *    using the following algorithm.
   *
   *    [startnode] and the whole tree below it are recursively traversed.
   *    After entering a node, the function ~pre is called. The result of
   *    this function must be a new node; it must not have children nor a
   *    parent. For example, you can pass
   *      [~pre:(fun n -> n # orphaned_flat_clone)] 
   *    to copy the original node. After that, the children are processed
   *    in the same way (from left to right) resulting in a list of
   *    mapped children. These are added to the mapped node as its 
   *    children.
   * 
   *    Now, the ~post function is invoked with the mapped node as argument, and
   *    the result is the result of the function (~post should return a root
   *    node, too; if not specified, the identity is the ~post function).
   *
   *    Both ~pre and ~post may raise [Skip] which causes that the node is
   *    left out (i.e. the mapped tree does neither contain the node nor
   *    any children of the node). 
   *    If the top node is skipped, the exception [Not_found] is
   *    raised.
   *
   *    For example, the following piece of code duplicates a tree, but
   *    removes all comment nodes:
   *
   *    [ map_tree ~pre:(fun n -> if n # node_type = T_comment then raise Skip else n # orphaned_flat_clone) startnode ]
   *
   *     Attribute and namespace nodes are ignored.
   * </ID>
   *)

val map_tree_sibl :
        pre: ('exta node option -> 'exta node -> 'exta node option ->
                  'extb node) ->
       ?post:('extb node option -> 'extb node -> 'extb node option ->
                  'extb node) ->
       'exta node ->
           'extb node
  (* <ID:val-map-tree-sibl>
   * <TYPE:fun>
   * <CALL>   [map_tree_sibl] ~pre ~post startnode
   * <SIG>    AUTO
   * <DESCR>  Maps the tree beginning at [startnode] to a second tree
   *    using the following algorithm.
   *
   *    [startnode] and the whole tree below it are recursively traversed.
   *    After entering a node, the function ~pre is called with three
   *    arguments: some previous node, the current node, and some next node.
   *    The previous and the next node may not exist because the current
   *    node is the first or the last in the current list of nodes.
   *    In this case, [None] is passed as previous or next node, resp.
   *    The result of this function invocation must be a new node; 
   *    it must not have children nor a parent. For example, you can pass
   *      [~pre:(fun prev n next -> n # orphaned_flat_clone)] 
   *    to copy the original node. After that, the children are processed
   *    in the same way (from left to right) resulting in a list of
   *    mapped children. 
   * 
   *    Now, the ~post function is applied to the list of mapped children
   *    resulting in a list of postprocessed children. (Note: this part
   *    works rather differently than [map_tree].) ~post has three arguments:
   *    some previous child, the current child, and some next child.
   *    The previous and the next child are [None] if non-existing.
   *    The postprocessed children are appended to the mapped node resulting
   *    in the mapped tree.
   *
   *    Both ~pre and ~post may raise [Skip] which causes that the node is
   *    left out (i.e. the mapped tree does neither contain the node nor
   *    any children of the node). 
   *    If the top node is skipped, the exception [Not_found] is
   *    raised.
   *
   *     Attribute and namespace nodes are ignored.
   * </ID>
   *)

val iter_tree : ?pre:('ext node -> unit) ->
                ?post:('ext node -> unit) ->
                'ext node ->
                    unit
  (* <ID:val-iter-tree>
   * <TYPE:fun>
   * <CALL>   [iter_tree] ~pre ~post startnode
   * <SIG>    AUTO
   * <DESCR>  Iterates over the tree beginning at [startnode] 
   *    using the following algorithm.
   *
   *    [startnode] and the whole tree below it are recursively traversed.
   *    After entering a node, the function ~pre is called. Now, the children
   *    are processed recursively. Finally, the ~post function is invoked.
   *
   *    The ~pre function may raise [Skip] causing that the children
   *    and the invocation of the ~post function are skipped.
   *    If the ~post function raises [Skip] nothing special happens.
   *
   *     Attribute and namespace nodes are ignored.
   * </ID>
   *)


val iter_tree_sibl :
       ?pre: ('ext node option -> 'ext node -> 'ext node option -> unit) ->
       ?post:('ext node option -> 'ext node -> 'ext node option -> unit) ->
       'ext node ->
           unit
  (* <ID:val-iter-tree-sibl>
   * <TYPE:fun>
   * <CALL>   [iter_tree_sibl] ~pre ~post startnode
   * <SIG>    AUTO
   * <DESCR>  Iterates over the tree beginning at [startnode] 
   *    using the following algorithm.
   *
   *    [startnode] and the whole tree below it are recursively traversed.
   *    After entering a node, the function ~pre is called with three
   *    arguments: some previous node, the current node, and some next node.
   *    The previous and the next node may be [None] if non-existing.
   *    Now, the children are processed recursively. 
   *    Finally, the ~post function is invoked with the same three
   *    arguments.
   *
   *    The ~pre function may raise [Skip] causing that the children
   *    and the invocation of the ~post function are skipped.
   *    If the ~post function raises [Skip] nothing special happens.
   *
   *     Attribute and namespace nodes are ignored.
   * </ID>
   *)


(************************ Whitespace handling ***************************)

type stripping_mode =
  [ `Strip_one_lf
  | `Strip_one
  | `Strip_seq
  | `Disabled
  ]
  (* <ID:type-stripping-mode>
   * <TYPE:type>
   * <CALL>   [stripping_mode]
   * <SIG>    AUTO
   * <DESCR>  The different ways how to strip whitespace from a single
   *    data node:
   * - [`Strip_one_lf]: If there is a linefeed character at the beginning/at
   *   the end, it will be removed. If there are more linefeed characters, 
   *   only the first/the last is removed.
   *   (This is the SGML rule to strip whitespace.)
   * - [`Strip_one]: If there is a whitespace character at the beginning/at
   *   the end, it will be removed. If there are more whitespace characters, 
   *   only the  first/the last is removed. Whitespace characters are space, 
   *   newline, carriage return, tab.
   * - [`Strip_seq]: All whitespace characters at the beginning/at the end are
   *   removed.
   * - [`Disabled]: Do not strip whitespace.
   *   --
   * </ID>
   *)


val strip_whitespace : 
      ?force:bool -> ?left:stripping_mode -> ?right:stripping_mode ->
      ?delete_empty_nodes:bool ->
      'ext node ->
      unit
  (* <ID:val-strip-whitespace>
   * <TYPE:fun>
   * <CALL>   [strip_whitespace] ~force ~left ~right ~delete_empty_nodes
   *             startnode
   * <SIG>    AUTO
   * <DESCR>
   * Modifies the passed tree in-place by the following rules:
   * - In general, whitespace stripping is not applied to nodes inside
   *   an [xml:space="preserve"] region, unless [~force:true] is passed
   *   to the function (default is [~force:false]). Only if whitespace
   *   stripping is allowed, the following rules are carried out.
   *   Note that the detection of regions with preserved whitespace takes
   *   the parent nodes of the passed [startnode] into account.
   * - If applied to a data node, whitespace at the beginning of the node
   *   is removed according to [~left], and whitespace at the end of the node
   *   is removed according to [~right].
   * - If applied to an element, whitespace at the beginning of the first
   *   data subnode is removed according to [~left], and whitespace at the end
   *   of the last data subnode is removed according to [~right]. Furthermore,
   *   these rules are recursively applied to all subelements (but not to
   *   other node types).
   * - If applied to the super root node, this node is treated as if it
   *   were an element.
   * - Whitespace of other node types is left as-is, as whitespace occuring
   *   in attributes.
   * - Option [~delete_empty_nodes] (default true):
   *   If data nodes become empty after removal of whitespace, they are
   *   deleted from the XML tree. 
   *   --
   * 
   * Defaults:
   *   - [~force:false]
   *   - [~left:`Disabled]
   *   - [~right:`Disabled]
   * </ID>
   *)


(****************************** normalization ****************************)

val normalize : 'ext node -> unit
  (* <ID:val-normalize>
   * <TYPE:fun>
   * <CALL>   [normalize] startnode
   * <SIG>    AUTO
   * <DESCR>  Normalizes the tree denoted by [startnode]  such that
   *    neither empty data nodes nor adjacent data nodes exist. Normalization
   *    works in-place.
   * </ID>
   *)

(******************************** validation *****************************)

val validate : 'ext node -> unit
  (* <ID:val-validate>
   * <TYPE:fun>
   * <CALL>   [validate] startnode
   * <SIG>    AUTO
   * <DESCR>  Validates the tree denoted by [startnode]. In contrast to
   *   [startnode # validate()] this function validates recursively.
   * </ID>
   *)

(******************************* document ********************************)

class [ 'ext ] document :
  Pxp_types.collect_warnings -> Pxp_types.rep_encoding ->
  object
    (* Documents: These are containers for root elements and for DTDs.
     *
     * Important invariant: A document is either empty (no root element,
     * no DTD), or it has both a root element and a DTD.
     *
     * A fresh document created by 'new' is empty.
     *)

    method init_xml_version : string -> unit
	(* Set the XML version string of the XML declaration. *)

    method init_root : 'ext node -> string -> unit
	(* Set the root element. It is expected that the root element has
	 * a DTD.
	 * The second argument is the original name of the root element
	 * (without namespace prefix processing).
	 * Note that 'init_root' checks whether the passed root element
	 * has the type expected by the DTD. The check takes into account
	 * that the root element might be a virtual root node.
	 *)

    method xml_version : string
      (* Returns the XML version from the XML declaration. Returns "1.0"
       * if the declaration is missing.
       *)

    method xml_standalone : bool
      (* Returns whether this document is declared as being standalone.
       * This method returns the same value as 'standalone_declaration'
       * of the DTD (if there is a DTD).
       * Returns 'false' if there is no DTD.
       *)

    method dtd : dtd
      (* Returns the DTD of the root element.
       * Fails if there is no root element.
       *)

    method encoding : Pxp_types.rep_encoding
      (* Returns the string encoding of the document = the encoding of
       * the root element = the encoding of the element tree = the
       * encoding of the DTD.
       * Fails if there is no root element.
       *)

    method root : 'ext node
      (* Returns the root element, or fails if there is not any. *)

    method add_pinstr : proc_instruction -> unit
      (* Adds a processing instruction to the document container.
       * The parser does this for PIs occurring outside the DTD and outside
       * the root element.
       *)

    method pinstr : string -> proc_instruction list
      (* Return all PIs for a passed target string. *)

    method pinstr_names : string list
      (* Return all target strings of all PIs. *)

    method write : ?default : string ->
                   ?prefer_dtd_reference : bool ->
                   Pxp_types.output_stream -> Pxp_types.encoding -> unit
      (* Write the document to the passed
       * output stream; the passed encoding used. The format
       * is compact (the opposite of "pretty printing").
       * If a DTD is present, the DTD is included into the internal subset.
       *
       * Option [~default]: Specifies the normprefix that becomes the
       * default namespace in the output.
       *
       * Option [~prefer_dtd_reference]: If true, it is tried to print
       * the DTD as reference, i.e. with SYSTEM or PUBLIC identifier.
       * This works only if the DTD has an [External] identifier. If
       * the DTD cannot printed as reference, it is included as text.
       * The default is not to try DTD references, i.e. to always include
       * the DTD as text.
       *)

    method dump : Format.formatter -> unit

  end
;;


(* Printers for toploop: *)

val print_node :
    'ext node -> unit ;;

val print_doc :
    'ext document -> unit ;;


(* ======================================================================
 * History:
 *
 * $Log: pxp_document.mli,v $
 * Revision 1.23  2001/12/03 23:46:29  gerd
 * 	New option ~prefer_dtd_reference for [write].
 *
 * Revision 1.22  2001/06/30 00:05:12  gerd
 * 	Fix: When checking the type of the root element, namespace
 * rewritings are taken into account.
 *
 * Revision 1.21  2001/06/28 22:42:07  gerd
 * 	Fixed minor problems:
 * 	- Comments must be contained in one entity
 * 	- Pxp_document.document is now initialized with encoding.
 *           the DTD encoding may be initialized too late.
 *
 * Revision 1.20  2001/06/27 23:35:43  gerd
 * 	Minor fixes: create_other, write.
 *
 * Revision 1.19  2001/06/25 21:04:18  gerd
 * 	Updated documentation. Most docs are now structured comments
 * that can be extracted and included into the docbook manual.
 * 	New option ~default for method [write].
 * 	New method create_other.
 *
 * Revision 1.18  2001/06/09 22:33:14  gerd
 * 	Added 'dump' methods to 'node' and 'document'. Also print_node,
 * print_doc.
 * 	Fixed namespace_info.
 *
 * Revision 1.17  2001/06/08 01:15:46  gerd
 * 	Moved namespace_manager from Pxp_document to Pxp_dtd. This
 * makes it possible that the DTD can recognize the processing instructions
 * <?pxp:dtd namespace prefix="..." uri="..."?>, and add the namespace
 * declaration to the manager.
 *
 * Revision 1.16  2001/06/08 00:12:40  gerd
 * 	Numerous changes:
 * 	- Method add_node has been deprecated in favor of
 * classify_data_node and append_node
 * 	- keep_always_whitespace_mode has been dropped in favor
 * of the new Pxp_yacc.config option drop_ignorable_whitespace
 * 	- create_element_node: accepts the arguments ~att_values
 * and ~valcheck. The first of them contains the already preprocessed
 * and normalized attribute values as att_value (and not as string).
 * The latter may be used to switch off the validation of attribute
 * lists.
 * 	- validate_contents, validate_attlist, validate: these
 * are now the core validation methods
 * 	New methods:
 * 	- complement_attlist
 * 	- improved namespace_manager
 * 	- delete_nodes, insert_nodes
 * 	- set_data
 * 	- set_attributes now official
 * 	New functions:
 * 	- strip_whitespace
 * 	- normalize
 * 	- validate
 *
 * Revision 1.15  2001/05/17 21:40:55  gerd
 * 	Changed comments.
 *
 * Revision 1.14  2001/05/17 21:38:12  gerd
 * 	Updated signatures for namespace functionality:
 * 	- methods namespace_manager, set_namespace_manager
 * 	- classes namespace_element_impl, namespace_impl, namespace_info_impl
 *
 * 	Added comments for attribute_impl, document order
 *
 * Revision 1.13  2001/04/26 23:59:36  gerd
 * 	Experimental support for namespaces: classes namespace_impl,
 * namespace_element_impl, namespace_attribute_impl.
 * 	New classes comment_impl, pinstr_impl, super_root_impl. These
 * classes have been added for stricter (runtime) type checking.
 *
 * Revision 1.12  2000/09/21 21:29:41  gerd
 * 	New functions get_*_exemplar.
 *
 * Revision 1.11  2000/09/09 16:41:03  gerd
 * 	Effort to reduce the amount of allocated memory: The number of
 * instance variables in document nodes has been miminized; the class
 * default_ext no longer stores anything; string pools have been implemented.
 *
 * Revision 1.10  2000/08/30 15:47:37  gerd
 * 	New method node_path.
 * 	New function compare.
 * 	New type ord_index with functions.
 *
 * Revision 1.9  2000/08/26 23:27:53  gerd
 * 	New function: make_spec_from_alist.
 * 	New iterators: find, find_all, find_element, find_all_elements,
 * map_tree, map_tree_sibl, iter_tree, iter_tree_sibl.
 * 	New node methods: node_position, nth_node, previous_node,
 * next_node.
 * 	Attribute and namespace types have now a string argument:
 * the name/prefix. I hope this simplifies the handling of view nodes.
 * 	First implementation of view nodes: attribute_impl. The
 * method attributes_as_nodes returns the attributes wrapped into
 * T_attribute nodes which reside outside the document tree.
 *
 * Revision 1.8  2000/08/18 20:14:00  gerd
 * 	New node_types: T_super_root, T_pinstr, T_comment, (T_attribute),
 * (T_none), (T_namespace).
 *
 * Revision 1.7  2000/07/23 02:16:34  gerd
 * 	Support for DFAs.
 *
 * Revision 1.6  2000/07/16 16:34:41  gerd
 * 	New method 'write', the successor of 'write_compact_as_latin1'.
 *
 * Revision 1.5  2000/07/14 13:56:11  gerd
 * 	Added methods id_attribute_name, id_attribute_value,
 * idref_attribute_names.
 *
 * Revision 1.4  2000/07/09 17:51:14  gerd
 * 	Element nodes can store positions.
 *
 * Revision 1.3  2000/07/04 22:05:10  gerd
 * 	New functions make_spec_from_mapping, create_data_node,
 * create_element_node.
 *
 * Revision 1.2  2000/06/14 22:19:06  gerd
 * 	Added checks such that it is impossible to mix encodings.
 *
 * Revision 1.1  2000/05/29 23:48:38  gerd
 * 	Changed module names:
 * 		Markup_aux          into Pxp_aux
 * 		Markup_codewriter   into Pxp_codewriter
 * 		Markup_document     into Pxp_document
 * 		Markup_dtd          into Pxp_dtd
 * 		Markup_entity       into Pxp_entity
 * 		Markup_lexer_types  into Pxp_lexer_types
 * 		Markup_reader       into Pxp_reader
 * 		Markup_types        into Pxp_types
 * 		Markup_yacc         into Pxp_yacc
 * See directory "compatibility" for (almost) compatible wrappers emulating
 * Markup_document, Markup_dtd, Markup_reader, Markup_types, and Markup_yacc.
 *
 * ======================================================================
 * Old logs from markup_document.mli:
 *
 * Revision 1.13  2000/05/27 19:15:08  gerd
 * 	Removed the method init_xml_standalone.
 *
 * Revision 1.12  2000/05/01 20:42:34  gerd
 *         New method write_compact_as_latin1.
 *
 * Revision 1.11  2000/04/30 18:15:57  gerd
 * 	Beautifications.
 * 	New method keep_always_whitespace_mode.
 *
 * Revision 1.10  2000/03/11 22:58:15  gerd
 * 	Updated to support Markup_codewriter.
 *
 * Revision 1.9  2000/01/27 21:51:56  gerd
 * 	Added method 'attributes'.
 *
 * Revision 1.8  2000/01/27 21:19:07  gerd
 * 	Added further methods.
 *
 * Revision 1.7  1999/11/09 22:20:14  gerd
 * 	Removed method init_dtd from class "document". The DTD is
 * implicitly passed to the document by the root element.
 *
 * Revision 1.6  1999/09/01 22:51:40  gerd
 * 	Added methods to store processing instructions.
 *
 * Revision 1.5  1999/09/01 16:19:57  gerd
 * 	The "document" class has now a "warner" as class argument.
 *
 * Revision 1.4  1999/08/19 21:59:13  gerd
 * 	Added method "reset_finder".
 *
 * Revision 1.3  1999/08/19 01:08:29  gerd
 * 	Added method "find".
 *
 * Revision 1.2  1999/08/15 02:19:41  gerd
 * 	Some new explanations: That unknown elements are not rejected
 * if the DTD allows them.
 *
 * Revision 1.1  1999/08/10 00:35:51  gerd
 * 	Initial revision.
 *
 *
 *)

This web site is published by Informatikbüro Gerd Stolpmann
Powered by Caml