Plasma GitLab Archive
Projects Blog Knowledge

Class Pxp_dtd.dtd

class dtd : ?swarner:Pxp_core_types.I.symbolic_warnings -> Pxp_core_types.I.collect_warnings -> Pxp_core_types.I.rep_encoding -> object .. end
DTD objects have two purposes:
  • They are containers for global declarations that apply to the whole XML document. This includes the character set, the standalone declaration, and all declaration that can appear in the "DTD part" of a document.
  • Also, they express formal constraints the document must fulfill such as validity, or (less ambitious) well-formedness.
Normally, programmers neither need to create such objects, nor to fill them with data, as the parser already does this. If it is required to create a DTD object, the recommended function is Pxp_dtd.create_dtd.

Despite its name, this object does not only define the DTD as such (i.e. what would be found in a ".dtd" file), but all formal requirements of documents that are needed by PXP. This also includes:

  • The name of the root element
  • The character encoding of the document
  • Whether validation is on or off
  • The namespace manager
  • Whether the document is declared as standalone
A consequence of this is that even documents have a DTD object that only have to comply to the relatively weak well-formedness constraints.

For some introductory words about well-formedness mode, see Parsing in well-formedness mode.


method root : string option
get the name of the root element if present. This is the name following "<!DOCTYPE". If there is no DOCTYPE declaration, this method will return None.
method set_root : string -> unit
set the name of the root element. This method can be invoked only once (usually by the parser)
method id : Pxp_core_types.I.dtd_id option
get the identifier for this DTD. Possible return values:
  • None: There is no DOCTYPE declaration, or only <!DOCTYPE name>
  • Some Internal: There is a DOCTYPE declaration with material in brackets like <!DOCTYPE name [ declarations ... ]>
  • Some(External xid): There is a DOCTYPE declaration with a SYSTEM or PUBLIC identifier (described by xid), but without brackets, i.e. <!DOCTYPE name SYSTEM '...'> or <!DOCTYPE name PUBLIC '...' '...'> .
  • Some(Derived xid): There is a DOCTYPE declaration with a SYSTEM or PUBLIC identifier (described by xid), and with brackets

method set_id : Pxp_core_types.I.dtd_id -> unit
set the identifier. This method can be invoked only once
method encoding : Pxp_core_types.I.rep_encoding
returns the encoding used for character representation
method lexer_factory : Pxp_lexer_types.lexer_factory
Returns a lexer factory for the character encoding
method allow_arbitrary : unit
This method sets the arbitrary_allowed flag. This flag disables a specific validation constraint, namely that all elements need to be declared in the DTD. This feature is used to implement the well-formedness mode: In this mode, the element, attribute, and notation declarations found in the textual DTD are ignored, and not added to this DTD object. As the arbitrary_allowed flag is also set, the net effect is that all validation checks regarding the values of elements and attributes are omitted. The flag is automatically set if the parser is called using one of the "wf" functions, e.g. Pxp_tree_parser.parse_wfdocument_entity.

Technically, the arbitrary_allowed flag changes the behaviour of the element and notation methods defined below so that they raise Undeclared instead of Validation_error when an unknown element or notation name is encountered.

method disallow_arbitrary : unit
Clears the arbitrary_allowed flag again
method arbitrary_allowed : bool
Returns whether arbitrary contents are allowed or not.
method standalone_declaration : bool
Whether there is a 'standalone' declaration or not.
method set_standalone_declaration : bool -> unit
Sets the 'standalone' declaration.
method namespace_manager : namespace_manager
For namespace-aware implementations of the node class, this method returns the namespace manager. If the namespace manager has not been set, the exception Not_found is raised.
method set_namespace_manager : namespace_manager -> unit
Sets the namespace manager as returned by namespace_manager.
method add_element : dtd_element -> unit
add the given element declaration to this DTD. Raises Not_found if there is already an element declaration with the same name.
method add_gen_entity : Pxp_entity.entity -> bool -> unit
add_gen_entity e extdecl: add the entity e as general entity to this DTD (general entities are those represented by &name;). If there is already a declaration with the same name, the second definition is ignored; as exception from this rule, entities with names "lt", "gt", "amp", "quot", and "apos" may only be redeclared with a definition that is equivalent to the standard definition; otherwise a Validation_error is raised.

extdecl: true indicates that the entity declaration occurs in an external entity. (Used for the standalone check.)

method add_par_entity : Pxp_entity.entity -> unit
add the given entity as parameter entity to this DTD (parameter entities are those represented by %name;). If there is already a declaration with the same name, the second definition is ignored.
method add_notation : dtd_notation -> unit
add the given notation to this DTD. If there is already a declaration with the same name, a Validation_error is raised.
method add_pinstr : proc_instruction -> unit
add the given processing instruction to this DTD.
method element : string -> dtd_element
looks up the element declaration with the given name. Raises Validation_error if the element cannot be found. If the arbitrary_allowed flag is set, however, Undeclared is raised instead.
method element_names : string list
returns the list of the names of all element declarations.
method gen_entity : string -> Pxp_entity.entity * bool
let e, extdecl = obj # gen_entity n: looks up the general entity e with the name n. Raises WF_error if the entity cannot be found.

extdecl: indicates whether the entity declaration occured in an external entity.

method gen_entity_names : string list
returns the list of all general entity names
method par_entity : string -> Pxp_entity.entity
looks up the parameter entity with the given name. Raises WF_error if the entity cannot be found.
method par_entity_names : string list
returns the list of all parameter entity names
method notation : string -> dtd_notation
looks up the notation declaration with the given name. Raises Validation_error if the notation cannot be found. If the arbitrary_allowed flag is sez, however, Undeclared is raised instead.
method notation_names : string list
Returns the list of the names of all added notations
method pinstr : string -> proc_instruction list
looks up all processing instructions with the given target. The "target" is the identifier following <?.
method pinstr_names : string list
Returns the list of the names (targets) of all added pinstrs
method validate : unit
ensures that the DTD is valid. This method is optimized such that actual validation is only performed if DTD has been changed. If the DTD is invalid, in most cases a Validation_error is raised, but other exceptions are possible, too.
method only_deterministic_models : unit
Succeeds if all regexp content models are deterministic. Otherwise Validation_error.
method write : ?root:string ->
Pxp_core_types.I.output_stream -> Pxp_core_types.I.encoding -> bool -> unit
write os enc doctype: Writes the DTD as enc-encoded string to os. If doctype, a DTD like <!DOCTYPE root [ ... ]> is written. If not doctype, only the declarations are written (the material within the square brackets).

The entity definitions are not written. However, it is ensured that the generated string does not contain any reference to an entity. The reason for the omission of the entites is that there is no generic way of writing references to external entities.

Option root: Override the name of the root element in the DOCTYPE clause.

method write_ref : ?root:string ->
Pxp_core_types.I.output_stream -> Pxp_core_types.I.encoding -> unit
write_ref os enc: Writes a reference to the DTD as enc-encoded string to os. The reference looks as follows:
   <!DOCTYPE root SYSTEM ... > or
   <!DOCTYPE root PUBLIC ... >
 
Of course, the DTD must have an external ID:
  • dtd#id = External(System ...) or
  • dtd#id = External(Public ...)
If the DTD is internal or mixed, the method write_ref will fail. If the ID is anonymous or private, the method will fail, too.

Option root: Override the name of the root element in the DOCTYPE clause.

This web site is published by Informatikbüro Gerd Stolpmann
Powered by Caml