The PXP user's guide
Prev	Chapter 3. The objects representing the document	Next

3.5. Namespaces

3.5.1. Prefix normalization

Namespaces have a unique identifier, the so-called namespace URI. For example, http://www.w3.org/1999/xhtml is the namespace URI of XHTML 1.0. As URIs are quite long, it would be a pain to use them directly to refer to namespaces. Because of this, namespaces are primarily referred to by a shorthand notation, the namespace prefix. For example, in the following XML snippet the prefix "h" is declared as a shorthand for the XHTML namespace:

<h:html xmlns:h="http://www.w3.org/1999/xhtml"> 
  <h:head>
    <h:title>Virtual Library</h:title> 
  </h:head> 
  <h:body> 
    <h:p>Moved to <h:a href="http://vlib.org/">vlib.org</h:a>.</h:p> 
  </h:body> 
</h:html>

It is possible to change the meaning of the prefixes everywhere in the document. Especially it is possible to change the meaning of a prefix for the scope of an element:

<x:address xmlns:x="http://addresses.org">
  <x:name xmlns:x="http://names.org">
    Gerd Stolpmann
  </x:name>
</x:address>

Here, the inner declaration of "x" temporarily overrides the outer declaration of "x". This limits the usability of namespace prefixes, as they do not identify namespaces throughout the whole document.

Many other parsers represent the namespace declarations explicitly by creating namespace nodes for the declarations. However, this has the disadvantage that you need to recur to the namespace URIs in order to identify namespaces in your programs, as the prefixes are not unique.

PXP has a different mode of processing namespaces. The prefixes are transformed while the document is being parsed such that they become unique. This transformation is called "prefix normalization". For example, the above x:address example would be transformed to

<x:address xmlns:x="http://addresses.org">
  <x1:name xmlns:x1="http://names.org">
    Gerd Stolpmann
  </x1:name>
</x:address>

and the parsed tree would have an outer element with node type T_element "x:address", and an inner element with node type T_element "x1:name". The obvious advantage is that the names of elements are still simple strings, and that it is not necessary to deal with pairs (namespace_uri,localname).

Furthermore, it is possible to control which prefixes are preferred. By manipulating the namespace_manager it is possible to demand a certain prefix for a certain namespace URI:

dtd # namespace_manager # add_namespace "addr" "http://addresses.org";
dtd # namespace_manager # add_namespace "nm" "http://names.org";

Now the normalized text reads:

<addr:address xmlns:addr="http://addresses.org">
  <nm:name xmlns:nm="http://names.org">
    Gerd Stolpmann
  </nm:name>
</addr:address>

This has the advantage that you know in advance which prefixes will be used which simplifies programming a lot.

3.5.2. DTDs

PXP defines a processing instruction for DTDs doing the same as the add_namespace method:

<?pxp:dtd namespace prefix="p" uri="u"?>

This makes it possible to declare elements (and attributes) for documents no matter which prefixes are actually used. For example, the following document is valid:

<?xml version="1.0"?>
<!DOCTYPE addr:address [

<!ELEMENT addr:address (nm:name)>
<!ELEMENT nm:name (#PCDATA)>

<?pxp:dtd namespace prefix="addr" uri="http://addresses.org"?>
<?pxp:dtd namespace prefix="nm"   uri="http://names.org"?>
]>

<x:address xmlns:x="http://addresses.org">
  <x:name xmlns:x="http://names.org">
    Gerd Stolpmann
  </x:name>
</x:address>

3.5.3. How to enable namespace processing

By default, PXP does no namespace processing. To enable it, set the parser option enable_namespace_processing to true. This makes the parser recognize namespace declarations, and it enables the prefix normalization, too.

Furthermore, it is recommended to use the class namespace_element_impl instead of element_impl.

Class: 'ext namespace_element_impl

Description: This class is an implementation of node which realizes element nodes. In contrast to element_impl , this class also implements the namespace methods. You can create a new object by

let exemplar = new namespace_element_impl ext_obj

which creates a special form of empty element which already contains a reference to the ext_obj , but is otherwise empty. This special form is called an element exemplar. In order to get a working element that can be used in a node tree it is required to apply the method create_element on the exemplar object.

Prev	Home	Next
Details of the mapping from XML text to the tree representation	Up	Configuring and calling the parser

This web site is published by Informatikbüro Gerd Stolpmann

Plasma	GitLab	Archive
Projects	Blog	Knowledge