4XPath API


4XPath API 4XPath XPAth API Application Programming Interface for 4XPath.

Usage

There is one interface for the XPath system that is used to parse both Paths and Expressions. In both cases, the parser will return a Parsed Token that represents the path or expression in a tree format. This token can then be used to further process nodes of DOM tree.

Paths

A Path in XPath is used to select a set of nodes in a DOM tree. The top level EBNF production for a Path is the LocationPath production (see http://www.w3.org/TR/xpath). To create a parsed Path first create an instance of the XPathParser

from xml.xpath import XPathParser
p = XPathParser.XPathParser()

Use the method parseExpression to parse a string into a parsed tree The path expression used in this exmaple can be broken into steps defined by the '/'. The first set of // will select all nodes that are a descendent of root

Then, child::ENTRY will select all elements with a name ENTRY that are children of the result of the above set Lastly the NAME[position()=1] Will select the element that is the child of any of the results of the above set, and that is a NAME element, and that is the first name element. In short, this will select the first NAME element that is a child of any ENTRY element in the DOM tree.

path = p.parseExpression('//child::ENTRY/NAME[position() = 1]')

The object returned from parseExpression has the following 2 methods: Dump will write the path expression to the given open file reference.

path.dump(sys.stdout)

Select will use a context to select a set of nodes. The context is made up of the context node and a context list. For top level selects, the context list usually contains one item, the context node.

from xml.xpath import Conext
c = Context.Context(node,[node])

rt = path.select(c)

rt will containt a list of nodes that were selected from the expression

print rt

Expressions

Expressions function much the same as Paths. In the EBNF production, and Expressions root is the Expr production. An Expression can contian Paths and a Path can contian expressions. It all depends on what you need.

from xml.xpath import XPathParser
p = XPathParser.XPathParser()

Use the parseExpression method to parse an expression string into parsed tokens The expression used inb the example will return an child elements of the context that have the tag name NAME or #PHONENUM

exp = p.parseExpression('NAME | PHONENUM')

Expressions have a dump method

exp.dump(sys.stdout)

Expressions can be evaluated against a context. There are 4 return types for expressions

  1. Boolean
  2. Number
  3. String
  4. Node Set

from xml.xpath import Conext
c = Context.Context(node,[node])

rt = exp.evaluate(c)

A common use of expressions is for matching. Rules for matching a context node are described in the XPath #specification. These rules have been combined into a single function call

This function will only return a boolean result from the expression evaluated at the given context.

rt = Boolean.BooleanEvaluate(exp,c)

Optimization: Document Indexing

XPath location paths require node-lists to be sorted in document order. This can be an expensive operation so XPath allows users to pre-index documents for faster sorting. To do so, do the following:

from xml.xpath import Util
...
Util.IndexDocument(document_node)
...XPath operations...
Util.FreeDocumentIndex(document_node)

Do be sure to free the index to avoid memory leaks. Also note that it's a bad idea to mutate any node in the document while it is indexed.

Module xml.xpath.Context

XPath context

Module Summary

Classes

Class Summary
Context Represents the context used for XPath processing at any given point

Class Context

Represents the context used for XPath processing at any given point

Attribute Summary
node The context node, as used for computing XPath expressions
position The context node's position in the context node list, as returned by the XPath position() function
size The size of the context node list
varBindings Maps variable and parameters by expanded name to the value of the variable
processorNss provides expansion from namespace prefixes to uris for expanded names in name tests, variable names, etc.

Method Summary
nss Get a dictionary representing namespace nodes defined at the context node

Method Details

__init__(node, position, size, varBindings, processorNss)
      

Parameters
node of type Python DOM binding node object

The context node, as used for computing XPath expressions

position of type positive integer

The context node's position in the context node list, as returned by the XPath position() function

size of type positive integer

The size of the context node list

varBindings of type dictionary with keys a tuple of two strings and value a string, integer, BooleanType or node set (list of nodes)

Maps variable and parameters by expanded name to the value of the variable. Defaults to an empty dictionary.

processorNss of type dictionary with string key and value

provides expansion from namespace prefixes to uris for expanded names in name tests, variable names, etc. Defaults to an empty dictionary.

Return Value
None


nss

nss()
      

Get a dictionary representing namespace nodes defined at the context node

Parameters
None
Return Value
dictionary with string key and string value

Maps prefixes to namespace URIs