The use of signals and slots in the
previous section was an example of using signals and slots in
GUI building. Of course, you can use signals and slots to link
GUI widgets with each other, and most of your slot
implementations will be in subclasses of
QWidget — but the mechanism works
well under other circumstances. A GUI is not necessary.
In this section, I will show how signals
and slots make a natural extension to the event driven nature of
XML parsers. As you probably know, XML is a fairly simple
mark-up language that can be used to represent hierarchical
data. There are basically two ways to look at XML data. One is
to convert the data in one fell swoop into some hierarchical
representation (for example, dictionaries containing
dictionaries). This method is the DOM (data-object-model)
representation. Alternatively, you can parse the data character
by character, generating an event every time a certain chunk has
been completed; this is the SAX parser model.
Python contains support for both XML
handling models in its standard libraries. The currently
appreciated module is xml.sax, which can make use of the fast
expat parser. However, expat is not part of standard Python.
There is an older, deprecated module, xmllib, which uses regular
expressions for parsing. While deprecated, this module is still
the most convenient introduction to XML handling with Python.
It's also far more ‘Pythonic' in feel than the Sax module,
which is based on the way Java does things.
We'll create a special module that will use
xmllib to parse an XML document and generate PyQt signals for
all elements of that document. It is easy to connect these
signals to another object (for instance, a PyQt
QListView which can show the XML document
in a treeview). But it would be just as easy to create a
formatter object that would present the data as HTML. A slightly
more complicated task would be to create a formatter object that
would apply XSLT transformations to the XML document —
that is, it would format the XML using stylesheets. Using
signals and slots, you can connect more than one transformation
to the same run of the parser. A good example would be a
combination of a GUI interface, a validator, and a statistics
calculator.
The next example is very simple. It is easy to extend,
though, with special nodes for comments, a warning message
box for errors, and more columns for attributes.
Example 7-9. An XML parser with signals and slots
#
# qtparser.py — a simple parser that, using xmllib,
# generates a signal for every parsed XML document.
#
import sys
import xmllib
from qt import *
TRUE=1
FALSE=0
We import the deprecated
xmllib module. It is deprecated because the sax module,
which uses the expat library, is a lot faster.
The xmllib module is far easier to use, however, and since it
uses regular expressions for its parsing, it is
available everywhere, while the expat library must be
compiled separately.
It is often convenient to
define constants for the boolean values true and false.
This is the Parser class. It
inherits the XMLParser class from
the xmllib module. The XMLParser
class can be used in two ways: by overriding a set of
special methods that are called when the parser
encounters a certain kind of XML element, or by
overriding a variable, self.elements,
which refers to a dictionary of tag-to-method mappings.
Overriding self.elements is very
helpful if you are writing a parser for a certain DTD or
XML document type definition, though it is not the way
to go for a generic XML structure viewer (such as the
one we are making now).
An example for a Designer ui file could contain the
following definition:
The keys to this dictionary are the actual tag
strings. The tuple that follows the key consists of the
functions that should be called for the opening and the
ending tag. If you don't want a function to be called,
enter None. Of course, you must implement these
functions yourself, in the derived parser class.
The first argument (after self, of course) to the
constructor is a QObject. Multiple inheritance isn't a
problem in Python, generally speaking, but you cannot
multiply inherit from PyQt classes. Sip gets hopelessly
confused if you do so. So we pass a
QObject to the constructor of the
Parser class. Later, we will have
this QObject object emit the
necessary signals.
The start
function takes a string as its parameter. This string
should contain the entire XML document. It is
also possible to rewrite this function to read a file line by
line; the default approach makes it difficult to work with
really large XML files. Reading a file line by line is a
lot easier on your computer's memory. You should call
close() after the last bit of text
has been passed to the parser.
The
xmllib.XMLParser class defines a
number of methods that should be overridden if you want
special behavior. Even though we will only use the
methods that are called when a document is started and
when a simple element is opened and closed, I've
implemented all possible functions here.
Every valid XML document should
start with a magic text that declares itself to be XML
— note that that the .ui Designer files don't
comply with this requirement. This method is fired (and
thus the signal is fired) when the parser encounters
this declaration. Normally, it looks like this:
<?xml version="1.0"
standalone="no"?>, with the minor
variation that standalone can also have the value "yes".
If an XML document has a
documenttype, this method is called. A doctype
declaration looks like this:
<!DOCTYPE book PUBLIC "-//Norman Walsh//DTD DocBk XML V3.1.4//EN"
"http://nwalsh.com/docbook/xml/3.1.4/db3xml.dtd">
and points to a DTD — a
description of what's allowed in this particular kind of
XML document.
There can be data in between the tags in an XML document,
just as with the text in a HTML document. This function is
called when the parser encounters such data.
In XML, you can use special
characters that are entered with &#, a number, and
closed with a semicolon. Python's xmllib will want to
translate this to an ASCII character. You cannot use
xmllib to parse documents that contain references to
Unicode characters.
XML has the same kind of comments as HTML. Most
parsers simply pass the comments, but if you want to show
them (for instance, in a structured view of an XML document)
or if you want to preserve the contents of the file exactly,
you can connect a slot to the signal emitted by this function.
CDATA is literal data enclosed
between <![CDATA[ and
]]>. A file containing
<![CDATA[surely you will be allowed to
starve to death in one of the royal parks.]]>
will present the quote
‘surely you will be allowed to starve to death in
one of the royal parks.' to any slot that is connected
to sigCData.
This is called when the XML document
contains processing instructions. A processing
instruction begins with <?. All special cases, such
as the XML declaration itself, are handled by other
methods.
You can declare entities in XML
— references to something externally defined.
Those start with <!. The contents of the declaration
will be passed on in the data
argument.
XML is far less forgiving than HTML
(or at least, XML has both a stricter definition and
less easy-going parsers), and whenever an error is
encountered, such as forgetting to close a tag, this
method is called.
unknown_starttag is the most
interesting method in the
xmllib.XMLParser class. This is
called whenever the xmllib parser encounters a plain tag
that is not present in its elements
dictionary. That is, it will be called for all elements
in our current implementation.
Unknown entities are forbidden in
XML — if you use an entity somewhere in your
document (which you can do by placing the name of the
entity between an ampersand and a semicolon), then it
must be declared. However, you might want to catch
occurrences of unknown entities and do something
special. That's why the function
unknown_entityref is implemented
here. By default unknown_entityref
calls the syntax_error() function
of xmllib.XMLParser.
The TreeView class will show the
contents of the XML file.
class TreeView(QListView):
def __init__(self, *args):
apply(QListView.__init__,(self, ) + args)
self.stack=[]
self.setRootIsDecorated(TRUE)
self.addColumn("Element")
def startDocument(self, tag, pubid, syslit, data):
i=QListViewItem(self)
if tag == None: tag = "None"
i.setText(0, tag)
self.stack.append(i)
def startElement(self, tag, attributes):
if tag == None: tag = "None"
i=QListViewItem(self.stack[-1])
i.setText(0, tag)
self.stack.append(i)
def endElement(self, tag):
del(self.stack[-1])
The TreeView
class is a simple subclass of PyQt's versatile
QListView class.
Because XML is a hierarchical file
format, elements are neatly nested in each other. In
order to be able to create the right treeview, we should
keep a stack of the current element depth. The last
element of the stack will be the parent element of all
new elements.
This option sets the beginning of
the tree at the first element, making it clear to the
user that it's an expandable tree instead of a simple
list.
We present only one column in the
listview — if you want to show the attributes of
elements, too, you might add a few more columns.
The
startDocument function is called
when the XML document is opened. It also starts the call
stack by creating the first element. The first
QListViewItem object has the
listview as a parent; all others with have a
QListViewItem object as parent.
The constructor of QListViewItem
is so overloaded that sip tends to get confused, so I
create the item and set its text separately.
Whenever an element is opened, a
QListViewItem item is created and
pushed on the stack, where it becomes the parent for newly
opened elements.
Conversely, when the element is
closed, it is popped from the
stack.
Here we create a
QObject which is used to emit all
necessary signals, since we cannot inherit from more
than one PyQt class at the same time. Note that by using
this technique, you don't have to subclass from
QObject in order to be able to
emit signals. Sometimes delegation works just as
well.
A parser object is created, with the
QObject object as its argument.
Before feeding the parser the text, all connections
we want are made from the QObject
object (which we passed to the parser to make sure it
can emit signals) to the TreeView
object that forms the main window.
The file whose name was given on the command
line is read and passed on to the parser. I have included
a very small test file, test.xml,
but you can use any Designer UI design file.
This is a very simple and convenient way
of working with XML files and PyQt gui's — but it's
generally useful, too. The standard way of working with XML
files and parsers allows for only one function to be called for
each tag. Using signals and slots, you can have as many slots
connected to each signal as you want. For instance, you can have
not only a gui, but also an analyzer that produces statistics
listening in on the same parsing run.