As noted previously, Version 6 JavaScript browsers seem to be coming together over the W3C DOM. Several key methods and properties in JavaScript can help in getting information from an XML file. In the section, a very simple XML file is used to demonstrate pulling data from XML into an HTML page using JavaScript to parse (interpret) the XML file. Unfortunately, the examples are limited to using IE5+ on Windows. (The same programs that worked fine using IE5+ on Windows bombed using IE5+ on the Mac using either OS 9+ or OS X.)
However, the great
majority of keywords used in the scripts are W3C DOM– compliant, and the only
keywords required from the Microsoft-unique set are XMLdocument and document.all(). All of the other keywords are found in NN6+. Table 15.1 shows
the W3C JavaScript keywords used in relationship to the XML file examples.
Table 15.1 Selected
Element Keywords in JavaScript
Property
|
Meaning
|
documentElement
|
Returns the root element of the document
|
firstChild
|
Is the first element within another element
(the first child of the current node)
|
lastChild
|
Is the last element within another element
(the last child of the current node)
|
nextSibling
|
Is the next element in the same nested level
as the current one
|
previousSibling
|
Is the previous element in the same nested
level as the current one
|
nodeValue
|
Is the value of a document element
|
getElementsByTagName
|
Used to place all elements into an object
|
Finding Children
To see how to pull data
from an XML file, all examples use the following XML file. The intentional
simplicity of the XML file is to help clarify using JavaScript with XML and
does not represent a sophisticated example of storing data in XML format.
writers.xml
<?xml
version="1.0" ?>
<writers>
<EnglishLanguage>
<fiction>
<pen>
<name>Jane
Austin</name>
<name>Rex Stout</name>
<name>Dashiell
Hammett</name>
</pen>
</fiction>
</EnglishLanguage>
</writers>
The XML file contains a
typical arrangement of data using a level of categories that you might find in
a bookstore or library arrangement. It is meant to be intuitively clear, as is
all XML.
The trick in all of the
following scripts is to understand how to find exactly what you want. The first
three scripts that follow use slightly different functions to find the first
child, last child, and sibling elements. The first script provides the entire
listing, and the second two just show the key JavaScript function within the
script. They all use the following common CSS file.
readXML.css
body {
font-family:verdana;
color:#ff4d00;
font-size:14pt;
font-weight:bold;
background-color:#678395;
}
div
{background-color:#c1d4cc;}
#blueBack
{background-color:#c1d4cc}
To read the first child
of an element, the reference is to document.firstChild. Given the simplicity of the sample XML file
(writers.xml), the script just keeps adding .firstChild to each of the elements as it makes its way to the place in the
XML file where the information with the data can be found.
However, before even
going after the first child of the <name> element, the HTML page sets up a connection to the XML page using
an <xml> container understood by Internet Explorer 5+ in
a Windows context. (At the time of this writing, IE6 was available, and it
worked fine with the following scripts, but only on a Windows PC.) The ID writersXML is defined as the XML object first, and then it
becomes part of a document, myXML, in this line:
myXML=
document.all("writersXML").XMLDocument
The document.all().XMLDocument is a Microsoft IE subset of JavaScript. After
this point, though, the JavaScript is pure W3C DOM and is consistent with NN6+.
With this line, writersNode is defined as the root element of the XML file
with the documentElement property:
writersNode =
myXML.documentElement
Its first child is the <EnglishLanguage> node, so the variable languageNode is defined aswritersNode.firstChild. Then the rest of the nodes in the XML document
are defined until the first child of the <name> node is encountered and its node value is placed into a variable
to be displayed in a text window. All of the processes are placed into the findWriter() user function.
readFirstChild.html
<html>
<head>
<link
rel="stylesheet" href="readXML.css"
type="text/css">
<title>Read
First Child</title>
<xml
ID="writersXML"
SRC="writers.xml"></xml>
<script
language="JavaScript">
function findWriter()
{
var myXML, writersNode, languageNode,
var penNode,nameNode,display
myXML= document.all("writersXML").XMLDocument
writersNode = myXML.documentElement
languageNode = writersNode.firstChild
fictionNode = languageNode.firstChild
penNode = fictionNode.firstChild
nameNode = penNode.firstChild
display =nameNode.firstChild.nodeValue;
document.show.me.value=display
}
</script>
</head>
<body>
<span
ID="blueBack">Read firstChild</span>
<div>
<form
name="show">
<input type=text
name="me">
<input
type="button" value="Display Writer"
onClick="findWriter()">
</form>
</div>
</body>
</html>
The first child of <pen> is displayed.
Reading the last child
uses an almost identical function. However, when the script comes to the parent
element <pen> of the <name> node, it asks for the last child, or simply the one at the end of
the list before the </pen> closing tag.
readLastChild.html (Function Only)
function findWriter()
{
var myXML, writersNode, languageNode,
var penNode,nameNode,display
myXML=
document.all("writersXML").XMLDocument
writersNode = myXML.documentElement
languageNode = writersNode.firstChild
fictionNode = languageNode.firstChild
penNode = fictionNode.firstChild
nameNode = penNode.lastChild //Here is the
key line
display =nameNode.firstChild.nodeValue;
document.show.me.value=display
}
Because the DOM contains
keywords for the first and last children, finding the beginning and end of an
XML file is pretty simple. What about all of the data in between? To display
the middle children, first you have to find the parent and start looking at the
next or previous sibling until you find what you want. This next function shows
how that is done using the nextSibling property.
readSibling.html (Function Only)
function findWriter()
{
var myXML, writersNode, languageNode
var penNode,nameNode,nextName,display
myXML=
document.all("writersXML").XMLDocument
writersNode = myXML.documentElement
languageNode = writersNode.firstChild
fictionNode = languageNode.firstChild
penNode = fictionNode.firstChild
nameNode = penNode.firstChild
nextName=nameNode.nextSibling //Not the
first but the next!
//The first child is the only child in
the next node.
display =nextName.firstChild.nodeValue;
document.show.me.value=display
}
The three functions
differ little in what they do or how they do it. However, using this method to
find a single name in a big XML file could take a lot of work. As you might
have surmised, because the XML file is part of an object, you can extract it in
an array-like fashion.
Reading Tag Names
Instead of tracing the
XML tree through child and parent nodes, you can use thegetElementByTagName() method. By specifying the tag name that you're
seeking, you can put all of the tag's values into an object and pull them out
using the document.item() method. The process is much easier than going
after first and last children or siblings and, I believe, much more effective
for setting up matching components. The following script is similar to the
others and uses the same external Cascading Style Sheet. The form is slightly
different at the bottom, so the whole program is listed rather than just the
function.
readNode.html
<html>
<head>
<link
rel="stylesheet" href="readXML.css"
type="text/css">
<title>
Read the whole list
</title>
<xml
ID="writersXML"
SRC="writers.xml"></xml>
<script
language="JavaScript">
function findWriters()
{
var myXML, myNodes;
var display="";
myXML=
document.all("writersXML").XMLDocument;
//Put the <name> element into an
object.
myNodes=myXML.getElementsByTagName("name");
//Extract the different values using a loop.
for(var
counter=0;counter<myNodes.length;counter++) {
display +=
myNodes.item(counter).firstChild.nodeValue +
"\n";
}
document.show.me.value=display;
}
</script>
</head>
<body>
<span
ID="blueBack">
Read All Data
</span>
<div>
<form name="show">
<textarea
name="me" cols=30
rows=5></textarea><p>
<input
type="button" value="Show all"
onClick="findWriters()">
</form></div>
</body>
</html>
All of the data in the specified tag category
are brought to the screen.
At this stage in browser
development, the great majority of terms used in extracting data from an XML
file are cross-browser–compatible, especially when Version 6 of both browsers
are compared side to side. In large measure, this is due to the fact that the
browser manufacturers are beginning to comply with the W3C DOM recommendations.
The Microsoft extensions to the W3C DOM could become adopted as part of the DOM
(as some have already), or the W3C DOM could develop functional equivalents.
However, at the time of this writing, there might not actually be a W3C
DOM–compliant method of the crucial first step of loading an XML document into
an HTML page. So, in the meantime, which I hope is short, it is necessary to
use the single-browser, single-platform techniques shown previously.
Well-Formed XML Pages
A well-formed XML page
requires either a DTD or a schema (exclusively Microsoft).The DTD tells the
parser what kind of data is contained in the XML file. If XML pages were parsed
only by JavaScript, no one would worry too much about DTD. However, when a
browser parses an XML file, it looks at the DTD to determine what kind of data
are in the file and how it is ordered. XML validators scan XML files and
determine whether they are valid, but browsers do not validate XML files. (A
good validator can be found at Brown University's site. If an XML file is not
valid, problems are likely to crop up.
Validation takes a
little extra work, but you will know that your XML file is well formed, and it
won't run into problems down the line somewhere. Using the example XML file
used previously, a DTD has been added in the following file, writersWF.xml.
All document type
definitions begin with this line:
<!DOCTYPE rootName
[
Because writers is the root element, it goes in as the root
name. Next, the first child of the root is declared—in this case, the child is <EnglishLanguage>, so the !ELEMENT declaration is as follows:
<!ELEMENT writers
(EnglishLanguage)>
You continue with !ELEMENT declarations until all of them are made. If more
than one instance of an element is within another element's container, a plus
sign (+) is added to the end of the element name. Because three nodes using <name> are within the <pen> element, the !ELEMENT declaration for <name> has a plus after it:
<!ELEMENT pen
(name+)>
Finally, close up the !DOCTYPE declaration using this code:
]>
Your file is ready for
validation. The complete listing follows.
writersWF.xml
<?xml
version="1.0" ?>
<!DOCTYPE writers [
<!ELEMENT writers
(EnglishLanguage)>
<!ELEMENT EnglishLanguage
(fiction)>
<!ELEMENT fiction
(pen)>
<!ELEMENT pen
(name+)>
<!ELEMENT name
(#PCDATA)>
]>
<writers>
<EnglishLanguage>
<fiction>
<pen>
<name>Jane
Austin</name>
<name>Rex Stout</name>
<name>Dashiell
Hammett</name>
</pen>
</fiction>
</EnglishLanguage>
</writers>
Will this new validated
file work with the example scripts provided previously? You bet! In all of the
previous files showing how JavaScript parses XML files, substitute writersWF.xml for the originalwriters.xml in this line:
<xml
ID="writersXML" SRC="writers.xml"></xml>
When you re-run the
script in IE5+ on your Windows PC, you will see exactly the same results. The
only difference is that now your XML file is well formed
No comments:
Post a Comment