Sunday

Using PHP 4's DOM XML functions to create XML files from SQL data

Intended Audience

This tutorial is intended for developers who wish to extract data from a database and insert it into XML files so that it can be processed in some way, usually by transforming it into HTML using an XSL file.

This method completely splits the presentation layer (i.e. the generation of HTML documents) from the business layer (the application of business rules using a language such as PHP) so that any one of these layers can be modified without affecting the other.

In the samples below the code is generic in that no column names are ever hard coded. Data is retrieved from the database as an associative array (a series of 'name=value' pairs), and every element of the array is extracted and transferred to the XML file. The contents of the associative array are therefore governed entirely by the SQL 'select' statement.

In the samples below I will show how to deal with data from a single table, and then data from two tables with a one-to-many relationship. Finally I will show how to insert additional data in the form of XML attributes.

Prerequisites

The sample code requires that you have the DOM XML extension available in your PHP installation. It is also assumed that you know what an XML file is and what it can be used for.

Please note that this extension has been removed from PHP 5 and moved to the PECL repository. For details on how to use the DOM extension instead please refer to Using PHP 5's DOM functions to create XML files from SQL data

Multiple occurrences of a single table

The following code will take the contents of $dbresult (any number of rows, each of which contains a series of name=value pairs) and write it to a variable as an XML string. This can subsequently be written to a disk file or transformed into an HTML document using the Sablotron XSLT processor which is built into PHP. For details on how to use this extension please refer to Using PHP 4's Sablotron extension to perform XSL Transformations

This first piece of code simply connects to the database and performs a query:

$dbconnect = mysql_connect('localhost', 'user', 'pass')) {

echo "Connection failed to the host 'localhost'.";
exit;
} // if
if (!mysql_select_db('test')) {
echo "Cannot connect to database 'test'";
exit;
} // if

$table_id = 'some_table';
$query = "SELECT * FROM $table_id";
$dbresult = mysql_query($query, $dbconnect);

Now that we have our data we transfer it to an XML document. We start by creating a new DOM document. The following command will set the XML version number to '1.0' and return the object reference for the new document:

// create a new XML document

$doc = domxml_new_doc('1.0');

The first element we create in the XML document is known as the root element. Each XML document must have 1, and only 1, root element. In this example I have called it 'root', but you can use whatever name you like (such as the name of the PHP script which is executing). Note that you have to create the element and insert it into the document with two functions.

// create root node

$root = $doc->create_element('root');
$root = $doc->append_child($root);

Now we are ready to start adding the data we have retrieved from the database. Note that I am returning each row as an associative array which provides me with a list of 'name=value' pairs. This makes all subsequent processing far easier.

// process one row at a time

while($row = mysql_fetch_assoc($dbresult)) {

The first task I must perform for each row is to add a new element to the XML document. Here I create a new element using the table name, then I insert it into the document as a child of the root element.

  // add node for each row

$occ = $doc->create_element($table_id);
$occ = $root->append_child($occ);

Now I loop through each column in the current row, and insert the fieldname and corresponding value. You will see how having an associative array makes life easy. I need not concern myself with how many columns have been returned from the database query, nor with the order in which they are presented, as every column in the array gets written out.

  // add a child node for each field

foreach ($row as $fieldname => $fieldvalue) {

Note that here I create a new element for the field and then insert it as a child to the current database row, as identified in $occ.

    $child = $doc->create_element($fieldname);

$child = $occ->append_child($child);

Now I must add the field value as a text node, then insert it as a child element to the current field node, as identified in $child.

    $value = $doc->create_text_node($fieldvalue);

$value = $child->append_child($value);

These loops do not terminate until they have processed every column of every row which has been retrieved from the database.

  } // foreach

} // while

This next function returns the completed XML document as a string.

// get completed xml document

$xml_string = $doc->dump_mem(true);

Here I am simply outputting the results to the client browser, but I could just as easily perform some additional processing such as passing it to an XSLT processor for transformation into another document, such as an HTML document, using the contents of a separate XSL file.

echo $xml_string;

?>

The contents of the XML file produced with this code will look something like the following, starting with the XML declaration, with the version number, and immediately followed by the root node.

"1.0"?>


For each database row there will be an element, as a child to the root node, which contains the table name Each row element will have a separate child element for each column within that row. Note that each column element contains a text node for its value, while the row element does not have a text node. This grouping will be repeated for each column within each row. After the last column, notice the closing tag for the current row element, after which the row/column group must be repeated for each additional row that was extracted from the database and transferred to the XML file.

  

value1
value2
............
valueX


............

The last line in an XML file is there to close the root node.


Note that each element within the XML document has an opening and a closing tag in the format .... This identifies the node name within the document tree. Everything between these two tags is a child node to that element. This child node may be a text node or another element.

You may sometimes see an element in an XML document shown as . This signifies that the element is empty. When an element is empty, XML allows the opening and closing tags to be merged into a single self-closing tag.

A One-to-Many relationship

In the following example the XML string will contain data from two tables arranged in a One-to-Many (or parent-to-child or outer-to-inner) relationship. In the following code two query results are produced: $resouter for the parent table and $resinner for the child table. I shall only comment this code where there are differences.

$dbconnect = mysql_connect('localhost', 'user', 'pass')) {

echo "Connection failed to the host 'localhost'.";
exit;
} // if
if (!mysql_select_db('test')) {
echo "Cannot connect to database 'test'";
exit;
} // if

Here, for example, are separate database queries for each of the two tables:

$outer_table = 'parent_table';

$query = "SELECT * FROM $outer_table WHERE column='value'";
$resouter = mysql_query($query, $dbconnect);

$inner_table = 'child_table';
$query = "SELECT * FROM $inner_table WHERE column='value'";
$resinner = mysql_query($query, $dbconnect);

Here we create a new DOM document and add the root node:

// create a new XML document

$doc = domxml_new_doc('1.0');

// add root node
$root = $doc->create_element('root');
$root = $doc->append_child($root);

Here we add a node for the single row obtained from the parent table:

// add node for parent/outer table

$outer = $doc->create_element($outer_table);
$outer = $root->append_child($outer);

We must not forget to add each column value as a child element to the $outer node.

// take only one row from parent/outer table

$row = mysql_fetch_assoc($resouter);

// add a child node for each parent field
foreach ($row as $fieldname => $fieldvalue) {
$child = $doc->create_element($fieldname);
$child = $outer->append_child($child);
$value = $doc->create_text_node($fieldvalue);
$value = $child->append_child($value);
} // foreach

Here we add a node for each row obtained from the child table. Note that each of these rows is inserted as a child node to the $outer node, not the $root node. Each $inner node will have its column values inserted as its children.

// process all rows of the inner/many/child table

while($row = mysql_fetch_assoc($resinner)) {
// add node for each record
$inner = $doc->create_element($inner_table);
$inner = $outer->append_child($inner);
// add a child node for each field
foreach ($row as $fieldname => $fieldvalue) {
$child = $doc->create_element($fieldname);
$child = $inner->append_child($child);
$value = $doc->create_text_node($fieldvalue);
$value = $child->append_child($value);
} // foreach
} // while

Finally, get the completed XML document and send it to the client's browser.

// get completed xml document

$xml_string = $doc->dump_mem(true);
echo $xml_string;
?>

The above code will produce an XML file with the following structure:

"1.0"?>



value1
value2
............
valueX

value1
value2
............
valueX


............



This has the structure to to . The has child nodes which are its column values as well as multiple occurrences of .

Adding optional attributes

It may sometimes be necessary to include additional information for an element with the XML data, and this can be done in the form of attributes. An attribute has a name and a value, and any number of attributes can be added to an element. This must be done by using the '->set_attribute' method immediately after the '->append_child' method and before any '->create_text_node' method, as shown in the following code snippet:

$child = $doc->create_element($fieldname);

$child = $outer->append_child($child);
$child->set_attribute('attr1', 'attrval1');
$child->set_attribute('attr2', 'attrval2');
$value = $doc->create_text_node($fieldvalue);
$value = $child->append_child($value);

These attribute values will then appear within the element's start tag, as follows:

"1.0"?>



"attrval1" attr2="attrval2">value1
"attrval1" attr2="attrval2">value2
"attrval1" attr2="attrval2">value3


Note that you can insert attributes for row elements as well as column elements.

In my own application I use attribute values to specify the size of each column, so that it does not have to be hard-coded within the XSL file. For multi-line columns I pass values for both 'rows' and 'cols'.

I also use attributes to include any error messages. All error messages get inserted to an array called $errors where the key is the fieldname and the value is the message. The code to insert the error message into the XML document as an attribute of the field which generated the error is as simple as this:

if (isset($errors[$fieldname])) {

$child->set_attribute("error", $errors[$fieldname]);
} // if

Using Multi-Byte Characters

I had a slight problem recently when the data I output to my XML file contained characters with accents (as in à, è, í, ö and û). These were coming out all garbled until I discovered the reason why. It turns out that the internal encoding for libxml when storing the document is UTF-8, so you need to convert non UTF-8 encoded strings into UTF-8 when setting content. This requires the following changes to the code samples:

(1) Convert from default character set (refer to default_charset in file php.ini) to UTF-8 by inserting a single line as follows:

  foreach ($row as $fieldname => $fieldvalue) {

$child = $doc->create_element($fieldname);
$child = $inner->append_child($child);
$value = mb_convert_encoding($value,'UTF-8','ISO-8859-1'); <<-- new line $value = $doc->create_text_node($fieldvalue);
$value = $child->append_child($value);
} // foreach

(2) To output the document correctly you must convert back to the default character set by amending the following line:

$xml_string = $doc->dump_mem(true, 'ISO-8859-1');

NOTE: In order for this to work you must enable the Multi-Byte String functions in PHP.

Conclusion

By using this method I have been able to develop a generic mechanism for creating XML files based on the relationship of the database tables concerned. All I need do is specify the table names(s) and the selection criteria, and whatever comes out of the database will be transferred to an XML file for subsequent transformation into HTML using an XSL file.
By Tony Marston

No comments:

Financial Engineering