The XMLReader class

Introduction

The XMLReader extension is an XML Pull parser. The reader acts as a cursor going forward on the document stream and stopping at each node on the way.

Class synopsis

XMLReader {
/* Constants */
const int XMLReader::NONE = 0 ;
const int XMLReader::ELEMENT = 1 ;
const int XMLReader::ATTRIBUTE = 2 ;
const int XMLReader::TEXT = 3 ;
const int XMLReader::CDATA = 4 ;
const int XMLReader::ENTITY_REF = 5 ;
const int XMLReader::ENTITY = 6 ;
const int XMLReader::PI = 7 ;
const int XMLReader::COMMENT = 8 ;
const int XMLReader::DOC = 9 ;
const int XMLReader::DOC_TYPE = 10 ;
const int XMLReader::DOC_FRAGMENT = 11 ;
const int XMLReader::NOTATION = 12 ;
const int XMLReader::WHITESPACE = 13 ;
const int XMLReader::END_ELEMENT = 15 ;
const int XMLReader::END_ENTITY = 16 ;
const int XMLReader::XML_DECLARATION = 17 ;
const int XMLReader::LOADDTD = 1 ;
const int XMLReader::DEFAULTATTRS = 2 ;
const int XMLReader::VALIDATE = 3 ;
const int XMLReader::SUBST_ENTITIES = 4 ;
/* Properties */
public readonly int $attributeCount ;
public readonly string $baseURI ;
public readonly int $depth ;
public readonly bool $hasAttributes ;
public readonly bool $hasValue ;
public readonly bool $isDefault ;
public readonly bool $isEmptyElement ;
public readonly string $localName ;
public readonly string $name ;
public readonly string $namespaceURI ;
public readonly int $nodeType ;
public readonly string $prefix ;
public readonly string $value ;
public readonly string $xmlLang ;
/* Methods */
bool close ( void )
DOMNode expand ( void )
string getAttribute ( string $name )
string getAttributeNo ( int $p.net )
string getAttributeNs ( string $localName , string $namespaceURI )
bool getParserProperty ( int $property )
bool isValid ( void )
bool lookupNamespace ( string $prefix )
bool moveToAttribute ( string $name )
bool moveToAttributeNo ( int $p.net )
bool moveToAttributeNs ( string $localName , string $namespaceURI )
bool moveToElement ( void )
bool moveToFirstAttribute ( void )
bool moveToNextAttribute ( void )
bool next ([ string $localname ] )
bool open ( string $URI [, string $encoding [, int $options = 0 ]] )
bool read ( void )
string readInnerXML ( void )
string readOuterXML ( void )
string readString ( void )
bool setParserProperty ( int $property , bool $value )
bool setRelaxNGSchema ( string $filename )
bool setRelaxNGSchemaSource ( string $source )
bool setSchema ( string $filename )
bool xml ( string $source [, string $encoding [, int $options = 0 ]] )
}

Properties

attributeCount

The number of attributes on the node

baseURI

The base URI of the node

depth

Depth of the node in the tree, starting at 0

hasAttributes

Indicates if node has attributes

hasValue

Indicates if node has a text value

isDefault

Indicates if attribute is defaulted from DTD

isEmptyElement

Indicates if node is an empty element tag

localName

The local name of the node

name

The qualified name of the node

namespaceURI

The URI of the namespace associated with the node

nodeType

The node type for the node

prefix

The prefix of the namespace associated with the node

value

The text value of the node

xmlLang

The xml:lang scope which the node resides

Predefined Constants

XMLReader Node Types

XMLReader::NONE

No node type

XMLReader::ELEMENT

Start element

XMLReader::ATTRIBUTE

Attribute node

XMLReader::TEXT

Text node

XMLReader::CDATA

CDATA node

XMLReader::ENTITY_REF

Entity Reference node

XMLReader::ENTITY

Entity Declaration node

XMLReader::PI

Processing Instruction node

XMLReader::COMMENT

Comment node

XMLReader::DOC

Document node

XMLReader::DOC_TYPE

Document Type node

XMLReader::DOC_FRAGMENT

Document Fragment node

XMLReader::NOTATION

Notation node

XMLReader::WHITESPACE

Whitespace node

XMLReader::SIGNIFICANT_WHITESPACE

Significant Whitespace node

XMLReader::END_ELEMENT

End Element

XMLReader::END_ENTITY

End Entity

XMLReader::XML_DECLARATION

XML Declaration node

XMLReader Parser Options

XMLReader::LOADDTD

Load DTD but do not validate

XMLReader::DEFAULTATTRS

Load DTD and default attributes but do not validate

XMLReader::VALIDATE

Load DTD and validate while parsing

XMLReader::SUBST_ENTITIES

Substitute entities and expand references

Table of Contents

    XMLReader::expand — بازگرداندن کپی گره فعلی شی DOMXMLReader::getAttributeNo — دریافت مقدار ویژگی به وسیله p.netXMLReader::getParserProperty — نشان‌دهنده تنظیم شدن خاصیت مشخص شدهXMLReader::lookupNamespace — Lookup namespace for a prefixXMLReader::moveToAttributeNo — انتقال cursor به یک ویژگی با استفاده از p.netXMLReader::moveToElement — انتقال cursor به جز والد در ویژگی فعلیXMLReader::moveToNextAttribute — انتقال cursor به ویژگی بعدیXMLReader::open — تنظیم URI شامل سند XML مورد پردازشXMLReader::readInnerXML — دریافت XML از گره فعلیXMLReader::readString — خواندن محتوای گره فعلی به عنوان رشتهXMLReader::setRelaxNGSchema — تنظیم filename یا URI برای RelaxNG SchemaXMLReader::setSchema — تایید صحت سند نسبت به XSDadd a note add a note

    User Contributed Notes 20 notes

down
3
japos dot trash at googlemail dot com
<?php
$xml
= new XMLReader();
$xml->XML('<tag attr="value" />');
$xml->read();
var_dump($xml->isEmptyElement);
$xml->moveToNextAttribute();
var_dump($xml->isEmptyElement);
?>

will output

(bool) true
(bool) false

So be sure to store $isEmptyElement before moving the cursor.
down
5
godseth at o2 dot pl.$sXmlFilePath;
        }
    }

   
/**
     * XML Parser
     *
     * @param XMLReader $oXml
     * @return array
     */
   
protected function parseXml( XMLReader $oXml ) {

       
$aAssocXML = null;
       
$iDc = -1;

        while(
$oXml->read()){
            switch (
$oXml->nodeType) {

                case
XMLReader::END_ELEMENT:

                    if (
$this->bOptimize) {
                       
$this->optXml($aAssocXML);
                    }
                    return
$aAssocXML;

                case
XMLReader::ELEMENT:

                    if(!isset(
$aAssocXML[$oXml->name])) {
                        if(
$oXml->hasAttributes) {
                           
$aAssocXML[$oXml->name][] = $oXml->isEmptyElement ? '' : $this->parseXML($oXml);
                        } else {
                            if(
$oXml->isEmptyElement) {
                               
$aAssocXML[$oXml->name] = '';
                            } else {
                               
$aAssocXML[$oXml->name] = $this->parseXML($oXml);
                            }
                        }
                    } elseif (
is_array($aAssocXML[$oXml->name])) {
                        if (!isset(
$aAssocXML[$oXml->name][0]))
                        {
                           
$temp = $aAssocXML[$oXml->name];
                            foreach (
$temp as $sKey=>$sValue)
                            unset(
$aAssocXML[$oXml->name][$sKey]);
                           
$aAssocXML[$oXml->name][] = $temp;
                        }

                        if(
$oXml->hasAttributes) {
                           
$aAssocXML[$oXml->name][] = $oXml->isEmptyElement ? '' : $this->parseXML($oXml);
                        } else {
                            if(
$oXml->isEmptyElement) {
                               
$aAssocXML[$oXml->name][] = '';
                            } else {
                               
$aAssocXML[$oXml->name][] = $this->parseXML($oXml);
                            }
                        }
                    } else {
                       
$mOldVar = $aAssocXML[$oXml->name];
                       
$aAssocXML[$oXml->name] = array($mOldVar);
                        if(
$oXml->hasAttributes) {
                           
$aAssocXML[$oXml->name][] = $oXml->isEmptyElement ? '' : $this->parseXML($oXml);
                        } else {
                            if(
$oXml->isEmptyElement) {
                               
$aAssocXML[$oXml->name][] = '';
                            } else {
                               
$aAssocXML[$oXml->name][] = $this->parseXML($oXml);
                            }
                        }
                    }

                    if(
$oXml->hasAttributes) {
                       
$mElement =& $aAssocXML[$oXml->name][count($aAssocXML[$oXml->name]) - 1];
                        while(
$oXml->moveToNextAttribute()) {
                           
$mElement[$oXml->name] = $oXml->value;
                        }
                    }
                    break;
                case
XMLReader::TEXT:
                case
XMLReader::CDATA:

                   
$aAssocXML[++$iDc] = $oXml->value;

            }
        }

        return
$aAssocXML;
    }

   
/**
     * Method to optimize assoc tree.
     * ( Deleting 0 p.net when element
     *  have one attribute / value )
     *
     * @param array $mData
     */
   
public function optXml(&$mData) {
        if (
is_array($mData)) {
            if (isset(
$mData[0]) && count($mData) == 1 ) {
               
$mData = $mData[0];
                if (
is_array($mData)) {
                    foreach (
$mData as &$aSub) {
                       
$this->optXml($aSub);
                    }
                }
            } else {
                foreach (
$mData as &$aSub) {
                   
$this->optXml($aSub);
                }
            }
        }
    }

}

?>

[EDIT BY danbrown AT php DOT net:  Fixes were also provided by "Alex" and (qdog AT qview DOT org) in user notes on this page (since removed).]
down
10
dkrnl at ya.net dot ruhttps://github.com/dkrnl/SimpleXMLReader

Usage example: http://github.com/dkrnl/SimpleXMLReader/blob/master/examples/example1.php

<?php

/**
* Simple XML Reader
*
* @license Public Domain
* @author Dmitry Pyatkov(aka dkrnl) <dkrnl@ya.net.ru>
* @url http://github.com/dkrnl/SimpleXMLReader
*/
class SimpleXMLReader extends XMLReader
{

   
/**
     * Callbacks
     *
     * @var array
     */
   
protected $callback = array();

   
/**
     * Add node callback
     *
     * @param  string   $name
     * @param  callback $callback
     * @param  integer  $nodeType
     * @return SimpleXMLReader
     */
   
public function registerCallback($name, $callback, $nodeType = XMLREADER::ELEMENT)
    {
        if (isset(
$this->callback[$nodeType][$name])) {
            throw new
Exception("Already exists callback $name($nodeType).");
        }
        if (!
is_callable($callback)) {
            throw new
Exception("Already exists parser callback $name($nodeType).");
        }
       
$this->callback[$nodeType][$name] = $callback;
        return
$this;
    }

   
/**
     * Remove node callback
     *
     * @param  string  $name
     * @param  integer $nodeType
     * @return SimpleXMLReader
     */
   
public function unRegisterCallback($name, $nodeType = XMLREADER::ELEMENT)
    {
        if (!isset(
$this->callback[$nodeType][$name])) {
            throw new
Exception("Unknow parser callback $name($nodeType).");
        }
        unset(
$this->callback[$nodeType][$name]);
        return
$this;
    }

   
/**
     * Run parser
     *
     * @return void
     */
   
public function parse()
    {
        if (empty(
$this->callback)) {
            throw new
Exception("Empty parser callback.");
        }
       
$continue = true;
        while (
$continue && $this->read()) {
            if (isset(
$this->callback[$this->nodeType][$this->name])) {
               
$continue = call_user_func($this->callback[$this->nodeType][$this->name], $this);
            }
        }
    }

   
/**
     * Run XPath query on current node
     *
     * @param  string $path
     * @param  string $version
     * @param  string $encoding
     * @return array(SimpleXMLElement)
     */
   
public function expandXpath($path, $version = "1.0", $encoding = "UTF-8")
    {
        return
$this->expandSimpleXml($version, $encoding)->xpath($path);
    }

   
/**
     * Expand current node to string
     *
     * @param  string $version
     * @param  string $encoding
     * @return SimpleXMLElement
     */
   
public function expandString($version = "1.0", $encoding = "UTF-8")
    {
        return
$this->expandSimpleXml($version, $encoding)->asXML();
    }

   
/**
     * Expand current node to SimpleXMLElement
     *
     * @param  string $version
     * @param  string $encoding
     * @param  string $className
     * @return SimpleXMLElement
     */
   
public function expandSimpleXml($version = "1.0", $encoding = "UTF-8", $className = null)
    {
       
$element = $this->expand();
       
$document = new DomDocument($version, $encoding);
       
$node = $document->importNode($element, true);
       
$document->appendChild($node);
        return
simplexml_import_dom($node, $className);
    }

   
/**
     * Expand current node to DomDocument
     *
     * @param  string $version
     * @param  string $encoding
     * @return DomDocument
     */
   
public function expandDomDocument($version = "1.0", $encoding = "UTF-8")
    {
       
$element = $this->expand();
       
$document = new DomDocument($version, $encoding);
       
$node = $document->importNode($element, true);
       
$document->appendChild($node);
        return
$document;
    }

}
?>
up
down
8
kula_shakerz
You should save the value of $isEmptyElement before processing attributes, or call moveToElement to make $isEmptyElement valid after processing attributes.

$isEmptyElement returns FALSE when XMLReader is positioned on an attribute node, even if attribute's parent element is empty.
down
9
jart (at) mail.ru. $name . "<br>");
   
    while(
$xml->read())
    {
        if(
$xml->nodeType == XMLReader::END_ELEMENT)
        {
            print
"</ul>";
            return
$tree;
        }
       
        else if(
$xml->nodeType == XMLReader::ELEMENT)
        {
           
$node = array();
           
            print(
"Adding " . $xml->name ."<br>");
           
$node['tag'] = $xml->name;

            if(
$xml->hasAttributes)
            {
               
$attributes = array();
                while(
$xml->moveToNextAttribute())
                {
                    print(
"Adding attr " . $xml->name ." = " . $xml->value . "<br>");
                   
$attributes[$xml->name] = $xml->value;
                }
               
$node['attr'] = $attributes;
            }
           
            if(!
$xml->isEmptyElement)
            {
               
$childs = xml2assoc($xml, $node['tag']);
               
$node['childs'] = $childs;
            }
           
            print(
$node['tag'] . " added <br>");
           
$tree[] = $node;
        }
       
        else if(
$xml->nodeType == XMLReader::TEXT)
        {
           
$node = array();
           
$node['text'] = $xml->value;
           
$tree[] = $node;
            print
"text added = " . $node['text'] . "<br>";
        }
    }
   
    print
"returning " . count($tree) . " childs<br>";
    print
"</ul>";
   
    return
$tree;
}

echo
"<PRE>";

$xml = new XMLReader();
$xml->open('test.xml');
$assoc = xml2assoc($xml, "root");
$xml->close();

print_r($assoc);
echo
"</PRE>";

?>

It reads this xml:

<test>
    <hallo volume="loud"> me <br/> lala </hallo>
    <hallo> me </hallo>
</test>
down
3
desk_ocean at msn dot com
<?php
function xml2assoc(&$xml){
   
$assoc = NULL;
   
$n = 0;
    while(
$xml->read()){
        if(
$xml->nodeType == XMLReader::END_ELEMENT) break;
        if(
$xml->nodeType == XMLReader::ELEMENT and !$xml->isEmptyElement){
           
$assoc[$n]['name'] = $xml->name;
            if(
$xml->hasAttributes) while($xml->moveToNextAttribute()) $assoc[$n]['atr'][$xml->name] = $xml->value;
           
$assoc[$n]['val'] = xml2assoc($xml);
           
$n++;
        }
        else if(
$xml->isEmptyElement){
           
$assoc[$n]['name'] = $xml->name;
            if(
$xml->hasAttributes) while($xml->moveToNextAttribute()) $assoc[$n]['atr'][$xml->name] = $xml->value;
           
$assoc[$n]['val'] = "";
           
$n++;               
        }
        else if(
$xml->nodeType == XMLReader::TEXT) $assoc = $xml->value;
    }
    return
$assoc;
}
?>

add else if($xml->isEmptyElement)
may be some xml has emptyelement
down
2
itari.$node->depth;
      print
'-'.$seq++;
      print
'  '.$path.'/'.($node->nodeType==3?'text() = ':$node->name);
      print
$node->value;
      if (
$node->hasAttributes) {
        print
' [hasAttributes: ';
        while (
$node->moveToNextAttribute()) print '@'.$node->name.' = '.$node->value.' ';
        print
']';
        }
      if (
$node->nodeType == 1) {
       
$oldpath=$path;
       
$path.='/'.$node->name;
        }
     
parseXML($node,$seq,$path);
      }
    else
parseXML($node,$seq,$oldpath);
}

$source = "<tag1>this<tag2 id='4' name='foo'>is</tag2>a<tag2 id='5'>common</tag2>record</tag1>";
$xml = new XMLReader();
$xml->XML($source);
print
htmlspecialchars($source).'<br/>';
parseXML($xml,0,'');
?>

Output:

<tag1>this<tag2 id='4' name='foo'>is</tag2>a<tag2 id='5'>common</tag2>record</tag1>

0-0 /tag1
1-1 /tag1/text() = this
1-2 /tag1/tag2 [hasAttributes: @id = 4 @name = foo ]
2-3 /tag1/text() = is
1-4 /text() = a
1-5 /tag2 [hasAttributes: @id = 5 ]
2-6 /text() = common
1-7 /text() = record
down
1
Sergey Aikinkulov=> $xml->isEmptyElement ? '' : xml2assoc($xml));
            if(
$xml->hasAttributes){
             
$el =& $assoc[$xml->name][count($assoc[$xml->name]) - 1];
              while(
$xml->moveToNextAttribute()) $el['attributes'][$xml->name] = $xml->value;
            }
            break;
          case
XMLReader::TEXT:
          case
XMLReader::CDATA: $assoc .= $xml->value;
        }
      }
      return
$assoc;
    }
?>
down
2
Sean Colin Ruizup
down
1
lee8oi at gmail dot com
Unfortunately simpleXML or xml DOM cannot process all xml strings. Some have error boxes added to the end of them (such as Battlefield Heroes syndicated news). These boxes cause an end of file sort of error and closes out the script. XMLReader grabs data from these strings without error.
down
0
eef dot vreeland at gmail dot com
About (non-)self-closing tags:

A) <tag></tag>
    $xmlRdr->isEmptyElement => false
    $xmlRdr->hasValue       => false
    $xmlRdr->value          => ''
    $xmlRdr->hasAttributes  => false

B) <tag />
    $xmlRdr->isEmptyElement => true
    $xmlRdr->hasValue       => false
    $xmlRdr->value          => ''
    $xmlRdr->hasAttributes  => false

C) <tag attribute="value"></tag>
    $xmlRdr->isEmptyElement => false
    $xmlRdr->hasValue       => false
    $xmlRdr->value          => ''
    $xmlRdr->hasAttributes  => true

D) <tag attribute="value" />
    $xmlRdr->isEmptyElement => true
    $xmlRdr->hasValue       => false
    $xmlRdr->value          => ''
    $xmlRdr->hasAttributes  => true

... and always use the '===' operator when testing properties
down
0
casella dot email at google dot mail dot comup
down
-1
eef dot vreeland at gmail dot com   $xmlRdr->hasAttributes  => false

B) <tag />
   $xmlRdr->isEmptyElement => true
   $xmlRdr->hasValue       => false
   $xmlRdr->value          => ''
   $xmlRdr->hasAttributes  => false

C) <tag attribute="value"></tag>
   $xmlRdr->isEmptyElement => false
   $xmlRdr->hasValue       => false
   $xmlRdr->value          => ''
   $xmlRdr->hasAttributes  => true

D) <tag attribute="value" />
   $xmlRdr->isEmptyElement => true
   $xmlRdr->hasValue       => false
   $xmlRdr->value          => ''
   $xmlRdr->hasAttributes  => true
down
0
boukeversteegh at gmail dot com
<?php
function xml2assoc($xml) {
   
$tree = null;
    while(
$xml->read())
        switch (
$xml->nodeType) {
            case
XMLReader::END_ELEMENT: return $tree;
            case
XMLReader::ELEMENT:
               
$node = array('tag' => $xml->name, 'value' => $xml->isEmptyElement ? '' : xml2assoc($xml));
                if(
$xml->hasAttributes)
                    while(
$xml->moveToNextAttribute())
                       
$node['attributes'][$xml->name] = $xml->value;
               
$tree[] = $node;
            break;
            case
XMLReader::TEXT:
            case
XMLReader::CDATA:
               
$tree .= $xml->value;
        }
    return
$tree;
}

?>

Usage:

myxml.xml:
------
<PERSON>
    <NAME>John</NAME>
    <PHONE type="home">555-555-555</PHONE>
</PERSON>
----

<?
    $xml
= new XMLReader();
   
$xml->open('myxml.xml');
   
$assoc = xml2assoc($xml);
   
$xml->close();
   
print_r($assoc);
?>

Outputs:
Array
(
    [0] => Array
        (
            [tag] => PERSON
            [value] => Array
                (
                    [0] => Array
                        (
                            [tag] => NAME
                            [value] => John
                        )

                    [1] => Array
                        (
                            [tag] => PHONE
                            [value] => 555-555-555
                            [attributes] => Array
                                (
                                    [type] => home
                                )

                        )

                )

        )

)

For reasons that have to do with recursion, it returns an array with the ROOT xml node as the first childNode, rather than to return only the ROOT node.
down
-3
andrei_antal at yahoo dot comup
down
-7
jnettles at inccrra dot org
<?php
$foo
= new XMLReader();
$foo->xml($STRING);
?>

.... where $STRING holds your XML. You cannot pass it like $foo = $STRING or $foo->xml = $STRING.
  • XMLReader
    • Introduction
    • Installing/Configuring
    • XMLReader
To Top