The DOMXPath class

(PHP 5, PHP 7)

简介

Supports XPath 1.0

类摘要

DOMXPath {

/* 属性 */

public DOMDocument $document ;

/* 方法 */

public __construct ( DOMDocument $doc )

public evaluate ( string $expression [, DOMNode $contextnode [, bool $registerNodeNS = TRUE ]] ) : mixed

public query ( string $expression [, DOMNode $contextnode [, bool $registerNodeNS = TRUE ]] ) : DOMNodeList

public registerNamespace ( string $prefix , string $namespaceURI ) : bool

public registerPhpFunctions ([ mixed $restrict ] ) : void

}

属性

document

DOMXPath::__construct — Creates a new DOMXPath object
DOMXPath::evaluate — Evaluates the given XPath expression and returns a typed result if possible
DOMXPath::query — Evaluates the given XPath expression
DOMXPath::registerNamespace — Registers the namespace with the DOMXPath object
DOMXPath::registerPhpFunctions — Register PHP functions as XPath functions

User Contributed Notes

peter at softcoded dot com 03-May-2017 05:51


You may not always know at runtime whether your file has

a namespace or not. This can make it difficult to create

XPath queries. Use the seriously underdocumented

"namespaceURI" property of the documentElement of a

DOMDocument to determine if there is a namespace.

Use code such as the following:



$doc = new DOMDocument();

$doc->load($file);

$xpath = new DOMXPath($doc);

$ns = $doc->documentElement->namespaceURI;

if($ns) {

  $xpath->registerNamespace("ns", $ns);

  $nodes = $xpath->query("//ns:em[@class='glossterm']");

} else {

  $nodes = $xpath->query("//em[@class='glossterm']");

}

//look at nodes here

peter at softcoded dot com 03-May-2017 04:46


Using XPath expressions can save a lot of programming

and allow you to home in on only the nodes you want.

Suppose you want to delete all empty <p> tags.

If you create a query using the following XPath expression,

you can find <p> tags that do not have any text

(other than spaces), any attributes,

any children or comments:



$expression = "//p[not(@*)  

   and not(*) 

   and not(./comment())

   and normalize-space(text())='']";

   

This expression will only find para tags that look like:



<p>[any number of spaces]</p>

<p></p>



Imagine the code you would have to add if you used

DOMDocument::getElementsByTagName("p") instead.

archimedix32783262 at mailinator dot com 03-Sep-2014 12:03


Note that evaluate() will use the same encoding as the XML document.



So if you have a UTF-16 XML, you will have to query using UTF-16 strings.



You can use iconv() to convert from your code's encoding to the target encoding for better legibility.

dhz 05-Jan-2011 10:47


I just spent far too much time chasing this one....



When running an xpath query on a table be careful about table internal nodes (ie: <tr></tr>, and <td></td>).  If the master <table> tag is missing, then query() (and likely evaluate() also) will return unexpected results.



I had a DOMNode with a structure like this:



<td>

    <table></table>

    <table>

        <tr>

            <td></td>

        </tr>

        <tr>

            <td></td>

            <td></td>

        </tr>

    </table>

</td>



Upon which I was trying to do a relative query (ie: <?php $xpath_obj->query('my/x/path', $relative_node); ?>).



But because of the lone outer <td></td> tags, the inner tags were being invalidated, while the nodes were still recognized.  Meaning that the following query would work:



<?php $xpath_obj->query('*[2]/*[*[2]]', $relative_node); ?>



But when replacing any of the "*" tokens with the corresponding (and valid) "table", "tr", or "td" tokens the query would inexplicably break.

david at lionhead dot nl 11-Aug-2009 06:33


When using DOMXPath and having a default namespace. Consider using an intermediate function to add the default namespace to all queries:





<?php


// The default namespace: x:xmlns="http://..."


$path="/Book/Title";


$path=preg_replace("\/([a-zA-Z])","/x:$1",$path);





// Result: /x:Book/x:Title


?>

Mark Omohundro, ajamyajax dot com 14-Dec-2008 09:15


<?php

// to retrieve selected html data, try these DomXPath examples:



$file = $DOCUMENT_ROOT. "test.html";

$doc = new DOMDocument();

$doc->loadHTMLFile($file);



$xpath = new DOMXpath($doc);



// example 1: for everything with an id

//$elements = $xpath->query("//*[@id]");



// example 2: for node data in a selected id

//$elements = $xpath->query("/html/body/div[@id='yourTagIdHere']");



// example 3: same as above with wildcard

$elements = $xpath->query("*/div[@id='yourTagIdHere']");



if (!is_null($elements)) {

  foreach ($elements as $element) {

    echo "<br/>[". $element->nodeName. "]";



    $nodes = $element->childNodes;

    foreach ($nodes as $node) {

      echo $node->nodeValue. "\n";

    }

  }

}

?>

The DOMXPath class

简介

类摘要

属性

Table of Contents

User Contributed Notes