Hi all,
I am parsing an XML document using the DOMDocument class.
What I need is a piece of code that will find duplicate entries in the XML.
A duplicate entry is defined as an entry which shares the same values in 3 specific elements in the XML entry with another entry.
I can think of a code I can create from scratch but is there something built-in n PHP to help me with that?
Many thanks
Find duplications in XML using DOM
Moderator: General Moderators
yes they are. Xpath?volka wrote:Are thoses entries identical in their string representation (.i.e all -including descendants- text nodes concatenated)?yarons wrote:A duplicate entry is defined as an entry which shares the same values in 3 specific elements in the XML entry with another entry.
It can probably be done via XPath.
For example: selecting all b elementshttp://www.w3.org/TR/xpath wrote:XPath is a language for addressing parts of an XML document, designed to be used by both XSLT and XPointer.
Code: Select all
<?php
$xml = '<entries>
<entry>
<a>abc</a>
<b>def</b>
<c>ghi</c>
</entry>
<entry>
<a>zyx</a>
<b>wvu</b>
<c>tsr</c>
</entry>
<entry>
<a>abc</a>
<b>def</b>
<c>ghi</c>
</entry>
<entry>
<a>1</a>
<b>2</b>
<c>3</c>
</entry>
</entries>';
$dom = new DOMDocument('1.0', 'iso-8859-1');
$dom->loadXML($xml);
$xpath = new DOMXPath($dom);
$nodelist = $xpath->query('//b');
foreach($nodelist as $node) {
echo $node->textContent, "<br />\n";
}
?>So I can only provide you with an example here
Code: Select all
<?php
$data = '<entries>
<entry>
<a>abc</a>
<b>def</b>
<c>ghi</c>
</entry>
<entry>
<a>zyx</a>
<b>wvu</b>
<c>tsr</c>
</entry>
<entry>
<a>abc</a>
<b>def</b>
<c>ghi</c>
</entry>
<entry>
<a>1</a>
<b>2</b>
<c>3</c>
</entry>
<entry>
<a>abc</a>
<b>def</b>
<c>ghi</c>
</entry>
<entry>
<a>1</a>
<b>2</b>
<c>3</c>
</entry>
<entry>
<a>abc</a>
<b>def</b>
<c>ghi</c>
</entry>
</entries>';
$style = '<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="iso-8859-1" indent="no"/>
<xsl:template match="entry">
<xsl:if test="not( .=preceding::entry )">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:if>
</xsl:template>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>';
$dom = new DOMDocument('1.0', 'iso-8859-1');
$dom->loadXML($style);
$xsl = new XSLTProcessor();
$xsl->importStyleSheet($dom);
$dom = new DOMDocument('1.0', 'iso-8859-1');
$dom->loadXML($data);
echo $xsl->transformToXML($dom); // use transformToDoc() to get a new DOMDocument
?>http://www.w3.org/ and google can tell you much more about xml, xpath and xsl