selectSingleNode VS selectNodes : Which is faster ?

Discussion:

(too old to reply)

Raoul Borges

2006-06-20 10:27:40 UTC

Ok, it's a strange question.

There is some article on the web:
http://www.devx.com/vb2themax/Tip/18823

======================================================
However, a quick look at the implementation of SelectSingleNode suggests
that using SelectNodes is often preferable. The following pseudocode shows
the internal working of SelectSingleNode:

Public Function SelectSingleNode(xpathExpr As String) As XmlNode
Dim nodes As XmlNodeList = SelectNodes(xpath)
Return nodes(0)
End Funtion

The SelectSingleNode method internally calls SelectNodes and retrieves all the
nodes that match a given XPath expression. Next it simply returns the first
selected node to the caller.
======================================================

And then, the author, Dino Esposito, asserts :

======================================================
doc.SelectSingleNode("NorthwindEmployees/Employee[position() = 1]")
======================================================

is faster than:
doc.SelectSingleNode("NorthwindEmployees/Employee")

My own take is that this is really... Ahem... Strange. My own tests show this to be
false (I compared selectSingleNode with selectNodes with the same XPath resquests,
and then with selectSingleNode with the same XPath + [position() = 1]).

I searched the net for confirmation, but found nothing but an article on the MSDN saying
the two were equivalent (but never saying they are the same code).

Could I have some external advice about all this ?
(if from someone familiar with internal workings of the MSXML, all the better !)

Thanks !

-- Raoul BORGES
P.S.: To email me, remove the Xs from my email adress.

Bjoern Hoehrmann

2006-06-20 11:24:08 UTC

Permalink

Post by Raoul Borges
http://www.devx.com/vb2themax/Tip/18823
Could I have some external advice about all this ?
(if from someone familiar with internal workings of the MSXML, all the better !)

--
Björn Höhrmann · mailto:***@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/

Raoul Borges

2006-06-20 15:20:13 UTC

Permalink

You're right, but I'm still surprised:

I searched again, and found this URL:
http://msdn.microsoft.com/msdnmag/issues/03/07/XPathandXSLT/
Where the same author explains it a little more

===================================================
|
| The SelectSingleNode method works as a special case of SelectNodes
| in that it returns only the first element of the returned node-set.
| Unfortunately, up until now, the implementation of SelectSingleNode has
| not been particularly efficient. If you need to locate only the first matching
| node, then calling SelectSingleNode or SelectNodes is nearly identical.
| Moreover, if you need to squeeze out every little bit of performance,
| you're probably better off using SelectNodes.
|
===================================================

I would have disagreed with the following...

===================================================
|
| This is an XPath best practice in general and is not due to the .NET
| Framework implementation in particular.
|
===================================================

.... because my own test with MSXML4 (not .NET) show that a selectNodes
is at least 2 times slower than a selectSingleNode, and that adding a
"[position() = 1]" postffix to a XPath string will slow a selectSingleNode and
a selectNodes by 15%

But then, I went to Apache, and looked at the Xalan's implementation of
selectSingleNode and selectNodeList, and found a similar code:
XPathEvaluator.cpp (line 93 and 177)

==================================================
|
| XalanNode* XPathEvaluator::selectSingleNode(...)
| {
| const XObjectPtr theResult(...);
|
| const NodeRefListBase& theNodeList = theResult->nodeset();
|
| return theNodeList.getLength() == 0 ? 0 : theNodeList.item(0);
| }
==================================================
|
| NodeRefList& XPathEvaluator::selectNodeList(...)
| {
| const XObjectPtr theResult(...);
|
| result = (theResult->nodeset());
|
| return result;
| }
|
==================================================

Which shows a code similar to the pseudo-code shown by the .NET example.

There must be a core difference between MSXML 4 and other DOM
implemantions...

: /

Thanks, anyway...
: )

-- Raoul BORGES

Post by Bjoern Hoehrmann

Post by Raoul Borges
http://www.devx.com/vb2themax/Tip/18823
Could I have some external advice about all this ?
(if from someone familiar with internal workings of the MSXML, all the better !)

The article is apparently based on an implementation detail of the
methods in the .NET Framework 1.0; if SelectSingleNode is indeed im-
plemented as the article states, the conclusions are reasonable (if
you assume there is no lazy evaluation). It does not say anything
about other versions of the Framework or MSXML.
--
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/