Using TreeWalker to query non-element nodes
25 Feb 2021
I was recently working on a project where we needed to frequently extract and then perform computation on all the text nodes in a DOM tree.
If it had been normal HTML elements we'd needed, we'd have had no problem - querySelectorAll()
is great for that. But that can find only elements - not other types of node, e.g. comment nodes or text nodes.
What we needed was a TreeWalker.
TreeWalker? 🔗
As MDN puts it, a TreeWalker, which sounds like something out of A Game of Thrones:
...represents a subset of nodes and a current position with them.
Here's a simple example, to get all div
elements anywhere within body
and add a class, "foo":
let walker = document.createTreeWalker(document.body, NodeFilter.SHOW_ELEMENT),
currNode;
while(currNode = walker.nextNode()) currNode.classList.add('foo');
There, we create a new TreeWalker, telling it the container node to look within, and what sort of nodes we're interested in (this latter takes the form of a static constant on the built-in NodeFilter
object.) To get HTML elements, we need SHOW_ELEMENT
, but there are other options.
We can then iterate over els
to do whatever we want with the captured elements, each time moving on to the next node via nextNode()
.
We don't have to go to the next node; there are a bunch of traversal methods available to TreeWalkers.
Selecting non-element nodes 🔗
So far, we've used TreeWalker to do a job that querySelectorAll('*')
could have done, and could have done more succinctly - namely, to get all elements.
How would we adapt it to select, say, comments nodes, or text nodes? Simple - we just use a different constant for parameter 2:
NodeFilter.SHOW_COMMENT
- for comment nodesNodeFilter.SHOW_TEXT
- for text nodes
...and there's a bunch of other possibilities, too, though some are deprecated.
An (inelegant) alternative for getting, say, all text nodes, would be a multi-loop situation like this:
let els = document.body.querySelectorAll('*'),
tNodes = [];
els.forEach(el =>
tNodes.push([...el.childNodes].filter(node => node.nodeType == 3))
);
tNodes = textNodes.flat();
Applying filters 🔗
createTreeWalker()
takes an optional, third param, where we can implement a node filter to filter the nodes. This is an object that must implement an acceptNode()
method, where we do our filtering. So to accept only div
s with the class "foo":
let walker = document.createTreeWalker(
document.body,
NodeFilter.SHOW_ELEMENT,
{acceptNode: el => el.matches('div.foo')}
);
(Actually, to be completely correct, our acceptNode method is supposed to return another static constant, either NodeFilter.FILTER_ACCEPT
or NodeFilter.FILTER_REJECT
, but returning a boolean seems to work as well.)
Performance 🔗
A quick note on performance. It has been suggested variously in some Stack Overflow answers that TreeWalkers can, in some cases, be much faster than other node-retrieval/iteration approaches.
However, like with all questions of optimisation, situations vary massively, on a plethora of factors. I've done some basic benchmarking and found cases where TreeWalkers were slower than alternatives such as querySelectorAll()
, and others where there was little difference.
It's something to bear in mind, though; if you've got expensive DOM traversal operations going on, a TreeWalker may offer some optimsation.
---
That's it; I hope you found this mini-guide useful!
Did I help you? Feel free to be amazing and buy me a coffee on Ko-fi!