Querying and transforming JSON with J-Path

Querying and transforming JSON with J-Path

2 Jun 2020 set j-path json javascript

It's quite often you see questions on Stack Overflow asking how to transform a JSON structure into something completely different. Over in XML land, we have XPath and XQuery for this. In JSON land, we have various tools including my very own J-Path, which allows you to query and transform JSON via XPath, the language used to query XML.

Input 🔗

Let's say we have a flat JSON structure, loaded into a JavaScript object, which is a list of films categorised by starring actor.

let input = [{ actor: 'Leslie Nielson', films: [{ genre: 'Comedy', title: 'Airplane' }, { genre: 'Comedy', title: 'Naked Gun' }] }, { actor: 'Colin Firth', films: [{ genre: 'Drama', title: 'Pride and Prejudice' }, { genre: 'Drama', title: 'The King\'s Speech' }] }, { actor: 'John Heder', films: [{ genre: 'Comedy', title: 'Napoleon Dynamite' }] }];

Output 🔗

Suppose we'd like to transform that document so the films are categorised by two levels, not just one (actors) - first genre, then actor.

[{ genre: "Comedy", films: [{ actor: "Leslie Nielson", films: [ "Airplane", "Naked Gun" ] }, { actor: "John Heder", films: [ "Napoleon Dynamite" ] }] }, { genre: "Drama", films: [{ actor: "Colin Firth", films: [ "Pride and Prejudice", "The King's Speech" ] }] }]

With J-Path this is a simple task.

The magic 🔗

It looks something like this. It's easy because the arduous work of filtering the JSON by various criteria is handled by J-Path.

let output = [...new Set(jpath(input, '//genre')).values()].map(g => { let actors = jpath(input, '//actor[parent::item/films//genre="'+g+'"]'); return { genre: g, films: [...new Set(actors).values()].map(a => { let films = jpath(input, '//films[parent::item/actor="'+a+'"]/item[genre="'+g+'"]/title'); return { actor: a, films: films }; }) }; });

So what's going on there? Remember we said we wanted our genres to be the outermost layer of the nest, so that's our starting point. We feed J-Path our input object and an XPath targeting the different genres.

One thing that's important to know about J-Path is it converts array elements into nodes named item. Since our input is an array, that's why we reference item in our XPath above.

That XPath looks like //genre. In other words: "Give me all the genre nodes found anywhere in the document."

When writing XPath using // is considered a bit lazy because it looks at all levels of depth in the document and thus is more computationally expensive. Really, we should tighten our path to be more targeted. But it's fine here, for this illustrative example.

Now, we want only unique genres, not repeated ones, and for this we utilise JavaScript's Sets API. Sets are sort of like arrays, except they allow only unique values. So:

let mySet = new Set(['foo', 'foo', 'bar']); console.log([...mySet.values()]); //['foo', 'bar']

See how our duplicate "foo" was omitted? And so it is for our genres - only one "Comedy" and one "Drama" make it in, not three of each.

[...new Set(jpath(input, '//genre')).values()]; //["Comedy", "Drama"]

This is the key to our approach. We first grab the genres, then map them to a new array which will contain our second level (actors) and eventually our third level (films).

To get the second level, actors, we need an XPath which fetches actors, just like the first XPath fetched genres, but this time it has to take into account the current genre we're iterating over. We have this in our g variable, fed to us by the map() function.

let output = [...new Set(jpath(input, 'item/genre')).values()].map(g => { let actors = jpath(input, '//actor[parent::item/films//genre="'+g+'"]'); //...

See how our second XPath, inside the genres iteration, is formed of static and dynamic input (our genre) this time? It says: "Give me all actors who have a record in input where the genre is {current genre}".

Then we do the same technique as before - feed our actors to a set, to weed out the duplicates, and map them to a new array. From there, it's just a question of getting the films matching the current genre and the current actor.

let films = jpath(input, '//films[parent::item/actor="'+a+'"]/item[genre="'+g+'"]/title');

Et voilĂ ! You can run this code and see the output over here.