Obtaining structured output from LLMs with LangChain and Node
28 Feb 2025
One of the key problems to solve in AI currently is how to get them to generate output of a predictable structure, such that it can be consumed by a system, rather than merely spat out as conversational output.
To this end, you may have read my guide to AI function calls from a few months back. Since then, there's been an increasing number of models that support so-called "structured output", known in some models as "JSON mode".
There's a few ways to instruct such models. One of them, as we saw with functions, is via JSON schema. In this article I'll be looking at LangChain.
Meet LangChain 🔗
LangChain is a platform, owned by the eponymous organisation, for building applications that work with AI. The platform does a great many things, but one of them is make obtaining structured output a breeze.
Specific to structured output, LangChain handles four key things:
- Ingestion of the JSON schema (you can also write schema via Zod, if you prefer, which compiles to JSON schema)
- Calling the LLM
- Enforcing adherance to the schema's rules
- Running retries if the LLM fails to respond as required
That's worth the (free) admission price alone.
Set up a Cloudflare Worker 🔗
Let's see LangChain's structured output in action via a Cloudflare Worker. Let's spin up a simple Worker called "langchain-demo".
This will guide you through a series of questions. Answer as follows:
- Start with = Hello World example
- Template = Worker only
- Language = JavaScript
- Version control = no
- Deploy = no
Now enter the newly-created directory.
Now let's install a couple of dependencies, specifically the core LangChain package and also LangChain's Mistral extension package, since we'll be using the Mistral LLM.
Now let's run the Worker:
This will launch our Worker on http://localhost:8787
. Head to there, and you'll see the Hello World!
Lastly, go and sign up for a free API key from Mistral - we'll need that in a second.
If you'd rather use OpenAI, you can; it'll just mean modifying the code that follows to use LangChain's OpenAI package rather than their Mistral package.
Boostrap the app 🔗
Time for action. We'll build a simple app that takes an incoming word (sent in via the query string) and outputs JSON saying what type of word it is (verb, noun etc.) and some information about its etymology.
First, open up src/index.js
and import LangChain's Mistral package (this will implicitly load the core package, too.) and specify our Mistral API key. (In production, we'd read this in from a secret, not write it in code, of course, but this is just a demo.)
Now replace the existing fetch()
handler with the following:
async fetch(req) {
const url = new URL(req.url);
const word = url.searchParams.get('word');
if (!word) return new Response('No word passed', {status: 400});
}
So far, so simple: we retrieve the word passed to our word
query string parameter, and if it's not found we quit with a 400 (bad input) error.
Set up structured output 🔗
Next, let's prepare our prompt and our schema. Our schema will take the form of JSON schema, and it's our way to demand output of a particular structure. Append the following to our fetch()
handler.
const prompt = `Return the required information about the word "${word}"`;
const schema = {
'$schema': 'http://json-schema.org/draft-07/schema#',
title: 'schema for LLM to analyse a word',
type: 'object',
properties: {
type: {
type: 'string',
description: 'The type of word - noun, verb etc.'
},
etymology: {
type: 'string',
description: 'Info about the etymology of the word',
maxLength: 255
}
},
required: ['type', 'etymology']
};
Our schema is simple: it stipulates two properties, and says that both are required in order for the response to be validated against the schema. If the LLM disobeys, LangChain will handle retries, as we'll see.
Now let's initiate Mistral-flavoured LangChain:
const model = new ChatMistralAI({
model: 'mistral-small-latest',
temperature: 0.2,
apiKey: mistralKey,
maxRetries: 5
});
We're using whatever Mistral deem their current flagship "small" model, as it has a good cost-to-performance balance. We're using temperature = 0.2, meaning mostly but not entirely deterministic, and we're setting maximum retries to 5, i.e. the number of retries LangChain will automatically do if the LLM's response does not adhere to our schema.
Nearly there. Next, we need to inform LangChain that we want to use structured output. We do this via the withStructuredOutput()
method, which handles the binding of our schema and the validation of the output against it. This method is available for each LLM package (Mistral, OpenAI etc.) that supports structured output.
Finally, let's fire the request, passing it our prompt, and output the response:
const json = JSON.stringify(await structured.invoke(prompt));
return new Response(json, {headers: {'content-type': 'application/json'}});
Now visit your worker in the browser, passing something like ?word=tomato
as the query string, and you should see the LLM's output just as we need it. The obvious next step would be to feed this response to something further upstream e.g. save it to a database or whatever.
And that's it!
Did I help you? Feel free to be amazing and buy me a coffee on Ko-fi!