(Tutorial) Directly/indirectly uploading to R2 (S3) buckets via Cloudflare Workers

30 Sep 2024 cloudflare r2 s3 workers

R2 is Cloudflare's flavour of S3 storage, as pioneered by AWS. Like S3, it's object-based storage for files, but within the excellent Cloudflare ecosystem.

In this tutorial I'll be showing you how to handle uploads to R2, via a Cloudflare Workers app, in two contexts:

Indirectly, where the upload is sent to R2 via your Cloudflare Worker back-end
Directly, where the upload is sent directly to R2 via a presigned link

We'll be using JSFiddle to mock up a front end.

Setting up a Worker 🔗

Let's set up our Cloudflare Worker, which will be our app (both front and back end, though it's more common to use a Worker just for the back end and host your front end as a SPA e.g. on Cloudflare Pages.)

First up, somewhere on your machine, run the following to trigger the Worker bootstrapper:

npm create cloudflare@latest -- r2-tut

This will guide you through a series of questions. Answer as follows:

Start with = Hello World example
Template = Hello World Worker
Language = JavaScript
Version control = no
Deploy = no

Now enter the newly-created directory.

cd r2-tut

Next up, let's install a router, to handle the different endpoints of our app. In this tutorial we'll use Itty Router; it's tiny, and well maintained.

npm i itty-router

Finally, let's add a couple of S3 dependencies that we'll need later for direct uploads.

npm i @aws-sdk/client-s3
npm i @aws-sdk/s3-request-presigner

Finally, let's set up our app's main file, in which we'll put a couple of routes (for direct and indirect uploads), and the middleware that powers them.

Open up src/ndex.js and replace the entire contents with this:

//prep
import { AutoRouter, cors, withContent, text } from 'itty-router'
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3'
import { getSignedUrl } from '@aws-sdk/s3-request-presigner';

//create router, with CORS
const { preflight, corsify } = cors();
const router = AutoRouter({
    before: [preflight],
    finally: [corsify]
})

//indirect uploads route
//TODO

//direct uploads route
//TODO

//export router
export default router

Of course, in production we'd want to add some sort of auth, but for this tutorial we'll leave our app dangerously wide open to accept uploads from all and sundry, no questions asked.

Finally, we need to configure our worker to run on Cloudflare's live network, not our local machine. This is so we can use real R2 - without this, everything will still work fine, but Cloudflare (specifically, Wrangler) will simulate R2 (and any other bindings) on our file system.

Open package.json and change the dev script to include the --remote flag:

"dev": "wrangler dev --remote",

Setting up an R2 bucket 🔗

Next is to set up our actual R2 bucket. For this, login to Cloudflare and, in the left nav, navigate to "R2 Object Storage" then click "Create bucket".

Give it a name of "r2-tut", and leave the other options set to their defaults, then hit Create. You'll be taken to your new bucket; now go to its "Settings" tab.

By default, a bucket is "private", meaning only an application authorised to use the bucket can control the ingress and egress of data to and from that bucket. If we attached a domain to the bucket, thus it exposing the bucket on that domain, it would become public. We don't want this; we'll keep it private.

Scroll down to the "CORS Policy" section and add a policy. This is necessary to allow direct uploads via presigned links. Add the following:

[
    {
        "AllowedOrigins": [
            "http://localhost:8787",
            "https://fiddle.jshell.net"
        ],
        "AllowedMethods": ["PUT"],
        "AllowedHeaders": ["Content-Type", "Authorization"],
        "ExposeHeaders": ["ETag"],
        "MaxAgeSeconds": 3000
    }
]

There's more to CORS policies, but this is the minimum we need. A few notes:

The origins domains are the domains from which we'll be sending the uploads to R2.
- For indirect uploads, this is our worker domain
- For direct uploads, this is the subdomain of JSFiddle that runs code
When we come to direct uploads, we'll be PUT'ting our files into R2, hence we need to support the PUT method

Finally, let's bind our bucket to our Worker, by editing the wrangler.toml file that was created as part of the Worker setup process. In the file, search for "r2" and you'll see there's already a section for R2 buckets, but it's commented out. Uncomment the three lines (i.e. remove the leading #) below the explanation and change the value of bucket_name to "r2-tut":

[[r2_buckets]]
binding = "MY_BUCKET"
bucket_name = "r2-tut"

Note, binding can be anything. It's just the name we'll use to reference the bucket within our code.

It's time to spin up our worker! This will launch our worker on http://localhost:8787.

npm run dev

Handling indirect uploads 🔗

As mentioned up top, indirect uploads means the upload will go to R2 via your Worker. This has the obvious advantage that we can do work to validate the file, its type, size etc., before committing it to R2. On the other hand, it means higher compute costs, as the file is processed and read into memory before being sent on to R2, and latency is higher as the file's total journey is longer.

We won't get into file validation in this tutorial, but let's see how we can receive the file and then save it to R2. Let's build out the indirect uploads route handler we prepared earlier in src/index.js. Its job is to forward the file to R2, then output a simple text response.

//indirect uploads route
router.post('/indirect', withContent, async (req, env) => {
    await env.MY_BUCKET.put('some-file', req.content.get('file'));
    return text('File uploaded!');
});

A few notes about what's happening there:

Our route handler expects POST data, formatted as form data
Specifically, form data, with our upload contained in the property file
We're saving it to R2 as simply "some-file", but in production you'd probably want to be a bit more verbose than that or use a naming convention, perhaps naming the file after the ID of the DB item it relates to, or whatever
We're using the withContent middleware to automatically parse the request body
We access our bucket via its binding, which is available (like all CF Worker bindings) on env, which is automatically passed in to Itty route handlers as the second argument

Now we need a front end. Let's set up a simple HTML form on JSFiddle and test it:

<form
    method='post'
    action='http://localhost:8787/indirect'
    enctype='multipart/formdata'
>
    <input type='file' name='file'>
    <button>Upload</button>
</form>

Run the Fiddle, upload a file and you should find that it ends up in your R2 bucket! You can browse your bucket via Cloudflare's web interface to make sure.

Handling direct uploads 🔗

Direct uploads mean our upload goes straight to R2, not even touching our back end en route. The advantages here are the same disadvantages we had with indirect uploads, and vice versa.

The first thing we'll need to do is set up a CloudFlare API token scoped to R2 usage. To do this, go to your R2 dashboard and click the "Manage R2 API tokens" link. Once there, create a new token, configured as follows:

Token name: whatever
Permissions: Object read and write
Specify buckets: select your bucket

Once done, Cloudflare will show you your new token's credentials. Make a note of the Access Key ID and Secret Access Key (ignore Token Value), plus your CF account ID (displayed on the R2 dashboard) - we'll need all of these shortly.

Now let's build out the direct uploads route handler we prepared earlier in index.js. The job of this route is not to handle the upload, but generate a presigned link that our front end can send the file to.

router.post('/direct', async req => {

    //prep
    const cfAccId = 'your-account-id';
    const storageUrl = `https://${cfAccId}.r2.cloudflarestorage.com`;
    const tokenKey = 'your-access-key-id';
    const tokenSecret = 'your-secret-access-key';

    //register S3 client
    const S3 = new S3Client({
        region: 'auto',
        endpoint: storageUrl,
        credentials: {
            accessKeyId: tokenKey,
            secretAccessKey: tokenSecret,
        },
    });

    //generate and return presigned URL
    return getSignedUrl(
        S3,
        new PutObjectCommand({Bucket: 'r2-tut', Key: 'some-file2'}),
        {expiresIn: 60},
    );
});

Be sure to edit the code to replace your-account-id, your-access-key-id and your-secret-access-key with their real values. And of course, in production you wouldn't want these credentials living in your code, you'd add them as secrets, but for this tutorial we're keeping it simple.

So what's happening there?
First we specify the credentials we'll need to create the link
Next we register an S3 client (remember R2 is CF's flavour of S3 - it's all S3 under the hood)

Finally we generate the link, specifying the name of our bucket and the name of the object we want to save the file to (we'll use "some-file2", since we used "some-file" earlier for indirect uploads.)

There are many parameters you can specify when generating presigned links, such as an MD5 checksum of the file, so that the file the user uploads matches the one that we agreed to generate a link for. For more, see the S3 docs for PutObjectCommand().

The last thing to do is to modify our front end. The flow here will be slightly different, mostly handled by JavaScript. Let's set up another Fiddle. Here's our HTML:

<input type='file' name='file'>
<button>Upload</button>

And now for our JS:

const field = document.querySelector('input');
const btn = document.querySelector('button');
btn.onclick = async () => {
    if (!field.files.length) return alert('Select a file first!');
    const urlReq = await fetch('http://localhost:8787/direct', {
        method: 'post'
    });
    const url = await urlReq.text();
    const fd = new FormData();
    fd.append('file', field.files[0]);
    const uploadReq = await fetch(url, {
        method: 'PUT',
        body: fd
    });
    alert(uploadReq.ok ? 'Yay!' : 'Nay...');
}

We're not using a HTML form here - we just need the field itself, to capture the file. The JS first fetches a presigned URL, i.e. the R2 URL to which we'll send the file, and then we send the file itself, as form data, to that URL.

Give it a go, and once again you should fine the file ends up in your bucket!

Next steps 🔗

Where would you go next? There's a few obvious directions:

Add authentication. This would be enforced by some middleware, withAuth, which might receive a JWT token.
Expand the app to cover file viewing/retrieval. For this, our app would generate presigned URLs to view, rather than upload, a particular file. We'd also need to add the GET method to our CORS policy.
With indirect uploads, validate the file. This would involve installing a package like magic-bytes and performing checks on the file before sending it to R2.
On the front end, use XmlHttpRequest rather than the Fetch API, for the reason that the latter, unfathomably, doesn't have a progress event, and so without a lot of jumping through hoops, you can't show feedback on how the upload is going
With indirect uploads, build some safeguards into the presigned URL. See the S3 docs for more.

Did I help you? Feel free to be amazing and buy me a coffee on Ko-fi!