Getting started with Docker part 2: Volumes

19 Mar 2022 docker mysql php

Welcome to the second part of my four-part guide to learning Docker, in the context of PHP and MySQL.

This time out we'll build a to-do app, and meet Docker "volumes" which, as we'll see, are key to developing and persisting data within Docker.

Setup 🔗

First, let's extend our codebase. You may recall from part 1 that we have an empty app directory. Let's add an index.php to it so our setup now looks like this:

learn-docker/
|-- docker/
|   |-- index.php
|-- app/
|   |-- index.php

app/index.php is our app. Let's put the following PHP/HTML in there:

<?php

//create data file if not exists
!is_dir('data') && mkdir('data');
!file_exists('data/data.json') && file_put_contents('data/data.json', []);

//get current todos as array
$todos = json_decode(file_get_contents('data/data.json'), 1);

//save any submitted to-do item
if (!empty($_POST['item'])) {
    $todos[] = $_POST['item'];
    file_put_contents('data/data.json', json_encode($todos));
    header('location: ?');
}

?>
<doctype html>
<head>
    <title>To-do app</title>
</head>
<body>
    <h1>To-dos</h1>

    <!-- new item form -->
    <form method='post'>
        <input type='text' name='item' />
        <button>Add</button>
    </form>

    <!-- current items -->
    <h2>Current items</h2>
    <?php if (!$todos) { ?>
    <p><em>none found...</em></p>
    <?php } ?>
    <ul>
        <?php foreach($todos as $todo) { ?>
        <li><?= $todo ?></li>
        <?php } ?>
    </ul>
</body>

Even if you're not a PHP dev it should be pretty straightforward what's happening there.

Our app will save created to-do items to data/data.json. It checks whether the data directory exists and creates it if not, then does the same for the data file itself.

The script then checks if a new item was submitted. If so, it's added to the array and the array is encoded as JSON and saved to our file.

Finally we have some HTML comprising a form for new items, and an unordered list of current ones.

Now we're all set to meet volumes!

Volumes 🔗

Volumes are Docker's way of allowing us to persist storage.

Remember from part 1 that containers run in isolation, with a virtualised filesystem and resources. That means when the container is destroyed, nothing of it remains. So if we create it again later, nothing our app did (e.g. write data to a file) will persist.

Docker defines two types of volume - and we'll be meeting and using both in this tutorial:

Bind mounts - typically used to develop in Docker by "mounting" files from our device into the container
Named volumes - typically used to persist data written by the app

Both are created by passing the --volume (or -v for short) argument when running a container. What we pass to the argument depends on the type of volume we're using, so let's look at and use each of these in turn.

Bind mounts 🔗

In part 1 we successfully bundled a simple PHP script into our image, such that anyone running our container would get that script.

That's fine - for shipping images. But it's not fine for developing. After all, we don't want to constantly rebuild our image to see changes while we code.

And anyway, we may not want to bundle code into our image, or even ship our image at all. Perhaps it's just a local image that should provide a development runtime, not contain actual app code.

So how can we get our local code into a container without bundling it into, and constantly rebuilding, an image? Meet bind mounts!

Bind mounts allow you to "mount" local files (i.e. on your system) into a container, as though those files had shipped with the image.

We specify a bind mount by mapping a local filepath to a path within the app, like so:

docker run ... -v <local-path>:<container-path> ...

Let's try it! First let's stop and remove the container from part 1, if it's still running.

docker rm -f php-container

...where php-container is the name we assigned to our container. The above command tells Docker to force the container's removal - even it's still running. Essentially this is a shortcut to first stopping (via docker stop) and then removing the container.

Next, cd into the project directory:

cd c:\...\learn-docker

Now let's spin up a new container - like before, but this time bind mounting our app directory into the webroot:

docker run `
  --name php-container `
  -d `
  -p 80:80 `
  -v "$(pwd)/app/index.php:/var/www/html/index.php" `
  php-app

$(pwd) means the current working directory (i.e. our project directory) and tells Docker to resolve the local path relative to that.

Now visit http://localhost and you should see our app - with the code being read from our machine, not the image! We can verify this by making a trivial change to the code. Change something - like the h1 text - and refresh, and you'll see the change.

Named volumes 🔗

So far, so good. Go ahead and add a few to-do items to the list. Refresh the page, and they should still be there, proving they're successfully being saved to, and read from, our JSON file.

Now let's stop and remove the container like before:

docker rm -f php-container

Then build it afresh, with the same command as before:

docker run `
  --name php-container `
  -d `
  -p 80:80 `
  -v "$(pwd)/app/index.php:/var/www/html/index.php" `
  php-app

When it's up, refresh the page. Oh no! Where did our saved to-do items go?

As mentioned above, nothing remains of a container once it's removed. Its content and resources are utterly fleeting - existing only while it exists. So how do we persist our saved to-do items between container runs, or make sure multiple developers running the container share the same items?

This is where named volumes come in.

A named volume is a bucket of data - essentially a virtual path whose location is managed by Docker. They're ideal if we just want to store data, without caring where.

In this case, we want to make our data directory the subject of a named volume so that, when the container is created later, the data file we write to it will persist.

Named volumes can be created separately, but for this tutorial we'll create one at the same time we run our container, as we did for the bind mount. The syntax for specifying a named volume is:

docker run ... -v <volume-name>:<container-path> ...

Let's stop and remove our container once more.

docker rm -f php-container

Then let's spin it up again - this time with a named volume we'll name "todo-data":

docker run `
  --name php-container `
  -d `
  -p 80:80 `
  -v "$(pwd)/app/index.php:/var/www/html/index.php" `
  -v todo-data:/var/www/html/data `
  php-app

Notice we now have two volume (-v) arguments, one for our bind mount (to get our local code into the container) and one for our named volume i.e. data bucket?

One more important step before we head back to the browser. Because we've told Docker to manage the data directory, it will do so under the root user. That's no good for us, because PHP is running under the www-data user and so currently won't be able to write to it.

Let's shell into our container to change the owner. We need only do this once, not each time the container starts. (We could also write a command in our Dockerfile to automate this, if we wished.)

docker exec php-container /bin/sh -c "chown -R www-data data"

Now let's head back to the browser and refresh. Once again we'll see our empty list. Add a few items, then remove the container like before, the start it up again using the same command we just ran above.

Et voila - our items are still there!

This demonstrates that volumes are totally independent - they're not tied to containers. Even though we destroyed our container, the volume remained and we were able to reattach it to our newly-created container just by referencing todo-data, i.e. its name!

Summary 🔗

We've seen how Docker uses volumes to persist data within (and across) containers.

Bind mounts are used to mount code from our local machine into the container, as though the code had been bundled with the image.

Named volumes are buckets of data. We don't know or care where Docker stores them; we just need to remember their name to reattach them to containers later.

Without using a named volume, there would be no way to persist our saved data across different containers. When a container is destroyed, all its resources are destroyed with it - nothing persists.

In part three, we'll add MySQL into the mix and learn about multi-container apps. See you there!

Did I help you? Feel free to be amazing and buy me a coffee on Ko-fi!