How to host Apache Superset on Azure via Docker container

How to host Apache Superset on Azure via Docker container

25 Jun 2025 azure container docker superset

I'm no sys-admin. I cannot emphasise that enough. Nonethless, I recently had to figure out how to host Apache Superset, an open source BI platform, on our Azure cloud.

This is the guide I wished I'd found when I needed it, so I hope it helps you out.

Essentially, we'll be running Superset inside a Docker container, on Azure container instances, via a custom image based on the official Superset image.

Setup 🔗

First, fulfill the following prerequisites:

Finally, login to the Azure CLI and let's go:

az login

This guide will use a Azure CLI for some steps and Azure Portal (i.e. via a browser) for others. You could in theory do the whole thing over CLI.

1. Create a resource group 🔗

Skip this step if you already have a resource group within Azure you want to use.

In Azure, everything lives in a resource group, a meta data container for the resources within it. Let's set one up.

  • In Azure Portal, go to resource groups > create
  • Select your subscription and give the group a name
  • Select the region it should be located in. This is NOT the region where the resources within it will (necessarily) run, merely the location of the meta data about them.
  • Save, and make a note of the resource group name for later

2. Create a container registry 🔗

I mentioned up top that we'd be building our own custom image based on the official Superset image. For this, we'll need somewhere to put that image, and that somewhere is an Azure container registry (ACR).

  • In Azure Portal, go to Container registrties > create
  • Select your subscription and the resource group we just made
  • Give the ACR a name and choose a location (again, this isn't where our resources will run, merely where our custom images will be stored)
  • Leave domain name label scope on "unsecure" for the purposes of this tutorial. Later, we might want to re-do this and tighten this up (it basically governs how the addresses our containers run on are generated)
  • Choose a pricing plan - go with basic if you're just playing around, or standard if you're building for production
  • Leave everything else, including the other tabs, as-is
  • Save, and make a note of the ACR for later

3. Create a meta database 🔗

Superset needs a database to store its meta data in. When you run Superset in dev mode (a different process entirely, via Docker Compose), it creates one for you within the container, but in production mode we need to provide our own.

You can use any SQL DB, hosted anywhere, so long as the DB is ultimately publicly accessible. We'll set up a Postgres DB within Azure.

  • In Azure Portal, go to PostgreSQL flexible servers > Create
  • Select your subscription and the resource group we made
  • Specify a server name and make a note of this for later
  • Select whichever region makes sense for you
  • Superset supports Postgres up to version 16 for its meta DB - I went with PG 15
  • Choose a compute + storage that makes sense for you
  • Leave other settings as-is unless you specifically want/need to change them
  • For authentication, select "PostgreSQL authentication only" for simplicity (again, change this if you know what you're doing)
  • Set an admin username and password, and make a note of these for later
  • In the Networking tab, make sure public access is enabled and whitelist the IP range 0.0.0.0 - 255.255.255.255 (you can limit this later once we know our container's IP)
  • Create

4. Create a Redis cache 🔗

Superset also wants a Redis instance, for caching. Azure offers no fewer than three Redis products, which differ by by featureset and who manages it (Azure or Redis themselves). We'll use the original (and simplest) Azure Cache for Redis product.

  • In Azure Portal, go to Azure cache for Redis > create (a dropdown may again ask you which Redis product to use - make sure you choose Azure Cache for Redis again)
  • Select your subscription and the resource group we made
  • Give it a name and region that makes sense for you
  • Choose an SKU and cache size that suits your need - the cheapest is Basic + 250gb
  • Under the Networking tab, for Connectivity method choose Public endpoint (or if you about this stuff, use private if you want)
  • Create
  • After creation, go to the resource > Settings > Authentication > Access keys tab, and copy and make a note of the primary secret

5. Set up Dockerfile 🔗

Time to bulid our image. First, create a directory somewhere on your machine and cd into it.

mkdir superset cd superset

Now create a file called Dockerfile (exactly that; no file extension) in that directory, and give it the following content:

# fetch the latest Superset image - we'll base our image on that FROM apache/superset:latest # switch to root user USER root # install Postgres driver RUN pip install psycopg2-binary # write Superset config file RUN cat <<EOF > /app/superset_config.py from os import environ from superset.config import FEATURE_FLAGS as DEFAULT_FEATURE_FLAGS db_user = environ.get("DB_USER") db_pass = environ.get("DB_PASS") db_host = environ.get("DB_HOST") db_uri = f"postgresql+psycopg2://{db_user}:{db_pass}@{db_host}:5432/postgres" SQLALCHEMY_DATABASE_URI = db_uri FEATURE_FLAGS = { **DEFAULT_FEATURE_FLAGS, "ENABLE_TEMPLATE_PROCESSING": True, } REDIS_HOST = environ.get("REDIS_HOST", "localhost") REDIS_PORT = environ.get("REDIS_PORT", "6379") REDIS_PASSWORD = environ.get("REDIS_PASSWORD", None) REDIS_SSL = environ.get("REDIS_SSL", "false").lower() == "true" redis_url = f"rediss://" if REDIS_PASSWORD: redis_url += f":{REDIS_PASSWORD}@" redis_url += f"{REDIS_HOST}:{REDIS_PORT}/0" BROKER_URL = redis_url CELERY_RESULT_BACKEND = redis_url CELERY_BROKER_TRANSPORT_OPTIONS = { "visibility_timeout": 3600, } if REDIS_SSL: CELERY_BROKER_TRANSPORT_OPTIONS['ssl'] = {'ssl_cert_reqs': 0} EOF # Set config path ENV SUPERSET_CONFIG_PATH=/app/superset_config.py # Install envsubst to help with env var substitution RUN apt-get update && apt-get install -y gettext-base && rm -rf /var/lib/apt/lists/* # Copy and chmod script AS ROOT COPY entry.sh /app/entry.sh RUN chmod +x /app/entry.sh # Now switch to non-root user USER superset CMD ["/app/entry.sh"]

There's a lot going on there, some of which is beyond the scope of this article, but the comments should help.

In short, we're fetching the Superset image then making our own tweaks. In particular, we're creating and copying across a custom config file, which will be merged with the default config file. This is all done in Python, since that's what Superset is written in. It references environment variables which we'll pass in via our command (step 8.)

Within that config file, we set up access to our meta DB and the Redis cache, and set a few feature flags. In particular, ENABLE_TEMPLATE_PROCESSING allows us to use Jinja placeholders in our SQL queries, to feed parameters to them. At least, I think that's what it's for.

6. Set up entry script 🔗

You may have noticed that our Dockerfile references a file, entry.sh. This bash file will be responsible for starting up the container and generally kicking things off. Let's create it (in the same directory).

#!/bin/bash set -e echo "Starting Superset setup..." # Upgrade DB & init Superset metadata superset db upgrade superset init # Create admin user if not exists if ! superset fab list-users | grep -q "${ADMIN_USERNAME:-admin}"; then echo "Creating admin user..." superset fab create-admin \ --username "${ADMIN_USERNAME:-admin}" \ --firstname "${ADMIN_FIRST_NAME:-Admin}" \ --lastname "${ADMIN_LAST_NAME:-User}" \ --email "${ADMIN_EMAIL:[email protected]}" \ --password "${ADMIN_PASSWORD:-admin}" else echo "Admin user already exists, skipping creation." fi # Run the Superset server with Gunicorn for production exec gunicorn --workers 3 --timeout 120 --bind 0.0.0.0:8088 "superset.app:create_app()"

Again, reference the comments for an overview of what's happening, but in short, we:

  • Initiate DB migrations - this tells Superset to build the metadatabase (unless it did so a previous time we built the container, in which case this step will be skipped)
  • Create an admin user (ditto)
  • Run Superset for production by running Gunicorn (a Python server)

As with Dockerfile, this script references environment variables which we'll pass in via our command (step 8.)

7. Build, tag and push the image 🔗

We're now ready to build our custom image, tag it, and push it to our Azure container registry (ACR), so that, when we build our container, Azure will be able to pull our custom image from the ACR.

In all the below commands, replace {arc-name} with your ACR name.

In Powershell, first login to the ACR, via Docker, with:

az acr login --name {acr-name}

Now let's build our custom image via Docker. We'll tag it as "my-superset" in the process.

docker build -t my-superset .

Now let's tag it with a reference to our ACR:

docker tag my-superset {acr-name}.azurecr.io/my-superset

Finally, let's push it to our ACR

docker push {acr-name}.azurecr.io/my-superset

8. Create container 🔗

And now let's bring it all together. Let's build our Powershell command, which will make a call to az container create, the Azure CLI endpoint for creating a container instance.

If you're not on Windows, substitute ` for \.

az container create ` --resource-group {resource-group} ` --name {container-name} ` --image {acr-name}.azurecr.io/hlp-superset:latest ` --os-type Linux ` --cpu 2 ` --memory 2 ` --ports 8088 ` --dns-name-label {sub-domain} ` --environment-variables ` SUPERSET_SECRET_KEY="{secret-key}" ` SUPERSET_LOAD_EXAMPLES="yes" ` DB_USER="{db-user}" ` DB_PASS="{db-pass}" ` DB_HOST="{db-server}.postgres.database.azure.com" ` REDIS_HOST="{redis-name}.redis.cache.windows.net" ` REDIS_PORT="6380" ` REDIS_PASSWORD="{redis-secret}" ` REDIS_SSL="true" ` ADMIN_USERNAME="{admin-user}" ` ADMIN_PASSWORD="{admin-pass}" ` ADMIN_FIRST_NAME="{admin-first-name}" ` ADMIN_LAST_NAME="{admin-last-name}" ` ADMIN_EMAIL="{admin-email}"

That's quite a command. Let's go through the bits you need to swap out for real details:

  • Replace {resource-group} with the name of your resource group
  • Replace {container-name} with whatever you want to name the container (e.g. "Superset")
  • Replace {acr-name} with the name of your ACR (Azure container registry)
  • Replace {sub-domain} with whatever subdomain you want to use when accessing Superset in a browser (omitting this parameter will mean the container isn't publicly accessible)
  • Replace {secret-key} with any secret key - just invent one
  • Replace the {db-*} bits with the meta DB details you copied earlier
  • Replace {redis-name} and {redis-secret} with the Redis details you copied earlier (it'll almost certainly be port 6380; this is what Azure uses for Redis over SSL.)
  • Replace the {admin-*} bits with details of the admin user you want to create

A few other notes:

  • Omit the SUPERSET_LOAD_EXAMPLES env var if you don't want Superset to be initialised with some out-of-the-box charts and dashboards
  • We're omitting the --location argument; this means the container will be created in the same region as the resource group. Include the argument if you want it in a different location.

Before running the command, go into Azure Portal > your container registry > settings > access keys and show then copy the password .

Now run the command in Powershell (paste it in then hit enter). Azure will prompt you to authenticate with your ACR.

  • For the ACR username, this is nearly always the ACR name itself, so enter that
  • For the password, grab this from Azure Portal > your ACR > settings > access keys > password (show then copy it) and paste it into Powershell, then hit enter

Having done all that, Azure should build your container. When it's finished, you'll see a dump of JSON. You should now be able to navigate to:

http://{container-name}.{container-region}.azurecontainer.io:8088

Note: Superset runs by default on port 8088. We can change that via the Dockerfile if we really want.

Check you can login as the admin user you set up, and that's it!

Future considerations 🔗

Some final considerations:

  • Now that everything's built, remember to go back to your DB config and limit public access to the IP the container runs on (assuming this is static; if it's non-static, we can't do this.)
  • If for any reason you re-do the process (e.g. to make changes to Superset config), but want to retain the meta DB structure and content that Superset has already initialised, make sure you use the same secret key as before, otherwise the DB encryption check will error (silently)
  • Azure provides two (non-Kubernetes) container products: container instances and container apps. We're using the former, which is the simpler of the two by far, but it's possible that the latter is the better choice for more heavyweight production apps. For many, though, container instances will be fine. Container apps, however, supports automatic scaling, orchestration, and more features.
  • We could instead host via Kubernetes, but that would drastically change our setup process. I know Docker well, but not Kubernetes, hence I went the Docker route, but if you want to host on Kubernetes, Azure has its Kubernetes Service product.

Did I help you? Feel free to be amazing and buy me a coffee on Ko-fi!