(Tutorial) Hosting Apache Superset part 1: getting Supserset running

(Tutorial) Hosting Apache Superset part 1: getting Supserset running

25 Jun 2025 azure container docker superset

I'm no sys-admin. I cannot emphasise that enough. Nonethless, I recently had to figure out how to host Apache Superset, an open source BI platform, on our Azure cloud.

This is the guide I wished I'd found when I needed it. It's in two parts (in part two we'll look at configuring Superset for embed use, which you may or may not need depending on whether you're planning to use SS directly in the browser or via embeds.)

We'll be running Superset inside a Docker container, on Azure container instances, via a custom image based on the official Superset image.

If you want to host it on a different container host, it's really just step 9 you'll need to modify. Everything else should apply everywhere.

This guide uses Superset 4.1.2, the latest release of Superset 4. There is a Superset 5, but the Dockerfile needs some tweaks to work with that, to accommodate this issue.

Setup 🔗

First, fulfill the following prerequisites:

Finally, login to the Azure CLI and let's go:

az login

This guide will use a Azure CLI for some steps and Azure Portal (i.e. via a browser) for others. You could in theory do the whole thing over CLI.

1. Create a resource group 🔗

Skip this step if you already have a resource group within Azure you want to use.

In Azure, everything lives in a resource group, a meta data container for the resources within it. Let's set one up.

  • In Azure Portal, go to resource groups > create
  • Select your subscription and give the group a name
  • Select the region it should be located in. This is NOT the region where the resources within it will (necessarily) run, merely the location of the meta data about them.
  • Save, and make a note of the resource group name for later

2. Create a container registry 🔗

I mentioned up top that we'd be building our own custom image based on the official Superset image. For this, we'll need somewhere to put that image, and that somewhere is an Azure container registry (ACR).

  • In Azure Portal, go to Container registrties > create
  • Select your subscription and the resource group we just made
  • Give the ACR a name and choose a location (again, this isn't where our resources will run, merely where our custom images will be stored)
  • Leave domain name label scope on "unsecure" for the purposes of this tutorial. Later, we might want to re-do this and tighten this up (it basically governs how the addresses our containers run on are generated)
  • Choose a pricing plan - go with basic if you're just playing around, or standard if you're building for production
  • Leave everything else, including the other tabs, as-is
  • Save, and make a note of the ACR for later

3. Create a meta database 🔗

Superset needs a database to store its meta data in. When you run Superset in dev mode (a different process entirely, via Docker Compose), it creates one for you within the container, but in production mode we need to provide our own.

You can use any SQL DB, hosted anywhere, so long as the DB is ultimately publicly accessible. We'll set up a Postgres DB within Azure.

  • In Azure Portal, go to PostgreSQL flexible servers > Create
  • Select your subscription and the resource group we made
  • Specify a server name and make a note of this for later
  • Select whichever region makes sense for you
  • Superset supports Postgres up to version 16 for its meta DB - I went with PG 15
  • Choose a compute + storage that makes sense for you
  • Leave other settings as-is unless you specifically want/need to change them
  • For authentication, select "PostgreSQL authentication only" for simplicity (again, change this if you know what you're doing)
  • Set an admin username and password, and make a note of these for later
  • In the Networking tab, make sure public access is enabled and whitelist the IP range 0.0.0.0 - 255.255.255.255 (you can limit this later once we know our container's IP)
  • Create

4. Create a Redis cache 🔗

Skip this if you can't be arsed with Redis and are happy for Superset to use in-memory caching instead. However, be aware that you'll lose this cache when you rebuild the container later (e.g. to make config changes, restart it etc.)

Superset also wants a Redis instance, for caching (and API rate-limiting, which we're not interested in here). Azure offers no fewer than three Redis products, which differ by by featureset and who manages it (Azure or Redis themselves). We'll use the original (and simplest) Azure Cache for Redis product.

  • In Azure Portal, go to Azure cache for Redis > create (a dropdown may again ask you which Redis product to use - make sure you choose Azure Cache for Redis again)
  • Select your subscription and the resource group we made
  • Give it a name and region that makes sense for you
  • Choose an SKU and cache size that suits your need - the cheapest is Basic + 250gb
  • Under the Networking tab, for Connectivity method choose Public endpoint (or if you about this stuff, use private if you want)
  • Create
  • We'll be connecting to Redis via access keys rather than any other auth means. After creation, go to the Redis resource > Settings > Authentication > Access keys tab, and:
    • Copy and make a note of the primary secret
    • Enable access keys authentication by UN-checking the box (Superset will 500 if you omit this as it won't be able to connect to Redis)

5. Set up config file 🔗

Now we'll set up a basic Superset config file, written in Python. Don't worry if you don't know Python - it's mostly just settings.

Create config.py and give it the following:

from os import environ from superset.config import FEATURE_FLAGS as DEFAULT_FEATURE_FLAGS # SS meta DB db_user = environ["DB_USER"] db_pass = environ["DB_PASS"] db_host = environ["DB_HOST"] SQLALCHEMY_DATABASE_URI = f"postgresql+psycopg2://{db_user}:{db_pass}@{db_host}:5432/postgres" # SS feature flags FEATURE_FLAGS = { **DEFAULT_FEATURE_FLAGS, "ENABLE_TEMPLATE_PROCESSING": True } # CSRF protection WTF_CSRF_ENABLED = True # Redis REDIS_HOST = environ["REDIS_HOST"] REDIS_PASSWORD = environ["REDIS_PASSWORD"] redis_url = f"rediss://:{REDIS_PASSWORD}@{REDIS_HOST}:6380/0" # Celery (for background tasks e.g. caching with Redis) BROKER_URL = redis_url CELERY_RESULT_BACKEND = redis_url CELERY_BROKER_TRANSPORT_OPTIONS = { "visibility_timeout": 3600, } # disable validation of Redis cert - it's managed by Azure so it's robust CELERY_BROKER_TRANSPORT_OPTIONS['ssl'] = {'ssl_cert_reqs': 0}

Our config file references environment variables, which we'll pass in via our build command (step 8). The whole thing will be merged into Superset's default config file.

A few points of note:

  • ENABLE_TEMPLATE_PROCESSING allows us to use Jinja placeholders in our SQL queries, to feed parameters to them. At least, I think that's what it's for.
  • rediss:// (as opposed to redis://), and port 6380, both assume our Redis is running over SSL, which, if it's hosted on Azure, it is by default.
  • If you plan to use map charts, you'll want a MapBox API key. Add another setting, MAPBOX_API_KEY = environ["MAPBOX_KEY"]

If you skipped the Redis set up in step 4, remove the 4 lines above relating to Redis.

6. Set up Dockerfile 🔗

Time to bulid our image. First, create a directory somewhere on your machine and cd into it.

mkdir superset cd superset

Now create a file called Dockerfile (exactly that; no file extension) in that directory, and give it the following content:

# fetch the latest Superset image - we'll base our image on that FROM apache/superset:4.1.2 # switch to root user USER root # install Postgres driver and flask-cors explicitly RUN pip install psycopg2-binary flask-cors # copy Superset config file and tell Superset where it is COPY config.py /app/superset_config.py ENV SUPERSET_CONFIG_PATH=/app/superset_config.py # Install envsubst to help with env var substitution RUN apt-get update && apt-get install -y gettext-base && rm -rf /var/lib/apt/lists/* # copy and chmod script AS ROOT COPY entry.sh /app/entry.sh RUN chmod +x /app/entry.sh # Now switch to non-root user USER superset # run entry script CMD ["/app/entry.sh"]

If you're new to Dockerfile or Docker generally, check out my in-depth guide.

In short, we're fetching the Superset image then making our own tweaks. In particular, we copy across our config file and the entry file which we'll create in step 7.

7. Set up entry script 🔗

You may have noticed that our Dockerfile references a file, entry.sh. This bash file will be responsible for starting up the container and generally kicking things off. Let's create it (in the same directory).

#!/bin/bash set -e echo "Starting Superset setup..." # Upgrade DB & init Superset metadata superset db upgrade superset init # Create admin user if not exists if ! superset fab list-users | grep -q "${ADMIN_USERNAME:-admin}"; then echo "Creating admin user..." superset fab create-admin \ --username "${ADMIN_USERNAME:-admin}" \ --firstname "${ADMIN_FIRST_NAME:-Admin}" \ --lastname "${ADMIN_LAST_NAME:-User}" \ --email "${ADMIN_EMAIL:[email protected]}" \ --password "${ADMIN_PASSWORD:-admin}" else echo "Admin user already exists, skipping creation." fi # Run the Superset server with Gunicorn for production exec gunicorn --workers 3 --timeout 120 --bind 0.0.0.0:8088 "superset.app:create_app()"

Again, reference the comments for an overview of what's happening, but in short, we:

  • Initiate DB migrations - this tells Superset to build the metadatabase (unless it did so a previous time we built the container, in which case this step will be skipped)
  • Create an admin user (ditto)
  • Run Superset for production by running Gunicorn (a Python server)

As with Dockerfile, this script references environment variables which we'll pass in via our command (step 8.)

8. Build, tag and push the image 🔗

We're now ready to build our custom image, tag it, and push it to our Azure container registry (ACR), so that, when we build our container, Azure will be able to pull our custom image from the ACR.

In all the below commands, replace {arc-name} with your ACR name.

In Powershell, first login to the ACR, via Docker, with:

az acr login --name {acr-name}

Now let's build our custom image via Docker. We'll tag it as "my-superset" in the process.

docker build -t my-superset . --no-cache

--no-cache won't be necessary the first time you build the image (though it won't do any harm), but it may be useful for subsequent builds where you've made minor changes to the Dockerfile. For me at least, Docker didn't always seem to pick these up, and used cached versions. Could have just been me...

Now let's tag it with a reference to our ACR so Docker knows where we're pushing it to. (Tags, you see, aren't just conveniences; they're used to dictate where, remotely, we want to send the image or retrieve it from.)

docker tag my-superset {acr-name}.azurecr.io/my-superset

Finally, let's push it to our ACR.

docker push {acr-name}.azurecr.io/my-superset

9. Create container 🔗

And now let's bring it all together. Let's build our Powershell command, which will make a call to az container create, the Azure CLI endpoint for creating a container instance.

If you're re-building the container, be sure to delete it first.

If you're not in Powershell, substitute ` for \. And make sure there's neither at the end of the last line.

az container create ` --resource-group {res-group} ` --name {container-name} ` --image {acr-name}.azurecr.io/hlp-superset:latest ` --os-type Linux ` --cpu 2 ` --memory 2 ` --ports 8088 ` --dns-name-label {sub-domain} ` --environment-variables ` SUPERSET_SECRET_KEY="{secret-key}" ` SUPERSET_LOAD_EXAMPLES="yes" ` DB_USER="{db-user}" ` DB_PASS="{db-pass}" ` DB_HOST="{db-server}.postgres.database.azure.com" ` REDIS_HOST="{redis-name}.redis.cache.windows.net" ` REDIS_PASSWORD="{redis-secret}" ` ADMIN_USERNAME="{admin-user}" ` ADMIN_PASSWORD="{admin-pass}" ` ADMIN_FIRST_NAME="{admin-first-name}" ` ADMIN_LAST_NAME="{admin-last-name}" ` ADMIN_EMAIL="{admin-email}"

That's quite a command. Let's go through the bits you need to swap out for real details:

  • Replace {res-group} with the name of your resource group
  • Replace {container-name} with whatever you want to name the container (e.g. "Superset")
  • Replace {acr-name} with the name of your ACR (Azure container registry)
  • Replace {sub-domain} with whatever subdomain you want to use when accessing Superset in a browser (omitting this parameter will mean the container isn't publicly accessible)
  • Replace {secret-key} with a secret key of your choice
  • Replace the {db-*} bits with the meta DB details you copied earlier
  • Replace {redis-name} and {redis-secret} with the Redis details you copied earlier (it'll almost certainly be port 6380; this is what Azure uses for Redis over SSL.)
  • Replace the {admin-*} bits with details of the admin user you want to create

A few other notes:

  • Omit the SUPERSET_LOAD_EXAMPLES env var if you don't want Superset to be initialised with some out-of-the-box charts and dashboards
  • We're omitting the --location argument; this means the container will be created in the same region as the resource group. Include the argument if you want it in a different location.
  • If you want to use map charts, we'll need to pass in the Mapbox API key. Add another env var, MAPBOX_KEY="{your-mapbox-key}" `

Before running the command, go into Azure Portal > your container registry > settings > access keys and show then copy the password .

Now run the command in Powershell (paste it in then hit enter). Azure will prompt you to authenticate with your ACR.

  • For the ACR username, this is nearly always the ACR name itself, so enter that
  • For the password, grab this from Azure Portal > your ACR > settings > access keys > password (show then copy it) and paste it into Powershell, then hit enter

Having done all that, Azure should build your container. When it's finished, you'll see a dump of JSON. You should now be able to navigate to:

http://{container-name}.{container-region}.azurecontainer.io:8088

Note: Superset runs by default on port 8088. We can change that via the Dockerfile if we really want.

Azure container instances default to HTTP. You'll want to add SSL to it later.

Check you can login as the admin user you set up, and that's it!

If you make changes to the config and need to rebuild the container, just repeat steps 7 and 8 - but first delete the container instance from Azure. Do this via Portal or via az container delete --name {container-name} --resource-group {res-group}.

Troubleshooting 🔗

If you're finding the container doesn't show in a browser after you've made it, something's obviously gone wrong.

First, check the container logs.

az container logs --resource-group {res-group} --name {container}

If that gives you "None", then the problem happened before the container even started running (i.e. during build). In which case, let's run the container locally and execute ourselves, manually, inside the container.

docker run -it --rm {acr-name}.azurecr.io/my-superset /bin/bash

That'll put you inside the container. Then run:

superset run -p 8088 --with-threads --reload --debugger

Bear in mind that, with running it locally, we're not passing all those env vars that we do in the live Powershell command, so expect errors about those being missing, or failure to import config file etc.

If that doesn't yield any joy, let's do the same thing but on Azure. Override the command with an instruction to keep the container alive for long enough that we can shell into it, start Superset manually, and see what error we get.

Edit the long Powershell command to include a new line:

--command-line "/bin/bash -c 'while true; do sleep 30; done'"

Delete the container (via Azure Portal), then run the command. Wait until the container is in "Running" state (again, you'll see this in Azure Portal), then run this (first replacing the {...} placeholders with your details):

az container exec ` --resource-group {res-group} ` --name {container-name} ` --exec-command "/bin/bash"

That gives us shell access to our container. Now let's launch our entry script manually, and monitor output:

bash /app/entry.sh

You'll get all sorts of logs. Have a look for errors in there (or dump it into an AI to get the answer quicker.)

Future considerations 🔗

Some final considerations:

  • Now that everything's built, remember to go back to your DB config and limit public access to the IP the container runs on (assuming this is static; if it's non-static, we can't do this.)
  • If for any reason you re-do the process (e.g. to make changes to Superset config), but want to retain the meta DB structure and content that Superset has already initialised, make sure you use the same secret key as before, otherwise the DB encryption check will error (silently)
  • Azure provides two (non-Kubernetes) container products: container instances and container apps. We're using the former, which is the simpler of the two by far, but it's possible that the latter is the better choice for more heavyweight production apps. For many, though, container instances will be fine. Container apps, however, supports automatic scaling, orchestration, and more features.
  • We could instead host via Kubernetes, but that would drastically change our setup process. I know Docker well, but not Kubernetes, hence I went the Docker route, but if you want to host on Kubernetes, Azure has its Kubernetes Service product.

---

Now head over to part two if you want to use Superset embeds!

Did I help you? Feel free to be amazing and buy me a coffee on Ko-fi!