Apache Superset is a great free and open-source software released by the Apache foundation for advanced data visualization and exploration. Unlike proprietary tools for data visualization, Apache Superset adds value for an organization as free and open-source software.

Once installed, you can create multiple dashboards with multiple varieties of charts, maps, etc, and connect multiple data sources including different databases, excel and CSV files, and even APIs (with some workarounds) to create a very advanced data exploration tool than just using a simple Excel spreadsheet. So whenever the data sources get updated, your Superset dashboards will also be updated simultaneously.

Apache superset dashboard. Source GitHub

This June Apache Superset announced Version 2, which adds different charts and features that were not available in version 1. And In this post, I’m covering how one can install Apache Superset version 2 on a computer running Linux/Ubuntu. And in a later post, I will show how you can connect an external database to Apache superset.

If you came here because because you were unable to install Superset by following the official documentation, because it threw errors, that is because the documentation has not updated for a while, and some steps mentioned in the official documentation are now outdated.

Before getting started

Before getting started, you will need to have a server/computer that is running latest Ubuntu LTS or Windows Subsystem for Linux with Docker installed.

See the Docker documentation on how you can install Docker on your computer, and some basic knowledge on how to use Docker.

Installing Superset with Docker

Currently, the easiest way to install Superset is with Docker. The officially documented way of installing Superset with pip does not seem to work and will give you dependency incompatibility issues. I’m sure these issues will be solved by the developers in future updates. But at the time of writing, the easiest way to install Superset is with Docker.

Again if you are trying to install Superset via the official Docker image published by Apache, you won’t be able to install the latest Superset version (V2), as the official Docker image at the time of writing installs an older version of Superset. So follow these steps instead to install Superset version 2.

Instead of the official Docker image, there is a community-maintained Docker Image published by the user, amancevice, and it has more than 5 million pulls at the time of writing. It is also on track with the latest updates of Supersets development.

Step 1 – Create a Dockerfile

Rather than directly pulling and installing the Docker image, follow the steps to create a Dockerfile and install with the Dockerfile that we just created.

We are doing this because we need to install the openpyxl module for Superset to read Excel files that you will be uploading to Superset.

We also need to make sure the Docker containers USER is set to root. Otherwise, you won’t be able to upload Excel files to Superset, and Superset seems to throw an internal server error (500).

    FROM amancevice/superset
    
    # install openpyxl to read write excel files, you won't be able to upload excel files without openpyxl
    
    RUN pip install openpyxl
    
    USER ROOT

Save the Dockerfile, and navigate to the location of the Dockerfile from the terminal, and build the docker image using the build command, and we will be tagging the image as superset.

docker build -t superset .

Now you can can the image and create your container, to do that run the following command and make sure to name your container as superset as in this command, or else the below steps won’t work.

docker run -p 8088:8088 -d --name superset superset

Now you need to create and administrator account to login and to create other users and dashboards etc, run the following command to create an admin user account within the superset instance with a username of admin and password of admin.

    docker exec -it superset superset fab create-admin \
                  --username admin \
                  --firstname Superset \
                  --lastname Admin \
                  --email [email protected] \
                  --password admin

The final step is to initialize the Superset database by running the following command. You might be prompted to create a new administrator account, and you can enter the details to create a second administrator account during the initialization process.

docker exec -it superset superset-init

Now you will be able to access the Superset installation from the address http://127.0.0.1:8088 and you will be able login with the username admin and password admin.

Initial dashboard after a successful installation


Host your Superset with us

Want to host Superset with dedicated cloud hosting? Let us maintain your Superset hosting with our dedicate cloud hosting solutions. With regular backups, extra security and lots more.

Get in touch with Zemantic