DataFusion Server Usage Guide
GitHub Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Using Docker

Pre-built Docker images

DataFusion Server has two container options: a full-featured container with Python plugin enabled and a compact container without Python plugin.

Pull container image from GitHub container registry

GitHub container registry supports only the amd64 architecture.

Full-featured built container:

docker pull ghcr.io/sal-openlab/datafusion-server/datafusion-server:latest

or built without Python plugin container:

docker pull ghcr.io/sal-openlab/datafusion-server/datafusion-server-without-plugin:latest

Executing container

Full-featured built container:

docker run -d --rm \
    -p 4000:4000 \
    -v ./data:/var/datafusion-server/data \
    --name datafusion-server \
    ghcr.io/sal-openlab/datafusion-server/datafusion-server:latest

or without Python plugin container:

docker run -d --rm \
    -p 4000:4000 \
    -v ./data:/var/datafusion-server/data \
    --name datafusion-server \
    ghcr.io/sal-openlab/datafusion-server/datafusion-server-without-plugin:latest

If you are only using sample data in a container, omit the -v ./data:/var/xapi-server/data.

docker run -d --rm \
    -p 4000:4000 \
    --name datafusion-server \
    ghcr.io/sal-openlab/datafusion-server/datafusion-server:latest

Checking running logs and server statistics

Inspecting container logs.

docker logs datafusion-server

Call statistics endpoint by cURL with jq formatter.

curl http://localhost:4000/sysinfo | jq

Results is like follows.

{
  "name": "datafusion-server",
  "version": "0.9.1",
  "plugin": {
    "pythonInterpreter": "3.11.7 (main, Jan  9 2024, 06:52:32) [GCC 12.2.0]",
    "connectors": [
      {
        "module": "example",
        "version": "1.1.0"
      },
      {
        "module": "excel",
        "version": "1.0.0"
      }
    ],
    "processors": [
      {
        "module": "pivot-table",
        "version": "1.0.0"
      }
    ]
  },
  "statistics": {
    "runningTime": 1277
  }
}

Stopping container

docker stop datafusion-server

Building containers your self

Clone the source codes from GitHub

git clone https://github.com/sal-openlab/datafusion-server.git
cd datafusion-server

Executes the bundled shell script

./make-containers.sh

This will build two containers: a full-feature container and a compact container with the without plugin feature. Additionally, Docker image files containing both containers are generated.

  • datafusion-server-x.y.z.tar.gz
  • datafusion-server-without-plugin-x.y.z.tar.gz

If the --no-export option is added to the make-containers.sh script, container image creation will not be performed.

Executing container

Full-featured built container:

docker run -d --rm \
    -p 4000:4000 \
    -p 50051:50051 \
    -p 9100:9100 \
    -v ./data:/var/datafusion-server/data \
    --name datafusion-server \
    datafusion-server:x.y.z

or without Python plugin container:

docker run -d --rm \
    -p 4000:4000 \
    -p 50051:50051 \
    -p 9100:9100 \
    -v ./data:/var/datafusion-server/data \
    --name datafusion-server \
    datafusion-server-without-plugin:x.y.z

If you are only using sample data in a container, omit the -v ./data:/var/xapi-server/data.

docker run -d --rm \
    -p 4000:4000 \
    -p 50051:50051 \
    -p 9100:9100 \
    --name datafusion-server \
    datafusion-server:x.y.z
The metrics expose port (default 9100) should not be publicly accessible. It should be operated within the same Docker network as the Prometheus server, but it is shown here as an example.