DataFusion Server Usage Guide
GitHub Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Delta Lake

Data Source Definition

{
  "format": "deltalake",
  "name": "example",
  "location": "s3://my-bucket/delta-table"
}

Specify deltalake in the format to indicate a Delta Lake table. And in the location, specify the root directory of the Delta Lake table. The schemes that can be specified in the location refer to supported Format and Location Matrix.

Options

Specified Table Version (Delta Table Time Travel Feature)

Specify the version number of the Delta table in the version key of the options block. The initial version is 0. If omitted, the latest version will be used.

{
  "format": "deltalake",
  "name": "example",
  "location": "file:///delta-tables/delta-table",
  "options": {
    "version": 0
  }
}

Here is the official documentation on Delta Lake’s ‘Time Travel’: Delta Lake Time Travel

Usage Example

$ curl -X POST http://127.0.0.1:4000/dataframe/query \
     -H 'Content-Type: application/json' \
     -d $'
{
  "dataSources": [
    {
      "format": "deltalake",
      "name": "example",
      "location": "file:///delta-table"
    }
  ]
  "query": {
    "sql": "SELECT * FROM example"
  }
}'

Footnote

Accessing Delta Lake tables utilizes the delta-kernel crate. I’m filled with gratitude toward the members who have achieved high-quality results early on in the delta-kernel-rs project.

At present, the functionality for reading Delta Lake tables has been implemented, but it is anticipated that in the near future, functionalities such as writing operations and vacuuming will be implemented. Data Fusion Server also plans to expand its capabilities accordingly.

Please refer to the blog post for more information on the Delta Kernel.