The response to the introduction of SQL at data.world has been really amazing. In a short time, SQL has become one of the primary ways that users interact with data, whether through the query UI, or through our API integrations (Python, R, Java).

In response, we continue to invest in SQL as a core technology in our data catalog. While we remain focused on supporting the SQL standard, we’re also extending the standard to support the unique needs of our users. Additionally, we continue to improve our query editor to make writing queries easier.

Multi-table Syntax

A very common pattern that we’ve observed is a single conceptual table being split into different files or sheets (often based on time). Here’s a simple example which splits purchase data by month into separate sheets of a Google Spreadsheet.

Purchases by Month

The same basic data has been split across a number of sheets with each containing the same columns, but different data. We’ve implemented a custom extension that allows you to treat these tables as one.

As an added bonus, you can future-safe your queries with a regular expression.

Aggregations, filtering, ordering and other standard SQL features will continue to work with this multi-table syntax.

Federated Queries

Oftentimes users want to join the data in their local dataset against another dataset in data.world. You can do this using what’s called a federated query.

Each of the following datasets contain a column containing a two-letter state abbreviation. I would like to join these datasets together so that I can do some further analysis on the relationship between them.

The fully qualified path to each table can be specified with a ‘dotted’ notation. Tables may then be joined using standard SQL JOIN syntax.

Note: tables and columns with non-word characters must be escaped with back ticks.

Union, Intersect, Minus

In addition to custom SQL extensions, we also continue to support the SQL standard. UNION, INTERSECT and MINUS have been added recently and are useful when working with multiple tables. This is similar to the multi-table syntax, but more flexible.

Dataset Table Information

There are a couple of special schema tables for each dataset: Tables and TableColumns.

This is useful information, though somewhat hidden. We have therefore included the schema in a new pane that’s displayed alongside the query editor. Just click to copy fully qualified column and table names.


This article is just scratching the surface of what’s possible today with SQL at data.world (filtered aggregations… anyone?). For more information, please check out the dwSQL docs.

We’re constantly adding more functionality to our cloud data catalog so please follow the Advanced SQL dataset and we’ll be sure to update when we have new things to show!

Want to make your data projects easier/faster/better? Streamline your data teamwork with our Modern Data Project Checklist!