DB Pilot's Local Analytics Workspace
Query local files, S3, PostgreSQL, SQLite & more - all at once, from your laptop.
October 27, 2023
So far DB Pilot has mainly been focused on providing a modern database GUI. Now it is becoming more than that - its new analytics workspace provides a full-on data toolbox for your local machine.
The analytics workspace is powered by an embedded ClickHouse instance. You can attach a variety of sources to the workspace, and - thanks to ClickHouse - query them all at once. For example, you can easily join data from a local CSV file with data from your PostgreSQL database.
And with DB Pilot’s AI assistant, writing the appropriate SQL for that is easy as never before.
Embedded ClickHouse
Initially the plan was to build DB Pilot’s analytics workspace around DuckDB. It’s fast, lightweight and straightforward to integrate. But earlier this year an alternative contender was announced: chDB - an embedded SQL OLAP Engine powered by ClickHouse.
ClickHouse’s wider ranging support of data sources - postgresql, mysql, mongodb, s3, local files & more - made this an obviously better fit for DB Pilot.
DB Pilot uses a fork of chDB that adds session support, and improves query execution time. Execution time is improved by moving a cleanup procedure that used to run after every query to a dedicated function that runs only once when disconnecting the client. If you are interested in learning more, see here.
Attaching Sources
You can attach a variety of sources to the analytics workspace. As of now, local files, files on S3, PostgreSQL and SQLite databases can be attached, with support for other sources following soon.
All sources can be queried at once, making it possible to connect all your data.
DB Pilot creates named collections in the embedded ClickHouse database for all source types where ClickHouse supports it. This means you can query your sources under a short alias, and don’t have to specify all connection details in each query. For example, you can query:
SELECT * FROM s3(my_bucket)
Whereas without named collections the query would be more verbose:
SELECT * FROM s3(
https://my-bucket.s3.amazonaws.com/my-file.csv,
'my_aws_access_key_id',
'my_aws_secret_access_key',
'CSVWithNames'
)
By default, there would be one problem though. ClickHouse stores named collection data to a file in plain text, including secrets. This is likely alright when running on a secure server. But it’s less of an option when running on a user’s personal machine, potentially alongside lots of other untrusted software.
To solve this, DB Pilot’s fork of chDB comes with additional functions to add named collections in memory only (see here and here). DB Pilot then stores all secrets securely in the MacOS Keychain, and loads them into memory as required.
Wrap up
If this sounds useful to you, go ahead and give DB Pilot a try. Or in case you want to try chDB on its own, head over to its repository or docs.