← Back to all posts

PipelineDB 0.8.5

PipelineDB 0.8.5 is here, download it now!

Database-level Control for Managing Continuous Queries

Previously, continuous_query_num_combiners + continuous_query_num_worker background workers would automatically fire-up for all databases at system start up. In cases when you only need continuous queries on a single database, this would cause a lot of idle processes. Now you can use the ACTIVATE and DEACTIVATE command to enable and disable continuous query execution respectively. This state is persisted across restarts so you only need to run ACTIVATE once.

The continuous_queries_enabled parameter controls whether newly created databases have continuous queries are enabled or not.

Throttling Write I/O

Instead of committing to disk after every batch, we've added a continuous_query_commit_interval configuration parameter which sets the minimum time a combiner process will wait before committing changes to disk. Any data that arrives during the commit interval is merged with on-disk tuples but kept in memory. This gives users better control on throttling writes. In case you're seeing high disk activity, bumping up the commit interval should improve performance at the cost of potentially more data loss during crashes.

set_agg Improvements

We added set_agg in the previous release as a way to do exact distinct counting. We've made some usability improvements to set_agg. Instead of being a bytea field, set_agg now results in an array type and can be interfaced by using PostgreSQL's Array Functions and Operators. We've also aliased array_agg(DISTINCT) to use set_agg under the hood because both aggregates are semantically equivalent. Previously, there was no support for array_agg(DISTINCT).


INSERT INTO stream (x) VALUES (1), (2), (3);
INSERT INTO stream (x) VALUES (1), (2), (3);
INSERT INTO stream (x) VALUES (3), (4), (5);

(1 row)

Client Improvements

We fixed a few bugs in our pipeline command-line client: \d now lists continuous views and streams, and \d$ can be used to list all streams.

pipeline-dump, pipeline-dumpall and pipeline-restore have been fixed to properly dump and restore continuous views, and can be used to take backups of PipelineDB and restore them later on. This is also the standard way to upgrade PipelineDB for now. The behaviour and API for these binaries is identical to their PostgreSQL equivalents: pg_dump, pg_dumpall and pg_restore.

Breaking Changes

We've added primary keys to materialization tables, so this release is not compatible with previous versions.

continuous_query_combiner_cache_mem is deprecated in this release.