← Back to all posts

PipelineDB 0.9.7 - Delta Streams and Towards a PostgreSQL Extension

PipelineDB 0.9.7 has shipped! Download it here.

This release contains some minor but necessary catalog improvements, so to migrate your existing installation to PipelineDB 0.9.7 you'll want to use the binary upgrade tool.

Without further ado, here's what PipelineDB 0.9.7 gives you:

Delta Streams

Delta streams are the most interesting new feature in this release. Output streams were released a couple of versions ago, which give you a stream of the incremental changes made to a continuous view in the form of old and new tuples. Delta streams build on this abstraction by introducing another tuple into the output stream: the delta tuple.

delta tuples contain the "difference" of the new and old tuple. For something like a sum aggregate, (delta).sum would simply be (new).sum - (old).sum. Taking this idea further, delta tuples are generalized to work with any aggregate value supported by PipelineDB, and the delta values can be combined downstream with new groupings.

Perhaps the most powerful capability this enables is the sharing of work between continuous views for greater efficiency and simpler, more elegant continuous topologies.

Let's briefly look at an example. Suppose we have a continuous view that maintains a few metrics that require non-trivial state to maintain over time (e.g., not aggregates such as count and sum, which require no intermediate state to maintain), and they're aggregated at a relatively high dimensionality:

  SELECT minute(ts), browser, url,
   count(distinct user_id) AS uniques
   percentile_cont(ARRAY[0.75, 0.90, 0.99]) WITHIN GROUP (ORDER BY latency) AS latencies
  FROM raw_page_views
GROUP BY minute, browser, url;

A very common pattern involves maintaining similar sets of metrics, but at different aggregation levels and dimensionalities for performance reasons, long-term archiving, etc. For example, we might have an hourly global rollup of the above metrics that we want it to be very fast to query at a glance, while remaining compact for archival data. If we only care about our metrics across all browsers and urls, the naive solution would be something like:

CREATE CONTINUOUS VIEW global_hourly_metrics_naive AS
  SELECT hour(ts), 
   count(distinct user_id) AS uniques
   percentile_cont(ARRAY[0.75, 0.90, 0.99]) WITHIN GROUP (ORDER BY latency) AS latencies
  FROM raw_page_views
GROUP BY hour;

While its on-disk footprint is much smaller, global_hourly_metrics still must read the same number of input events as metrics_1m from raw_page_views, which is obviously sub-optimal.

But we can do better now. Delta streams in conjunction with the combine aggregate make this kind of re-grouping very easy and efficient:

CREATE CONTINUOUS VIEW global_hourly_metrics AS
  SELECT hour((new).minute), 
   combine((delta).uniques) AS uniques,
   combine((delta).latencies) AS latencies
  FROM output_of('metrics_1m')
GROUP BY hour;

These "down aggregated" combined metrics only require reading the stream of updates made to metrics_1m (which is much less intensive than reading from the granular raw_page_views stream), and lose no information in the process. That is, the output of global_hourly_metrics is precisely equivalent to global_hourly_metrics_naive.

Check out the delta stream docs for additional information.

Reaper Procs

The previous release introduced time-to-live (TTL) support for continuous views. TTL expiration requires deleting expired rows in the background, which was previously done in the autovacuumer. This approach proved to be problematic, mainly because it wasn't conducive to aggressive enough deletions. By design, the autovacuumer is fairly considerate about the load it incurs on tables, and can often be canceled completely under certain conditions. Ultimately this meant that TTL deletions could be delayed for quite a while, leading to continuous views growing beyond their intended bounds and increasingly long-running delete transactions when deletions would eventually execute.

Unfortunately we learned this the hard way. One of our customers began to observe TTL'd continuous views becoming unexpectedly large, happily maintaining many rows that were far outside of the TTL window. This cascaded into a number of other unexpected problems, so an interim solution was devised that would do deletions from these large CVs manually from cron. While not pretty, the approach worked very well, ultimately leading us to break out TTL expiration into its own process, called the "reaper" process.

The reaper's only job is to execute deletions on TTL-expired rows as smartly as possible. It also limits the number of rows that can be deleted in a single transaction, preventing indefinitely long-running transactions.

Check out the docs for more information, and take a look at some of the configuration options reaper processes expose.

HyperLogLog Performance Improvement

HyperLogLog is a fairly integral part of any PipelineDB deployment. It's implicitly used by COUNT(DISTINCT), and generally very useful for any kind of analytics workload involving counting unique things in different ways. PipelineDB exposes a few different ways to build and manipulate HyperLogLog objects.

One very common use of HyperLogLog objects with PipelineDB is to combine many of them together, ultimately allowing users to count uniques across different dimensions while not losing any information. PipelineDB internally combines them as well in order to incrementally update continuous views backed by HLLs.

These combinations are done by unioning the HLLs, and this release adds an internal optimization for these unions. Without getting into the sorcery of how HyperLogLog works, basically you end up with a bunch of registers containing small integers jammed into an array of bytes. A union of two HLLs works by iterating over all of their registers, taking the max of each. For compactness, these registers span word boundaries and therefore require quite a lot of bit packing/unpacking (CPU) to manipulate.

So what this optimization does is "unpack" HLLs being unioned into a representation for which the registers are directly addressable as elements of an array, reducing the cycles required for unions by quite a bit. The tradeoff is obviously more memory usage at union time, which is more than worth it because the union result is presumably an aggregate result anyways, so it's already "small".

Improved Docker Image

Thanks to the thoughtful work of ankoh, the official PipelineDB Docker image got a facelift in this release. A few of the highlights:

  • Smaller image footprint
  • It's now easier to bind a custom pipelinedb.conf to PipelineDB containers
  • If you want to build your own PipelineDB images, the new Dockerfile now exposes quite a few switches that enable you to only include what you want

Check out the PR for more details.

Odds and Ends

PipelineDB Will Become a Standard PostgreSQL Extension

Last but not least, we've committed to making PipelineDB a standard PostgreSQL extension by version 1.0. The majority of PipelineDB's code is actually in its own modules at this point, but our plan is to incrementally roll out the remaining refactoring work over the upcoming 0.9.8, 0.9.9 and 1.0.0 releases.

In terms of timeline, we're tenatively aiming to release PipelineDB 1.0 simultaneously with PostgreSQL 10, which is currently slated for some time during the third quarter of 2017 (July - September). But if we can ship it earlier than that, we will. And if it takes a bit longer than that to get it right, then that's okay too. We'll keep you posted!

But first, we need your help! The internal refactor work is relatively straightforward and specc'd out, but we want to be as careful and thoughtful as possible about getting the higher-level PostgreSQL <-> PipelineDB interface just right. And the best way for us to do that is to consider as much input from our amazing userbase as possible. You all have given us countless great ideas, and we'd apprecaite a few more as PipelineDB nears the big 1.0.

We've created a generic issue here that we hope will serve as a sounding board for some of these fundamental design considerations, and we sincerely welcome your thoughtful input as we figure out just the right way to map existing PipelineDB functionality onto PostgreSQL's extension interface. Again here's the discussion:


Stay tuned for more! :D