PipelineDB is joining Confluent Read the blog post →

High-performance time-series aggregation for PostgreSQL

An open-source PostgreSQL extension that runs SQL queries continuously on streams, incrementally storing results in tables.

Install Now

No Application Code

PipelineDB enables realtime data processing using only SQL. Our continuous query planner and execution engine handle all of the complexity of running realtime computation on streams.

Runs on PostgreSQL

PipelineDB is a standard PostgreSQL extension, enabling it to leverage the stability of an extremely mature and reliable database enchanced by a vibrant ecosystem.

Eliminate ETL

PipelineDB eliminates the ETL layer. Stream granular data directly into the database and continuously distill it with the SQL queries you’ve declared.

Efficient and Sustainable

PipelineDB stores only the output of continuous queries, which are incrementally updated as data is ingested. The database’s size is independent of the amount of data ingested over time.

What it Can Do

PipelineDB can do everything PostgreSQL can do, with some powerful additions designed for high-throughput, streaming workloads:
continuous aggregations

Continuous Aggregations

Continuously aggregate, filter, and distill streaming data into summary data in realtime with continuous SQL queries and store the results in PipelineDB.

sliding window queries

Sliding Window Queries

Run continuous queries over custom time windows (1 second, 1 minute, 1 day, 30 days, etc.) and then either store the windowed query results in PipelineDB or discard the raw data after the window time has passed.

joining streams on tables

Joining Streams on Tables

Streaming analytic data oftentimes needs context. PipelineDB’s integrated relational storage engine enables you to join streaming data on historical data for comparison and analysis in realtime.

probabilistic data structures

Probabilistic Data Structures

Oftentimes 100% accuracy tradeoffs in exchange for speed are acceptable with realtime analytics at scale. PipelineDB supports data structures and algorithms such as Bloom filters, count-min sketch, Filtered-Space-Saving top-k, HyperLogLog, and t-digest for very accurate approximations on high-volume streams.

How It Works

Continuous views are the core abstraction of PipelineDB. You can think about them as very high-throughput, realtime, incrementally updated materialized views. The most important property of a continuous view is that only its output is actually stored in the database.
Stream Buffer: Tuples inserted into a stream are stored in a concurrent shared memory circular buffer.
SELECT count(*) FROM stream
Worker: Worker processes read micro-batches from the stream buffer, aggregate them into partial results and send them to combiner processes.
SELECT sum(count) FROM view, worker
10 + 5
Combiner: Combiner processes take partial results from workers and continuously merge them with on-disk tuples.