Apache Druid | Data Ecosystem Wiki

Public Preview · May 18–Jun 5 NEW: Opus 4.8 is now available in Ana →

Contents

Apache Druid

Apache Druid is a real-time analytics database created at Metamarkets in 2011 by Eric Tschetter, Fangjin Yang, Gian Merlino, and Vadim Ogievetsky. It defined the real-time OLAP category and once ran most of the world's ad-tech analytics, but is fading versus ClickHouse in 2026.

Apache Druid is a real-time analytics database that, for most of the 2010s, defined what "real-time OLAP" meant. It was the engine that proved you could ingest events from Kafka and serve sub-second analytical queries against billions of rows — a combination that warehouses could not do and that, before Druid, mostly required custom engineering at companies like LinkedIn, Yahoo, and Facebook.

Druid is a foundational technology in this category. It also, in 2026, has the unmistakable feel of a system whose moment has passed. New deployments are rare. The center of gravity in real-time OLAP has decisively moved to ClickHouse. Existing Druid deployments at large adopters like Netflix and Airbnb continue to run, but the conversation in this category is no longer "Druid vs Pinot" — it is "ClickHouse, and what about migrating off Druid?"

The Metamarkets Origin

Druid was created in 2011 at Metamarkets, a programmatic ad-tech analytics startup based in San Francisco. Metamarkets's customers were ad networks and ad exchanges who needed to slice billions of bid events by dimensions like publisher, advertiser, geography, device, and creative — in real time — to serve interactive dashboards. The four engineers who built it were Eric Tschetter (the original architect, who wrote a famous blog post titled "Introducing Druid: Real-Time Analytics at a Billion Rows Per Second"), Fangjin Yang, Gian Merlino, and Vadim Ogievetsky.

Metamarkets had tried existing tools — Hadoop was too slow for interactive queries, traditional databases couldn't ingest fast enough, and column stores like Vertica were too expensive at the scale Metamarkets needed. So they built their own, with a specific design goal: interactive (sub-second) queries on streaming event data with petabyte-scale storage and high concurrency.

Druid was open-sourced in October 2012 under the Apache license. By 2013-2014 it had been adopted at Netflix, Yahoo, eBay, Twitter, and Airbnb. In 2015, Yang, Merlino, and Ogievetsky left Metamarkets to found Imply, the commercial company stewarding Druid (analogous to Confluent for Kafka or Databricks for Spark). Druid became an Apache top-level project in early 2019. Metamarkets itself was acquired by Snap in 2017.

How Druid Actually Works

Druid's architectural recipe was, for its time, novel and influential:

Time-partitioned segments. Druid stores data in time-chunked segments (typically hourly or daily). Queries that filter by time only scan relevant segments — a critical optimization for time-series-shaped event data.

Pre-aggregation at ingest (rollups). Druid is famous for its rollup feature: at ingest time, you specify a set of dimensions and metrics, and Druid pre-aggregates incoming events into compact summary rows. If you ingest 10 billion raw events with rollup, you might end up with 100 million rolled-up rows — a 100x reduction. Queries then run over the rollups, not the raw events. This was a brilliant trade for ad-tech, where raw event detail was often unnecessary and the speedup was dramatic.

Columnar storage with bitmap indexes. Like ClickHouse, Druid stores columns separately. It also builds bitmap indexes on string columns, which makes filter operations on high-cardinality categorical dimensions (publisher, advertiser, country) extremely fast.

A multi-process architecture. This is where Druid's complexity becomes visible. A Druid cluster runs historicals (serve queries on older immutable segments), middleManagers (handle real-time ingestion), brokers (route queries and merge results), coordinators (manage segment placement), overlords (manage ingestion task scheduling), and depends on ZooKeeper (cluster coordination), a deep storage like S3 or HDFS (segment archive), and a metadata store like MySQL or Postgres (segment catalog). To run Druid, you operate all of this.

The architectural elegance of separating these concerns is real — it lets you scale read and ingest paths independently, store cold data in S3 cheaply, and replicate hot segments across historicals for query parallelism. The operational cost of running a cluster of half a dozen distinct process types is also real, and is the single biggest reason Druid lost ground to ClickHouse.

Why Druid Is Fading

The honest reasons Druid lost the real-time OLAP race:

1. Operational complexity. Compared to ClickHouse's "single C++ binary," Druid's six-process architecture plus ZooKeeper plus deep storage plus metadata store is enormously more work to operate. New users would try Druid, struggle to get a cluster running, try ClickHouse, have it running in an hour, and never look back. Imply's managed offering (Imply Polaris, then Imply Cloud) addresses this for their customers, but the open-source operational tax is unavoidable.

2. SQL surface area. For years, Druid was primarily queried through a custom JSON-based query language. SQL support was added later (via Apache Calcite) and improved over time, but the SQL surface remained more limited than ClickHouse's — particularly around joins, which Druid historically did not support well at all.

3. Updates and corrections. Druid's segment-based architecture makes it hard to update or delete individual records. The standard pattern is to re-ingest entire segments, which is slow and operationally heavy.

4. The rollup tradeoff. Druid's rollup feature was a strength for ad-tech but a weakness for use cases that need raw event detail. As real-time OLAP expanded beyond ad-tech into in-product analytics and observability — where you often want individual event records — ClickHouse's "no rollup required" approach proved more flexible.

5. ClickHouse was just faster on most benchmarks. Independent benchmarks consistently showed ClickHouse outperforming Druid on common analytical query patterns at lower hardware cost. Speed matters in this category.

6. Java vs C++. Druid is a Java application. It is subject to JVM tuning, garbage collection pauses, and the operational overhead that comes with the JVM. ClickHouse's C++ implementation avoids these.

Where Druid Still Makes Sense

Despite its declining momentum, Druid remains a legitimate choice in narrow scenarios:

You have an existing Druid deployment that works. Migrating off is real work, and the legacy deployments at Netflix, Airbnb, and Lyft continue to run. Don't migrate without a reason.
You need real-time rollups at very high ingest rates. Druid's rollup feature, if your workload fits its assumptions, can dramatically reduce storage and query costs.
Imply is selling you a managed Druid service and the operational complexity is fully abstracted away. Imply Polaris is genuinely simpler to use than self-hosted Druid.
You need bitmap-index-based filtering on high-cardinality categorical dimensions at extreme performance. Druid's bitmap index implementation is mature and well-tuned for this exact pattern.

For most new projects in 2026, the honest recommendation is: start with ClickHouse, evaluate Druid only if you have a specific reason ClickHouse won't work.

Where Druid Sits in the Stack

Druid sits downstream of event streaming (Kafka and Kinesis are the standard ingest paths) and serves queries to BI tools, application backends, and end users via SQL or JSON queries. It is a specialized analytical serving layer, not a transformation engine — you typically pair it with a stream processor like Flink for any non-trivial enrichment.

How TextQL Works with Druid

TextQL Ana connects to Druid via Druid's SQL interface (HTTP-based) and queries it the same way it queries other SQL backends. The interesting use case is querying a Druid deployment that already powers real-time dashboards inside an organization — TextQL becomes the natural-language interface to data that engineers and analysts already trust for live operational reporting.

See TextQL in action

Apache Druid

Created 2011 at Metamarkets

Open-sourced October 2012

Apache TLP 2019

Original creators Eric Tschetter, Fangjin Yang, Gian Merlino, Vadim Ogievetsky

Commercial sponsor Imply (founded 2015)

License Apache 2.0

Written in Java

Notable users Netflix, Airbnb, Lyft, Walmart, Salesforce, Pinterest

Category Real-time Analytics

Monthly mindshare ~30K · ~13K GitHub stars; mature but losing mindshare to ClickHouse