Vendors | Data Ecosystem Wiki

Thirty Launches in Thirty Days · Read the recap →

Contents

Vendors

Vendors are the companies that build and sell data tools. Some bundle their data products into a broader cloud platform (AWS, Google, Microsoft); others are pure-plays focused on a single category (Snowflake, Databricks, dbt Labs).

A vendor is the company that builds and sells a data tool. The wiki has a page for almost every important product in the modern data stack — Snowflake, BigQuery, Power BI, Trino, dbt, Tableau, and so on. This section is about the companies behind those products: who founded them, what year, what else they own, and what their actual commercial strategy looks like.

The reason this distinction matters: most of the famous "products" in data are not standalone companies. BigQuery is a feature of Google Cloud. Power BI is a feature of Microsoft 365. Redshift is a feature of AWS. You cannot understand why those products behave the way they do — pricing, packaging, integration, roadmap — without understanding the parent company's broader strategy.

The Two Kinds of Data Vendors

There are basically two business models, and almost every vendor fits one of them.

1. The hyperscaler bundle. AWS, Google Cloud, and Microsoft Azure all sell hundreds of cloud services. Data products are one slice of a much larger menu, and the strategic logic is to make sure no customer ever has a reason to leave the bundle. Each hyperscaler builds (or buys) at least one entry in every data category — a warehouse, an object store, an ETL tool, a BI tool, a streaming platform, an ML platform — not because each individual product is best in class, but because the bundle is the product. The moat is the AWS account, the Azure tenant, the GCP project, and the existing committed spend.

2. The category pure-play. Snowflake only sells a data warehouse (and adjacent things). Databricks only sells a lakehouse platform. dbt Labs only sells transformation tooling. Starburst only sells federated SQL. Confluent only sells Kafka. These companies live or die on whether their single product is best in class, and they have to fight the hyperscalers' "good enough and already in your AWS bill" alternative every quarter. The moat is product excellence and developer love, not the bundle.

Then there's a third, weirder category: the roll-up. Salesforce has acquired its way into being a major data vendor (Tableau in 2019 for $15.7B, MuleSoft in 2018 for $6.5B, Informatica in 2025). IBM did it earlier (Cognos, SPSS, Red Hat). Oracle has been doing it for thirty years. The strategy here isn't "best product" or "best bundle" — it's "buy the leader in each category and extract rent through the existing enterprise relationship."

Vendors and Their Products in This Wiki

Vendor	Type	Key data products in the wiki
—-	—-	—-
AWS	Hyperscaler	Redshift, S3, Kinesis, QuickSight, SageMaker
Google Cloud	Hyperscaler	BigQuery, GCS, Looker
Microsoft	Hyperscaler	Power BI, Azure Blob Storage
Snowflake	Pure-play	Snowflake
Databricks	Pure-play	Databricks, Photon, Databricks ML
Starburst	Pure-play	Trino, Starburst Galaxy
dbt Labs	Pure-play	dbt Core, dbt Cloud, dbt Semantic Layer
Confluent	Pure-play	Confluent Kafka
Salesforce	Roll-up	Tableau, Informatica, Data Cloud, MuleSoft

How to Read a Vendor Page

Each vendor page in this section follows the same shape:

Origin story — who founded the company, what year, and what problem they were originally solving. This usually explains a lot of the product's quirks.
What they actually sell — the data products in their portfolio, with links to the per-product pages elsewhere in the wiki.
Strategy — how the company makes money, who they compete with, and what their bet on the future is.
Honest market take — where the vendor is strong, where they're weak, and where the marketing diverges from the reality.

This is the opposite of how vendor websites work. Vendor websites are designed to present each product in the most favorable possible light against an idealized "before" state. The vendor pages here try to give you the same picture a sober analyst would: real history, real numbers, real tradeoffs.

Why Vendor Strategy Matters Even If You Just Want to Pick a Tool

You might reasonably ask: "I just want to know whether to use BigQuery or Snowflake. Why do I care about Google's broader cloud strategy?"

Because the strategy is the product. BigQuery's pricing model (pay per byte scanned), its serverless architecture, and its tight coupling to Google Cloud Storage are all consequences of Google running it as a feature inside GCP rather than as a standalone business. Snowflake's multi-cloud portability, virtual warehouse model, and consumption-based credits are consequences of Snowflake being a venture-backed pure-play that has to win against three hyperscalers simultaneously. The architectural decisions follow from the business model.

Same thing with Power BI. The reason Power BI dominates BI today is not that it has better visualizations than Tableau (it doesn't, mostly). It's that Microsoft sells it as part of E5 licensing, where it shows up at near-zero marginal cost. Once you understand that, every decision Microsoft makes about Power BI — the integration with Excel, the lukewarm Mac story, the heavy push into Fabric — becomes obvious.

Vendor pages try to make those forces visible.

How TextQL Thinks About Vendors

TextQL Ana is a layer that sits on top of whatever data stack a customer already has. We connect to the warehouse, the BI tool, the catalog, and the semantic layer regardless of vendor — Snowflake or BigQuery or Databricks or Redshift, Tableau or Power BI or Looker. Our job is to make the existing stack more useful, not to replace any of it. So we end up with strong opinions about every major vendor, and the pages in this section are how we share them.

See TextQL in action

Vendors

What it is Companies that build & sell data tools

Two main types Hyperscaler bundles vs. category pure-plays

Hyperscalers AWS, Google Cloud, Microsoft

Pure-plays Snowflake, Databricks, dbt Labs, Starburst

Roll-ups Salesforce (Tableau, MuleSoft, Informatica)