NEW: Scale AI Case Study — ~1,900 data requests per week across 4 business units Read now →
Scale AI
How Scale AI answers ~1,900 data requests per week across Ops, Finance, Growth & HR
One AI agent serving four business units — every answer governed through a single semantic layer owned by the Data team.
>50%
of Scale’s dbt code reviewed by Ana
1.9T
rows across tables handled in production
+74.9%
organic monthly adoption
"I suggest you try TextQL on your messiest datasets, hook it up to your worst codebase and documents, and ask the most complicated question that actually drives your business."
Heqing Huang, Director of Analytics, Scale AI
About Scale AI
Scale AI is a software company providing data labeling and development tools for artificial intelligence, operating a marketplace of hundreds of thousands of expert contributors across 73 countries.
Industry
Software services
Company Size
~1,200 employees
Headquarters
San Francisco, CA
Pain Point
Four business units with distinct data stacks, all bottlenecked on a shared Data team and a growing ticket queue
Products Used
Data Sources
Snowflake, dbt, Tableau, Redash
Every team self-served. Every query governed.
Request Demo →Request Demo →
Scale AI produces billions of data points generated by hundreds of thousands of expert contributors across 73 countries. Running a business that complex means every department (Finance, Growth, Ops, HR) depends on data to operate. With the constant requests, an analytics backlog grew. Turnaround slowed. And the team who should have been building foundational data infrastructure were stuck fielding ad hoc queries instead.
Rather than hire another ten analysts, Director of Analytics Heqing Huang deployed TextQL’s Ana. Pre-loaded with Scale’s own data sources and governed through a single centralized semantic layer, Ana now resolves roughly 1,900 data requests per week and runs 42 active playbooks in production. Stakeholders across the company get self-serve answers in minutes — not days.
Queries answered per week
~350
~1,900
Leading fintech company's internal data agent usage
Scale AI's internal Ana usage
5.4× throughput
Source: recently, a leading fintech company reported ~1.4K queries answered over 4 weeks (~350/wk).
[ THE FOUNDATION ]
How the Data team kept control while opening access
Every Ana deployment at Scale runs on the same data engineering foundation: dbt models, Snowflake warehouse, Tableau dashboards. They're governed through a single semantic layer owned by the Data team. Teams get self-serve access; the Data team keeps ownership of what every metric actually means.
CONTROL
Every query governed. Nothing gets past the semantic layer.
BUSINESS UNITS
FINANCE
420 req/wk
GROWTH
510 req/wk
OPS
380 req/wk
HR
290 req/wk
CENTRAL SEMANTIC LAYER — OWNED BY THE DATA TEAM
ONTOLOGY
Metric definitions
CERTIFIED
dbt lineage
ACCESS
Row/col scoping
AUDIT
Query logs
CONNECTED SYSTEMS
Snowflake
dbt
Tableau
Redash
CRM
Procurement
+7 more
Ana reads dbt model lineage directly, so every answer traces back through the same transformations that power Scale's certified dashboards in Tableau. When a team asks "what was revenue last quarter," Ana resolves it against the same model that finance uses in the board deck. No drift, no second source of truth.
[ TEAM CONFIGURATIONS ]
Same agent. Four different worlds.
The deployment wasn't one-size-fits-all. Each business unit got an Ana pre-loaded with the data sources they actually use — Growth sees CRM and pipeline telemetry, Finance sees procurement and cloud billing, Ops sees the contributor marketplace, HR sees HRIS and performance management data. Underneath, every query resolves against the same governed metric definitions.
Finance
~420 req/wk
Finance pulls from Snowflake billing tables, procurement software for BPO invoices, and AWS/GCP billing for infrastructure costs. These three systems rarely agree with each other, and reconciling them was a recurring time sink.
A question like “what’s our spend efficiency across campaigns?” required joining billing actuals against invoice line items against cloud compute, then defending the number in front of leadership.
Ana handles the cross-system join in a single query. Finance now runs campaign-level spend tracking, budget variance analysis, and cost allocation across programs without filing a ticket. The questions that matter most to the CFO’s office get answered in the meeting where they come up, not a week later.
Growth
~510 req/wk
Growth tracks delivery metrics, free-trial conversion, and marketplace performance across Snowflake ETL pipelines, CRM data, and snapshot views that capture point-in-time funnel states.
The hard questions were never about any one system. They were about correlations across all three: “how do week-over-week delivery metrics relate to pipeline movement in the CRM, and where are free-trial conversions stalling?”
Ana reads the ETL schemas, understands the CRM object model, and reconciles snapshot timing differences automatically. Growth built playbooks that refresh weekly and deliver results into Slack channels. The team went from requesting reports to owning their analytics workflow entirely.
Ops
~380 req/wk
Supply ops manages contributor availability across a 73-country marketplace. Their data lives in Snowflake ETL tables, task and delivery tables tracking contributor output, and demand forecasting models.
The question is always some version of “do we have enough contributors with the right skills in the right regions for what’s coming in three months?” Answering it means joining availability data with task completion rates with forward-looking demand signals.
Ana runs those joins and produces capacity projections by region, skill type, and program. Ops uses it to flag constraints before they become delivery problems. The dashboards refresh automatically instead of requiring a manual pull every time leadership asks for an update.
HR
~290 req/wk
HR operates across their HRIS for headcount and contributor data, a performance management platform for reviews, and a workforce administration system for HR operations. Each system has its own ID schema and its own definition of “active.”
A question like “contributor retention by region” used to mean pulling from all three, reconciling the mismatches, and hoping the numbers held up in review.
Ana maps across the systems and handles the reconciliation. HR now runs retention breakdowns, workforce allocation analysis, and planning queries that inform hiring decisions on demand.
"The traditional BI players — Tableau, Snowflake — are not innovating as fast as these new tools that add value to our business."
Heqing Huang, Director of Analytics, Scale AI
[ THE RESULTS ]
Data scientists build models now, not reports.
Within nine months, Ana went from a pilot to the default analytics surface at Scale. The numbers tell the story: 28,000+ total messages sent, 11,500+ threads started, and a peak week of ~1,900 messages. Early adoption averaged ~400 messages over the first 30 days; the most recent 30-day window hit ~7,000 — a 17× increase, entirely pull-driven. The Data team’s inbound queue dropped dramatically, and they shifted from fulfilling requests to owning the semantic layer underneath all of it.
QUARTERLY IMPACT
From pilot to ~1,900 messages a week
Weekly Messages to Ana
*Average weekly messages per month. 28,000+ total messages across 11,500+ threads.
Automated weekly reporting
Ana delivers breakdowns on supply capacity, demand forecasting, and contributor performance directly into Slack channels, including project health metrics, contributor retention analysis, and spend efficiency.
Dynamic dashboards in minutes
Any user can build a fully functioning, complex dashboard from scratch in under 45 minutes and schedule it to refresh automatically. What once required days of back-and-forth with the data team now happens in a single sitting.
On-demand business questions
Ana provides instant responses to exploratory analysis, scenario planning, and root-cause investigations, including ad-hoc requests from Scale’s executives on project timelines and marketplace dynamics.
Code-level transparency
Ana traces through DBT models and SQL logic to explain how metrics are constructed, enabling stakeholders to understand the analysis without pinging the data team for a walkthrough.
The expansion from analytics into the rest of the company happened the way the best platform rollouts do: someone used it, their colleague saw what it could do, and the next Slack message was “can I get access?” Finance saw what analytics was doing with Ana and started using it for spend efficiency. Growth picked it up for campaign analysis. Supply ops adopted it for contributor forecasting. HR used it for workforce planning. The 74.9% month-over-month user adoption growth wasn’t a launch metric. It was a trailing indicator of a tool that was already indispensable by the time anyone thought to measure it.
Scale stopped treating analyst bandwidth as the bottleneck for every data question in the company and gave every team the ability to answer their own questions at the speed the business actually moves.
"With TextQL, our analysts can now focus on high leverage tasks and the most challenging problems."
Heqing Huang, Director of Analytics, Scale AI
Read more from TextQL
Engineering
Introducing API Connectors
Your warehouse has your historical data. Your SaaS tools have everything else. Now Ana can reach both in the same conversation, securely, with no tokens exposed.
Announcements
From Conversation to Application: Introducing Persistent Dashboards
Your entire team can now use live, interactive dashboards that refresh on schedule — all built through conversation with Ana.
Announcements
Ana Just Got 25% Cheaper
For the second time in a month, we're making Ana even more cost-effective.
Go from question to conviction in minutes, not weeks.