We’re pumped. After months of building in stealth, we’re finally ready to lift the curtain and share what we’ve been working on: typedef, a new kind of AI data processing engine. Serverless, inference-first, and built from the ground up to run production AI workloads at scale.
What does that mean in practice? AI workloads today are increasingly complex and data-intensive. Teams are classifying millions of documents, summarizing call transcripts, automating customer support, moderating content, curating data for training and evals, and building LLM-powered analytics and agents. These aren’t just toy use cases, they’re the foundation of real products and systems.
We’re also extremely proud to announce our $5.5M seed round led by Pear VC, with support from Verissimo Ventures, Monochrome, Tokyo Black, and an exceptional group angels. We’re incredibly fortunate to have such a sharp and supportive group of backers who share our belief that AI infrastructure needs to be reimagined from the ground up.
How it started
When we first set out, we knew there had to be a better way. The status quo wasn’t cutting it. We’ve all spent too many hours wrangling brittle Spark jobs, provisioning and scaling clusters by hand, and debugging cryptic OOM errors at 2am. These systems weren’t designed for today’s needs, they were built for yesterday’s data.
We wanted to build something better. So we stepped back and asked a simple question:
"If the creators of Spark were starting from scratch today, what kind of engine would they build?"
The answer begins with the biggest shift since Spark’s early days: AI.
AI has completely reshaped data infrastructure. Traditional pipelines that were built to move and transform structured data now need to process chat logs, documents, support tickets, transcriptions, source code, and more. Workflows aren’t just ETL anymore, they involve tokenization, chunking, prompt formatting, inference orchestration, and multi-model routing. These operations don’t fit cleanly into SQL or MapReduce. They demand new primitives, new execution models, and tighter feedback loops.
It was clear: the infrastructure needed to change.
So we started from first principles, and built a new kind of engine, purpose-built for the shape of modern AI workloads.
AI is hitting a wall
Over the last couple of years, we’ve watched GenAI go from curiosity to board-level initiatives. Now, more than ever, companies are under pressure to turn prototypes into real business value.
But the truth is that most AI projects still don’t see the light of day in production.
They're brittle and break under scale. They’re built with duct tape and glue. This isn’t because teams lack talent, but because the infrastructure wasn’t designed for these modern workloads. Traditional data engines were built for rows and columns, not the messy, multi-modal inputs of modern AI.
And now, the stakes are even higher. AI is evolving at breakneck speed, and most companies are still figuring out where the value is. In a world of unknowns, teams need to move fast, test, iterate and learn just to keep up. The winners will be the ones who reach business impact first.
We’ve spoken with dozens of teams and the pattern is clear: LLMs are powerful. Operationalizing them is the bottleneck.
Inference is the new data processing primitive
We’ve spent our careers building data systems at places like Salesforce, Starburst, and Tecton. We’ve built real-time ML platforms, scaled OLAP infrastructure and seen the pain that emerges when AI gets bolted onto systems what were never meant to support it.
Inference is no longer the last mile of a workflow; it’s a first-class operation that needs to be composable, scalable, and deterministic. Inference is unlocking a new transform for data and AI teams to analyze and draw insight from unstructured data. We need to harness the power of inference and extend it into familiar tooling and interfaces that data teams already know and love.
Typedef is our answer: an inference-first engine that handles the messy realities of AI workloads. Token limits, context windows, chunking, retries, and rate limits handled as first-class citizens. Typedef gives you a unified engine for building AI-native data pipelines and agentic applications, that span structured tables, messy text, embeddings, and LLM outputs.
A serverless engine that turns AI into a reliable part of your data infrastructure.
Built for data & AI teams
typedef is:
- Serverless: We are infra folks to the core. No clusters, no complex configurations. Just import our SDK and start building. Then build some more, and scale seamlessly.
- AI-native: Handles structured and unstructured data, LLM specific constraints, and inference complexity out of the box.
- Composable: Build with our semantic operators in Python. Chain them together alongside relational operators and productionize your AI workflows. Build confidence with lineage, observability and cost tracking.
Our goal is to give data and AI teams the reliability and rigor that they’ve come to expect with traditional pipelines, with the power of LLM’s under the hood.
Already in use - and open sourced
Matic, an insurtech company working with 70+ carriers, is already using typedef to analyze policy documents and call transcripts. With us, they’ve reduced human error, slashed costs, and improved compliance.
We’re also releasing a big part of our platform as open source: Fenic, an opinionated, PySpark-inspired DataFrame framework for building AI workflows and agentic applications. It’s our way of pushing the ecosystem forward and inviting others to build with us. We’re big believers in working in the open to help advance innovation and adoption of AI. Working together and building communities around it is the fastest way to harness the power of AI.
What's next?
We’re just getting started. Over the coming months we’ll be rolling support for more operators, data sources, and agentic pipelines. In the coming weeks we’ll be onboarding select teams onto the alpha version of our cloud data engine.
If you’re building production AI systems, or want to. We’d love to hear from you. Join our community and team or just shoot us a note. hello@typedef.ai
Let’s make AI actually work.
-Yoni + Kostas & the typedef squad
