# Baseten

**Source:** https://geo.sig.ai/brands/baseten  
**Vertical:** AI Infrastructure  
**Subcategory:** AI Inference Platform  
**Tier:** Challenger  
**Website:** baseten.co  
**Last Updated:** 2026-04-14

## Summary

Baseten raised $300M Series E at $5B (IVP, CapitalG, NVIDIA $150M), powering inference for Cursor, Mercor, Clay, and Lovable; $585M total raised. Jan 2026.

## Company Overview

Baseten is an AI infrastructure company that makes it easy for teams to deploy and run machine learning models in production. Founded in 2019, Baseten builds the full systems software stack for AI applications -- from GPUs and autoscaling to observability, billing, and developer tools -- enabling companies to serve AI models at scale without managing underlying infrastructure.

The company raised $300 million in Series E funding in January 2026, led by IVP, CapitalG, and NVIDIA (which invested $150M), bringing total funding to $585 million and its valuation to $5 billion. This marked the company's third fundraise in the past year, reflecting explosive growth in AI inference demand.

Baseten's customers include leading AI companies such as Cursor, Mercor, Clay, OpenEvidence, Lovable, and Abridge, as well as enterprises building specialized models for their domains. The platform powers a multi-model future where companies combine multiple AI models for different tasks, requiring reliable, low-latency inference infrastructure at scale.

## Frequently Asked Questions

### What does Baseten do?
AI inference infrastructure platform for deploying and running ML models in production at scale.

### How much has Baseten raised?
$300M Series E at $5B valuation (January 2026). $585M total. NVIDIA invested $150M.

### Who uses Baseten?
Cursor, Mercor, Clay, OpenEvidence, Lovable, Abridge, and other AI-first companies.

### What problem does Baseten solve?
Full-stack inference infrastructure: GPUs, autoscaling, observability, billing -- so teams focus on models, not ops.

### How does Baseten handle GPU autoscaling for inference?
Baseten's autoscaling evaluates queue depth, active request count, and per-request token counts in real time to scale GPU capacity up and down within seconds. It supports scale-to-zero for low-traffic periods, warm pool strategies to minimize cold-start latency for high-traffic models, and multi-region deployment for latency optimization across geographies.

### What models and frameworks does Baseten support?
Baseten supports any Python-based ML model through its Truss open-source framework, with pre-built templates for popular model architectures including vLLM for LLM serving, Whisper for transcription, Stable Diffusion and FLUX for image generation, and custom transformers models. Any containerizable ML workload can be deployed as a Baseten model.

### How does Baseten's pricing compare to managed inference alternatives?
Baseten charges for actual GPU compute time consumed, with no markup on model weights or per-token fees for open-source models. This makes Baseten more cost-effective than closed API providers for high-throughput workloads and competitive with self-hosted GPU clusters for teams that factor in engineering time savings. Dedicated GPU capacity can be reserved for lower hourly rates.

### Why did NVIDIA invest $150M in Baseten?
NVIDIA's $150M investment in Baseten's Series E reflects strategic alignment — Baseten is a major consumer of NVIDIA H100 and H200 GPU capacity and its growth directly drives NVIDIA hardware demand. The investment gives NVIDIA visibility into inference platform trends and aligns Baseten's roadmap with NVIDIA's Blackwell GPU rollout, ensuring Baseten optimizes for next-generation NVIDIA hardware.

## Tags

ai-powered, b2b, infrastructure, platform, saas

---
*Data from geo.sig.ai Brand Intelligence Database. Updated 2026-04-14.*