The Landscape
Who's building what. Opinionated. Updated weekly.
This is a living directory of the agent ecosystem. Every entry includes two things: what the company or tool does, and why i think it matters (or doesn't). These are not generic descriptions. They're takes, informed by the deep dives, interviews, and research i do every week for the newsletter.
You can filter by category or maturity stage. I will continue to update this landscape on a weekly basis. If something's missing or i got something wrong, let me know.
65 entries
Foundation Models
Anthropic
GrowthThe company behind Claude. Building frontier models with a focus on safety and long-context reasoning. The Claude 4 family (Opus, Sonnet, Haiku) are among the strongest models for agentic tasks, with massive context windows and reliable tool use.
Why it matters
Claude is quietly becoming the default model for serious agent builders. The combination of massive context windows, strong tool use, and reliable instruction following makes it the backbone of most production agent systems i see shipping today. Claude Code is also pushing the boundary on what a coding agent can actually do in practice.
OpenAI
MatureThe company that kicked off the current AI era with ChatGPT. GPT-4o and the o-series reasoning models power a huge chunk of the agent ecosystem. The Agents SDK (evolved from Swarm) and function calling are the entry point for most builders.
Why it matters
Still the default for most developers. The Assistants API and function calling are how most people build their first agent. But the gap with competitors is narrowing fast, and i'm seeing more teams diversify away from OpenAI-only stacks. The o-series reasoning models are genuinely impressive for complex multi-step tasks though.
Google DeepMind
GrowthGoogle's AI research lab and the team behind Gemini. Gemini 2.0 brought native tool use, and the Gemini CLI gives developers a free agentic coding assistant. Google's moat is distribution through Search, Android, and Workspace.
Why it matters
Google has the distribution advantage nobody else can match. A billion people already use their products. If Gemini gets good enough (and it's getting there), the default agent for most consumers will just be.. Google. The Gemini CLI launch was a smart move, giving developers a free on-ramp.
Meta (Llama)
GrowthMeta's open-source LLM family. Llama 3 and its successors have become the backbone of the open-source agent ecosystem. Free to use, modify, and deploy, which makes it the go-to for teams that need full control over their stack.
Why it matters
Llama changed the game for anyone building agents who can't or won't depend on API providers. The open-source ecosystem around it is massive. But here's the honest take: for most production agent use cases, the hosted APIs (Claude, GPT) still outperform on reliability. Llama is great for cost-sensitive deployments and edge cases where you need full control.
DeepSeek
GrowthChinese AI lab that demonstrated a 46x reduction in training costs and 90% lower inference costs. Their models proved that frontier-level performance doesn't require frontier-level budgets, sending shockwaves through the industry.
Why it matters
DeepSeek is the most important thing that happened to AI economics in 2025. They proved intelligence is deflationary. This has massive implications for agent builders because cheaper inference means agents can reason longer, run more checks, handle messier tasks. The cost floor keeps dropping.
Mistral
GrowthFrench AI company building efficient open-weight models. Known for strong performance-per-parameter and function calling capabilities. Mistral Large and the smaller Mixtral models offer competitive alternatives to the US-based providers.
Why it matters
Mistral matters for the agent ecosystem because they're proving that European AI can compete on model quality. For teams building agents that need to comply with EU regulations, having a strong European model provider is genuinely useful. Their function calling support is solid.
xAI (Grok)
GrowthElon Musk's AI company building Grok models with a focus on real-time information access through X (Twitter) data. Grok 3 showed competitive performance on coding and reasoning benchmarks.
Why it matters
The interesting angle here isn't the model quality (it's competitive but not leading). It's the data advantage. Having real-time access to X's firehose of social data gives Grok-powered agents a unique edge for anything involving current events, sentiment, or social signals.
Cohere
GrowthEnterprise-focused AI company building models optimized for business use cases. Command R+ is designed for RAG and tool use in enterprise environments, with strong multilingual support.
Why it matters
Cohere is playing a different game than the consumer-facing model providers. They're going deep on enterprise, and that focus shows in their RAG capabilities and data privacy features. If you're building agents for regulated industries, they're worth a look.
Agent Frameworks
LangChain / LangGraph
GrowthThe most widely adopted agent framework ecosystem. LangChain provides the building blocks (chains, tools, memory). LangGraph adds stateful, graph-based orchestration for complex multi-step agent workflows with branching logic and human-in-the-loop.
Why it matters
Love it or hate it, LangChain is where most people start. The ecosystem is massive, the docs are good, and LangGraph genuinely solves hard orchestration problems. The learning curve is steeper than alternatives but it handles complex workflows better than anything else i've used. If you need state machines and branching logic, start here.
CrewAI
GrowthMulti-agent orchestration framework that lets you define teams of AI agents with roles, goals, and tools. Great for prototyping multi-agent systems quickly with a role-based mental model.
Why it matters
CrewAI is the fastest path from idea to working multi-agent prototype. You can get something running in a day. The role-based metaphor (agents as team members with specializations) is intuitive and maps well to how people think about delegation. It starts to creak under production load, but for learning and prototyping, nothing's faster.
AutoGen
GrowthMicrosoft's open-source multi-agent framework. Enables conversational agents that can collaborate, debate, and solve problems together. Strong integration with Azure services and the Microsoft ecosystem.
Why it matters
AutoGen's conversation-based approach to multi-agent systems is genuinely different from the alternatives. Agents talk to each other to solve problems, which creates interesting emergent behaviors. The Microsoft backing means solid enterprise integration, but the developer experience still lags behind LangGraph and CrewAI.
OpenAI Agents SDK
EarlyOpenAI's official framework for building agents, evolved from the Swarm research project. Provides primitives for agent orchestration, handoffs between agents, and guardrails, all tightly integrated with OpenAI's models.
Why it matters
This is OpenAI saying the framework layer matters enough to own it themselves. The Agents SDK is clean and well-designed, but the obvious limitation is vendor lock-in. You're building on OpenAI's stack, with OpenAI's models. For teams already committed to the OpenAI ecosystem, it's the smoothest path. For everyone else, it's a question of how much lock-in you're comfortable with.
Pydantic AI
EarlyAgent framework from the creators of Pydantic, Python's most popular data validation library. Brings type-safe, model-agnostic agent building with structured outputs, dependency injection, and strong testing support.
Why it matters
If you're a Python developer who cares about type safety and clean code, this is probably the framework that'll feel most natural. The Pydantic team knows how to build developer tools. It's newer and less battle-tested than LangChain but the DX is noticeably better for structured agent outputs.
LlamaIndex
GrowthData framework for LLM applications, strong in RAG (retrieval-augmented generation) and data-connected agents. Specializes in helping agents access and reason over private data sources like documents, databases, and APIs.
Why it matters
LlamaIndex carved out a clear niche: if your agent needs to work with your data, this is where to start. The RAG pipeline tooling is best-in-class. It's less of a general-purpose agent framework and more of a data layer that agents can plug into. The two aren't mutually exclusive, many teams use LlamaIndex for data + LangGraph for orchestration.
Semantic Kernel
GrowthMicrosoft's SDK for integrating AI into applications. Supports C#, Python, and Java. Designed for enterprise developers who need agent capabilities within existing Microsoft/.NET ecosystems.
Why it matters
If you're a .NET shop, this is your only real option for agent building, and it's actually quite good. The enterprise focus means solid patterns for things like authentication, logging, and observability that other frameworks treat as afterthoughts. Not the sexiest framework, but production-ready.
Platforms & Products
Manus AI
GrowthGeneral-purpose AI agent that can handle multi-step tasks end-to-end. Booking flights, compiling research reports, building playable games. Launched March 2025 and hit $90M ARR in 5 months. Uses a sandboxed virtual computer approach.
Why it matters
Manus felt like the ChatGPT moment for agents. The frenzy was real: invite codes selling for thousands of dollars, Discord swelling past 180K members in days. i did a deep dive on them. The honest assessment: the underlying architecture (Claude + Qwen running in sandboxed VMs) isn't revolutionary, but the product execution is. The question is whether reliability can scale. Early users reported looping errors and lazy behavior on longer tasks. But $90M ARR in 5 months doesn't lie.
Perplexity
GrowthAI-powered answer engine that combines search with LLM reasoning. Perplexity Pro offers deep research capabilities that can autonomously research complex questions, browse the web, and synthesize findings into structured reports.
Why it matters
Perplexity is the clearest example of an agent that regular people actually use. The deep research feature is genuinely impressive for complex questions. Give it a real research problem and compare to an hour of Googling. The quality gap is significant. This is what consumer-facing agents should feel like.
ElevenLabs
GrowthVoice AI platform offering text-to-speech, voice cloning, and conversational AI agents. 11ai is their voice-first agent that integrates with Notion, Slack, Perplexity, and other tools via MCP. Over 5,000 voices available.
Why it matters
ElevenLabs is making voice the command layer for agents, and the math is compelling: we speak at 150 words per minute, type at 40. 11ai turns research, execution, and multi-step tasks conversational. With MCP integration, it plugs into tools people already use. i covered their launch in Agent Angle #1. Voice-first agents feel like a genuinely different interaction paradigm.
HeyGen
GrowthAI video generation platform that creates realistic video avatars and translations. Their agents can produce personalized video content at scale, from marketing to customer communication.
Why it matters
HeyGen is pushing video from a creative tool into an agent workflow. Instead of hiring actors and studios, you describe what you want and an agent generates it. The quality is getting eerily good. The interesting question is what happens when video agents get as good as text agents, because most of the internet is still consumed as video.
Decart
GrowthAI company generating interactive worlds in real-time. Founded by two Israeli intelligence veterans, built to $3.1 billion valuation. Their world generation models create infinite interactive realities at a fraction of traditional simulation costs.
Why it matters
i did a deep dive on Decart. An investor told them it was impossible, but they built a system that generates interactive worlds 10x cheaper than existing methods. The implications for gaming, simulation, and training environments are massive. If agents need to practice in simulated worlds before acting in real ones (and they will), Decart's approach could be foundational.
Coding Agents
Cursor
GrowthAI-native code editor built as a fork of VS Code. Integrates AI deeply into the editing experience with inline completions, chat, and an agent mode that can plan and execute multi-file changes autonomously.
Why it matters
Cursor is the coding agent most developers actually use daily. It's not the most autonomous (Claude Code goes further) but the integration into the editing workflow is seamless. The agent mode is genuinely useful for medium-complexity refactoring tasks. i think the IDE-native approach is the right bet for most developers.
Devin (Cognition)
GrowthThe first AI software engineer, built by Cognition Labs. Given a task, Devin can plan, write code, debug, deploy, and iterate autonomously using its own browser, code editor, and terminal.
Why it matters
Devin's launch in early 2024 was a watershed moment. It showed people what a fully autonomous coding agent could look like. The reality has been more nuanced. Production reliability is still a challenge, and most teams find they get better results from copilot-style tools (Cursor, Claude Code) than fully autonomous agents. But Devin pushed the whole space forward.
Claude Code
GrowthAnthropic's agentic coding tool that runs in your terminal. Can read your codebase, write and edit files, run commands, search the web, and manage git workflows. Operates directly on your filesystem with human-in-the-loop approval.
Why it matters
i've been using Claude Code heavily and it's shifted how i think about what coding agents can do. The terminal-first approach means it has real access to your tools and environment, not just a sandboxed editor. The human-in-the-loop design (you approve each action) trades autonomy for reliability, and for production code, that's the right tradeoff.
Replit Agent
GrowthAI agent built into Replit's cloud IDE that can build full applications from natural language descriptions. Creates projects, writes code, configures environments, and deploys to production all within Replit's platform.
Why it matters
Replit Agent is interesting because it controls the full stack, from code to deployment. That closed-loop advantage (sound familiar?) means it can actually ship working apps, not just write code. The tradeoff is you're locked into Replit's platform. i covered a case where it deleted live data, which highlights the reliability gap that all coding agents face.
GitHub Copilot
MatureGitHub's AI pair programmer, now with agent mode. Started as inline code completions and has evolved into a multi-model agent that can plan, execute multi-file edits, and run terminal commands within VS Code.
Why it matters
Copilot has the distribution advantage. Tens of millions of developers already use GitHub. The agent mode is catching up to Cursor but the real moat is the GitHub integration: issues, PRs, code review, Actions. If Copilot agents can operate across the full GitHub workflow, that's hard to compete with.
Windsurf
GrowthAI-native IDE from Codeium (rebranded). Competes directly with Cursor with features like Cascade (multi-step agent flows) and deep codebase understanding. Recently acquired by OpenAI.
Why it matters
The Windsurf acquisition by OpenAI tells you everything about how important the coding agent space has become. OpenAI wanted an IDE and they bought one. For users, this means Windsurf will likely get deep GPT integration that other editors can't match. The competitive dynamics between Cursor/Windsurf/Copilot are worth watching closely.
v0 (Vercel)
GrowthVercel's AI-powered frontend generation tool. Describe a UI and v0 generates production-ready React code. Part of the wave of tools turning natural language descriptions into working interfaces.
Why it matters
v0 is interesting because it's targeting a very specific workflow (frontend generation) and doing it well. The code quality is surprisingly good for a generative tool. It's a sign that the best agent products will be narrow and excellent rather than broad and mediocre.
bolt.new (StackBlitz)
GrowthBrowser-based AI development environment that can build and deploy full-stack applications from prompts. Runs entirely in the browser using WebContainers, which means no local setup required.
Why it matters
bolt.new and Lovable represent the zero-to-deployed agent experience. No terminal, no IDE, just describe what you want. The target audience isn't developers; it's everyone else. The quality ceiling is lower than Cursor/Claude Code for complex apps, but for getting something live fast, the friction reduction is real.
Lovable
EarlyAI-powered app builder that generates full-stack applications from natural language. Handles frontend, backend, database, and deployment. Targets non-technical users and rapid prototyping.
Why it matters
Same thesis as bolt.new: the next wave of builders won't write code. They'll describe what they want and an agent builds it. Lovable is one of the better executions of this idea, with solid design defaults and Supabase integration for the backend.
Customer Support
Sierra
GrowthEnterprise AI agent platform for customer experience. Founded by Bret Taylor (ex-Salesforce CEO) and Clay Bavor (ex-Google). Building outcome-priced agents that handle customer service conversations end-to-end for Fortune 500 companies.
Why it matters
i did a full deep dive on Sierra. They're building an outcome-priced agent OS for customer service, betting that reliability engineering and supervision can make probabilistic systems dependable enough for Fortune 500 scale. The founding team is absurdly credentialed. The key insight is they're not replacing customer service reps; they're replacing the vending machine experience of bad support. Early results from their enterprise deployments are genuinely impressive.
Intercom Fin
GrowthIntercom's AI agent for customer support. Resolves customer questions using your help center, documentation, and conversation history. Integrated directly into Intercom's existing customer messaging platform.
Why it matters
Fin has a distribution advantage that pure-play agent companies don't: Intercom already sits in the customer support workflow for thousands of companies. Adding an agent layer on top of existing conversations, knowledge bases, and routing is a natural extension. The resolution rates are competitive with specialized agent companies.
Ada
GrowthAI-powered customer service automation platform. Uses reasoning agents that can resolve complex customer issues across chat, email, voice, and social channels. Processes over a billion customer interactions.
Why it matters
Ada has been in the automated support game longer than most, and the shift from scripted chatbots to reasoning agents has been good to them. They already had the integrations, the customer relationships, and the conversation data. Adding LLM-powered reasoning on top of that existing foundation is exactly the closed loop advantage i keep talking about.
ServiceNow
GrowthEnterprise workflow platform that's added AI agents for IT service management, HR, and customer service. Agents automate ticket routing, resolution, and cross-department workflows within ServiceNow's platform.
Why it matters
Same thesis as Salesforce: the companies that already own the workflow have the best shot at adding agents on top. ServiceNow's IT service management agents are one of the most proven enterprise agent use cases. When an employee reports a laptop issue, an agent that can check inventory, create a ticket, and schedule a replacement is genuinely useful.
Legal
Harvey
GrowthAI assistant for legal professionals. Trained on legal data to help with contract analysis, due diligence, litigation support, and legal research. Used by major law firms including Allen & Overy and PwC.
Why it matters
Legal is one of the strongest verticals for AI agents because the work is high-value, document-heavy, and follows patterns. Harvey's moat is the legal training data and the relationships with major firms. A coding agent can be swapped easily. A legal agent that understands specific jurisdictional requirements and has been validated against real case law? That's sticky.
Research Agents
Glean
GrowthEnterprise AI search and knowledge platform. Connects to all your company's apps (Slack, Confluence, Google Drive, etc.) and lets employees ask questions that get answered from internal knowledge. The agent mode can take actions across connected apps.
Why it matters
Glean is solving the internal knowledge base Q&A problem, which is one of the most proven agent use cases for enterprises today. If you have good source material and fragmented knowledge across tools, an agent that can search everything and synthesize answers is genuinely useful. Their connectors to enterprise apps are the moat.
Hebbia
GrowthAI-powered document analysis platform. Processes millions of pages of complex documents (financial filings, legal contracts, research papers) and lets analysts ask questions across entire document sets.
Why it matters
Hebbia is going after the analyst workflow in finance and legal, where people spend hours reading documents. Their Matrix product lets you run structured analysis across thousands of documents simultaneously. The bet is that the best AI companies will be the ones that own specific high-value workflows, not general-purpose chat.
Finance & Trading
Klarna AI
MatureKlarna's internal AI assistant handling customer service. Replaced the equivalent of 700 customer service agents within months of deployment. Handles 2/3 of all customer service chats.
Why it matters
Klarna is the poster child for the agent economics argument. They cut their workforce and claimed the AI handles conversations better, faster, and cheaper. Whether you find that inspiring or unsettling, the numbers are hard to argue with. It's also a cautionary tale: Klarna owns the full stack (customer data, transaction history, support workflows), which is why it works. Most companies trying to replicate this don't have that advantage.
Infrastructure
TinyFish
EarlyAI-powered web data extraction platform. Turns the internet's messiest, most unstructured pages into clean, structured data. Built specifically for the kind of web content that traditional scrapers can't handle.
Why it matters
i did a deep dive on TinyFish. Most of the web is a mess, tables that don't render, dynamic content, pages built for humans not machines. TinyFish uses AI to read the unreadable web and turn it into the world's most valuable database. Data extraction infrastructure like this is the picks-and-shovels play for the agent economy. Every agent that needs to understand the web needs something like this.
Nvidia
MatureThe company that makes the GPUs powering virtually all AI training and inference. The H100, H200, and Blackwell architecture GPUs are the foundation of every major AI model and agent system.
Why it matters
Nvidia is winning right now, but 90% gross margins never last forever. Major customers (Amazon, Google, etc.) are building their own chips. At the same time, more efficient models like DeepSeek mean you need fewer GPUs to get the same work done. This challenges the infinite demand thesis. But for now, if you're running agents at scale, you're running on Nvidia hardware.
Browserbase
EarlyCloud browser infrastructure for AI agents. Provides headless browsers that agents can control to interact with websites, fill forms, click buttons, and extract data. Purpose-built for agentic web automation.
Why it matters
Every agent that needs to interact with the web needs a browser. Browserbase is the infrastructure layer that makes this work at scale. Reliable browser automation is one of those unsexy problems that turns out to be critical for production agents. The picks-and-shovels play for the agent economy.
Firecrawl
EarlyWeb scraping API designed for LLMs and agents. Crawls websites and returns clean, structured markdown that's ready for AI consumption. Handles JavaScript rendering, anti-bot protection, and content extraction.
Why it matters
The web wasn't built for AI agents to read. Firecrawl translates messy web pages into clean data that agents can actually work with. It's a simple tool that solves a real problem. If you're building any agent that needs to pull information from the web, you'll probably end up using something like this.
Tavily
EarlySearch API built specifically for AI agents. Returns structured, relevant results optimized for LLM consumption rather than human browsing. Used as the default search tool in many agent frameworks.
Why it matters
Regular search APIs return results designed for humans to browse. Tavily returns results designed for agents to consume. It's a small but important distinction that matters a lot when you're building agents that need to research and gather information. The fact that it's become the default search tool in LangChain says something.
LangSmith
GrowthObservability and evaluation platform for LLM applications from the LangChain team. Lets you trace agent runs, debug failures, evaluate outputs, and monitor production deployments.
Why it matters
If you don't log everything your agent does, you're flying blind. i keep saying this: log every tool call, every model response, every decision point. LangSmith makes this practical. The evaluation features for comparing agent performance over time are also genuinely useful for closing the feedback loop.
Braintrust
EarlyEnterprise platform for evaluating, monitoring, and improving AI applications. Provides tools for running evals, A/B testing prompts, and tracking AI quality metrics in production.
Why it matters
Braintrust tackles the evaluation problem head-on. Most agent benchmarks are broken (i wrote about this). What you need is evaluation tied to your specific use case, with your actual data, measuring metrics that matter to your users. Braintrust provides the infrastructure for that kind of real-world evaluation.
Weights & Biases
MatureML experiment tracking and model monitoring platform. Originally built for training ML models, now expanded to cover LLM evaluation, prompt engineering, and agent observability.
Why it matters
W&B has been the standard for ML experiment tracking for years, and they've adapted well to the LLM era. The Weave product for LLM application monitoring is solid. For teams that are already using W&B for model training, adding agent monitoring is a natural extension.
Helicone
EarlyOpen-source LLM observability platform. One-line proxy integration that logs all LLM calls with latency, cost, and quality metrics. Supports all major model providers.
Why it matters
Helicone's value proposition is simplicity. One line of code to start logging all your LLM calls. For teams that don't need the full weight of LangSmith or W&B, Helicone is the fastest path to visibility into what your agents are actually doing and costing.
Apify
GrowthWeb scraping and automation platform with a marketplace of pre-built scrapers (Actors). Now integrates with LLMs to enable AI-powered web agents that can navigate, extract, and process web data.
Why it matters
Apify has been doing web scraping longer than the AI agent wave, and that experience matters. Their marketplace of pre-built scrapers means agents can plug into thousands of data sources without building custom integrations. For data-hungry agents, Apify is often the fastest path to getting the data they need.
E2B
EarlyCloud sandboxes for AI agents and AI-generated code. Provides secure, isolated environments where agents can execute code, run tools, and interact with files without risking the host system.
Why it matters
When agents write and execute code, you need sandboxing. E2B provides the secure execution environments that let agents run code without nuking your production database. After the stories i've covered of agents deleting live data and firing root commands, sandbox infrastructure feels less like nice-to-have and more like essential.
Composio
EarlyTool integration platform for AI agents. Provides 250+ pre-built integrations (GitHub, Slack, Salesforce, etc.) that agents can use without custom API work. Handles authentication, rate limiting, and error handling.
Why it matters
Every agent needs tools, and building tool integrations from scratch is tedious, repetitive work. Composio pre-builds the integrations so you can focus on the agent logic. For prototyping especially, being able to plug in 250+ tools immediately is a huge time saver.
Scale AI
MatureData labeling and AI infrastructure company. Originally built for training data, now provides evaluation infrastructure, RLHF pipelines, and government AI contracts. Works with most major model providers.
Why it matters
Scale AI is the invisible infrastructure behind many of the models powering agents. Their data labeling and RLHF work feeds directly into model quality. The government contracts are also interesting: if agents are going to operate in government contexts, the evaluation and safety infrastructure Scale provides becomes critical.
Healthcare
Chai Discovery
EarlyAI-powered biotech R&D platform. Uses AI agents to accelerate drug discovery, molecular design, and biological research. Rewrites the traditional R&D playbook with AI-first approaches.
Why it matters
Healthcare and biotech are the verticals where AI agents could have the most impact. Chai-2 is rewriting the biotech R&D playbook by letting agents do the kind of molecular analysis that used to take teams of PhDs months. The compliance requirements in healthcare create a natural moat. Regulated industries are where vertical agents win.
Sales & Outbound
Salesforce Agentforce
GrowthSalesforce's platform for building and deploying AI agents across sales, service, marketing, and commerce. Agents operate within the Salesforce ecosystem with access to CRM data, workflows, and customer history.
Why it matters
Salesforce has the enterprise distribution that startups can only dream of. Agentforce agents sit inside existing Salesforce workflows with access to customer data that's been accumulating for years. That's a massive closed-loop advantage. The question is whether Salesforce can execute on the AI product fast enough, or if nimble startups eat their lunch.
Robotics & Embodied
Tesla Optimus
EarlyTesla's humanoid robot program. Designed for repetitive, dangerous, or boring tasks. Already deployed in limited capacity in Tesla factories. Leverages Tesla's experience with self-driving AI, manufacturing scale, and vertical integration.
Why it matters
Tesla's advantage in robotics is the same as their advantage in EVs: vertical integration and manufacturing scale. If anyone can bring the cost of humanoid robots down to mass-market levels, it's Tesla. The factory deployments are early but real. i covered K-Scale's attempt to compete with Tesla on open-source robots.. the economics are brutal. Tesla's scale is nearly impossible to match.
Figure
EarlyBuilding general-purpose humanoid robots with a focus on labor tasks. Figure 02 integrates conversational AI (via OpenAI partnership) with physical dexterity. Targeting warehouse, manufacturing, and logistics applications.
Why it matters
Figure has raised more money than almost any robotics startup in history, and the OpenAI partnership gives them access to frontier AI capabilities. The Figure 02 demos showing natural conversation while performing physical tasks are impressive. But robotics is hard. Really hard. The gap between demo and production is even wider than in software agents.
K-Scale Labs
EarlyOpen-source humanoid robot company that tried to build an affordable alternative to Tesla Optimus. Founded on the premise that robotics needs its Linux moment. Built and shipped faster than major competitors but faced brutal economics.
Why it matters
i wrote a deep dive on K-Scale. They built an open humanoid faster than Tesla, which is genuinely remarkable. But their story also reveals the fragile economics of open-source robots. Hardware margins are thin, iteration is slow, and the capital requirements are enormous. What their failure tells us is that open-source might work for software agents but the economics don't translate to hardware.
Unitree
EarlyChinese robotics company building affordable humanoid and quadruped robots. The G1 humanoid is priced starting at $16,000, dramatically undercutting Western competitors. Also makes the Go2 quadruped robot.
Why it matters
Unitree is doing to robots what Chinese manufacturers did to EVs and drones: making them dramatically cheaper. A $16K humanoid robot changes the math on what deployments are economically viable. The quality question remains, but if Unitree can deliver reliable robots at these price points, it changes the entire competitive landscape.
1X Technologies
EarlyNorwegian robotics company building humanoid robots for everyday tasks. Their NEO robot is designed for home and work environments. Focused on safety and natural human-robot interaction.
Why it matters
1X is taking a different approach to humanoid robots: safety-first design for human environments. While Tesla and Figure target industrial settings first, 1X is designing for homes and offices from the start. The safety-first approach could be a genuine differentiator as robots move beyond controlled factory floors.
Agility Robotics
EarlyBuilds Digit, a bipedal robot designed for warehouse and logistics work. One of the few humanoid robots with actual commercial deployments, partnering with Amazon for warehouse operations.
Why it matters
Agility is one of the few robotics companies that can point to real commercial deployments, not just demos. The Amazon partnership for warehouse operations is a meaningful validation. Digit isn't trying to do everything; it's focused on the specific physical tasks that matter in logistics. That vertical focus is the same pattern i see winning in software agents.
Boston Dynamics
MatureThe OG robotics company (now owned by Hyundai). Atlas and Spot are the most recognizable robots in the world. Transitioned Atlas to electric and pivoted from research showcases to commercial deployments.
Why it matters
Boston Dynamics proved that extreme physical capability is possible, but struggled for years to turn it into a business. The Hyundai acquisition and commercial pivot are changing that. Spot has real commercial deployments in inspection and surveying. The lesson here: in robotics, the gap between can we build it and can we sell it profitably is enormous.
Physical Intelligence
ResearchAI company building foundation models for robotics. Their pi-zero model is a generalist robot policy that can control different robot hardware for various tasks, similar to how language models generalize across text tasks.
Why it matters
Physical Intelligence is trying to do for robot control what foundation models did for language: build one model that generalizes across many tasks and hardware platforms. If they succeed, it changes the economics of robotics completely. Instead of programming each robot for each task, you fine-tune a foundation model. Early results are promising but it's still very much research-stage.
Apptronik
EarlyAustin-based robotics company building the Apollo humanoid robot. Partnered with NASA and Mercedes-Benz. Focused on modular design that can be customized for different industrial applications.
Why it matters
Apptronik's modular approach is smart. Instead of one robot for everything, they build a platform that can be configured for specific tasks. The NASA and Mercedes partnerships validate the hardware quality. Whether that's enough to compete with Tesla's scale and Unitree's pricing remains the question.
Sanctuary AI
EarlyCanadian company building humanoid robots with a focus on human-like dexterity and cognition. Their Phoenix robot is designed for general-purpose work in manufacturing, logistics, and retail.
Why it matters
Sanctuary AI's emphasis on hand dexterity is key. As i wrote in the Dexterity Stack deep dive, intelligence lives at the point of contact. A robot that can fold towels, handle irregular objects, and adapt to unstructured environments solves the hardest problems in physical AI. Sanctuary is one of the few teams seriously tackling this.
On-Chain Agents
Virtuals Protocol
EarlyCrypto protocol for creating and co-owning AI agents. Lets communities launch tokenized agents that can interact across social platforms, games, and DeFi. Built on Base (Coinbase's L2).
Why it matters
Virtuals is the most interesting experiment in on-chain agents right now. The idea of tokenized, community-owned agents is genuinely novel. Whether it's a durable business model or a speculative vehicle remains to be seen. The token economics create incentives for agent development that don't exist in traditional software. But the space is still.. very early and very wild.
ai16z / ELIZA
EarlyOpen-source framework for building AI agents with crypto-native capabilities. ELIZA started as a meme project on Solana and evolved into the most widely used framework for building autonomous social and trading agents.
Why it matters
ELIZA's journey from meme to legitimate framework is one of the stranger stories in the agent space. The open-source community around it is genuinely active, and the agent capabilities (social posting, trading, cross-platform interaction) are real. But the crypto-native context means everything is wrapped in speculation and token dynamics that make it hard to evaluate on pure fundamentals.
Fetch.ai
EarlyDecentralized platform for building and deploying autonomous AI agents on blockchain. Merged with SingularityNET and Ocean Protocol to form the Artificial Superintelligence Alliance (ASI).
Why it matters
Fetch.ai has been in the AI-meets-crypto space longer than most. The merger with SingularityNET and Ocean Protocol creates the largest decentralized AI ecosystem. The thesis is that autonomous agents need decentralized infrastructure for identity, payments, and coordination. Whether that thesis is correct is still an open question.
Get the Weekly Brief
Weekly AI agents intel for 13,000+ readers. Subscribe and get the 2026 AI Playbook (PDF) free.