Mira Murati, former OpenAI CTO, has launched groundbreaking AI through her startup, Thinking Machines Lab. The new “interaction models” enable seamless real-time conversations, processing audio, video, and text simultaneously. Unlike traditional turn-based AI, these models mimic natural human dialogue with minimal delays.
The flagship model, TML-Interaction-Small, is a 276-billion-parameter Mixture-of-Experts (MoE) system with 12 billion active parameters. It uses “full-duplex” architecture, breaking interactions into 200-millisecond micro-turns. This allows the AI to listen, observe, and respond continuously—even while speaking. On the FD-bench benchmark, it achieves response latency under 0.4 seconds, faster than rivals from OpenAI and Google.
Thinking Machines, founded in early 2025 and backed by $2 billion in funding, aims to transform human-AI collaboration. Early demos show AI handling live streams for tasks like drawing, searching, or tool use during talks. A limited research preview launches soon, with wider access later in 2026.
This innovation addresses a key AI flaw: long pauses in responses. By enabling instant reactions, it boosts applications in education, customer service, and virtual assistants. Murati’s team trained these models from scratch, avoiding patchwork fixes on existing systems.
India’s AI ecosystem, growing at 40% annually, could benefit as startups adopt such tech for multilingual real-time tools. With global AI investment hitting $200 billion last year, Thinking Machines positions itself as a leader in multimodal AI. Experts predict it could redefine productivity, making AI a true partner. As adoption rises, ethical concerns like privacy in continuous monitoring must be addressed. Murati’s venture signals the next era of intuitive AI.
FAQs [Frequently Asked Questions]
1. What are interaction models by Thinking Machines?
New AI systems for real-time audio-video-text processing, using full-duplex tech for continuous human-like conversations without turn-based delays.
2. What is TML-Interaction-Small’s key spec?
276-billion-parameter MoE model with 0.4-second response latency on FD-bench, enabling micro-turn interactions every 200 milliseconds.
3. When can users access these models?
Limited research preview in coming months, broader release later 2026; currently demos only, no public product yet.