Descript is an AI-powered video and podcast editing platform that treats audio and video like a text document. Users edit media by editing a transcript — cutting words deletes the corresponding footage — and Underlord, Descript's built-in AI layer, handles complex production tasks automatically. It is used by podcasters, video creators, marketers, and production teams.
Underlord now runs on reasoning models, including selectable Gemini 3, enabling it to handle multi-step edit instructions that previously required manual execution. Users can describe complex sequences — cut all pauses over one second, remove filler words, add a chapter break before each topic shift — and Underlord executes them as a coordinated chain rather than a series of individual actions.
Video generation from text prompts is now available via integrated Veo 3.1 and Sora 2, allowing creators to generate B-roll or scene footage directly within Descript without switching to an external tool. Lip sync for dubbed and translated video was added alongside the generation features, improving realism for multilingual content.
Caption translation and dubbing expanded significantly: 39 additional languages are now supported for captions, and 6 new languages gained full dubbing support including voice synthesis. Descript also added 21 new stock voices for AI voiceover, bringing the total library to over 1,000.
MCP (Model Context Protocol) integration allows Claude, and other AI agents that support MCP, to control Descript via natural language prompts. This enables automated editing workflows where an external agent can issue editing commands, run exports, or manage projects programmatically.
Descript is best for video and podcast creators who want AI-assisted editing at the transcript level, and for teams producing multilingual or dubbed content who need integrated lip sync and voice synthesis.
Leave a Review
Reviews are published after moderation. We don't share your email.