Guide

How to isolate audio from video

This guide explains a practical workflow for extracting voices, instruments, and ambience from mixed video audio. The goal is speed and usable results, not a perfect lab-grade separation in every case.

Step-by-step workflow

Start with the cleanest source clip you have: If you have multiple takes, choose the one with the best mic proximity and lowest clipping. Better inputs produce better separations.
Write a specific target prompt: Use descriptive prompts like 'main speaker voice', 'acoustic guitar strumming', or 'soft crowd ambience'. Avoid broad prompts like 'good audio'.
Process and preview both tracks: Listen to isolated and background tracks independently before exporting. This helps confirm the extraction quality and spot artifacts early.
Balance levels for your target platform: Raise the isolated track for clarity and keep a small amount of background when needed for natural tone in social clips or interviews.
Export and continue your edit: Send results into your broader workflow for captions, transitions, and mastering. AudioPrompt is strongest as a fast front-end isolator.

Prompt examples that usually work better

"Primary host voice at center"
"Lead vocal with minimal reverb"
"Hi-hat and snare only"
"Street ambience with passing cars"

Common mistakes to avoid

Using vague prompts that do not describe the target sound source
Expecting perfect isolation from clipped or severely distorted recordings
Skipping preview checks and discovering artifacts only after export
Treating one prompt result as final instead of iterating quickly