Step-by-step workflow
1
Write script with Claude
Generate a punchy voiceover script.
Write a 60-second YouTube voiceover script about [TOPIC]. Hook in first 5 seconds (surprising fact or question). Short sentences (max 15 words). Conversational tone. End with CTA. Approx 150 words. Audience: [YOUR AUDIENCE].
2
Generate voiceover with Kokoro TTS
Run the script to get a .wav file.
from kokoro import KPipeline
import soundfile as sf, numpy as np
pipeline = KPipeline(lang_code='a')
with open('script.txt', 'r') as f: script = f.read()
audio_parts = [a for _,_,a in pipeline(script, voice='af_heart', speed=0.95)]
sf.write('voiceover.wav', np.concatenate(audio_parts), 24000)
print("voiceover.wav saved!")
3
Import into CapCut or DaVinci Resolve
Drag voiceover.wav to audio track. Sync visuals to audio beats.
4
Add auto-captions in CapCut
Text → Auto Captions → select language. Syncs automatically to Kokoro audio.
Pro tips
→
CapCut auto-captions sync perfectly with Kokoro TTS
→
For Reels/Shorts: use speed=1.1 for slightly faster pacing
→
DaVinci Resolve is free and professional — use for longer YouTube videos
Why this matters for India
// india context
Create Hindi YouTube content without hiring a voice artist — Kokoro Hindi voices are surprisingly natural