AI Captions for Viral Shorts
Auto viral 4-word captions for short-form video
Captions are critical for vertical shorts — 80% of Reels/TikTok/Shorts are watched muted. MarbSocial generates captions via Google Gemini STT with high Russian accuracy (catches slang, IT terms, mixed RU/EN), chunks them TikTok-style (4 words per line, max 2.5s on screen), and burns them into the final MP4. Position editor: vertical 0-100% slider, font scale 0.5×–2×, white/yellow colors.
What it does
Gemini STT transcription: Russian, English, auto-detect
4-word chunking with transitions (TikTok/Reels style)
Manual Y-offset (0-100%) and font scale (0.5×–2×)
Top / Center / Bottom presets for quick setup
Aspect-aware font size: 9:16 and 1:1 get different base sizes
Burn-in to final MP4 — work on any player
Auto adaptation to CIS local names and slang
Retry on failed transcription chunks — fewer time gaps
FAQ
How accurate is Russian transcription?
Clean speech: 95-97%. Noisy/accented/poor mic: 85-92%. Podcasts, lectures, interviews usually recognize well. MarbSocial uses Gemini 1.5 Flash with retry on failed chunks (2.5-min slices instead of 5-min).
Can I edit recognized text?
Yes. In the Transcript panel, click any segment to edit. Changes save and apply on next clip render.
Do captions work with letterbox mode (black bars)?
Yes. In "fit with bars" mode, captions render on the video (not in the black area). Position adjustable via Y-slider as usual.
See also
Start for free
Standard plan — 3 accounts, 100 posts/month. No credit card required.
Try AI captions on your video