Tool

AI Captions for Viral Shorts

Auto viral 4-word captions for short-form video

Captions are critical for vertical shorts — 80% of Reels/TikTok/Shorts are watched muted. MarbSocial generates captions via Google Gemini STT with high Russian accuracy (catches slang, IT terms, mixed RU/EN), chunks them TikTok-style (4 words per line, max 2.5s on screen), and burns them into the final MP4. Position editor: vertical 0-100% slider, font scale 0.5×–2×, white/yellow colors.

What it does

Gemini STT transcription: Russian, English, auto-detect

4-word chunking with transitions (TikTok/Reels style)

Manual Y-offset (0-100%) and font scale (0.5×–2×)

Top / Center / Bottom presets for quick setup

Aspect-aware font size: 9:16 and 1:1 get different base sizes

Burn-in to final MP4 — work on any player

Auto adaptation to CIS local names and slang

Retry on failed transcription chunks — fewer time gaps

FAQ

How accurate is Russian transcription?

Clean speech: 95-97%. Noisy/accented/poor mic: 85-92%. Podcasts, lectures, interviews usually recognize well. MarbSocial uses Gemini 1.5 Flash with retry on failed chunks (2.5-min slices instead of 5-min).

Can I edit recognized text?

Yes. In the Transcript panel, click any segment to edit. Changes save and apply on next clip render.

Do captions work with letterbox mode (black bars)?

Yes. In "fit with bars" mode, captions render on the video (not in the black area). Position adjustable via Y-slider as usual.

See also

Start for free

Standard plan — 3 accounts, 100 posts/month. No credit card required.

Try AI captions on your video