A VS Code extension · by OkeyLanders

Make video where you make everything else.

AI video generation built into VS Code. Generate reference frames in the image tab, push them straight into a video, or hand a whole song to the Music Video Orchestrator and let it storyboard, generate, and render a multi-segment cut — with your API keys staying on your machine.

Veo Sora Kling Wan Seedance Hailuo
Music Video Orchestrator
Frame Minion composition timeline inside VS Code — audio waveform over a row of 23 generated video clips, with seam-smoothing and render controls.
Powered by the frontier

Frame Minion spans OpenRouter's full image and video catalog — and reaches fal.ai for Wan lip-sync. Switch models from a dropdown; bring your own keys.

VideoVeo 3.1 VideoSora 2 VideoKling 3 VideoWan 2.7 VideoSeedance 2 VideoHailuo 2.3 ImageGemini ImageGPT Image ImageFLUX Lip-syncWan · fal.ai

Powered by OpenRouter  +  fal.ai — bring your own keys.

What it is

A frame factory and a film studio, both inside your editor.

01

Generate frames.

A chat-style image generator for crafting exactly the start, end, and reference frames your shot needs.

02

Turn frames into video.

Push any image straight into a video as a start frame, end frame, or reference — then generate with the model of your choice.

03

Orchestrate a whole video.

Hand a song to the Music Video Orchestrator and let it plan, generate, and render a multi-segment cut.

image · gallery
Frame Minion image tab: chat-style image generation with a result and its action menu — Use as start/end/reference frame, Send to project segment, Send to project style or character.
The image tab

The frame factory.

Chat-style, iterative image generation across OpenRouter's image models — Gemini, GPT Image, FLUX. Upload references, control aspect ratio, get multi-image responses.

  • One-click on every result: use as start / end / reference frame.
  • Send to a project segment, or to project style and character/location refs.
  • Reference-image upload, aspect-ratio control, multi-image responses.

Your image gallery becomes a frame factory feeding the video form.

video · one-shot
Frame Minion video tab: a single-shot form with prompt, model selector (Wan 2.7 I2V via fal.ai), start frame, end frame, driving audio, and duration/resolution pickers.
The video tab

One shot, deliberate.

Video is expensive per render, so there's no chat thread here — just a focused single-shot form. Prompt, model, reference image, start frame, end frame, audio.

  • Duration, resolution, and aspect-ratio pickers populate from the selected model's declared capabilities.
  • Drop in a start and end frame to pin the motion between two exact images.
  • Lip-sync mode drives the shot from a sliced audio window — handles included.
Account balances
OpenRouter
$9.64 / $50.00
Remaining
fal.ai
$4.90
Balance

↻ auto-refreshes after every generation

Provider balance strip

Always know your spend.

Live OpenRouter and fal.ai account balances sit in the sidebar and auto-refresh after every generation — so you're never surprised by a render bill. You pay the providers directly; the extension is free.

The centerpiece

The Music Video Orchestrator.

Give it a song and a concept. It does the rest of the pre-production — segmenting the track, storyboarding frames, generating a clip per segment, and rendering the final cut with ffmpeg.

22 segments + fade-out 2:57 track 16 transitions ffmpeg render
1

Plan from the song.

Set the look, then hand over the track. Add style references for mood, palette and lighting; add named characters and locations so your subjects stay consistent; upload your song. The planner reads the song's structure and lays out a shot per segment — start-frame prompt, end-frame prompt, camera/action prompt, and timing — ready to refine.

+ style refs+ characters & locations↑ upload song
plan · song sectionsclick to flip ↻
Plan view: the planner model and the read-only song-sections table mapping intro, verse, pre-chorus, chorus and outro to timecodes, mood, and lyric cues, above the generated segments.
project · style & character refsclick to flip ↻
Project setup: title and concept for 'Wrong Song', the uploaded audio track, and grids of style refs and named character/location references.
2

Storyboard — and direct every segment.

Generate start and end frames per segment, with continuity built in — each shot's end frame normally matches the next shot's start. Direct any segment: tweak the start/end frame prompts, or swap exactly which style and character/location references that segment uses. Promote a great frame into project-wide refs.

tweak promptsper-segment refs↳ matches next start
segments 01–02 · framesclick to flip ↻
Two consecutive segment cards, each with a start frame, end frame, video prompt and generated clip; an end frame is annotated 'normally matches segment 02 start' to show frame continuity.
references for segment 01click to flip ↻
References modal for a single segment: a list of style references and a list of named characters/locations, each individually toggled on or off for that segment, marked 'overridden'.
3

Generate a clip per segment — with effortless audio sync.

Per-segment video generation runs in parallel. For lip-sync, Frame Minion automatically slices the song to exactly that segment's window and hands the model just that slice — so the mouth matches the real lyrics at that moment. And a complete history of every frame and video generation persists — step back through takes and revert in a click.

0:40 – 0:48 (8.0s)parallel × 3per-slot history
segment 05 · clipclick to flip ↻
A segment card with start frame, end frame, camera/action video prompt and character tags on the left, and its finished 8-second clip with a ready badge on the right.
history & galleryclick to flip ↻
History and gallery modal: four stored versions of one segment's slot on the left with seeds, and the full project video gallery of 19 generated clips on the right.
A segment's video slot showing a 4/4 take selector and a 'History 4' button.History · 4 takes
4

Soften the cuts.

Place one of 16 transition presets across six families — fades, dissolves, wipes, slides, geometric, and effects like flash, shake, impact-zoom and glitch — on any boundary. An overlap model that never changes your runtime. Optional seam smoothing blends hard cuts automatically.

slidecircle-openglitch+13 more
transition picker
Transition picker modal: Slide, Geometric and Effects families with preview tiles and a transition-duration stepper.
5

Stay in sync — through every regen and retime.

Regenerate a clip, swap in a different take, or change a segment's duration, and every segment still respects its own stored timing. Frame Minion flags the exact moment a clip drifts out of sync with the track — so the cut never silently slips as you iterate.

⚠ video timing out of sync1:45 – 1:53 (8.0s)
out-of-sync warning
A segment card flagged 'Video timing out of sync' with its 1:45–1:53 (8.0s) timing badge highlighted in red.
6

Render — reassembled in order, in sync.

Preview or final-render the whole thing with ffmpeg. The per-segment clips concatenate back in order and the full song lays over the top, so the finished video stays locked to the track end to end — with seam smoothing optional on every boundary.

22 / 23 clips readyall seams▶ final
composition · render
Composition timeline: the song waveform over a row of 23 generated video clips with per-segment timecodes, seam-smoothing and render controls, and the assembled preview playing below.

Per-slot media history

Every frame and video slot keeps a version history. Step back and forth through takes, or browse the project gallery.

Prompt wizards

Regenerate a segment's video prompt from its frames + audio analysis; analyze a segment's audio for lyrics and mood.

Save / load projects

Projects persist to a workspace folder. Reopen and everything — frames, clips, history, transitions, timing — comes back.

See it render

From a song to a finished cut.

Three music videos, planned, generated, and rendered start-to-finish inside VS Code. No timeline app, no render farm.

Featuring songs by Drew's Song from the album Drew's Song, available on Apple Music and Spotify.

Watch with soundopens player
Wrong Songstreaming from Supabase
Wrong Song · 9:16 · ffmpeg render
Watch with soundopens player
Different Pagesstreaming from Supabase
Different Pages · 22 segments
Watch with soundopens player
God You Arestreaming from Supabase
God You Are · full track muxed
The details that add up

Small things, done right.

Built right

Not a flimsy wrapper. A real tool.

Your keys never leave your machine.

API keys live in VS Code's encrypted SecretStorage — not in settings, not on a server.

No backend, no middleman.

Calls go straight from your editor to OpenRouter and fal.ai. Your prompts and media are yours.

Clean architecture.

Layered, typed, tested — built to last. Forked from the pixel-minion framework.

Get started

A short setup, then you're rendering.

1

Install Frame Minion

Search Frame Minion in the VS Code Extensions view, or grab it from the web Marketplace.

2

Install ffmpeg

Run brew install ffmpeg (or your platform's package manager) — Frame Minion uses it to preview and render the final cut.

3

Add your API keys

Open the Frame Minion view and paste your OpenRouter and fal.ai keys. Both live in encrypted SecretStorage.

4

Pick a model, write a prompt, generate

Generate a frame, push it to video, or hand a song to the Orchestrator.

Add to VS Code View on GitHubsoon

Bring your own OpenRouter and fal.ai keys — fal.ai signs the short-lived storage URLs that deliver your frames and audio to the video models, so it's required. ffmpeg installs separately via Homebrew.

frame-minion
# 1 — install the extension
ext install okeylanders.frame-minion
# 2 — install ffmpeg (preview + render)
brew install ffmpeg
# 3 — add your keys
OpenRouter key stored in SecretStorage
fal.ai key stored (required)
# 4 — generate
frame → video → orchestrate 🜂
Questions

Good to know.

Your editor just learned to direct

From prompt to music video, without leaving VS Code.

Add to VS Code View on GitHubsoon