Frame Minion — Providers & Setup

Frame Minion stands on four external pieces: OpenRouter (the brain and the image/video engine), fal.ai (a specialist video host and the default file uploader), a storage provider (how your reference media reaches the video models), and ffmpeg (the local render engine).

You don't need all four to start — in fact, generating images needs only one. Knowing what each piece does makes the setup choices obvious.

The mental model

Three remote providers do different jobs, and one local tool assembles the result. OpenRouter thinks, draws, and (mostly) animates; fal.ai handles the video work OpenRouter can't and is the default file host for video references; S3 is the opt-out if you'd rather host those files yourself; ffmpeg stitches everything together locally and is the only piece you explicitly install — one click, fully removable.

OpenRouter — the brain & the image/video engine

A single API in front of many AI labs. Frame Minion routes almost everything through it with one key.

openrouter.ai Get an API key Browse models Credits & balance

Planning & language — the music-video planner, per-segment audio analysis (lyrics + mood), and the prompt-enhancement wizards all call a text/multimodal model here. Default: openai/gpt-5.1.
Image generation — every start frame, end frame, and reference image. Default: google/gemini-3.1-flash-image-preview ("Nano Banana 2"); GPT and FLUX image models are also available.
Video generation — single-shot and per-segment clips run through OpenRouter's /videos passthrough (Veo, Sora, Wan, etc.). Default: google/veo-3.1-fast.

What you need: one OpenRouter API key, pasted into Settings once (stored in VS Code's secret storage, never in plain settings). Your balance shows in the sidebar and refreshes after each generation. This is the only effectively required provider — without it there's no planning, no images, and no default video path.

fal

fal.ai — the specialist video host + default uploader

A generative-media inference platform. It plays two distinct roles — worth keeping separate in your head.

fal.ai Get an API key Billing & balance Model gallery

Wan-family video, especially lip-sync. Some Wan models — particularly lip-synced singing shots — aren't reliably available through OpenRouter, so Frame Minion calls fal.ai directly for those. (Wan lip-sync accepts clips roughly 2–15 seconds long; the app slices each segment's audio to fit automatically.)
The default file uploader. Its zero-setup CDN is how reference frames and audio get a URL the video models can fetch — see Storage below.

What you need: one fal.ai API key — the same key covers both roles, and your fal balance also appears in the sidebar. If you only ever use OpenRouter image + Veo video and never touch Wan or fal storage, you can skip it — but the default storage provider is fal, so most users who generate video will want a fal key too.

Storage — how your reference media reaches the video models

Video providers fetch reference frames and audio by URL, not as inline bytes. Storage picks where those files get uploaded first.

Whenever a clip uses a start/end frame, a reference image, or a lip-sync audio slice, Frame Minion has to upload that file somewhere reachable first. The storageProvider setting picks where.

Provider	Setup	Best for
fal.ai CDN `fal` · default	Zero config — just needs your fal.ai key.	Almost everyone. Nothing to provision.
Amazon S3 `s3`	Bring your own bucket: AWS credentials + region + bucket name.	Teams who want media in infrastructure they own and control.

Amazon S3 AWS regions IAM keys

A few things worth knowing

The setting is read live on every upload, so you can switch providers without restarting.
fal-default is not zero-config for OpenRouter-only users. If you generate video, have no fal key, and haven't set up S3, uploads fail with a clear "add a fal key or switch to S3" message. Pick one.
Existing S3 users are protected. If you previously configured AWS credentials, Frame Minion keeps you on S3 rather than silently switching you to the fal default.
The S3 box (keys, region, bucket, check/create) is a self-contained block in Settings, dimmed when fal is the active provider.

ffmpeg — the local render engine

Everything in the Composition / render stage runs through ffmpeg on your own machine.

ffmpeg.org Download builds Static builds Frame Minion installs

Frame Minion uses it to:

Sequence the per-segment clips into one continuous video,
Generate transitions and effects at clip boundaries (dip-to-black, wipes, flash, shake, impact-zoom, glitch, and the rest),
Smooth seams between adjacent clips,
Extract start/end frames from any existing video you attach,
Encode both the in-editor preview and the universal final MP4.

Why ffmpeg is also needed for the in-editor preview. VS Code's built-in video player doesn't play AAC (mp4a) audio inside an MP4 — so a raw render would preview silently. Frame Minion uses ffmpeg to add an MP3 audio track to the preview specifically so you can hear it inside VS Code. (The universal final MP4 doesn't need this — it's encoded for broad/Apple compatibility and plays with sound anywhere.)

ffmpeg is not downloaded automatically

Frame Minion resolves a binary in this order, and never fetches anything on its own:

frameMinion.ffmpegPath setting — an explicit path you provide.
System ffmpeg on your PATH. Version 6+ recommended; the filters Frame Minion emits (xfade, minterpolate, mpdecimate) need at least 4.3.
A copy you previously installed through Frame Minion.

If none resolve, Frame Minion tells you ffmpeg is missing — a banner where rendering is gated, and a status indicator in Settings — and offers a one-click Download & Install. That fetches a pinned, SHA-256-verified static build for your platform and caches it for all future renders. You can Uninstall it later from Settings; that removes only Frame Minion's downloaded copy and never touches a system ffmpeg.

Settings defaults at a glance

Setting	Default	What it controls
openRouterModel	openai/gpt-5.1	Planner / analysis / prompt-enhance text model.
imageModel	gemini-3.1-flash-image	Frame & reference image generation.
videoModel	google/veo-3.1-fast	Per-segment / single-shot video generation.
storageProvider	fal	Where video reference media is uploaded (`fal` or `s3`).
maxConcurrentVideos	1 (1–5)	How many segment videos generate in parallel.
outputDirectory	frame-minion	Folder (under your workspace) where projects and media are written.
awsRegion	us-east-1	S3 region (only used when `storageProvider` is `s3`).
s3BucketName	""	Your S3 bucket (only used when `storageProvider` is `s3`).
maxConversationTurns	10	Conversation history depth for chat-style flows.
ffmpegPath	""	Optional explicit path; empty = auto-resolve (system → installed → prompt).

API keys (OpenRouter, fal.ai, AWS) are never stored in settings — they live in VS Code's encrypted secret storage and are entered through the Settings panel.

What needs a key vs. what's automatic

Piece	Status	Notes
OpenRouter	Required	Planning, images, default video. Image generation needs only this.
fal.ai	Recommended	Default uploader for video references; required for Wan / lip-sync. Skip it if you only generate images.
Amazon S3	Optional	Opt-out alternative to fal for hosting video reference media.
ffmpeg	One-click install	Used locally for rendering. Auto-detected if present; otherwise installed (and removable) from Settings.

The shortest path: one OpenRouter key gets you planning and images. Add a fal key the moment you want video (or lip-sync). Reach for S3 only if you'd rather host reference files yourself. And ffmpeg is the single thing you install — one click, fully removable, only ever at your request.