Providers & Setup External services & the local render engine

Frame Minion stands on four external pieces: OpenRouter (the brain and the image/video engine), fal.ai (a specialist video host and the default file uploader), a storage provider (how your reference media reaches the video models), and ffmpeg (the local render engine).

You don't need all four to start — in fact, generating images needs only one. Knowing what each piece does makes the setup choices obvious.

The mental model
Three remote providers do different jobs, and one local tool assembles the result. OpenRouter thinks, draws, and (mostly) animates; fal.ai handles the video work OpenRouter can't and is the default file host for video references; S3 is the opt-out if you'd rather host those files yourself; ffmpeg stitches everything together locally and is the only piece you explicitly install — one click, fully removable.
Myth-buster: image generation needs only an OpenRouter key
References go inline as base64 — no upload, no fal key, no S3, and no ffmpeg. If you only generate images, one key is the entire setup.
OpenRouter — the brain & the image/video engine
A single API in front of many AI labs. Frame Minion routes almost everything through it with one key.
  • Planning & language — the music-video planner, per-segment audio analysis (lyrics + mood), and the prompt-enhancement wizards all call a text/multimodal model here. Default: openai/gpt-5.1.
  • Image generation — every start frame, end frame, and reference image. Default: google/gemini-3.1-flash-image-preview ("Nano Banana 2"); GPT and FLUX image models are also available.
  • Video generation — single-shot and per-segment clips run through OpenRouter's /videos passthrough (Veo, Sora, Wan, etc.). Default: google/veo-3.1-fast.
What you need: one OpenRouter API key, pasted into Settings once (stored in VS Code's secret storage, never in plain settings). Your balance shows in the sidebar and refreshes after each generation. This is the only effectively required provider — without it there's no planning, no images, and no default video path.
fal
fal.ai — the specialist video host + default uploader
A generative-media inference platform. It plays two distinct roles — worth keeping separate in your head.
  • Wan-family video, especially lip-sync. Some Wan models — particularly lip-synced singing shots — aren't reliably available through OpenRouter, so Frame Minion calls fal.ai directly for those. (Wan lip-sync accepts clips roughly 2–15 seconds long; the app slices each segment's audio to fit automatically.)
  • The default file uploader. Its zero-setup CDN is how reference frames and audio get a URL the video models can fetch — see Storage below.
What you need: one fal.ai API key — the same key covers both roles, and your fal balance also appears in the sidebar. If you only ever use OpenRouter image + Veo video and never touch Wan or fal storage, you can skip it — but the default storage provider is fal, so most users who generate video will want a fal key too.
Storage — how your reference media reaches the video models
Video providers fetch reference frames and audio by URL, not as inline bytes. Storage picks where those files get uploaded first.

Whenever a clip uses a start/end frame, a reference image, or a lip-sync audio slice, Frame Minion has to upload that file somewhere reachable first. The storageProvider setting picks where.

ProviderSetupBest for
fal.ai CDN fal · default Zero config — just needs your fal.ai key. Almost everyone. Nothing to provision.
Amazon S3 s3 Bring your own bucket: AWS credentials + region + bucket name. Teams who want media in infrastructure they own and control.

A few things worth knowing

  • The setting is read live on every upload, so you can switch providers without restarting.
  • fal-default is not zero-config for OpenRouter-only users. If you generate video, have no fal key, and haven't set up S3, uploads fail with a clear "add a fal key or switch to S3" message. Pick one.
  • Existing S3 users are protected. If you previously configured AWS credentials, Frame Minion keeps you on S3 rather than silently switching you to the fal default.
  • The S3 box (keys, region, bucket, check/create) is a self-contained block in Settings, dimmed when fal is the active provider.
Image generation never needs an uploader
Storage providers exist only so video models can fetch reference media by URL. The Image Generation tab sends all its references inline as base64 directly to OpenRouter — nothing is uploaded anywhere. Image-only? One OpenRouter key, no storage setup at all.
ffmpeg — the local render engine
Everything in the Composition / render stage runs through ffmpeg on your own machine.

Frame Minion uses it to:

  • Sequence the per-segment clips into one continuous video,
  • Generate transitions and effects at clip boundaries (dip-to-black, wipes, flash, shake, impact-zoom, glitch, and the rest),
  • Smooth seams between adjacent clips,
  • Extract start/end frames from any existing video you attach,
  • Encode both the in-editor preview and the universal final MP4.
About the audio
The generated clips are not silent — they're produced with sound. The reason the song gets laid back over the final render is quality, not necessity: re-muxing one continuous audio track across the whole video prevents the small distortions and gaps that would otherwise appear at every clip seam.

Why ffmpeg is also needed for the in-editor preview. VS Code's built-in video player doesn't play AAC (mp4a) audio inside an MP4 — so a raw render would preview silently. Frame Minion uses ffmpeg to add an MP3 audio track to the preview specifically so you can hear it inside VS Code. (The universal final MP4 doesn't need this — it's encoded for broad/Apple compatibility and plays with sound anywhere.)

ffmpeg is not downloaded automatically

Frame Minion resolves a binary in this order, and never fetches anything on its own:

  1. frameMinion.ffmpegPath setting — an explicit path you provide.
  2. System ffmpeg on your PATH. Version 6+ recommended; the filters Frame Minion emits (xfade, minterpolate, mpdecimate) need at least 4.3.
  3. A copy you previously installed through Frame Minion.

If none resolve, Frame Minion tells you ffmpeg is missing — a banner where rendering is gated, and a status indicator in Settings — and offers a one-click Download & Install. That fetches a pinned, SHA-256-verified static build for your platform and caches it for all future renders. You can Uninstall it later from Settings; that removes only Frame Minion's downloaded copy and never touches a system ffmpeg.

Nothing installs behind your back
Either you already have ffmpeg (used automatically), or you opt in with one click. The only time rendering touches the network is that single, user-initiated install.

Settings defaults at a glance

SettingDefaultWhat it controls
openRouterModelopenai/gpt-5.1Planner / analysis / prompt-enhance text model.
imageModelgemini-3.1-flash-imageFrame & reference image generation.
videoModelgoogle/veo-3.1-fastPer-segment / single-shot video generation.
storageProviderfalWhere video reference media is uploaded (fal or s3).
maxConcurrentVideos1 (1–5)How many segment videos generate in parallel.
outputDirectoryframe-minionFolder (under your workspace) where projects and media are written.
awsRegionus-east-1S3 region (only used when storageProvider is s3).
s3BucketName""Your S3 bucket (only used when storageProvider is s3).
maxConversationTurns10Conversation history depth for chat-style flows.
ffmpegPath""Optional explicit path; empty = auto-resolve (system → installed → prompt).
API keys (OpenRouter, fal.ai, AWS) are never stored in settings — they live in VS Code's encrypted secret storage and are entered through the Settings panel.

What needs a key vs. what's automatic

PieceStatusNotes
OpenRouterRequiredPlanning, images, default video. Image generation needs only this.
fal.aiRecommendedDefault uploader for video references; required for Wan / lip-sync. Skip it if you only generate images.
Amazon S3OptionalOpt-out alternative to fal for hosting video reference media.
ffmpegOne-click installUsed locally for rendering. Auto-detected if present; otherwise installed (and removable) from Settings.

The shortest path: one OpenRouter key gets you planning and images. Add a fal key the moment you want video (or lip-sync). Reach for S3 only if you'd rather host reference files yourself. And ffmpeg is the single thing you install — one click, fully removable, only ever at your request.