WAN 2.2 produces some of the best AI video quality available — but most setups cap you at around 5 seconds before memory runs out. The SVI (Stable Video Infinity) 2.0 Pro LoRAs completely change that by chaining clips together seamlessly so you can generate long-form video that technically runs as long as you want. And with quantized GGUF models, this works on cards with as little as 6 GB VRAM.
How SVI Chaining Works
Instead of generating a 1-minute video in one shot (which would require enormous VRAM), SVI uses a smarter approach:
- The workflow generates the video in 5-second segments.
- After each segment, the SVI LoRA reads the last few frames — motion, lighting, character position — and uses them as the starting point for the next segment.
- The transitions are seamless: the result looks like one continuous shot, not separate clips glued together.
Step 1 — Install ComfyUI (Portable Windows)
- Download the ComfyUI Portable ZIP from the ComfyUI releases page and extract it with 7-Zip.
- Navigate into the
custom_nodesfolder, click the address bar, typecmd, and press Enter. - Run
git clonefor the ComfyUI Manager repository. - Navigate back to the main ComfyUI folder (where the Python embedded folder is) and run the specific command from the written guide to install manager dependencies inside the portable Python environment.
Step 2 — Download Models
You'll need to gather several files before launching the workflow:
| File | Source | Destination |
|---|---|---|
| WAN 2.2 14B GGUF (high-noise) | Quantstack HuggingFace — Image-to-Video repo → high-noise folder | models/unet/ |
| WAN 2.2 14B GGUF (low-noise) | Quantstack HuggingFace — Image-to-Video repo → low-noise folder | models/unet/ |
| SVI V2 Pro LoRA (high + low noise) | Kaiji HuggingFace → loras/stable_video_infinity/v2.0/ | models/loras/ |
| LightX2V LoRA (high + low noise) | Kaiji HuggingFace → loras/lightx2v/ or LightX2V WAN 2.2 HuggingFace repo | models/loras/ |
| UMT5 XXL CLIP model | city96 HuggingFace — umt5 repository | models/clip/ |
| WAN 2.1 VAE | Comfy-Org WAN repackaged → files/VAE folder | models/vae/ |
| Upscale model | Channel HuggingFace repo (link in guide) | models/upscale_models/ |
Step 3 — Set Up the Workflow in ComfyUI
- Launch ComfyUI and load the SVI long-video workflow (download link on CivitAI — linked in video description).
- If any nodes appear red, go to Manager → Install Missing Nodes. Install each missing package, then restart ComfyUI and refresh your browser.
- Check all model loader nodes — verify the arrows point to the files you actually downloaded.
GGUF vs. Full Diffusion Model
The workflow defaults to the GGUF model loader for low-VRAM operation. If you have a high-end GPU and want to use the full-precision diffusion model instead, there's a fast group bypasser switch in the workflow — enable the diffusion model loader and disable the GGUF option.
Step 4 — Generate a Long Video
- Load your starting image in the Load Image node.
- Set resolution in the Resize Image node — this controls both the input resize and the output video dimensions.
- The default workflow generates a 20-second video split into four 5-second segments. Write a separate prompt for each segment describing what happens in that 5-second window.
- Check the seed settings — default is "fixed" (same output every run). Change to "randomize" if you want variations.
- Click Run. The SVI LoRA automatically passes the final frames of each segment into the next, creating seamless continuity.
Extending Beyond 20 Seconds (Infinite Chaining)
Want more than 20 seconds? Adding more segments is straightforward:
- Select all nodes in one of the existing 5-second subsection groups.
- Right-click → Clone.
- Drag the clone into position and connect the extended image output from the previous segment into the previous image input of the new clone.
- Connect the new clone's extended image output to the upscale image connector at the end of the workflow.
Each clone adds 5 seconds. Repeat as many times as you want.
Tips for Best Results
- Write per-segment prompts. Each 5-second block needs its own prompt describing that specific portion — account for what happened in the previous segment to maintain narrative flow.
- Start with Q3_K_M GGUF if you're on limited VRAM. Upgrade the quantization level as your hardware allows.
- Use fixed seeds when you find a good generation so you can reproduce it exactly. Switch to randomize for exploration.
- SmoothMix WAN 2.2 is an alternative — faster motion, fewer steps, no LoRAs needed — but results can be less consistent than the standard setup with SVI LoRAs.
- RunPod is great for testing before committing to a long local generation on lower-end hardware.
📦 Want to skip the setup?
The Local Lab offers pre-configured AI installer packages so you can get running in minutes, not hours.
Get the Installer →