JSON to Video: The Complete Developer Reference Guide
JSON to Video: The Complete Developer Reference Guide
JSON to video means defining a video as structured data -- scenes, elements, animations, transitions -- and sending that data to a rendering engine that produces an MP4. No timeline editors. No drag-and-drop. Just a JSON object that describes exactly what you want, and a video file that matches.
If you've ever tried to generate videos programmatically, you know the pain. FFmpeg commands get unwieldy after 3 overlays. MoviePy requires a Python environment and breaks on complex compositions. Browser-based rendering with Puppeteer is slow and unreliable. JSON to video solves this: you describe the video declaratively, and the API handles encoding, compositing, transitions, and output.
This guide is the complete reference. Every element type, every animation option, every configuration parameter -- with code examples in Python, JavaScript, and cURL.
Why Developers Choose JSON Over Visual Editors
Visual editors (CapCut, Canva, Premiere) are designed for humans making one video at a time. When you need to generate 100 product videos from a database, or create personalized content for each user, or automate daily social posts -- visual editors don't scale.
JSON to video gives you:
Programmability. Loop through a dataset and generate a video for each row. Merge customer data into templates. Version control your video designs in Git.
Consistency. Every video follows the exact same structure. No human error in placement, timing, or styling. A template renders identically whether it's the 1st or 10,000th video.
Integration. Trigger video generation from any system that can make HTTP requests: n8n, Zapier, your backend, a cron job, a Telegram bot. The JSON to Video API accepts standard REST calls.
Speed. Rendering happens on dedicated GPU infrastructure. A 60-second video in 1080p typically renders in 30-90 seconds, depending on complexity.
The JSON Video Schema: Top-Level Structure
Every video starts with a top-level object that defines global settings and an array of scenes:
{
"resolution": "1080x1920",
"fps": 30,
"quality": "high",
"scenes": [],
"audio": {}
}
Resolution
Standard presets:
| Value | Aspect Ratio | Use Case |
|---|---|---|
"1080x1920" |
9:16 | TikTok, Reels, Shorts |
"1920x1080" |
16:9 | YouTube, presentations |
"1080x1080" |
1:1 | Instagram feed, LinkedIn |
"1080x1350" |
4:5 | Instagram portrait |
"3840x2160" |
16:9 | 4K YouTube |
You can also specify custom resolutions as "widthxheight". Both values must be even numbers.
FPS
Standard values: 24 (cinematic), 30 (web standard), 60 (smooth motion). Higher FPS increases render time proportionally. For most social media content, 30 is the sweet spot.
Quality
"low" (fast renders, smaller files), "medium" (balanced), "high" (best quality, larger files). The difference between medium and high is roughly 20% render time for 15% better visual quality. Use high for final output, low for previews and testing.
Scenes
Scenes are the building blocks of your video. Each scene has a duration, optional background, an array of elements, and an optional transition to the next scene.
{
"duration": 5,
"background": {
"color": "#1a1a2e"
},
"elements": [],
"transition": {
"type": "fade",
"duration": 0.5
}
}
Duration
In seconds. Supports decimals: 3.5 for three and a half seconds. Minimum is 0.5, maximum is 300 (5 minutes per scene).
Background
Three options:
// Solid color
{ "color": "#1a1a2e" }
// Gradient
{ "gradient": { "from": "#1a1a2e", "to": "#16213e", "direction": "to bottom" } }
// Image
{ "image": "https://example.com/background.jpg", "fit": "cover" }
The fit property on image backgrounds accepts "cover" (fill and crop), "contain" (fit within bounds), and "stretch".
Element Types: Complete Reference
Text Elements
Text is the most commonly used element. Here's every available property:
{
"type": "text",
"text": "Your headline here",
"style": {
"fontSize": 56,
"fontFamily": "Inter",
"fontWeight": "bold",
"fontStyle": "italic",
"color": "#FFFFFF",
"backgroundColor": "rgba(0,0,0,0.5)",
"textAlign": "center",
"lineHeight": 1.4,
"letterSpacing": 2,
"textTransform": "uppercase",
"maxWidth": 900,
"padding": 20,
"borderRadius": 8,
"textShadow": "2px 2px 4px rgba(0,0,0,0.5)"
},
"position": { "x": "50%", "y": "40%" },
"animation": {
"type": "fadeInUp",
"duration": 0.8,
"delay": 0.2,
"easing": "easeOutCubic"
}
}
Positioning: Values can be percentages ("50%") or pixels (540). Percentages are relative to the video resolution. "50%" centers the element.
Font families: The API ships with 50+ fonts including Inter, Roboto, Montserrat, Playfair Display, Oswald, and Source Code Pro. Pass any Google Font name and it's loaded automatically.
Available text animations:
| Animation | Description |
|---|---|
fadeIn |
Simple opacity fade |
fadeInUp |
Fade in while sliding up |
fadeInDown |
Fade in while sliding down |
fadeInLeft |
Fade in from left |
fadeInRight |
Fade in from right |
typewriter |
Characters appear one by one |
bounceIn |
Bounce entrance |
zoomIn |
Scale from small to full size |
slideUp |
Slide in from bottom |
slideDown |
Slide in from top |
blur |
Blur to sharp transition |
Image Elements
Display images from URLs with positioning, sizing, and effects:
{
"type": "image",
"src": "https://example.com/product.png",
"position": { "x": "50%", "y": "50%" },
"size": { "width": 600, "height": 400 },
"style": {
"borderRadius": 16,
"border": "3px solid #FFFFFF",
"opacity": 0.9,
"objectFit": "cover",
"shadow": "0 4px 20px rgba(0,0,0,0.3)"
},
"animation": {
"type": "kenBurns",
"duration": 5,
"direction": "zoomIn"
},
"crop": {
"x": 0,
"y": 0,
"width": 800,
"height": 600
}
}
Ken Burns effect: The kenBurns animation slowly pans and zooms across an image. Directions: "zoomIn", "zoomOut", "panLeft", "panRight". Duration should match or exceed the scene duration.
Supported formats: JPEG, PNG, WebP, GIF (first frame only -- use video elements for animated content). Images are fetched and cached at render time, so use stable URLs.
Size: Omit to use natural image dimensions. Specify only width or height to scale proportionally. Specify both for exact sizing.
Video Elements
Embed video clips within scenes:
{
"type": "video",
"src": "https://example.com/clip.mp4",
"position": { "x": "50%", "y": "50%" },
"size": { "width": "100%", "height": "100%" },
"trim": {
"start": 2.5,
"end": 10.0
},
"playbackRate": 1.0,
"volume": 0.5,
"loop": true,
"style": {
"objectFit": "cover",
"borderRadius": 0
}
}
Trimming: start and end are in seconds. Only the trimmed portion plays. If the trimmed clip is shorter than the scene duration and loop is true, it repeats.
Playback rate: 0.5 for slow motion, 1.0 for normal, 2.0 for double speed. Range: 0.25 to 4.0.
Supported formats: MP4 (H.264), WebM. MP4 is recommended for compatibility.
Shape Elements
Create backgrounds, dividers, overlays, and decorative elements:
{
"type": "shape",
"shape": "rectangle",
"position": { "x": "50%", "y": "50%" },
"style": {
"width": 800,
"height": 200,
"backgroundColor": "#e94560",
"borderRadius": 12,
"border": "2px solid #FFFFFF",
"opacity": 0.8,
"shadow": "0 2px 10px rgba(0,0,0,0.3)"
},
"animation": {
"type": "fadeIn",
"duration": 0.5
}
}
Shape types: "rectangle", "circle", "line". For circles, set equal width and height and borderRadius: "50%".
Shapes render behind text and image elements in the same scene (z-order follows array order -- first element is bottommost).
Audio Elements
The top-level audio object adds background music to the entire video:
{
"audio": {
"src": "https://example.com/background-music.mp3",
"volume": 0.3,
"fadeIn": 2,
"fadeOut": 3,
"loop": true,
"trim": {
"start": 0,
"end": 30
}
}
}
For per-scene audio (voice-overs, sound effects), add an audio property to individual scenes:
{
"duration": 5,
"audio": {
"src": "https://example.com/voiceover-scene1.mp3",
"volume": 1.0
},
"elements": []
}
Scene audio plays alongside the global background audio. Adjust volumes so they don't compete -- typically 0.2-0.3 for background music when voice-over is present.
Supported formats: MP3, WAV, AAC. MP3 recommended for smaller file sizes.
Transitions Between Scenes
Transitions define how one scene flows into the next:
{
"transition": {
"type": "fade",
"duration": 0.5
}
}
Available transitions:
| Type | Description |
|---|---|
fade |
Cross-fade between scenes |
slideLeft |
Next scene slides in from right |
slideRight |
Next scene slides in from left |
slideUp |
Next scene slides in from bottom |
slideDown |
Next scene slides in from top |
wipe |
Horizontal wipe |
zoom |
Zoom into next scene |
blur |
Blur transition |
none |
Hard cut (no transition) |
Transition duration is in seconds. Keep it between 0.3 and 1.0 for professional-looking results. Anything longer feels sluggish.
Code Examples
Python
import requests
import time
API_KEY = "your_api_key"
BASE_URL = "https://api.json2video.com/v2"
video_json = {
"resolution": "1080x1920",
"fps": 30,
"quality": "high",
"scenes": [
{
"duration": 5,
"background": {"color": "#0a0a1a"},
"elements": [
{
"type": "text",
"text": "Product Launch 2026",
"style": {
"fontSize": 64,
"fontWeight": "bold",
"color": "#FFFFFF",
"textAlign": "center"
},
"position": {"x": "50%", "y": "40%"},
"animation": {"type": "fadeInUp", "duration": 0.8}
}
],
"transition": {"type": "slideLeft", "duration": 0.4}
},
{
"duration": 6,
"background": {"color": "#0a0a1a"},
"elements": [
{
"type": "image",
"src": "https://example.com/product-hero.jpg",
"position": {"x": "50%", "y": "40%"},
"size": {"width": 800, "height": 600},
"style": {"borderRadius": 12},
"animation": {"type": "zoomIn", "duration": 1.0}
},
{
"type": "text",
"text": "$49/month",
"style": {
"fontSize": 48,
"fontWeight": "bold",
"color": "#e94560"
},
"position": {"x": "50%", "y": "80%"},
"animation": {"type": "bounceIn", "duration": 0.5, "delay": 0.5}
}
]
}
],
"audio": {
"src": "https://assets.json2video.com/audio/corporate-upbeat.mp3",
"volume": 0.3,
"fadeOut": 2
}
}
# Submit render
headers = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}
response = requests.post(f"{BASE_URL}/renders", json=video_json, headers=headers)
render = response.json()
render_id = render["id"]
print(f"Render submitted: {render_id}")
# Poll for completion
while True:
status_response = requests.get(f"{BASE_URL}/renders/{render_id}", headers=headers)
status = status_response.json()
if status["status"] == "completed":
print(f"Video ready: {status['output_url']}")
print(f"Render time: {status['render_time']}s")
break
elif status["status"] == "failed":
print(f"Render failed: {status.get('error', 'Unknown error')}")
break
else:
print(f"Status: {status['status']} ({status.get('progress', 0)}%)")
time.sleep(5)
JavaScript (Node.js)
const API_KEY = 'your_api_key';
const BASE_URL = 'https://api.json2video.com/v2';
const videoJson = {
resolution: '1080x1920',
fps: 30,
quality: 'high',
scenes: [
{
duration: 5,
background: { color: '#0a0a1a' },
elements: [
{
type: 'text',
text: 'Product Launch 2026',
style: {
fontSize: 64,
fontWeight: 'bold',
color: '#FFFFFF',
textAlign: 'center'
},
position: { x: '50%', y: '40%' },
animation: { type: 'fadeInUp', duration: 0.8 }
}
],
transition: { type: 'slideLeft', duration: 0.4 }
}
],
audio: {
src: 'https://assets.json2video.com/audio/corporate-upbeat.mp3',
volume: 0.3,
fadeOut: 2
}
};
async function renderVideo() {
// Submit render
const renderResponse = await fetch(`${BASE_URL}/renders`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify(videoJson)
});
const render = await renderResponse.json();
console.log(`Render submitted: ${render.id}`);
// Poll for completion
while (true) {
const statusResponse = await fetch(`${BASE_URL}/renders/${render.id}`, {
headers: { 'Authorization': `Bearer ${API_KEY}` }
});
const status = await statusResponse.json();
if (status.status === 'completed') {
console.log(`Video ready: ${status.output_url}`);
console.log(`Render time: ${status.render_time}s`);
return status;
}
if (status.status === 'failed') {
throw new Error(`Render failed: ${status.error || 'Unknown'}`);
}
console.log(`Status: ${status.status} (${status.progress || 0}%)`);
await new Promise(resolve => setTimeout(resolve, 5000));
}
}
renderVideo().catch(console.error);
cURL
# Submit render
curl -X POST https://api.json2video.com/v2/renders \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"resolution": "1080x1920",
"fps": 30,
"scenes": [
{
"duration": 5,
"background": {"color": "#0a0a1a"},
"elements": [
{
"type": "text",
"text": "Hello from cURL",
"style": {"fontSize": 56, "color": "#FFFFFF"},
"position": {"x": "50%", "y": "50%"},
"animation": {"type": "fadeIn", "duration": 0.8}
}
]
}
]
}'
# Response: {"id": "render_abc123", "status": "processing"}
# Check status
curl https://api.json2video.com/v2/renders/render_abc123 \
-H "Authorization: Bearer YOUR_API_KEY"
# Response: {"id": "render_abc123", "status": "completed", "output_url": "https://..."}
Real-World Template: Product Showcase
Here's a complete 4-scene product showcase template you can adapt. This is the kind of template available in the Template Marketplace:
{
"resolution": "1080x1920",
"fps": 30,
"quality": "high",
"scenes": [
{
"duration": 3,
"background": { "gradient": { "from": "#667eea", "to": "#764ba2", "direction": "to bottom right" } },
"elements": [
{
"type": "text",
"text": "{{brand_name}}",
"style": { "fontSize": 32, "color": "rgba(255,255,255,0.7)", "textTransform": "uppercase", "letterSpacing": 4 },
"position": { "x": "50%", "y": "25%" },
"animation": { "type": "fadeIn", "duration": 0.5 }
},
{
"type": "text",
"text": "{{product_name}}",
"style": { "fontSize": 64, "fontWeight": "bold", "color": "#FFFFFF", "textAlign": "center", "maxWidth": 900 },
"position": { "x": "50%", "y": "45%" },
"animation": { "type": "fadeInUp", "duration": 0.8, "delay": 0.3 }
},
{
"type": "text",
"text": "{{tagline}}",
"style": { "fontSize": 28, "color": "rgba(255,255,255,0.8)" },
"position": { "x": "50%", "y": "65%" },
"animation": { "type": "fadeIn", "duration": 0.6, "delay": 0.8 }
}
],
"transition": { "type": "slideLeft", "duration": 0.5 }
},
{
"duration": 5,
"background": { "color": "#FFFFFF" },
"elements": [
{
"type": "image",
"src": "{{product_image_url}}",
"position": { "x": "50%", "y": "40%" },
"size": { "width": 700, "height": 700 },
"style": { "borderRadius": 20, "shadow": "0 8px 30px rgba(0,0,0,0.12)" },
"animation": { "type": "zoomIn", "duration": 1.0 }
},
{
"type": "text",
"text": "{{feature_1}}",
"style": { "fontSize": 28, "color": "#333333" },
"position": { "x": "30%", "y": "82%" },
"animation": { "type": "fadeInLeft", "duration": 0.5, "delay": 0.5 }
},
{
"type": "text",
"text": "{{feature_2}}",
"style": { "fontSize": 28, "color": "#333333" },
"position": { "x": "70%", "y": "82%" },
"animation": { "type": "fadeInRight", "duration": 0.5, "delay": 0.7 }
}
],
"transition": { "type": "fade", "duration": 0.4 }
},
{
"duration": 4,
"background": { "color": "#f8f9fa" },
"elements": [
{
"type": "text",
"text": "{{benefit_headline}}",
"style": { "fontSize": 44, "fontWeight": "bold", "color": "#1a1a2e", "textAlign": "center", "maxWidth": 800 },
"position": { "x": "50%", "y": "30%" },
"animation": { "type": "fadeInUp", "duration": 0.6 }
},
{
"type": "text",
"text": "{{benefit_description}}",
"style": { "fontSize": 28, "color": "#555555", "textAlign": "center", "maxWidth": 750, "lineHeight": 1.6 },
"position": { "x": "50%", "y": "55%" },
"animation": { "type": "fadeIn", "duration": 0.5, "delay": 0.4 }
}
],
"transition": { "type": "slideUp", "duration": 0.4 }
},
{
"duration": 4,
"background": { "gradient": { "from": "#764ba2", "to": "#667eea", "direction": "to bottom right" } },
"elements": [
{
"type": "text",
"text": "{{price}}",
"style": { "fontSize": 72, "fontWeight": "bold", "color": "#FFFFFF" },
"position": { "x": "50%", "y": "35%" },
"animation": { "type": "bounceIn", "duration": 0.6 }
},
{
"type": "text",
"text": "{{cta_text}}",
"style": {
"fontSize": 36,
"color": "#764ba2",
"backgroundColor": "#FFFFFF",
"padding": 20,
"borderRadius": 12,
"fontWeight": "bold"
},
"position": { "x": "50%", "y": "60%" },
"animation": { "type": "fadeInUp", "duration": 0.5, "delay": 0.5 }
},
{
"type": "text",
"text": "{{website_url}}",
"style": { "fontSize": 24, "color": "rgba(255,255,255,0.7)" },
"position": { "x": "50%", "y": "80%" },
"animation": { "type": "fadeIn", "duration": 0.5, "delay": 1.0 }
}
]
}
],
"audio": {
"src": "https://assets.json2video.com/audio/upbeat-corporate.mp3",
"volume": 0.25,
"fadeIn": 1,
"fadeOut": 2
}
}
Replace the {{placeholder}} values with real data. In a programmatic pipeline, use string replacement or a template engine. The AI docs describe how to use AI-assisted template filling for dynamic content.
Performance Tips
Optimize image sizes. Don't send a 4000x3000 JPEG when the element displays at 600x400. Pre-resize images to match their display size. This alone can cut render times by 30-40%.
Minimize scene count. Each scene adds rendering overhead. A 10-scene video with 3-second scenes renders slower than a 5-scene video with 6-second scenes, even though total duration is similar.
Use JPEG over PNG for photos. PNG is better for graphics with transparency. For everything else, JPEG at 85% quality is smaller and faster to download during rendering.
Batch similar renders. If you're generating 100 videos with the same template, submit them in parallel rather than sequentially. The API processes concurrent renders on separate workers. Check your plan limits for concurrent render capacity.
Test with low quality first. Set "quality": "low" during development. Renders finish in half the time. Switch to "high" for production output.
Error Handling
Common error responses and how to handle them:
| Status Code | Meaning | Action |
|---|---|---|
| 400 | Invalid JSON schema | Check your JSON structure against this reference |
| 401 | Invalid API key | Verify your Bearer token |
| 402 | Plan limit reached | Upgrade your plan or wait for reset |
| 404 | Render not found | Check the render ID |
| 413 | Payload too large | Reduce scene count or use URL references for assets |
| 429 | Rate limited | Back off and retry after the Retry-After header value |
| 500 | Server error | Retry after 30 seconds, up to 3 times |
Always validate your JSON before submitting. A missing comma or unclosed bracket returns a 400 with a parsing error that can be hard to debug in complex templates.
import json
def validate_video_json(video_json):
"""Validate video JSON before submitting to API."""
required_fields = ['resolution', 'scenes']
for field in required_fields:
if field not in video_json:
raise ValueError(f"Missing required field: {field}")
if not video_json['scenes']:
raise ValueError("At least one scene is required")
for i, scene in enumerate(video_json['scenes']):
if 'duration' not in scene:
raise ValueError(f"Scene {i} missing duration")
if scene['duration'] < 0.5 or scene['duration'] > 300:
raise ValueError(f"Scene {i} duration must be 0.5-300 seconds")
if 'elements' not in scene or not scene['elements']:
raise ValueError(f"Scene {i} has no elements")
# Validate JSON is serializable
try:
json.dumps(video_json)
except (TypeError, ValueError) as e:
raise ValueError(f"JSON serialization error: {e}")
return True
Beyond the Basics
This reference covers the core schema. For more advanced features:
- AutoCaptions -- Automatically generate and burn subtitles into your videos. See the AutoCaptions guide.
- AI-assisted generation -- Use natural language to generate video JSON. See the AI API docs.
- Template marketplace -- Pre-built templates for common use cases. Browse the Template Marketplace.
- CapCut API alternative -- How our API compares to CapCut for programmatic video. See the CapCut API comparison.
The JSON to video approach fundamentally changes how you think about video production. It stops being a creative bottleneck and becomes a data transformation problem. Define the template once, pipe in the data, get videos out. That's the entire model, and it scales to any volume you need.
Related Articles
How to Build a Telegram Video Bot with n8n and a Video API
Build a Telegram bot that generates videos on demand using n8n and a JSON Video API. Complete tutor…
Read more →n8n Video Automation: The Complete Guide to No-Code Video Workflows
Build automated video workflows with n8n. Step-by-step guide with templates, webhook triggers, and …
Read more →Text-to-Video API: Build Your First AI Video in Under 5 Minutes
Create your first AI-generated video using a text-to-video API in under 5 minutes. Step-by-step gui…
Read more →