JSON to Video: The Complete Developer Reference Guide

Mar 10, 2026 By smrht@icloud.com

JSON to Video: The Complete Developer Reference Guide

JSON to video means defining a video as structured data -- scenes, elements, animations, transitions -- and sending that data to a rendering engine that produces an MP4. No timeline editors. No drag-and-drop. Just a JSON object that describes exactly what you want, and a video file that matches.

If you've ever tried to generate videos programmatically, you know the pain. FFmpeg commands get unwieldy after 3 overlays. MoviePy requires a Python environment and breaks on complex compositions. Browser-based rendering with Puppeteer is slow and unreliable. JSON to video solves this: you describe the video declaratively, and the API handles encoding, compositing, transitions, and output.

This guide is the complete reference. Every element type, every animation option, every configuration parameter -- with code examples in Python, JavaScript, and cURL.

Why Developers Choose JSON Over Visual Editors

Visual editors (CapCut, Canva, Premiere) are designed for humans making one video at a time. When you need to generate 100 product videos from a database, or create personalized content for each user, or automate daily social posts -- visual editors don't scale.

JSON to video gives you:

Programmability. Loop through a dataset and generate a video for each row. Merge customer data into templates. Version control your video designs in Git.

Consistency. Every video follows the exact same structure. No human error in placement, timing, or styling. A template renders identically whether it's the 1st or 10,000th video.

Integration. Trigger video generation from any system that can make HTTP requests: n8n, Zapier, your backend, a cron job, a Telegram bot. The JSON to Video API accepts standard REST calls.

Speed. Rendering happens on dedicated GPU infrastructure. A 60-second video in 1080p typically renders in 30-90 seconds, depending on complexity.

The JSON Video Schema: Top-Level Structure

Every video starts with a top-level object that defines global settings and an array of scenes:

{
  "resolution": "1080x1920",
  "fps": 30,
  "quality": "high",
  "scenes": [],
  "audio": {}
}

Resolution

Standard presets:

Value Aspect Ratio Use Case
"1080x1920" 9:16 TikTok, Reels, Shorts
"1920x1080" 16:9 YouTube, presentations
"1080x1080" 1:1 Instagram feed, LinkedIn
"1080x1350" 4:5 Instagram portrait
"3840x2160" 16:9 4K YouTube

You can also specify custom resolutions as "widthxheight". Both values must be even numbers.

FPS

Standard values: 24 (cinematic), 30 (web standard), 60 (smooth motion). Higher FPS increases render time proportionally. For most social media content, 30 is the sweet spot.

Quality

"low" (fast renders, smaller files), "medium" (balanced), "high" (best quality, larger files). The difference between medium and high is roughly 20% render time for 15% better visual quality. Use high for final output, low for previews and testing.

Scenes

Scenes are the building blocks of your video. Each scene has a duration, optional background, an array of elements, and an optional transition to the next scene.

{
  "duration": 5,
  "background": {
    "color": "#1a1a2e"
  },
  "elements": [],
  "transition": {
    "type": "fade",
    "duration": 0.5
  }
}

Duration

In seconds. Supports decimals: 3.5 for three and a half seconds. Minimum is 0.5, maximum is 300 (5 minutes per scene).

Background

Three options:

// Solid color
{ "color": "#1a1a2e" }

// Gradient
{ "gradient": { "from": "#1a1a2e", "to": "#16213e", "direction": "to bottom" } }

// Image
{ "image": "https://example.com/background.jpg", "fit": "cover" }

The fit property on image backgrounds accepts "cover" (fill and crop), "contain" (fit within bounds), and "stretch".

Element Types: Complete Reference

Text Elements

Text is the most commonly used element. Here's every available property:

{
  "type": "text",
  "text": "Your headline here",
  "style": {
    "fontSize": 56,
    "fontFamily": "Inter",
    "fontWeight": "bold",
    "fontStyle": "italic",
    "color": "#FFFFFF",
    "backgroundColor": "rgba(0,0,0,0.5)",
    "textAlign": "center",
    "lineHeight": 1.4,
    "letterSpacing": 2,
    "textTransform": "uppercase",
    "maxWidth": 900,
    "padding": 20,
    "borderRadius": 8,
    "textShadow": "2px 2px 4px rgba(0,0,0,0.5)"
  },
  "position": { "x": "50%", "y": "40%" },
  "animation": {
    "type": "fadeInUp",
    "duration": 0.8,
    "delay": 0.2,
    "easing": "easeOutCubic"
  }
}

Positioning: Values can be percentages ("50%") or pixels (540). Percentages are relative to the video resolution. "50%" centers the element.

Font families: The API ships with 50+ fonts including Inter, Roboto, Montserrat, Playfair Display, Oswald, and Source Code Pro. Pass any Google Font name and it's loaded automatically.

Available text animations:

Animation Description
fadeIn Simple opacity fade
fadeInUp Fade in while sliding up
fadeInDown Fade in while sliding down
fadeInLeft Fade in from left
fadeInRight Fade in from right
typewriter Characters appear one by one
bounceIn Bounce entrance
zoomIn Scale from small to full size
slideUp Slide in from bottom
slideDown Slide in from top
blur Blur to sharp transition

Image Elements

Display images from URLs with positioning, sizing, and effects:

{
  "type": "image",
  "src": "https://example.com/product.png",
  "position": { "x": "50%", "y": "50%" },
  "size": { "width": 600, "height": 400 },
  "style": {
    "borderRadius": 16,
    "border": "3px solid #FFFFFF",
    "opacity": 0.9,
    "objectFit": "cover",
    "shadow": "0 4px 20px rgba(0,0,0,0.3)"
  },
  "animation": {
    "type": "kenBurns",
    "duration": 5,
    "direction": "zoomIn"
  },
  "crop": {
    "x": 0,
    "y": 0,
    "width": 800,
    "height": 600
  }
}

Ken Burns effect: The kenBurns animation slowly pans and zooms across an image. Directions: "zoomIn", "zoomOut", "panLeft", "panRight". Duration should match or exceed the scene duration.

Supported formats: JPEG, PNG, WebP, GIF (first frame only -- use video elements for animated content). Images are fetched and cached at render time, so use stable URLs.

Size: Omit to use natural image dimensions. Specify only width or height to scale proportionally. Specify both for exact sizing.

Video Elements

Embed video clips within scenes:

{
  "type": "video",
  "src": "https://example.com/clip.mp4",
  "position": { "x": "50%", "y": "50%" },
  "size": { "width": "100%", "height": "100%" },
  "trim": {
    "start": 2.5,
    "end": 10.0
  },
  "playbackRate": 1.0,
  "volume": 0.5,
  "loop": true,
  "style": {
    "objectFit": "cover",
    "borderRadius": 0
  }
}

Trimming: start and end are in seconds. Only the trimmed portion plays. If the trimmed clip is shorter than the scene duration and loop is true, it repeats.

Playback rate: 0.5 for slow motion, 1.0 for normal, 2.0 for double speed. Range: 0.25 to 4.0.

Supported formats: MP4 (H.264), WebM. MP4 is recommended for compatibility.

Shape Elements

Create backgrounds, dividers, overlays, and decorative elements:

{
  "type": "shape",
  "shape": "rectangle",
  "position": { "x": "50%", "y": "50%" },
  "style": {
    "width": 800,
    "height": 200,
    "backgroundColor": "#e94560",
    "borderRadius": 12,
    "border": "2px solid #FFFFFF",
    "opacity": 0.8,
    "shadow": "0 2px 10px rgba(0,0,0,0.3)"
  },
  "animation": {
    "type": "fadeIn",
    "duration": 0.5
  }
}

Shape types: "rectangle", "circle", "line". For circles, set equal width and height and borderRadius: "50%".

Shapes render behind text and image elements in the same scene (z-order follows array order -- first element is bottommost).

Audio Elements

The top-level audio object adds background music to the entire video:

{
  "audio": {
    "src": "https://example.com/background-music.mp3",
    "volume": 0.3,
    "fadeIn": 2,
    "fadeOut": 3,
    "loop": true,
    "trim": {
      "start": 0,
      "end": 30
    }
  }
}

For per-scene audio (voice-overs, sound effects), add an audio property to individual scenes:

{
  "duration": 5,
  "audio": {
    "src": "https://example.com/voiceover-scene1.mp3",
    "volume": 1.0
  },
  "elements": []
}

Scene audio plays alongside the global background audio. Adjust volumes so they don't compete -- typically 0.2-0.3 for background music when voice-over is present.

Supported formats: MP3, WAV, AAC. MP3 recommended for smaller file sizes.

Transitions Between Scenes

Transitions define how one scene flows into the next:

{
  "transition": {
    "type": "fade",
    "duration": 0.5
  }
}

Available transitions:

Type Description
fade Cross-fade between scenes
slideLeft Next scene slides in from right
slideRight Next scene slides in from left
slideUp Next scene slides in from bottom
slideDown Next scene slides in from top
wipe Horizontal wipe
zoom Zoom into next scene
blur Blur transition
none Hard cut (no transition)

Transition duration is in seconds. Keep it between 0.3 and 1.0 for professional-looking results. Anything longer feels sluggish.

Code Examples

Python

import requests
import time

API_KEY = "your_api_key"
BASE_URL = "https://api.json2video.com/v2"

video_json = {
    "resolution": "1080x1920",
    "fps": 30,
    "quality": "high",
    "scenes": [
        {
            "duration": 5,
            "background": {"color": "#0a0a1a"},
            "elements": [
                {
                    "type": "text",
                    "text": "Product Launch 2026",
                    "style": {
                        "fontSize": 64,
                        "fontWeight": "bold",
                        "color": "#FFFFFF",
                        "textAlign": "center"
                    },
                    "position": {"x": "50%", "y": "40%"},
                    "animation": {"type": "fadeInUp", "duration": 0.8}
                }
            ],
            "transition": {"type": "slideLeft", "duration": 0.4}
        },
        {
            "duration": 6,
            "background": {"color": "#0a0a1a"},
            "elements": [
                {
                    "type": "image",
                    "src": "https://example.com/product-hero.jpg",
                    "position": {"x": "50%", "y": "40%"},
                    "size": {"width": 800, "height": 600},
                    "style": {"borderRadius": 12},
                    "animation": {"type": "zoomIn", "duration": 1.0}
                },
                {
                    "type": "text",
                    "text": "$49/month",
                    "style": {
                        "fontSize": 48,
                        "fontWeight": "bold",
                        "color": "#e94560"
                    },
                    "position": {"x": "50%", "y": "80%"},
                    "animation": {"type": "bounceIn", "duration": 0.5, "delay": 0.5}
                }
            ]
        }
    ],
    "audio": {
        "src": "https://assets.json2video.com/audio/corporate-upbeat.mp3",
        "volume": 0.3,
        "fadeOut": 2
    }
}

# Submit render
headers = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}
response = requests.post(f"{BASE_URL}/renders", json=video_json, headers=headers)
render = response.json()
render_id = render["id"]
print(f"Render submitted: {render_id}")

# Poll for completion
while True:
    status_response = requests.get(f"{BASE_URL}/renders/{render_id}", headers=headers)
    status = status_response.json()

    if status["status"] == "completed":
        print(f"Video ready: {status['output_url']}")
        print(f"Render time: {status['render_time']}s")
        break
    elif status["status"] == "failed":
        print(f"Render failed: {status.get('error', 'Unknown error')}")
        break
    else:
        print(f"Status: {status['status']} ({status.get('progress', 0)}%)")
        time.sleep(5)

JavaScript (Node.js)

const API_KEY = 'your_api_key';
const BASE_URL = 'https://api.json2video.com/v2';

const videoJson = {
  resolution: '1080x1920',
  fps: 30,
  quality: 'high',
  scenes: [
    {
      duration: 5,
      background: { color: '#0a0a1a' },
      elements: [
        {
          type: 'text',
          text: 'Product Launch 2026',
          style: {
            fontSize: 64,
            fontWeight: 'bold',
            color: '#FFFFFF',
            textAlign: 'center'
          },
          position: { x: '50%', y: '40%' },
          animation: { type: 'fadeInUp', duration: 0.8 }
        }
      ],
      transition: { type: 'slideLeft', duration: 0.4 }
    }
  ],
  audio: {
    src: 'https://assets.json2video.com/audio/corporate-upbeat.mp3',
    volume: 0.3,
    fadeOut: 2
  }
};

async function renderVideo() {
  // Submit render
  const renderResponse = await fetch(`${BASE_URL}/renders`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify(videoJson)
  });

  const render = await renderResponse.json();
  console.log(`Render submitted: ${render.id}`);

  // Poll for completion
  while (true) {
    const statusResponse = await fetch(`${BASE_URL}/renders/${render.id}`, {
      headers: { 'Authorization': `Bearer ${API_KEY}` }
    });

    const status = await statusResponse.json();

    if (status.status === 'completed') {
      console.log(`Video ready: ${status.output_url}`);
      console.log(`Render time: ${status.render_time}s`);
      return status;
    }

    if (status.status === 'failed') {
      throw new Error(`Render failed: ${status.error || 'Unknown'}`);
    }

    console.log(`Status: ${status.status} (${status.progress || 0}%)`);
    await new Promise(resolve => setTimeout(resolve, 5000));
  }
}

renderVideo().catch(console.error);

cURL

# Submit render
curl -X POST https://api.json2video.com/v2/renders \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "resolution": "1080x1920",
    "fps": 30,
    "scenes": [
      {
        "duration": 5,
        "background": {"color": "#0a0a1a"},
        "elements": [
          {
            "type": "text",
            "text": "Hello from cURL",
            "style": {"fontSize": 56, "color": "#FFFFFF"},
            "position": {"x": "50%", "y": "50%"},
            "animation": {"type": "fadeIn", "duration": 0.8}
          }
        ]
      }
    ]
  }'

# Response: {"id": "render_abc123", "status": "processing"}

# Check status
curl https://api.json2video.com/v2/renders/render_abc123 \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response: {"id": "render_abc123", "status": "completed", "output_url": "https://..."}

Real-World Template: Product Showcase

Here's a complete 4-scene product showcase template you can adapt. This is the kind of template available in the Template Marketplace:

{
  "resolution": "1080x1920",
  "fps": 30,
  "quality": "high",
  "scenes": [
    {
      "duration": 3,
      "background": { "gradient": { "from": "#667eea", "to": "#764ba2", "direction": "to bottom right" } },
      "elements": [
        {
          "type": "text",
          "text": "{{brand_name}}",
          "style": { "fontSize": 32, "color": "rgba(255,255,255,0.7)", "textTransform": "uppercase", "letterSpacing": 4 },
          "position": { "x": "50%", "y": "25%" },
          "animation": { "type": "fadeIn", "duration": 0.5 }
        },
        {
          "type": "text",
          "text": "{{product_name}}",
          "style": { "fontSize": 64, "fontWeight": "bold", "color": "#FFFFFF", "textAlign": "center", "maxWidth": 900 },
          "position": { "x": "50%", "y": "45%" },
          "animation": { "type": "fadeInUp", "duration": 0.8, "delay": 0.3 }
        },
        {
          "type": "text",
          "text": "{{tagline}}",
          "style": { "fontSize": 28, "color": "rgba(255,255,255,0.8)" },
          "position": { "x": "50%", "y": "65%" },
          "animation": { "type": "fadeIn", "duration": 0.6, "delay": 0.8 }
        }
      ],
      "transition": { "type": "slideLeft", "duration": 0.5 }
    },
    {
      "duration": 5,
      "background": { "color": "#FFFFFF" },
      "elements": [
        {
          "type": "image",
          "src": "{{product_image_url}}",
          "position": { "x": "50%", "y": "40%" },
          "size": { "width": 700, "height": 700 },
          "style": { "borderRadius": 20, "shadow": "0 8px 30px rgba(0,0,0,0.12)" },
          "animation": { "type": "zoomIn", "duration": 1.0 }
        },
        {
          "type": "text",
          "text": "{{feature_1}}",
          "style": { "fontSize": 28, "color": "#333333" },
          "position": { "x": "30%", "y": "82%" },
          "animation": { "type": "fadeInLeft", "duration": 0.5, "delay": 0.5 }
        },
        {
          "type": "text",
          "text": "{{feature_2}}",
          "style": { "fontSize": 28, "color": "#333333" },
          "position": { "x": "70%", "y": "82%" },
          "animation": { "type": "fadeInRight", "duration": 0.5, "delay": 0.7 }
        }
      ],
      "transition": { "type": "fade", "duration": 0.4 }
    },
    {
      "duration": 4,
      "background": { "color": "#f8f9fa" },
      "elements": [
        {
          "type": "text",
          "text": "{{benefit_headline}}",
          "style": { "fontSize": 44, "fontWeight": "bold", "color": "#1a1a2e", "textAlign": "center", "maxWidth": 800 },
          "position": { "x": "50%", "y": "30%" },
          "animation": { "type": "fadeInUp", "duration": 0.6 }
        },
        {
          "type": "text",
          "text": "{{benefit_description}}",
          "style": { "fontSize": 28, "color": "#555555", "textAlign": "center", "maxWidth": 750, "lineHeight": 1.6 },
          "position": { "x": "50%", "y": "55%" },
          "animation": { "type": "fadeIn", "duration": 0.5, "delay": 0.4 }
        }
      ],
      "transition": { "type": "slideUp", "duration": 0.4 }
    },
    {
      "duration": 4,
      "background": { "gradient": { "from": "#764ba2", "to": "#667eea", "direction": "to bottom right" } },
      "elements": [
        {
          "type": "text",
          "text": "{{price}}",
          "style": { "fontSize": 72, "fontWeight": "bold", "color": "#FFFFFF" },
          "position": { "x": "50%", "y": "35%" },
          "animation": { "type": "bounceIn", "duration": 0.6 }
        },
        {
          "type": "text",
          "text": "{{cta_text}}",
          "style": {
            "fontSize": 36,
            "color": "#764ba2",
            "backgroundColor": "#FFFFFF",
            "padding": 20,
            "borderRadius": 12,
            "fontWeight": "bold"
          },
          "position": { "x": "50%", "y": "60%" },
          "animation": { "type": "fadeInUp", "duration": 0.5, "delay": 0.5 }
        },
        {
          "type": "text",
          "text": "{{website_url}}",
          "style": { "fontSize": 24, "color": "rgba(255,255,255,0.7)" },
          "position": { "x": "50%", "y": "80%" },
          "animation": { "type": "fadeIn", "duration": 0.5, "delay": 1.0 }
        }
      ]
    }
  ],
  "audio": {
    "src": "https://assets.json2video.com/audio/upbeat-corporate.mp3",
    "volume": 0.25,
    "fadeIn": 1,
    "fadeOut": 2
  }
}

Replace the {{placeholder}} values with real data. In a programmatic pipeline, use string replacement or a template engine. The AI docs describe how to use AI-assisted template filling for dynamic content.

Performance Tips

Optimize image sizes. Don't send a 4000x3000 JPEG when the element displays at 600x400. Pre-resize images to match their display size. This alone can cut render times by 30-40%.

Minimize scene count. Each scene adds rendering overhead. A 10-scene video with 3-second scenes renders slower than a 5-scene video with 6-second scenes, even though total duration is similar.

Use JPEG over PNG for photos. PNG is better for graphics with transparency. For everything else, JPEG at 85% quality is smaller and faster to download during rendering.

Batch similar renders. If you're generating 100 videos with the same template, submit them in parallel rather than sequentially. The API processes concurrent renders on separate workers. Check your plan limits for concurrent render capacity.

Test with low quality first. Set "quality": "low" during development. Renders finish in half the time. Switch to "high" for production output.

Error Handling

Common error responses and how to handle them:

Status Code Meaning Action
400 Invalid JSON schema Check your JSON structure against this reference
401 Invalid API key Verify your Bearer token
402 Plan limit reached Upgrade your plan or wait for reset
404 Render not found Check the render ID
413 Payload too large Reduce scene count or use URL references for assets
429 Rate limited Back off and retry after the Retry-After header value
500 Server error Retry after 30 seconds, up to 3 times

Always validate your JSON before submitting. A missing comma or unclosed bracket returns a 400 with a parsing error that can be hard to debug in complex templates.

import json

def validate_video_json(video_json):
    """Validate video JSON before submitting to API."""
    required_fields = ['resolution', 'scenes']
    for field in required_fields:
        if field not in video_json:
            raise ValueError(f"Missing required field: {field}")

    if not video_json['scenes']:
        raise ValueError("At least one scene is required")

    for i, scene in enumerate(video_json['scenes']):
        if 'duration' not in scene:
            raise ValueError(f"Scene {i} missing duration")
        if scene['duration'] < 0.5 or scene['duration'] > 300:
            raise ValueError(f"Scene {i} duration must be 0.5-300 seconds")
        if 'elements' not in scene or not scene['elements']:
            raise ValueError(f"Scene {i} has no elements")

    # Validate JSON is serializable
    try:
        json.dumps(video_json)
    except (TypeError, ValueError) as e:
        raise ValueError(f"JSON serialization error: {e}")

    return True

Beyond the Basics

This reference covers the core schema. For more advanced features:

  • AutoCaptions -- Automatically generate and burn subtitles into your videos. See the AutoCaptions guide.
  • AI-assisted generation -- Use natural language to generate video JSON. See the AI API docs.
  • Template marketplace -- Pre-built templates for common use cases. Browse the Template Marketplace.
  • CapCut API alternative -- How our API compares to CapCut for programmatic video. See the CapCut API comparison.

The JSON to video approach fundamentally changes how you think about video production. It stops being a creative bottleneck and becomes a data transformation problem. Define the template once, pipe in the data, get videos out. That's the entire model, and it scales to any volume you need.

Related Articles