Text-to-Video API: Build Your First AI Video in Under 5 Minutes
Text-to-Video API: Build Your First AI Video in Under 5 Minutes
You type a sentence. An API returns a video. That's text-to-video in 2026 — and it's no longer experimental. Production teams use it daily for social content, ads, and product demos.
This guide gets you from zero to rendered video in under 5 minutes. No video editing experience needed.
What a Text-to-Video API Actually Does
A text-to-video API accepts a text prompt (like "a drone shot over a tropical island at golden hour") and returns a rendered video clip — typically 3-10 seconds at 720p or 1080p.
Under the hood, a generative AI model (like Kling 3.0, Sora 2, or Veo 3.1) interprets your prompt and generates frames that are stitched into a video. The API handles all the compute — you don't need GPUs, model weights, or machine learning expertise.
Through the SamAutomation AI API, you access 29+ video generation models from a single endpoint. Same authentication, same response format, different models.
Step 1: Get Your API Key (1 minute)
Sign up at samautomation.work. The free tier includes enough credits to generate your first videos — no credit card required.
Navigate to your dashboard, copy your API key from the settings page. You'll need this for every API call.
Step 2: Choose Your Model (30 seconds)
For your first video, use Kling 3.0. It offers the best balance of quality, speed, and cost:
- Generation time: 15-30 seconds
- Quality: 8.5/10
- Cost: ~50 credits per 5-second clip
For a full comparison of all available models, check our AI video model comparison.
Step 3: Write Your Prompt (1 minute)
The prompt determines everything. Here's the structure that produces consistent results:
[Subject] + [Action] + [Setting] + [Style] + [Camera]
Example prompts that work:
- "A coffee cup steaming on a wooden desk, morning sunlight through a window, cinematic, shallow depth of field"
- "A woman jogging through a city park at sunrise, slow motion, golden hour lighting, shot on 35mm film"
- "Product packaging rotating 360 degrees on a white background, studio lighting, commercial quality"
- "Abstract liquid metal flowing and morphing, iridescent colors, macro lens, 4K quality"
Prompts that don't work well: - "Make a cool video" (too vague) - "A video of everything happening in a busy city street with 50 people and cars and bikes and..." (too complex) - "Text saying 'SALE 50% OFF' appearing on screen" (AI models struggle with text rendering)
Pro tip: Keep prompts under 100 words. Be specific about the subject and style, but don't over-describe every detail. Let the model fill in the gaps — it's often more creative than prescriptive prompts.
Step 4: Make the API Call (1 minute)
Here's a simple Python example:
import requests
response = requests.post(
"https://samautomation.work/api/ai/video/generate/",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"model": "kling-3.0",
"prompt": "A coffee cup steaming on a wooden desk, morning sunlight through a window, cinematic",
"duration": 5,
"aspect_ratio": "16:9"
}
)
result = response.json()
print(f"Video URL: {result['video_url']}")
Or use cURL if you prefer the command line:
curl -X POST https://samautomation.work/api/ai/video/generate/ \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "kling-3.0",
"prompt": "A coffee cup steaming on a wooden desk, cinematic",
"duration": 5,
"aspect_ratio": "16:9"
}'
The API returns a job ID immediately. Poll for the result, or set up a webhook to get notified when rendering completes.
Step 5: Get Your Video (1-2 minutes wait)
Depending on the model and server load, your video renders in 15-90 seconds. The response includes:
- video_url — Direct download link for the rendered MP4
- thumbnail_url — Preview frame from the video
- credits_used — How many credits this generation consumed
- generation_time — How long it took
Download the video, review it, and iterate on your prompt if needed.
Making It Production-Ready
Your first AI-generated clip is a building block. To turn it into publishable content, add these layers:
Add Captions
Pass the video through AutoCaptions to add burned-in subtitles. Essential for social media where 85% of video is watched without sound.
Compose with JSON Templates
Combine your AI clip with text overlays, logos, music, and transitions using the JSON Video API. This gives you a polished final product instead of a raw AI clip.
Build a Workflow
Connect the entire pipeline in n8n: 1. Schedule trigger → runs daily 2. AI generates a script from your content calendar 3. Text-to-video API creates the visual 4. JSON Video API adds branding and captions 5. Automated posting to your social channels
Optimizing Your Results
Image-to-Video: More Control
If you need the video to look like a specific image (your product, your logo, a specific style), use image-to-video instead of text-to-video:
{
"model": "kling-3.0",
"image_url": "https://your-site.com/product-photo.jpg",
"prompt": "gentle zoom in with subtle camera movement",
"duration": 5
}
This animates your existing image, giving you much more predictable output than generating from text alone.
Negative Prompts
Some models support negative prompts — telling the AI what to avoid:
{
"prompt": "a modern kitchen, bright and clean",
"negative_prompt": "people, text, watermarks, blurry"
}
Seed Values
For reproducible results, pass a seed value. Same prompt + same seed = same (or very similar) output. Useful for iterating: keep the seed, adjust the prompt.
Cost Breakdown
| Model | 5-second clip | 10-second clip | Credits/month (Basic) |
|---|---|---|---|
| Kling 3.0 | ~50 credits | ~90 credits | 1,450 included |
| Veo 3.1 | ~80 credits | ~150 credits | 1,450 included |
| Sora 2 | ~120 credits | ~220 credits | 1,450 included |
| Pixverse V4.5 | ~30 credits | ~55 credits | 1,450 included |
On the Basic plan (€29.95/month), you get 1,450 AI credits — enough for roughly 29 Kling clips or 18 Veo clips per month. The Pro plan (€49.95/month) gives you 2,450 credits.
For high-volume needs, the BYOK (Bring Your Own Key) option lets you connect your own model API keys and pay the model providers directly, which is often cheaper at scale.
What's Next
Now that you've generated your first AI video, explore these next steps:
- Try different models — Each model has a distinct style. Compare them all.
- Build a pipeline — Automate daily content creation with n8n workflows.
- Add personalization — Use JSON templates to create data-driven video at scale.
- Repurpose content — Turn your blog posts into 50+ videos automatically.
The API documentation has everything you need: samautomation.work/api/ai/docs. Start experimenting — the best way to learn what these models can do is to use them.
Related Articles
JSON to Video: The Complete Developer Reference Guide
Complete developer guide for JSON to video APIs. Schema reference, code examples in Python, JavaScr…
Read more →How to Build a Telegram Video Bot with n8n and a Video API
Build a Telegram bot that generates videos on demand using n8n and a JSON Video API. Complete tutor…
Read more →n8n Video Automation: The Complete Guide to No-Code Video Workflows
Build automated video workflows with n8n. Step-by-step guide with templates, webhook triggers, and …
Read more →