AI Video Prompts
AI · VIDEO

The Secret to Great AI Video Prompts

Discover why JSON Prompts are the key to producing cinematic videos with Veo 3 and Flora — with full control, no surprises, and at a fraction of traditional production costs.

Apr 27, 2026·8 min read

AI video generation has taken a massive leap in 2026. Tools like Veo 3 from Google DeepMind or Kling AI can produce 4K cinematic clips with native audio in seconds. But there's a problem nobody tells you: most results are mediocre because the prompts are vague.

The secret isn't in the model. It's in how you talk to it. And the most powerful way to communicate with a video model isn't free text — it's structured JSON. Here's why and how.

The Complete Pipeline: From Text to 4K Video

How the production flow works with flora.ai, Veo 3 and Flux Pro

01

Text Prompt

flora.ai

Describes the scene in natural language: style, duration, mood, camera movement. The starting point of everything.

"cinematic spot at dawn, drone shot, golden hour, 30s"
02

JSON Builder

Structured control

The text prompt is converted into a structured JSON with scenes, audio, output format and style parameters.

{ scenes, audio, output: "4K" }
03

Veo 3 · Video

Google DeepMind

The video model receives the JSON and generates the cinematic sequence with native audio, motion and lighting.

model: veo-3 · output: mp4 · 4K
04

Flux Pro · Image

Style reference

Flux Pro generates cinematic reference frames that guide the visual style and color palette of the final video.

style reference · cinematic frame
05

Output · mp4

Final export

The final result: a 4K video with synchronized audio, ready to publish on any platform.

✓ 4K · 30s · audio · flora.ai export

Why the JSON Prompt Changes Everything

A free-text prompt leaves too much to the model's interpretation. A JSON prompt gives you total control over every parameter.

When you write "a cinematic video at dawn", the model makes hundreds of decisions for you: duration, camera movement, audio type, scene pacing, color palette. The result might be good — or completely different from what you imagined.

With a JSON prompt, every one of those decisions is yours. You define the scenes, camera, lighting, audio and output format. The model executes exactly what you tell it.

Result: reproducible and predictable

The same JSON generates consistent results. You can iterate, adjust one parameter and see exactly what changes.

prompt.json
{
  "project": {
    "style": "cinematic",
    "duration": 30
  },
  "scenes": [
    {
      "camera": "drone",
      "lighting": "golden hour"
    }
  ],
  "audio": {
    "music": "orchestral"
  }
}
Free-text prompt
  • Unpredictable results
  • Hard to iterate precisely
  • Model makes decisions for you
  • Generic audio by default
Structured JSON prompt
  • Full control over every parameter
  • Reproducible results
  • Precise and efficient iteration
  • Audio and scenes defined by you

5 Keys to Video Prompts That Actually Work

The most common mistakes and how to avoid them

Define the camera movement

Always specify the shot type: drone shot, tracking shot, close-up, wide angle. Without this, the model defaults to generic choices and results look flat.

Correct"drone shot circling the subject at golden hour"
Avoid"a shot of the subject"

Specify the lighting

Lighting defines the entire mood of the video. Golden hour, blue hour, studio lighting, overcast — each creates a radically different atmosphere.

Correct"golden hour backlight, warm tones, lens flare"
Avoid"good lighting"

Include audio in the JSON

Veo 3 generates native audio. If you don't specify it in the JSON, the model adds generic ambient sound. Define the music genre, tempo and sound effects.

Correct{ "audio": { "music": "orchestral", "sfx": "wind" } }
AvoidNot including an audio field

Use style references with Flux Pro

Before generating the video, create a reference frame with Flux Pro. This anchors the visual style and prevents Veo 3 from interpreting the prompt unexpectedly.

CorrectGenerate frame → use as style_reference in JSON
AvoidRelying only on text to define the style

Control duration per scene

Don't put the total duration in a single field. Split the JSON into scenes with individual durations for full control over pacing and narrative.

Correct{ "scenes": [{ "duration": 8 }, { "duration": 12 }] }
Avoid{ "duration": 30 } // no scenes

Real Example: Cinematic Ad Spot

A 30-second spot for a luxury watch brand, produced entirely with AI

Cinematic AI spot - real example

The client needed a 30-second spot to launch a new watch line. Traditional budget: €15,000–€25,000 (film crew, locations, post-production). With the AI pipeline: €180 in model credits and 4 hours of work.

The key was structuring the JSON with 4 distinct scenes: drone opening, watch detail shot, lifestyle scene and logo close. Each scene with its own lighting, camera movement and duration.

Savings: 99% of traditional cost

From €20,000 to €180. No film crew, no locations, no production days.

AI video generation workflow

The 2026 Stack

The complete ecosystem for AI video production

flora.ai

Main orchestrator

Platform that connects all models and manages the complete AI video production pipeline.

Veo 3

Video generation

Google DeepMind's model for cinematic video generation with integrated native audio.

Flux Pro

Visual reference

High-quality image generator for creating reference frames that guide the video's visual style.

Kling AI

Video alternative

Alternative to Veo 3 with excellent camera movement control and temporal coherence.

Runway Gen-4

Editing & refinement

Ideal for editing generated clips, adding effects and refining details in the final video.

99%

Cost reduction vs traditional production

4h

Average production time for a 30s spot

4K

Native output resolution with Veo 3

Possible iterations with no extra filming cost

Conclusion

AI video generation isn't magic — it's prompt engineering. The difference between a mediocre result and a professional-quality cinematic spot lies in how you structure the instructions.

JSON prompts give you the control that free-text prompts simply can't offer. Combined with a well-defined pipeline — flora.ai as orchestrator, Veo 3 for video, Flux Pro for visual references — you can produce cinematic content at a fraction of traditional costs. The future of video production is already here.

Want to implement AI video production in your business?

At AFENIX we help brands and agencies integrate AI video pipelines, reducing production costs by up to 99% without sacrificing cinematic quality.

Request Free Consulting