What is OmniAgent?

OmniAgent is your all-in-one creative AI agent. Instead of juggling separate tools for different media types, OmniAgent lets you generate images, videos, and music from a single, unified interface — just describe what you want, and it handles the rest.

How It Works

When you submit a request, OmniAgent spins up a dedicated sandbox environment — a fully isolated virtual machine equipped with specialized skills and the compute resources needed to bring your idea to life. This means it can handle complex, multi-step creative tasks end to end, without you having to manage any of the process.

  • Images Generate original artwork, illustrations, mockups, concept visuals, and more

  • Videos Produce short clips, animated sequences, and motion content from a text description

  • Music Compose original tracks, soundscapes, or audio snippets tailored to your brief

You guide everything through plain language. Describe a style, upload a reference, or start with a simple idea — OmniAgent figures out the steps.

Choosing a Mode

OmniAgent offers three modes depending on the complexity of your task and the quality of output you need. Each mode uses a different combination of lead and subagent models, processing depth, and resource allocation.

🟡 Thinking

Best for quick, everyday creative tasks where speed matters.

Specification
Value

Lead Agent Model

gemini-3.0-flash

Subagent Model

gemini-2.5-flash

Max Concurrent Subagents

5

Max LangGraph Turns

40

Thinking Budget

5,000 tokens

Summarization Trigger

15,500 tokens

Summarization Keep Messages

8

Thinking mode is the lightest and fastest option. It's ideal when you need a quick image render, a short audio clip, or a simple video without a lot of iterative refinement. It runs up to 5 subagents concurrently and keeps context lean, making it the most responsive mode for straightforward prompts.

Use Thinking when: You want fast results, your request is relatively simple, or you're experimenting with ideas.


🔵 Pro

Best for more detailed work that needs higher accuracy and more room to reason.

Specification
Value

Lead Agent Model

gemini-3.0-flash

Subagent Model

gemini-2.5-flash

Max Concurrent Subagents

10

Max LangGraph Turns

80

Thinking Budget

5,000 tokens

Summarization Trigger

25,000 tokens

Summarization Keep Messages

10

Pro mode doubles the concurrent subagents and LangGraph turns compared to Thinking, giving the agent significantly more room to plan, execute, and refine. It also raises the summarization trigger to 25,000 tokens, meaning it holds more context before compressing — which helps with longer or multi-part creative tasks. The lead model remains gemini-3.0-flash, keeping things efficient while handling more complexity.

Use Pro when: Your task involves multiple creative elements, requires back-and-forth refinement, or you need noticeably better output quality than Thinking provides.


🟣 Ultra

Best for complex, high-fidelity creative projects that demand maximum reasoning power.

Specification
Value

Lead Agent Model

claude-sonnet-4-6

Subagent Model

gemini-3.0-flash

Max Concurrent Subagents

10

Max LangGraph Turns

150

Thinking Budget

10,000 tokens

Summarization Trigger

50,000 tokens

Summarization Keep Messages

15

Ultra is OmniAgent at full power. It's the only mode that uses Claude Sonnet 4.6 as the lead agent — bringing deeper reasoning, better instruction-following, and more nuanced creative judgment to your request. The thinking budget doubles to 10,000 tokens, LangGraph turns go up to 150, and the summarization trigger extends to 50,000 tokens, allowing the agent to sustain long, detailed creative sessions without losing important context.

Use Ultra when: You're working on a high-stakes or complex creative project, need the most polished output possible, or your task involves many interdependent steps.

Mode Comparison at a Glance

Thinking
Pro
Ultra

Lead Model

gemini-3.0-flash

gemini-3.0-flash

claude-sonnet-4-6

Subagent Model

gemini-2.5-flash

gemini-2.5-flash

gemini-3.0-flash

Max Subagents

5

10

10

Max Turns

40

80

150

Thinking Budget

5,000

5,000

10,000

Summarization Trigger

15,500

25,000

50,000

Keep Messages

8

10

15

Tips for Getting the Best Results

  • Be specific in your prompt. The more detail you give — style, mood, length, format — the closer the first output will be to what you're imagining.

  • Start with Thinking to explore, then switch to Ultra to finalize. Use faster modes to iterate on the concept, then run Ultra for the polished version.

  • Upload a reference when you have one. OmniAgent can use images, audio, or documents as creative anchors.

  • Iterate in conversation. You don't need to get the prompt perfect the first time — just describe what to adjust and OmniAgent will refine it.

Last updated