Language Model

What it is: The AI engine that powers your agent’s thinking and communication. How to choose:

Default model: Best for most use cases - balanced performance and cost
Advanced models: May offer improved reasoning but at higher cost
Specialized models: Optimized for specific tasks like coding or creative writing

If you’re just getting started, stick with the default model. You can always upgrade later as your needs evolve.

Maximum output tokens

What it is: Override how much content an agent’s model can generate as part of its reasoning, decision making, and text generation. Explicitly setting a larger limit (8,000 or higher) may be required for agents performing complex tasks with many tools and large inputs. How to configure:

Lower limits: More concise responses, faster performance, lower cost
Higher limits: More detailed responses, but may increase processing time

Adjust based on whether you need brief updates or comprehensive explanations from your agent.

Higher limits may incur more cost, and setting a value that exceeds the model’s limit may cause the agent to fail.

Temperature

What it is: A slider that controls how creative versus predictable your agent’s responses will be. How to set it:

Low (0-0.3): More focused, consistent responses - ideal for factual tasks, customer support, or data analysis
Medium (0.4-0.7): Balanced creativity and precision - good for general conversation
High (0.8-1.0): More varied, creative responses - better for brainstorming, storytelling, or generating diverse ideas

Example: A sales agent might use lower temperature for explaining product specs, but higher temperature when brainstorming marketing ideas.

Reasoning / Thinking

What it is: For supported models (OpenAI, Google, Anthropic), you can configure your models to use output tokens to ‘reason’. Decisions will be slower and cost more, but it improves performance on complex problems. Ignored unless your selected agent model supports reasoning and you choose a consistent config, e.g. use OpenAI reasoning effort for OpenAI o-series models. How to set it: Using the ‘Provider’ dropdown, set this to match the model you’ve chosen for your Agent (OpenAI, Google, Anthropic)

For OpenAI, you can then select ‘Reasoning Effort’. The default is Medium, and you can also select Low or High.
For Google, you can enter a ‘Thinking Budget’, which can be set at any positive value to enable thinking
For Anthropic, you can enter a ‘Thinking Budget’, which can be set at any value above 1024 to enable thinking

Parallel Tool Calls Beta

What it is: Allows an agent to execute multiple tools or hand off multiple tasks to other agents at the same time, rather than one at a time. When enabled, the agent can issue parallel tool calls in a single step, which can significantly reduce overall execution time for workflows with independent operations. When to use it:

Running multiple independent operations at once (for example, querying several data sources in a single step)
Gathering information from multiple sources simultaneously
Reducing overall execution time when individual tool calls don’t depend on each other’s results
Parallel agent workflows in Workforces where sub-agents can operate independently

How to configure:

Open your agent and click Advanced in the bottom-left of the agent builder
In Advanced settings, go to the Language Model tab
Toggle on Parallel Tool Calls
For agent-to-agent handovers in the Workforce Builder: select the connection, open Edge Settings, and set Task Behavior to Create new task

If the agent is part of a Workforce, set all of its outbound connections to use Create new task as their Task Behavior to get the parallel speedup. The Workforce Builder highlights conflicting connections in yellow as a reminder. See Edge Settings for details on configuring Task Behavior for Workforce connections.

Fallback model

What it is: A backup language model that automatically retries failed tasks when your primary model encounters errors or fails to respond. If your agent’s primary model fails to complete a task, the fallback model will automatically retry the task one time. If the fallback model also fails, the task will fail as usual and follow your configured error handling behavior. When to use it:

Using models that occasionally have reliability issues (e.g., Gemini)
Running critical workflows where you need higher reliability
You want to reduce task failures due to temporary LLM provider issues

How to configure it:

Open your agent
Click Advanced in the bottom-left of the agent builder
In Advanced settings, go to the Language Model tab
Under Fallback model, select a model to rerun failed tasks on

Choose a fallback model from a different provider than your primary model. For example, if you’re using Gemini as your primary model, select an OpenAI model (like GPT-5) as your fallback. This eliminates the chance of another provider-specific error and maximizes reliability.

Overview

Agents

Tools

Workforce

Knowledge

Language Model

Language Model

Maximum output tokens

Temperature

Reasoning / Thinking

Parallel Tool Calls Beta

Fallback model

Overview

Agents

Tools

Workforce

Knowledge

Documentation Index

​Language Model

​Maximum output tokens

​Temperature

​Reasoning / Thinking

​Parallel Tool Calls Beta

​Fallback model

Language Model

Maximum output tokens

Temperature

Reasoning / Thinking

Parallel Tool Calls Beta

Fallback model