Skip to content

ModelSettings

type ModelSettings = object;

Settings to use when calling an LLM.

This class holds optional model configuration parameters (e.g. temperature, topP, penalties, truncation, etc.).

Not all models/providers support all of these parameters, so please check the API documentation for the specific model and provider you are using.

optional contextManagement?: ModelSettingsContextManagement;

Context-management strategies to apply when calling the model. This setting is available on OpenAI Responses requests, including server-side compaction. See https://developers.openai.com/api/docs/guides/compaction.


optional frequencyPenalty?: number;

The frequency penalty to use when calling the model.


optional maxTokens?: number;

The maximum number of output tokens to generate.


optional parallelToolCalls?: boolean;

Whether to use parallel tool calls when calling the model. Defaults to false if not provided.


optional presencePenalty?: number;

The presence penalty to use when calling the model.


optional promptCacheRetention?: "in-memory" | "24h" | null;

Enables prompt caching and controls how long cached content should be retained by the model provider. See https://platform.openai.com/docs/guides/prompt-caching#prompt-cache-retention for the available options.


optional providerData?: Record<string, any>;

Additional provider specific settings to be passed directly to the model request.


optional reasoning?: ModelSettingsReasoning;

The reasoning settings to use when calling the model.


optional retry?: ModelRetrySettings;

Runtime-only retry configuration for the model request.


optional store?: boolean;

Whether to store the generated model response for later retrieval. Defaults to true if not provided.


optional temperature?: number;

The temperature to use when calling the model.


optional text?: ModelSettingsText;

The text settings to use when calling the model.


optional toolChoice?: ModelSettingsToolChoice;

The tool choice to use when calling the model.


optional topP?: number;

The topP to use when calling the model.


optional truncation?: "auto" | "disabled";

The truncation strategy to use when calling the model.