Skip to main content

Roadmap

What we shipped, what we are building next, and what we plan to build.

Last Shipped

Jinja2 Template Support in the Playground
11/17/2025
Playground
Use Jinja2 templating in prompts to add conditional logic, filters, and template blocks. The template format is stored in the configuration schema, and the SDK handles rendering automatically.
Programmatic Evaluation through the SDK
11/11/2025
Evaluation
Run evaluations programmatically from code with full control over test data and evaluation logic. Evaluate agents built with any framework and view results in the Agenta dashboard.
Online Evaluation
11/11/2025
Evaluation
Automatically evaluate every request to your LLM application in production. Catch hallucinations and off-brand responses as they happen instead of discovering them through user complaints.
Customize LLM-as-a-Judge Output Schemas
11/10/2025
Evaluation
Configure LLM-as-a-Judge evaluators with custom output schemas. Use binary, multiclass, or custom JSON formats. Enable reasoning for better evaluation quality.
Structured Output Support in the Playground
4/15/2025
Playground
Define and validate structured output formats in the playground. Save structured output schemas as part of your prompt configuration.
Vertex AI Provider Support
10/24/2025
IntegrationPlayground
Use Google Cloud's Vertex AI models including Gemini and partner models in the playground, Model Hub, and through Gateway endpoints.
Filtering Traces by Annotation
10/14/2025
Observability
Filter and search for traces based on their annotations. Find traces with low scores or feedback quickly using the rebuilt filtering system.

In progress

Chat Session View in Observability
Observability
Display entire chat sessions in one consolidated view. Currently, each trace in a chat session appears in a separate tab. This feature will group traces by session ID and show the complete conversation in a single view.
Navigation Links from Traces to App/Environment/Variant
Observability
Add clickable links in the observability trace and drawer view to navigate to the application, variant, version, and environment used in each trace. Makes it easy to jump directly to the configuration that generated a specific trace.
Support for built-in LLM Tools (e.g. web search) in the Playground
Playground
We are adding the ability to use built-in LLM tools (e.g. web search) in the playground.
Folders for Prompt Organization
Playground
Create folders and subfolders to organize prompts in the playground. Move prompts between folders and search within specific folders to structure prompt libraries.
Projects and Workspaces
Misc
Improve organization structure by adding projects. Create projects for different products and scope resources to specific projects.
PDF Support in the Playground
Playground
Add PDF support for models that support it (OpenAI, Gemini, etc.) through base64 encoding, URLs, or file IDs. Support extends to human evaluation for reviewing model responses on PDF inputs.
Prompt Snippets
Playground
Create reusable prompt snippets that can be referenced across multiple prompts. Reference specific versions or always use the latest version to maintain consistency across prompt variants.
Date Range Filtering in Metrics Dashboard
Observability
We are adding the ability to filter traces by date range in the metrics dashboard.

Planned

Feature Requests

Upvote or comment on the features you care about or request a new feature.