Catalog
neondatabase/neon-ai-gateway

neondatabase

neon-ai-gateway

One API and one credential for frontier and open-source LLMs, built into your Neon branch and powered by Databricks. Use when a user wants to call an LLM, add AI/chat/an agent to their app, route between model providers (OpenAI, Anthropic, Google/Gemini, Meta, Alibaba, DeepSeek), or avoid juggling separate provider API keys and accounts — especially when they already use Neon and want AI requests to branch with their project. Works with the OpenAI SDK, Anthropic SDK, google-genai, the Vercel AI SDK, and Mastra by changing only the base URL. Triggers include "call an LLM", "add AI to my app", "chat completion", "model routing", "LLM proxy/gateway", "one API for all models", "use Claude/GPT/Gemini", "AI SDK", "Mastra agent", "Neon AI Gateway", and "log/rate-limit AI calls".

global
New~3.0k
v1.0Saved Jun 12, 2026

Neon AI Gateway

This is a preview feature and only available in us-east-2. The Neon AI Gateway is the LLM inference layer built into your Neon branch: one API and one Neon credential give you access to frontier and open-source models from Anthropic, OpenAI, Google, Meta, Alibaba, DeepSeek, and Databricks — powered by Databricks. Your existing OpenAI/Anthropic/Gemini SDK works by changing only the base URL.

Use this skill to help the user send model calls through the gateway, wire it into the AI SDK or Mastra, and switch providers without rewiring code. Deliver a working inference request, a configured agent, or a precise answer from the official Neon docs.

When to Use

Reach for the AI Gateway whenever an app or agent needs to call an LLM and the user would rather not manage model providers themselves:

  • One credential instead of many provider accounts. A single Neon credential reaches the entire model catalog across seven providers. No separate OpenAI / Anthropic / Google billing, keys, or signups to provision and rotate.
  • Switch models without rewiring. The unified endpoint is OpenAI-compatible and works with every model in the catalog — change one model field to move between Claude, GPT, and Gemini. Standard SDKs (OpenAI, Anthropic, google-genai) work with just a base-URL change.
  • AI follows your branches. Each branch has its own gateway endpoint, scoped with the same lineage as your database. AI requests from a preview/feature branch are isolated to that branch — the same isolation your data already gets — which makes preview, CI, and agent environments self-contained.
  • No extra infrastructure, and it's already next to your data. The gateway lives inside your Neon project (and is injected into Neon Functions automatically), runs on the same Databricks infrastructure that serves trillions of tokens a month, and supports streaming (SSE) out of the box.

If the user already has a deep, single-provider integration and no interest in Neon branching or multi-model routing, a direct provider SDK is fine — but the moment they want one credential, model portability, or branch-scoped AI, this is the reason to use it.

What It Does

  • One API for all models — Frontier and open-source models behind a single endpoint, addressed by their catalog ID (e.g. claude-sonnet-4-6, gpt-5-mini, gemini-2-5-flash).
  • Standard SDKs, one URL change — OpenAI SDK and AI SDK (OpenAI-compatible MLflow/Responses routes), Anthropic SDK (native Messages), google-genai (native Gemini).
  • Branch-scoped — Each branch gets its own gateway host; the Neon credential authorizes requests for that branch and its descendants.
  • Streaming — Server-sent events work on all endpoints with no extra configuration.

Setup

The gateway is part of neon.ts (see the neon skill for the branch-first workflow and neon.ts basics). Enable it under preview.aiGateway:

// neon.ts
import { defineConfig } from "@neondatabase/config/v1";

export default defineConfig({
  preview: {
    aiGateway: true,
  },
});
neonctl deploy   # provisions the gateway on the linked branch

Environment variables

When preview.aiGateway is enabled, Neon injects the gateway credentials as OpenAI-standard env vars (so the OpenAI SDK and AI SDK work from the environment with no config), plus NEON_-branded aliases. Inside a deployed Neon Function these are injected automatically; locally, neonctl env pull writes them to .env/.env.local (or use neon-env run -- <cmd> to inject at runtime without a file):

Variable Meaning
OPENAI_API_KEY Gateway bearer token (a Neon credential, nt_live_...)
OPENAI_BASE_URL Full OpenAI-dialect route, including /ai-gateway/openai/v1: https://<branch-id>-api.ai.<region>.aws.neon.tech/ai-gateway/openai/v1
NEON_AI_GATEWAY_TOKEN Same bearer as OPENAI_API_KEY (survives a user overriding OPENAI_* with their own keys)
NEON_AI_GATEWAY_BASE_URL Bare branch gateway host (scheme://host, no path — no /ai-gateway): https://<branch-id>-api.ai.<region>.aws.neon.tech

The two base URLs are different: OPENAI_BASE_URL already includes the full /ai-gateway/openai/v1 (Responses) route, while NEON_AI_GATEWAY_BASE_URL is just the bare host, so you append /ai-gateway/<dialect> yourself (this is also what the @neondatabase/ai-sdk-provider does for you). The routes under the host are:

  • /ai-gateway/mlflow/v1 — unified, OpenAI Chat Completions-compatible; recommended default, works with every provider.
  • /ai-gateway/openai/v1 — OpenAI Responses API (required for gpt-5-…-codex variants and gpt-5-5-pro). This is the route OPENAI_BASE_URL already points at, because the @ai-sdk/openai provider uses the Responses API by default.
  • /ai-gateway/anthropic/v1 — native Anthropic Messages (extended thinking, prompt caching).
  • /ai-gateway/gemini/v1beta/... — native Gemini generateContent.

So ${NEON_AI_GATEWAY_BASE_URL}/ai-gateway/mlflow/v1 is the chat-completions endpoint, ${NEON_AI_GATEWAY_BASE_URL}/ai-gateway/openai/v1 equals OPENAI_BASE_URL, and so on. If you only have OPENAI_BASE_URL and need chat completions, swap the dialect: baseUrl.replace("/openai/v1", "/mlflow/v1") (this is what the Mastra example does).

For typed access, parseEnv (from @neondatabase/env) returns env.aiGateway (apiKey, baseUrl) derived from your neon.ts.

Use with the Vercel AI SDK

The with-ai-sdk example deploys an agent as a Neon Function that streams text and generates images. The @ai-sdk/openai provider reads OPENAI_API_KEY and OPENAI_BASE_URL from the injected env automatically — no client config needed; just pick a catalog model:

import { openai } from "@ai-sdk/openai";
import { streamText } from "ai";

const result = streamText({
  model: openai("gpt-5-mini"),
  messages,
  tools: {
    image_generation: openai.tools.imageGeneration({
      outputFormat: "jpeg",
      size: "1024x1024",
    }),
  },
});
return result.toUIMessageStreamResponse();

For multi-provider routing from a single call, the dedicated @neondatabase/ai-sdk-provider reads NEON_AI_GATEWAY_BASE_URL + NEON_AI_GATEWAY_TOKEN and routes each model to the best endpoint (Anthropic → Messages, OpenAI/Codex → Responses, everything else → MLflow):

import { neon } from "@neondatabase/ai-sdk-provider/v1";
import { generateText } from "ai";

const { text } = await generateText({
  model: neon("claude-haiku-4-5"), // or gpt-5-3-codex, gemini-2-5-flash, ...
  prompt: "Summarize Postgres for me.",
});

Use with Mastra

The with-mastra example runs a memory-backed agent (threads/messages in Postgres via @mastra/pg) as a Neon Function, with its model pointed at the gateway. It reads env.aiGateway from parseEnv and uses the chat-completions (MLflow) dialect:

import { Agent } from "@mastra/core/agent";
import { parseEnv } from "@neondatabase/env/v1";
import config from "../neon";

const env = parseEnv(config);
const gatewayUrl = env.aiGateway.baseUrl.replace("/openai/v1", "/mlflow/v1");

export const personalAssistant = new Agent({
  id: "personal-assistant",
  name: "personal-assistant",
  instructions:
    "You are a warm, concise personal assistant with long-term memory.",
  model: {
    id: `neon/claude-haiku-4-5`,
    url: gatewayUrl,
    apiKey: env.aiGateway.apiKey,
  },
  memory,
});

Use with plain SDKs

The injected OPENAI_API_KEY and OPENAI_BASE_URL are OpenAI-standard, so new OpenAI() picks them up with zero config. Since OPENAI_BASE_URL is the OpenAI Responses dialect (/openai/v1), call the Responses API:

import OpenAI from "openai";

const client = new OpenAI(); // reads OPENAI_API_KEY + OPENAI_BASE_URL from the env

const res = await client.responses.create({
  model: "gpt-5-mini", // swap to claude-sonnet-4-6, gemini-2-5-flash, ...
  input: "What is Neon?",
});

For the unified chat-completions dialect (/mlflow/v1) instead, point the client at it. The ergonomic way is to swap the dialect on the injected base URL rather than rebuild it (same move the Mastra example makes):

const client = new OpenAI({
  baseURL: process.env.OPENAI_BASE_URL!.replace("/openai/v1", "/mlflow/v1"),
});

const res = await client.chat.completions.create({
  model: "claude-sonnet-4-6",
  messages: [{ role: "user", content: "What is Neon?" }],
});

The Anthropic SDK and google-genai work the same way for native provider features — point them at the /anthropic and /gemini routes on the bare gateway host (${NEON_AI_GATEWAY_BASE_URL}/ai-gateway/anthropic, ${NEON_AI_GATEWAY_BASE_URL}/ai-gateway/gemini).

Model identifiers

Use a model's catalog ID directly in the model field — e.g. claude-sonnet-4-6, gpt-5-mini, gemini-2-5-flash. No provider prefix is needed. To look up the exact identifiers the gateway serves, which underlying model each maps to, and their context windows, pricing, and capabilities, use any of:

Availability

The AI Gateway is a preview (early access) feature available only on new projects in the us-east-2 region; it can't be enabled on existing projects. Foundation model access requires a paid Neon plan. Confirm the user's project is a new project in us-east-2. If the user does not yet have access, point them to the private beta sign-up: https://neon.com/blog/were-building-backends#access

Neon Documentation

The Neon documentation is the source of truth and the AI Gateway is evolving rapidly, so always verify against the official docs. Any doc page can be fetched as markdown by appending .md to the URL or by requesting Accept: text/markdown. Find the right page from the docs index (https://neon.com/docs/llms.txt) and the changelog announcements.

Further reading

Files1
1 files · 11.1 KB

Select a file to preview

Overall Score

87/100

Grade

A

Excellent

Safety

88

Quality

88

Clarity

87

Completeness

84

Summary

The Neon AI Gateway skill guides users to integrate a unified LLM inference API built into Neon branches, supporting multiple providers (OpenAI, Anthropic, Google, Meta, Alibaba, DeepSeek) through a single credential and endpoint. The skill teaches configuration, SDK integration patterns (Vercel AI SDK, Mastra, plain SDKs), and model routing without requiring agent code generation or filesystem writes — it is purely educational documentation.

Static Analysis Findings

1 finding

Patterns detected by deterministic static analysis before AI scoring. Hover over any finding code for detailed information and remediation guidance.

Credential Exposure
SEC-020Direct .env File Access3x in 1 file

Direct .env file access

SKILL.md.env3x

Detected Capabilities

Read environment variables (OPENAI_API_KEY, OPENAI_BASE_URL, NEON_AI_GATEWAY_*)Configuration file modification (neon.ts)External documentation link navigation (neon.com, models.dev)

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

call an llmadd ai to appmodel routingneon ai gatewayclaude gpt geminione api multiple modelsllm proxychat completions

Risk Signals

INFO

References to .env file access for reading injected credentials (OPENAI_API_KEY, OPENAI_BASE_URL, NEON_AI_GATEWAY_TOKEN, NEON_AI_GATEWAY_BASE_URL)

SKILL.md: Environment variables section
INFO

Instruction to use `neonctl env pull` to write gateway credentials to .env/.env.local

SKILL.md: Environment variables section
INFO

External domain references: models.dev (model catalog), neon.com (official docs), www.apache.org (license only)

SKILL.md: Throughout; Further reading section

Referenced Domains

External domains referenced in skill content, detected by static analysis.

models.devneon.comwww.apache.org

Use Cases

  • Set up AI Gateway in a Neon project
  • Route between multiple LLM providers with one API
  • Integrate with Vercel AI SDK or Mastra
  • Switch LLM models without rewiring code
  • Call Claude, GPT, or Gemini via gateway endpoint
  • Branch-scoped AI request isolation in preview environments
  • Avoid juggling separate provider API keys

Quality Notes

  • Excellent scope definition: clearly states when to use (one credential, model portability, branch isolation) and when not to use (single-provider deep integrations)
  • Comprehensive SDK integration patterns: covers Vercel AI SDK, Mastra, plain OpenAI/Anthropic/google-genai SDKs with concrete code examples
  • Well-structured environment variable documentation: distinguishes between OpenAI-standard vars (OPENAI_API_KEY/OPENAI_BASE_URL) and Neon-branded aliases (NEON_AI_GATEWAY_*), and explains the two different base URL formats (one with /ai-gateway/openai/v1, one bare host)
  • Clear dialect routing guidance: explains /mlflow/v1 (chat-completions default), /openai/v1 (Responses API), /anthropic/v1, and /gemini/v1beta routes with practical URL construction
  • Strong availability and limitations section: explicitly states preview status, us-east-2 region requirement, new projects only, paid plan requirement
  • Authoritative documentation references: directs users to official Neon docs, models.dev canonical catalog, and provides markdown fetch pattern
  • Practical prerequisite: assumes user has Neon account and understands neon.ts config
  • Code examples are copy-paste ready and show both direct SDK usage and multi-provider routing patterns
  • Minor: SEC-020 findings are expected and appropriate (credential reading from injected env vars is the intended behavior, not a leak)
Model: claude-haiku-4-5-20251001Analyzed: Jun 12, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

Add neondatabase/neon-ai-gateway to your library

Command Palette

Search for a command to run...