Gemini Interactions API Skill

This skill provides instructions for authenticating, connecting to, and utilizing the stateful, server-managed Gemini Interactions API on Gemini Enterprise Agent Platform.

The Interactions API is the modern, recommended way to execute Generative AI agent conversations, background research tasks, multi-turn chats, and structured, multi-step workflows.

[!IMPORTANT] CRITICAL: Unified SDK & Latest Models

Unified SDK: Use the Google Gen AI SDK (google-genai >= 2.0.0 for Python, @google/genai >= 2.0.0 for JS/TS). Legacy SDKs like google-cloud-aiplatform, @google-cloud/vertexai, and google-generativeai are strictly unsupported for Interactions.

Latest Models Only: Use gemini-3.1-pro-preview, gemini-3.1-flash-lite, gemini-3-flash-preview, gemini-2.5-pro, or gemini-2.5-flash. Refer to the latest model versions to check for new updates. Legacy models (gemini-2.0-*, gemini-1.5-*) are deprecated and do not support interactions.

Turn-Scoped Parameters: Parameters like tools, system_instruction, and generation_config are turn-scoped. They MUST be passed with each interaction request.

1. Authentication

Before running any code, ensure you are authenticated with Application Default Credentials (ADC) and have the necessary API enabled.

Login:
```
gcloud auth application-default login
```

Enable API (if not already enabled):

gcloud services enable aiplatform.googleapis.com

2. Client Initialization

You can initialize the client using environment variables (recommended) or by passing explicit configuration parameters.

Option A: Environment Variables (Recommended)

Configure environment variables to let the SDK automatically resolve settings:

export GOOGLE_GENAI_USE_ENTERPRISE=true
export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="global"

Python

from google import genai

# The SDK automatically picks up the environment variables
client = genai.Client()

TypeScript/JavaScript

import { GoogleGenAI } from "@google/genai";

// The SDK automatically picks up the environment variables
const ai = new GoogleGenAI();

Option B: Explicit Inline Parameters

Alternatively, pass configuration values directly inside your code:

Python

from google import genai
import google.auth

_, project_id = google.auth.default()
client = genai.Client(enterprise=True, project=project_id, location="global")

TypeScript/JavaScript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({
    enterprise: {
        project: "your-project-id",
        location: "global"
    }
});

3. Core Interactions API Usage

Quick Start (Single-Turn)

Submit a single prompt and read the final text response. Under the modern schema, output content is retrieved from the steps list.

Python

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Explain serverless computing in one sentence."
)
# Output text is located under steps
print(interaction.steps[-1].content[0].text)

TypeScript/JavaScript

const interaction = await ai.interactions.create({
    model: "gemini-3-flash-preview",
    input: "Explain serverless computing in one sentence."
});
console.log(interaction.steps[interaction.steps.length - 1].content[0].text);

Stateful Conversation (Multi-Turn)

Interactions are stateful by default. Store the conversation state in the cloud and reference it in the subsequent turn using previous_interaction_id.

Python

# Turn 1: Introduce ourselves
turn1 = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Hi! My name is John. I am working on AI agents.",
    store=True
)
print(f"Turn 1: {turn1.steps[-1].content[0].text}")

# Turn 2: Refer back to the stored turn state
turn2 = client.interactions.create(
    model="gemini-3-flash-preview",
    input="What is my name?",
    previous_interaction_id=turn1.id
)
print(f"Turn 2: {turn2.steps[-1].content[0].text}")

TypeScript/JavaScript

// Turn 1
const turn1 = await ai.interactions.create({
    model: "gemini-3-flash-preview",
    input: "Hi! My name is John. I am working on AI agents.",
    store: true
});

// Turn 2
const turn2 = await ai.interactions.create({
    model: "gemini-3-flash-preview",
    input: "What is my name?",
    previousInteractionId: turn1.id
});
console.log(turn2.steps[turn2.steps.length - 1].content[0].text);

Real-Time Streaming

Stream responses in real-time. Passing stream=True returns an iterable chunk generator.

Python

response = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Write a short poem about debugging.",
    stream=True
)

for chunk in response:
    if chunk.steps:
        step = chunk.steps[-1]
        if step.content and step.content[0].text:
            print(step.content[0].text, end="", flush=True)
print()

TypeScript/JavaScript

const responseStream = await ai.interactions.create({
    model: "gemini-3-flash-preview",
    input: "Write a short poem about debugging.",
    stream: true
});

for await (const chunk of responseStream) {
    if (chunk.steps) {
        const step = chunk.steps[chunk.steps.length - 1];
        if (step.content && step.content[0].text) {
            process.stdout.write(step.content[0].text);
        }
    }
}
console.log();

Structured Output (Pydantic / Polymorphic `response_format`)

Retrieve structured, type-safe JSON matching a schema. Under the modern Interactions API, a polymorphic response_format argument directly takes the target schema structure.

Python

from pydantic import BaseModel, Field

class Book(BaseModel):
    title: str = Field(description="The title of the book")
    author: str = Field(description="The book's author")
    year_published: int

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Recommend one famous sci-fi book.",
    response_format=Book
)

# The text will be a valid JSON matching the Book schema
print(interaction.steps[-1].content[0].text)

TypeScript/JavaScript

import { Type } from "@google/genai";

const BookSchema = {
    type: Type.OBJECT,
    properties: {
        title: { type: Type.STRING, description: "The title of the book" },
        author: { type: Type.STRING, description: "The book's author" },
        yearPublished: { type: Type.INTEGER }
    },
    required: ["title", "author", "yearPublished"]
};

const interaction = await ai.interactions.create({
    model: "gemini-3-flash-preview",
    input: "Recommend one famous sci-fi book.",
    responseFormat: BookSchema
});

console.log(interaction.steps[interaction.steps.length - 1].content[0].text);

Function Calling (Agent Tool Use)

Define local tools (functions) and submit execution results to the stateful interaction history.

Python

def get_stock_price(ticker: str) -> float:
    """Gets the stock price for a given ticker symbol."""
    if ticker.upper() == "GOOG":
        return 175.50
    return 100.0

# Turn 1: Pass tools to the model
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="What is the stock price of GOOG?",
    tools=[get_stock_price]
)

last_step = interaction.steps[-1]
# Check if the model requested a function call
if last_step.tool_calls:
    for call in last_step.tool_calls:
        if call.name == "get_stock_price":
            ticker_arg = call.args.get("ticker")
            price = get_stock_price(ticker_arg)

            # Turn 2: Submit function execution result statefully
            final_turn = client.interactions.create(
                model="gemini-3-flash-preview",
                input=f"The stock price for {ticker_arg} is ${price}.",
                previous_interaction_id=interaction.id
            )
            print(final_turn.steps[-1].content[0].text)

TypeScript/JavaScript

import { Type } from "@google/genai";

// Define local tool
function getStockPrice({ ticker }: { ticker: string }): number {
    if (ticker.toUpperCase() === "GOOG") {
        return 175.50;
    }
    return 100.00;
}

// Turn 1: Pass tools to the model
const interaction = await ai.interactions.create({
    model: "gemini-3-flash-preview",
    input: "What is the stock price of GOOG?",
    tools: [{
        functionDeclarations: [{
            name: "getStockPrice",
            description: "Gets the stock price for a given ticker symbol.",
            parameters: {
                type: Type.OBJECT,
                properties: {
                    ticker: { type: Type.STRING, description: "The stock ticker symbol" }
                },
                required: ["ticker"]
            }
        }]
    }]
});

const lastStep = interaction.steps[interaction.steps.length - 1];
// Check if the model requested a function call
if (lastStep.toolCalls) {
    for (const call of lastStep.toolCalls) {
        if (call.name === "getStockPrice") {
            const tickerArg = call.args.ticker as string;
            const price = getStockPrice({ ticker: tickerArg });

            // Turn 2: Submit function execution result statefully
            const finalTurn = await ai.interactions.create({
                model: "gemini-3-flash-preview",
                input: `The stock price for ${tickerArg} is $${price}.`,
                previousInteractionId: interaction.id
            });
            console.log(finalTurn.steps[finalTurn.steps.length - 1].content[0].text);
        }
    }
}

4. Accessing the Interactions API via REST

For shell-based scripts, debugging, or non-Python/JS environments, you can communicate with the stateful Interactions API directly using raw HTTP/REST requests via curl.

1. REST Endpoint

The REST API endpoint for interactions is:

POST https://aiplatform.googleapis.com/v1beta1/projects/{PROJECT_ID}/locations/{LOCATION}/interactions

LOCATION: Use global (or custom region if required).
PROJECT_ID: Your Google Cloud Project ID.

2. Set up Variables & Authentication Header

Set your target agent ID (e.g., model or custom agent path) and access token generated from Application Default Credentials:

AGENT_ID="your-agent-id"
ACCESS_TOKEN=$(gcloud auth print-access-token)

3. Single-Turn Interaction Payload

Send a request to start an interaction using the agent variable:

curl -X POST "https://aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/global/interactions" \
  -H "Authorization: Bearer ${ACCESS_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "'"${AGENT_ID}"'",
    "input": [{
      "role": "user",
      "content": [{
        "type": "text",
        "text": "Explain serverless computing in one sentence."
      }]
    }]
  }'

Response Example

A synchronous POST request returns a JSON object containing the conversation step details and unique identifiers:

{
  "id": "your-interaction-id",
  "status": "completed",
  "steps": [
    {
      "role": "model",
      "content": [
        {
          "type": "text",
          "text": "Serverless computing is a cloud execution model where the cloud provider dynamically manages the allocation and provisioning of servers, charging customers based on actual usage rather than pre-purchased capacity."
        }
      ]
    }
  ],
  "usage": {
    "total_tokens": 24751,
    "total_input_tokens": 23894,
    "total_output_tokens": 857
  },
  "created": "2026-05-08T10:44:43Z",
  "updated": "2026-05-08T10:44:43Z",
  "environment_id": "your-environment-id",
  "object": "interaction"
}

4. Multi-Turn Stateful Interaction Payload

To continue an existing conversation statefully, specify the previous_interaction_id in the JSON payload:

curl -X POST "https://aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/global/interactions" \
  -H "Authorization: Bearer ${ACCESS_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "'"${AGENT_ID}"'",
    "store": true,
    "previous_interaction_id": "YOUR_PREVIOUS_INTERACTION_ID",
    "input": [{
      "role": "user",
      "content": [{
        "type": "text",
        "text": "Can you elaborate on that?"
      }]
    }]
  }'

5. Streaming Output Payload

To stream updates in real time (Server-Sent Events format), pass "stream": true in the payload:

curl -X POST "https://aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/global/interactions" \
  -H "Authorization: Bearer ${ACCESS_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "'"${AGENT_ID}"'",
    "stream": true,
    "input": [{
      "role": "user",
      "content": [{
        "type": "text",
        "text": "Write a long story about space travel."
      }]
    }]
  }'

The endpoint will return a chunked stream where each event begins with data: containing JSON updates with the event_type and step contents.

How curl handles streaming: By default, when "stream": true is passed, the server responds with Transfer-Encoding: chunked and Content-Type: text/event-stream (Server-Sent Events). curl will automatically keep the connection open and print the incoming data chunks to stdout in real time as they are pushed by the server. The user does not need to poll or pull further; the complete sequence of events streams continuously until completion.

Files1

1 files · 11.1 KB

Select a file to preview

Overall Score

82/100

Grade

B

Good

Safety

80

Quality

85

Clarity

88

Completeness

78

Summary

This skill provides comprehensive guidance for using the Gemini Interactions API on Google Cloud's Enterprise Agent Platform. It covers authentication, client initialization, core API usage patterns (single-turn, multi-turn stateful conversations, streaming, structured output, and function calling), and REST-based interactions for shell environments. The skill includes practical code examples in Python and TypeScript/JavaScript.

Detected Capabilities

API authentication (gcloud ADC)Google Cloud API calls (aiplatform.googleapis.com)Environment variable configurationSDK client initialization (Python, TypeScript)REST/HTTP requests via cURLBearer token authenticationStreaming response handling

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

gemini interactions apimulti-turn conversationstateful agentfunction callingstructured outputgemini streamingagent platform

Risk Signals

INFO

Bearer token in curl requests (gcloud auth print-access-token)

Section 4.2 and curl examples

INFO

Environment variables for project ID and location

Section 2.1

INFO

Network requests to aiplatform.googleapis.com

Section 4.1 REST endpoint

Referenced Domains

External domains referenced in skill content, detected by static analysis.

aiplatform.googleapis.comdocs.cloud.google.comwww.apache.org

Use Cases

Build multi-turn AI agent conversations with state management
Execute structured AI workflows with function calling and tool use
Stream real-time responses from Gemini models
Generate type-safe JSON responses using Pydantic schemas
Debug and test Gemini API interactions via REST/cURL
Implement background AI research tasks on the Agent Platform

Quality Notes

Excellent: Clear separation of client initialization methods (environment variables vs. explicit parameters)
Excellent: Comprehensive code examples in both Python and TypeScript with consistent patterns
Excellent: Well-structured sections covering authentication, initialization, single-turn, multi-turn, streaming, structured output, and function calling
Excellent: Clear callouts for SDK version requirements and model selection constraints
Excellent: REST API section provides complete cURL examples for developers without SDK support
Good: Error handling patterns demonstrated implicitly through tool_calls checks (Python and TypeScript)
Minor: No explicit error handling guidance for common failure modes (auth failures, rate limiting, token expiration)
Minor: No deprecation timeline or migration path from legacy SDKs mentioned (only that they're unsupported)

Model: claude-haiku-4-5-20251001Analyzed: May 19, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

gemini-interactions-api

Gemini Interactions API Skill

1. Authentication

2. Client Initialization

Option A: Environment Variables (Recommended)

Python

TypeScript/JavaScript

Option B: Explicit Inline Parameters

Python

TypeScript/JavaScript

3. Core Interactions API Usage

Quick Start (Single-Turn)

Python

TypeScript/JavaScript

Stateful Conversation (Multi-Turn)

Python

TypeScript/JavaScript

Real-Time Streaming

Python

TypeScript/JavaScript

Structured Output (Pydantic / Polymorphic response_format)

Python

TypeScript/JavaScript

Function Calling (Agent Tool Use)

Python

TypeScript/JavaScript

4. Accessing the Interactions API via REST

1. REST Endpoint

2. Set up Variables & Authentication Header

3. Single-Turn Interaction Payload

Response Example

4. Multi-Turn Stateful Interaction Payload

5. Streaming Output Payload

Summary

Detected Capabilities

Trigger Keywords

Risk Signals

Referenced Domains

Use Cases

Quality Notes

Reviews

Command Palette

Structured Output (Pydantic / Polymorphic `response_format`)