Catalog
google/agent-platform-rag-engine-management

google

agent-platform-rag-engine-management

Manage and query Agent Platform RAG Engine Corpora and retrieve grounded contexts using the Google GenAI SDK. Use when listing RAG corpora or files, inspecting a corpus, retrieving contexts, or generating content grounded in a RAG corpus. Do not use for standard database queries (use SQL/Spanner skills), Google Workspace RAG, or other RAG products like gRAG.

global
0installs0uses~1.7k
v1.0Saved May 27, 2026

Agent Platform RAG Engine Management

This skill provides instructions on how to interact with Agent Platform RAG Engine using the Agent Platform Python SDK. You MUST use the vertexai Python SDK to perform RAG Engine operations, rather than raw REST calls or MCP tools, because this code is intended to be run by external clients.

Phase 0: Environment Setup

CRITICAL: Before running any of the Python snippets below, you MUST ensure the environment is correctly initialized by following these steps:

  1. Google Cloud Authentication: Authenticate with your Google Cloud credentials and configure active Application Default Credentials (ADC) for Agent Platform access:
    gcloud auth login
    gcloud auth application-default login
    
  2. Virtual Environment: Create and activate a dedicated virtual environment:
    python3 -m venv ~/rag_agent_venv
    source ~/rag_agent_venv/bin/activate
    
  3. Install Dependencies: Install the required Agent Platform SDKs:
    pip install google-cloud-aiplatform google-genai
    
  4. Execution: Advise the user that every time they execute a Python snippet, they must ensure this virtual environment is activated first.

Workflow Decision Tree

  1. Information Gathering: Has the user provided the Project ID, Region, and Corpus ID?

    • No -> Proceed to [1. Listing Corpora and Files] to discover the necessary Resource Names and IDs. Only ask the user if discovery fails.
    • Yes -> Proceed.
  2. Task Type: What does the user want to do?

    • List Corpora and Files -> Proceed to [1. Listing Corpora and Files].
    • Inspect a Corpus -> Proceed to [2. Getting / Inspecting a RAG Engine Corpus].
    • Search for Contexts -> Proceed to [3. Retrieving Contexts].
    • Answer questions using RAG Engine -> Proceed to [4. Answering the User with Retrieved Context].

[!TIP] Placeholder Parameter Replacement: The Python scripts below use bracketed string placeholders (like "{project_id}", "{region}", and "{corpus_id}"). You MUST dynamically replace these placeholders with the actual Project ID, Region, and Corpus ID values provided in the user's prompt (or active context) before generating, providing, or executing the scripts.

1. Listing Corpora and Files (Discovery)

If you do not know the Resource Name of the corpus or file, you MUST list them first to discover them. The SDK handles pagination automatically when converted to a list, but you can also use manual pagination for large sets.

1.1 Listing and Discovering Corpora

import vertexai
from vertexai.preview import rag

vertexai.init(project="{project_id}", location="{region}")

# Approach A: List ALL (Automatic Pagination)
# The SDK's Pager iterates through all pages for you.
all_corpora = list(rag.list_corpora())
print(f"Found {len(all_corpora)} corpora in total.")
for c in all_corpora:
    print(f"Corpus Name: {c.name} | Display Name: {c.display_name}")

# Approach B: Manual Pagination (for very large projects)
pager = rag.list_corpora(page_size=10)
# Process first page
for c in pager:
    print(f"Corpus: {c.display_name}")

# Get next page if needed
if pager.next_page_token:
    second_page = rag.list_corpora(
        page_size=10, page_token=pager.next_page_token
    )

1.2 Listing and Discovering Files

To understand what files (and types) are in a corpus, list them and inspect the display_name (usually includes the extension).

import vertexai
from vertexai.preview import rag

vertexai.init(project="{project_id}", location="{region}")
corpus_name = (
    "projects/{project_id}/locations/{region}/ragCorpora/{corpus_id}"
)

# List files with automatic pagination
files = list(rag.list_files(corpus_name=corpus_name))
print(f"Found {len(files)} files.")

for f in files:
    # High-level SDK RagFile objects usually have name, display_name,
    # description
    print(f"File: {f.display_name} | Resource: {f.name}")
    # Tip: Check extension to understand file type (PDF, TXT, etc.)
    if f.display_name.lower().endswith(".pdf"):
        print("  Type: PDF")
    elif f.display_name.lower().endswith(".txt"):
        print("  Type: Plain Text")

2. Getting / Inspecting an Agent Platform RAG Engine Corpus

To retrieve details about an existing Agent Platform RAG Engine corpus:

import vertexai
from vertexai.preview import rag

vertexai.init(project="{project_id}", location="{region}")

# To get details of a specific corpus
corpus_name = (
    "projects/{project_id}/locations/{region}/ragCorpora/{corpus_id}"
)
corpus = rag.get_corpus(name=corpus_name)
print(f"Corpus Name: {corpus.name}")
print(f"Display Name: {corpus.display_name}")

3. Retrieving Contexts

To retrieve relevant contexts from a RAG Engine corpus based on a query:

import vertexai
from vertexai.preview import rag

vertexai.init(project="{project_id}", location="{region}")

corpus_name = (
    "projects/{project_id}/locations/{region}/ragCorpora/{corpus_id}"
)
query = "What is the speed of light?"

# Retrieve contexts
response = rag.retrieval_query(
    rag_corpora=[corpus_name],
    text=query,
    similarity_top_k=3
)

for context in response.contexts.contexts:
    print(f"Context text: {context.text}")
    print(f"Source: {context.source_uri}")

4. Answering the User with Retrieved Context

To use the retrieved context alongside an Agent Platform model to generate a grounded response:

from google import genai
from google.genai import types

client = genai.Client(enterprise=True, project="{project_id}", location="{region}")
corpus_name = (
    "projects/{project_id}/locations/{region}/ragCorpora/{corpus_id}"
)

# Define the Agent Platform RAG Engine tool pointing to the corpus
rag_tool = types.Tool(
    retrieval=types.Retrieval(
        vertex_rag_store=types.VertexRagStore(
            rag_resources=[types.VertexRagStoreRagResource(rag_corpus=corpus_name)],
            rag_retrieval_config=types.RagRetrievalConfig(
                top_k=3,
                filter=types.RagRetrievalConfigFilter(
                    vector_similarity_threshold=0.5,
                ),
            ),
        )
    )
)

# Generate content using the RAG Engine tool
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="What is the speed of light?",
    config=types.GenerateContentConfig(
        tools=[rag_tool]
    )
)
print(response.text)
Files1
1 files · 11.1 KB

Select a file to preview

Overall Score

82/100

Grade

B

Good

Safety

85

Quality

82

Clarity

83

Completeness

76

Summary

This skill provides structured guidance for interacting with Google Cloud's Agent Platform RAG Engine using the Python SDK. It covers discovering corpora and files, inspecting corpus details, retrieving contexts via semantic search, and generating grounded responses using retrieved context with the Gemini model.

Detected Capabilities

google cloud authenticationpython sdk executionapi queries to vertex aicontext retrieval from rag corporacontent generation with language models

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

query rag corpusretrieve context groundedlist rag filessemantic search documentsagent platform raggemini with raginspect corpus details

Risk Signals

INFO

Google Cloud Authentication via gcloud CLI

Phase 0: Environment Setup, step 1
INFO

Virtual environment activation required for dependency isolation

Phase 0: Environment Setup, step 2
INFO

Project ID, region, and corpus ID parameters required in scripts

Phase 0 and throughout sections 1-4
INFO

SDK-based API calls (not raw REST) to Vertex AI RAG Engine

Section 1-4, all code blocks
INFO

Client initialization with enterprise flag and project/location context

Section 4: Answering the User with Retrieved Context

Referenced Domains

External domains referenced in skill content, detected by static analysis.

www.apache.org

Use Cases

  • Query RAG corpora to find relevant documents for semantic search
  • List and inspect Agent Platform RAG Engine corpora and their contents
  • Retrieve contextual information from a RAG corpus using natural language queries
  • Generate AI responses grounded in RAG corpus data using Gemini models
  • Discover corpus and file resource names for Agent Platform RAG operations

Quality Notes

  • Excellent decision tree workflow (Workflow Decision Tree section) that guides agents through information gathering and task routing
  • Clear placeholder parameter replacement guidance ensures agents correctly substitute project IDs, regions, and corpus IDs before execution
  • Two pagination approaches provided (automatic and manual) with appropriate guidance on when to use each
  • Environment setup instructions are comprehensive and prioritized as CRITICAL, reducing execution failures
  • SDK-first design (emphasizing vertexai over REST) is well-justified and appropriate for external client execution
  • Code examples are well-commented and show both high-level and low-level API usage patterns
  • File type detection logic in section 1.2 provides practical guidance for handling different document types
  • Explicit guidance on what NOT to use this skill for (SQL/Spanner, Google Workspace RAG, gRAG) prevents scope creep
Model: claude-haiku-4-5-20251001Analyzed: May 27, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

Add google/agent-platform-rag-engine-management to your library

Command Palette

Search for a command to run...