GKE Cluster Creation
This reference guides creating GKE clusters. The golden path Autopilot configuration is the default for all new clusters.
MCP Tools:
list_clusters,create_cluster,get_cluster,list_operations,get_operation
Workflow
- Discover context: Use
list_clustersto see existing clusters. Usegcloud config get-value projectif project unknown. - Gather inputs: project_id, region, cluster_name, environment type
- Select mode: Autopilot (default) vs Standard
- Configure networking: auto-create subnet (default) or bring-your-own
- Review golden path settings: present the config and confirm with user
- Create: Use MCP
create_clustertool. Fall back togcloudCLI only if MCP is unavailable. - Track: Use
get_operationto monitor creation progress - Verify: Use
get_clusterwithreadMask="*"to confirm golden path settings applied
Mode Selection
| Criteria | Autopilot (Golden Path) | Standard |
|---|---|---|
| Node management | Google-managed | Self-managed |
| Pricing | Pay per pod resource | Pay per node (VM) |
| : : request : : | ||
| Node customization | Via ComputeClasses | Full control |
| DaemonSets | Allowed (with | Full control |
| : : restrictions) : : | ||
| GPU/TPU | Supported via | Supported via node pools |
| : : ComputeClasses : : | ||
| Best for | Most production workloads | Kernel tuning, custom OS, |
| : : : privileged workloads : |
Rule: Default to Autopilot unless the customer has a specific requirement that Autopilot cannot satisfy.
Templates
1. Golden Path Autopilot (Production)
This is the default. All settings match
../gke-golden-path/assets/golden-path-autopilot.yaml.
Via gcloud:
gcloud container clusters create-auto <CLUSTER_NAME> \
--region <REGION> \
--project <PROJECT_ID> \
--release-channel regular \
--enable-private-nodes \
--enable-master-authorized-networks \
--enable-dns-access \
--enable-secret-manager \
--secret-manager-rotation-interval=120s \
--scoped-rbs-bindings \
--monitoring=SYSTEM,API_SERVER,SCHEDULER,CONTROLLER_MANAGER,STORAGE,POD,DEPLOYMENT,STATEFULSET,DAEMONSET,HPA,CADVISOR,KUBELET,DCGM \
--quiet
Via MCP (create_cluster):
{
"parent": "projects/<PROJECT_ID>/locations/<REGION>",
"cluster": {
"name": "<CLUSTER_NAME>",
"autopilot": { "enabled": true },
"privateClusterConfig": { "enablePrivateNodes": true },
"masterAuthorizedNetworksConfig": {
"privateEndpointEnforcementEnabled": true
},
"releaseChannel": { "channel": "REGULAR" },
"secretManagerConfig": {
"enabled": true,
"rotationConfig": { "enabled": true, "rotationInterval": "120s" }
},
"rbacBindingConfig": {
"enableInsecureBindingSystemAuthenticated": false,
"enableInsecureBindingSystemUnauthenticated": false
}
}
}
2. Autopilot Dev/Test
Relaxes some golden path defaults for cost savings and easier access in non-production.
gcloud container clusters create-auto <CLUSTER_NAME> \
--region <REGION> \
--project <PROJECT_ID> \
--release-channel rapid \
--quiet
Warning: This does not apply golden path security hardening. Suitable for dev/test only.
3. Standard Regional (When Autopilot is Not an Option)
gcloud container clusters create <CLUSTER_NAME> \
--region <REGION> \
--project <PROJECT_ID> \
--num-nodes 3 \
--machine-type e2-standard-4 \
--disk-type pd-balanced \
--enable-autoscaling --min-nodes 1 --max-nodes 10 \
--enable-shielded-nodes --enable-secure-boot \
--workload-pool=<PROJECT_ID>.svc.id.goog \
--enable-private-nodes \
--enable-master-authorized-networks \
--enable-vertical-pod-autoscaling \
--enable-dataplane-v2 \
--release-channel regular \
--quiet
4. GPU/AI Workloads (Autopilot with ComputeClass)
Create a golden path Autopilot cluster, then apply a ComputeClass for GPU workloads:
# 1. Create golden path cluster (same as template 1)
gcloud container clusters create-auto <CLUSTER_NAME> \
--region <REGION> --project <PROJECT_ID> \
--enable-private-nodes --enable-master-authorized-networks \
--enable-dns-access --enable-secret-manager --scoped-rbs-bindings \
--quiet
# 2. Apply GPU ComputeClass (see gke-compute-classes.md)
kubectl apply -f gpu-compute-class.yaml
# 3. Or use GIQ for inference (see gke-inference.md)
gcloud container ai profiles manifests create \
--model=gemma-2-9b-it --model-server=vllm --accelerator-type=nvidia-l4 --quiet > inference.yaml
kubectl apply -f inference.yaml
Instructions
- ALWAYS ask for
project_idif not in context - ALWAYS ask for
region - ALWAYS ask for a unique
cluster_name - DEFAULT to golden path Autopilot unless customer specifies otherwise
- WARN about Day-0 decisions (networking, private nodes) that are hard to change later
- WARN about cost for GPU or multi-region clusters
- When using MCP
create_cluster, thecluster.nameshould be the short name (e.g.,my-cluster), not the full resource path