Catalog
google/gke-cost

google

gke-cost

Optimizes GKE costs, rightsizes workloads, and configures Spot VMs and CUDs. Use when optimizing GKE costs, rightsizing GKE workloads, or configuring GKE Spot VMs. Don't use for general compute class provisioning or GPU Selection (use gke-compute-classes instead).

global
New~2.0k
v1.0Saved Jun 24, 2026

GKE Cost Optimization

This reference covers strategies for reducing GKE costs while maintaining the golden path security and reliability posture.

MCP Tools: get_k8s_resource, describe_k8s_resource, apply_k8s_manifest, patch_k8s_resource, get_cluster

Golden Path Cost Features

The golden path already includes cost-optimizing settings:

Setting Value Impact
autoscalingProfile OPTIMIZE_UTILIZATION Aggressive node
: : : scale-down reduces idle :
: : : compute :
verticalPodAutoscaling enabled VPA recommendations
: : : prevent :
: : : over-provisioning :
Autopilot pricing Pay per pod request No charge for unused
: : : node capacity :
Node Auto Provisioning enabled Right-sized node pools
: : : created automatically :

Cost Optimization Strategies

1. Spot VMs via ComputeClasses

Use Spot VMs for fault-tolerant workloads (60-90% cost reduction).

apiVersion: cloud.google.com/v1
kind: ComputeClass
metadata:
  name: spot-with-fallback
spec:
  activeMigration:
    optimizeRulePriority: true
  priorities:
  - machineFamily: n4
    spot: true
  - machineFamily: n4
    spot: false

Spot-suitable workloads:

Workload Spot-Suitable?
Batch / data processing Yes
Dev / test environments Yes
Stateless web/API (replicas >= 2) Yes (with PDBs)
Jobs with checkpointing Yes
Stateful workloads (databases) No
Single-replica critical services No

Handling eviction:

spec:
  template:
    spec:
      terminationGracePeriodSeconds: 25  # Must be < 30s for Spot
      containers:
      - name: app
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 5"]

1.1 Direct Workload Spot Selection (NodeSelector)

For stateless or batch workloads in GKE Autopilot, you do not need to create a custom ComputeClass to use Spot VMs. You can target the built-in Spot capacity directly by adding a nodeSelector to your Pod or Deployment spec.

[!WARNING] Preemption Warning: Spot VMs are interruptible and can be preempted at any time with a 30-second notice. Workloads must be fault-tolerant and run with at least 2 replicas for high availability. Always explicitly warn users about this preemption risk when recommending Spot VMs.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: stateless-spot-app
spec:
  replicas: 2
  template:
    spec:
      nodeSelector:
        cloud.google.com/gke-provisioning: Spot
      terminationGracePeriodSeconds: 25  # Must be < 30s to allow graceful shutdown before preemption
      containers:
      - name: app
        image: <IMAGE>

2. Pod Rightsizing

Use VPA recommendations to reduce over-provisioned requests.

# 1. Deploy VPA in recommendation mode
kubectl apply -f - <<EOF
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: <DEPLOYMENT>-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: <DEPLOYMENT>
  updatePolicy:
    updateMode: "Off"
EOF

# 2. Wait 24+ hours for data collection

# 3. Read recommendations
kubectl get vpa <DEPLOYMENT>-vpa -o jsonpath='{.status.recommendation}'

Optimization rules:

Condition Action Savings
CPU request >5x P95 actual Reduce to P95 * 1.2 High
Memory request >3x P95 actual Reduce to P95 * 1.2 High
CPU request >2x P95 actual Reduce to P95 * 1.2 Medium
No resource requests set Add requests (enables bin-packing) Medium

3. Machine Type Selection

Family Use Case Relative Cost
e2 General purpose, burstable Lowest
t2a / t2d Scale-out (Arm/AMD), price-performance Low
: : optimized : :
n4a Axion Arm-based, general-purpose Low
: : price-performance : :
n4 / n4d General purpose (Intel/AMD), flexible shapes Low-Medium
c4a Compute-optimized (Arm), high efficiency Medium-High
c3 / c4 Compute-optimized (Intel) Medium-High
c3d / c4d Compute-optimized (AMD), high-performance Medium-High
: : throughput : :
ek-standard Autopilot enhanced (golden path) Medium
m3 / x4 Memory-optimized, SAP HANA, large databases High
g2 (L4 GPU) AI inference High
a3 (H100 GPU) AI training Highest
a4 / a4x Ultra-scale AI (Blackwell GPUs) Highest

In Autopilot, machine type is managed. Use ComputeClasses to influence selection.

4. Committed Use Discounts (CUDs)

For steady-state workloads, purchase 1-year or 3-year CUDs:

  • 1-year: ~20-30% discount
  • 3-year: ~50-55% discount
  • Applied automatically to matching usage in the region
  • Purchase via Google Cloud Console > Billing > Committed use discounts

5. Cluster Management

  • Stop/start dev clusters: Idle dev clusters cost money even with no workloads (control plane fee).
  • Right-size node pools (Standard): Use Cluster Autoscaler with appropriate min/max.
  • Multi-tenant clusters: Share a single cluster across teams instead of per-team clusters (see the gke-multitenancy skill).

Cost Monitoring

# View cluster cost breakdown (requires Cost Management API)
gcloud billing budgets list --billing-account=<BILLING_ACCOUNT> --quiet

# View node utilization
kubectl top nodes

# View pod resource usage vs requests
kubectl top pods --all-namespaces --containers

Dev/Test Cost Savings

For non-production environments, these golden path deviations are acceptable:

| Setting | Production (Golden | Dev/Test | : : Path) : : | ----------------------- | ------------------ | ----------------------------- | | Cluster mode | Autopilot | Autopilot (cheaper with fewer | : : : pods) : | Release channel | Regular | Rapid (get fixes faster) | | Private nodes | Required | Optional (simpler access) | | Monitoring components | Full suite | SYSTEM_COMPONENTS only | | Secret Manager rotation | 120s | Disabled | | Maintenance windows | Configured | Not needed |

Files1
1 files · 11.1 KB

Select a file to preview

Overall Score

82/100

Grade

B

Good

Safety

85

Quality

82

Clarity

85

Completeness

78

Summary

This skill provides GKE cost optimization guidance, covering Spot VM configuration, pod rightsizing via VPA, machine type selection, and commitment discounts. It teaches agents to read cluster metrics, analyze utilization patterns, and recommend cost-reduction strategies within the golden path security model.

Detected Capabilities

kubernetes resource queryingkubernetes manifest analysiskubernetes metrics readinggcloud billing command executioncost analysis and recommendationcluster utilization analysis

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

optimize gke costsspot vmspod rightsizingcommitted use discountsmachine type selectiongke cost analysiscluster utilizationvpa recommendations

Risk Signals

INFO

Use of gcloud billing commands (gcloud billing budgets list)

Cost Monitoring section
INFO

Kubernetes resource read operations (kubectl get, kubectl top)

Cost Monitoring section, Pod Rightsizing section
INFO

Kubernetes manifest application (kubectl apply)

Pod Rightsizing section
INFO

Read access to cluster state via get_k8s_resource, describe_k8s_resource

MCP Tools declaration

Referenced Domains

External domains referenced in skill content, detected by static analysis.

www.apache.org

Use Cases

  • Reduce GKE compute costs by 60-90% using Spot VMs for fault-tolerant workloads
  • Right-size over-provisioned pods based on VPA recommendations and P95 utilization metrics
  • Compare machine families and select cost-optimal instances for different workload types
  • Plan and purchase Committed Use Discounts to achieve 20-55% sustained savings
  • Analyze idle dev clusters and recommend shutdown or consolidation strategies

Quality Notes

  • Comprehensive coverage of GKE cost strategies with clear, actionable guidance
  • Well-structured sections with decision tables comparing options (Spot suitability, machine families, optimization rules)
  • Explicit warnings about Spot VM preemption risk and minimum replica requirements
  • Golden path security context provided as reference baseline
  • Clear distinction between production and dev/test acceptable deviations
  • Good use of YAML examples for VPA and ComputeClass configurations
  • VPA workflow documented step-by-step with wait time guidance
  • Cost savings quantified (e.g., Spot VMs 60-90%, CUDs 20-55%)
  • Scope limitations clearly stated: excludes GPU selection and general compute class provisioning
  • Machine type table includes relative cost positioning and use case mapping
Model: claude-haiku-4-5-20251001Analyzed: Jun 24, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

Add google/gke-cost to your library

Command Palette

Search for a command to run...