GKE Cost Optimization

This reference covers strategies for reducing GKE costs while maintaining the golden path security and reliability posture.

MCP Tools: get_k8s_resource, describe_k8s_resource, apply_k8s_manifest, patch_k8s_resource, get_cluster

Golden Path Cost Features

The golden path already includes cost-optimizing settings:

Setting	Value	Impact
`autoscalingProfile`	`OPTIMIZE_UTILIZATION`	Aggressive node
: : : scale-down reduces idle :
: : : compute :
`verticalPodAutoscaling`	`enabled`	VPA recommendations
: : : prevent :
: : : over-provisioning :
Autopilot pricing	Pay per pod request	No charge for unused
: : : node capacity :
Node Auto Provisioning	enabled	Right-sized node pools
: : : created automatically :

Cost Optimization Strategies

1. Spot VMs via ComputeClasses

Use Spot VMs for fault-tolerant workloads (60-90% cost reduction).

apiVersion: cloud.google.com/v1
kind: ComputeClass
metadata:
  name: spot-with-fallback
spec:
  activeMigration:
    optimizeRulePriority: true
  priorities:
  - machineFamily: n4
    spot: true
  - machineFamily: n4
    spot: false

Spot-suitable workloads:

Workload	Spot-Suitable?
Batch / data processing	Yes
Dev / test environments	Yes
Stateless web/API (replicas >= 2)	Yes (with PDBs)
Jobs with checkpointing	Yes
Stateful workloads (databases)	No
Single-replica critical services	No

Handling eviction:

spec:
  template:
    spec:
      terminationGracePeriodSeconds: 25  # Must be < 30s for Spot
      containers:
      - name: app
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 5"]

1.1 Direct Workload Spot Selection (NodeSelector)

For stateless or batch workloads in GKE Autopilot, you do not need to create a custom ComputeClass to use Spot VMs. You can target the built-in Spot capacity directly by adding a nodeSelector to your Pod or Deployment spec.

[!WARNING] Preemption Warning: Spot VMs are interruptible and can be preempted at any time with a 30-second notice. Workloads must be fault-tolerant and run with at least 2 replicas for high availability. Always explicitly warn users about this preemption risk when recommending Spot VMs.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: stateless-spot-app
spec:
  replicas: 2
  template:
    spec:
      nodeSelector:
        cloud.google.com/gke-provisioning: Spot
      terminationGracePeriodSeconds: 25  # Must be < 30s to allow graceful shutdown before preemption
      containers:
      - name: app
        image: <IMAGE>

2. Pod Rightsizing

Use VPA recommendations to reduce over-provisioned requests.

# 1. Deploy VPA in recommendation mode
kubectl apply -f - <<EOF
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: <DEPLOYMENT>-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: <DEPLOYMENT>
  updatePolicy:
    updateMode: "Off"
EOF

# 2. Wait 24+ hours for data collection

# 3. Read recommendations
kubectl get vpa <DEPLOYMENT>-vpa -o jsonpath='{.status.recommendation}'

Optimization rules:

Condition	Action	Savings
CPU request >5x P95 actual	Reduce to `P95 * 1.2`	High
Memory request >3x P95 actual	Reduce to `P95 * 1.2`	High
CPU request >2x P95 actual	Reduce to `P95 * 1.2`	Medium
No resource requests set	Add requests (enables bin-packing)	Medium

3. Machine Type Selection

Family	Use Case	Relative Cost
e2	General purpose, burstable	Lowest
t2a / t2d	Scale-out (Arm/AMD), price-performance	Low
: : optimized : :
n4a	Axion Arm-based, general-purpose	Low
: : price-performance : :
n4 / n4d	General purpose (Intel/AMD), flexible shapes	Low-Medium
c4a	Compute-optimized (Arm), high efficiency	Medium-High
c3 / c4	Compute-optimized (Intel)	Medium-High
c3d / c4d	Compute-optimized (AMD), high-performance	Medium-High
: : throughput : :
ek-standard	Autopilot enhanced (golden path)	Medium
m3 / x4	Memory-optimized, SAP HANA, large databases	High
g2 (L4 GPU)	AI inference	High
a3 (H100 GPU)	AI training	Highest
a4 / a4x	Ultra-scale AI (Blackwell GPUs)	Highest

In Autopilot, machine type is managed. Use ComputeClasses to influence selection.

4. Committed Use Discounts (CUDs)

For steady-state workloads, purchase 1-year or 3-year CUDs:

1-year: ~20-30% discount
3-year: ~50-55% discount
Applied automatically to matching usage in the region
Purchase via Google Cloud Console > Billing > Committed use discounts

5. Cluster Management

Stop/start dev clusters: Idle dev clusters cost money even with no workloads (control plane fee).
Right-size node pools (Standard): Use Cluster Autoscaler with appropriate min/max.
Multi-tenant clusters: Share a single cluster across teams instead of per-team clusters (see the gke-multitenancy skill).

Cost Monitoring

# View cluster cost breakdown (requires Cost Management API)
gcloud billing budgets list --billing-account=<BILLING_ACCOUNT> --quiet

# View node utilization
kubectl top nodes

# View pod resource usage vs requests
kubectl top pods --all-namespaces --containers

Dev/Test Cost Savings

For non-production environments, these golden path deviations are acceptable:

| Setting | Production (Golden | Dev/Test | : : Path) : : | ----------------------- | ------------------ | ----------------------------- | | Cluster mode | Autopilot | Autopilot (cheaper with fewer | : : : pods) : | Release channel | Regular | Rapid (get fixes faster) | | Private nodes | Required | Optional (simpler access) | | Monitoring components | Full suite | SYSTEM_COMPONENTS only | | Secret Manager rotation | 120s | Disabled | | Maintenance windows | Configured | Not needed |

Files1

1 files · 11.1 KB

Select a file to preview

Overall Score

82/100

Grade

B

Good

Safety

85

Quality

82

Clarity

85

Completeness

78

Summary

This skill provides GKE cost optimization guidance, covering Spot VM configuration, pod rightsizing via VPA, machine type selection, and commitment discounts. It teaches agents to read cluster metrics, analyze utilization patterns, and recommend cost-reduction strategies within the golden path security model.

Detected Capabilities

kubernetes resource queryingkubernetes manifest analysiskubernetes metrics readinggcloud billing command executioncost analysis and recommendationcluster utilization analysis

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

optimize gke costsspot vmspod rightsizingcommitted use discountsmachine type selectiongke cost analysiscluster utilizationvpa recommendations

Risk Signals

INFO

Use of gcloud billing commands (gcloud billing budgets list)

Cost Monitoring section

INFO

Kubernetes resource read operations (kubectl get, kubectl top)

Cost Monitoring section, Pod Rightsizing section

INFO

Kubernetes manifest application (kubectl apply)

Pod Rightsizing section

INFO

Read access to cluster state via get_k8s_resource, describe_k8s_resource

MCP Tools declaration

Referenced Domains

External domains referenced in skill content, detected by static analysis.

www.apache.org

Use Cases

Reduce GKE compute costs by 60-90% using Spot VMs for fault-tolerant workloads
Right-size over-provisioned pods based on VPA recommendations and P95 utilization metrics
Compare machine families and select cost-optimal instances for different workload types
Plan and purchase Committed Use Discounts to achieve 20-55% sustained savings
Analyze idle dev clusters and recommend shutdown or consolidation strategies

Quality Notes

Comprehensive coverage of GKE cost strategies with clear, actionable guidance
Well-structured sections with decision tables comparing options (Spot suitability, machine families, optimization rules)
Explicit warnings about Spot VM preemption risk and minimum replica requirements
Golden path security context provided as reference baseline
Clear distinction between production and dev/test acceptable deviations
Good use of YAML examples for VPA and ComputeClass configurations
VPA workflow documented step-by-step with wait time guidance
Cost savings quantified (e.g., Spot VMs 60-90%, CUDs 20-55%)
Scope limitations clearly stated: excludes GPU selection and general compute class provisioning
Machine type table includes relative cost positioning and use case mapping

Model: claude-haiku-4-5-20251001Analyzed: Jun 24, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

gke-cost

GKE Cost Optimization

Golden Path Cost Features

Cost Optimization Strategies

1. Spot VMs via ComputeClasses

1.1 Direct Workload Spot Selection (NodeSelector)

2. Pod Rightsizing

3. Machine Type Selection

4. Committed Use Discounts (CUDs)

5. Cluster Management

Cost Monitoring

Dev/Test Cost Savings

Summary

Detected Capabilities

Trigger Keywords

Risk Signals

Referenced Domains

Use Cases

Quality Notes

Reviews

Command Palette