Reference

Domain 3: Deploying and Implementing a Cloud Solution (~25%)

Domain 3 is the largest domain on the Associate Cloud Engineer exam, accounting for approximately 25% of the total score -- roughly 13 to 15 questions. This domain is purely practical: you must know how to deploy compute, containers, serverless, data, and networking resources using both the Cloud Console and gcloud CLI. The exam covers six sub-domains spanning Compute Engine, GKE, Cloud Run, Cloud Functions, data products, networking, and infrastructure as code.


3.1 Deploying and Implementing Compute Engine Resources

Creating a Compute Instance

A Compute Engine VM instance is the fundamental compute resource on Google Cloud. You must know both Console and CLI creation paths.

Required parameters for every instance:

Parameter Description Default
Name Unique within the project + zone None (required)
Zone The zone where the VM runs Depends on project/config defaults
Machine type CPU/memory combination (e.g., e2-medium, n2-standard-4) e2-medium
Boot disk image OS image (Debian, Ubuntu, CentOS, Windows, etc.) Debian
Network/Subnet VPC and subnet for the primary NIC default network

gcloud CLI -- create a basic instance:

gcloud compute instances create my-vm \
  --zone=us-central1-a \
  --machine-type=e2-medium \
  --image-family=debian-12 \
  --image-project=debian-cloud \
  --boot-disk-size=20GB \
  --boot-disk-type=pd-balanced

Key configuration categories the exam tests:

  • Disks: Boot disk (image, size, type: pd-standard, pd-balanced, pd-ssd, pd-extreme), additional persistent disks, local SSDs
  • Availability policy: --maintenance-policy=MIGRATE|TERMINATE (on-host maintenance), --restart-on-failure (automatic restart), provisioning model (--provisioning-model=STANDARD|SPOT)
  • SSH keys: Project-wide metadata keys vs. instance-specific keys via --metadata=ssh-keys=USERNAME:KEY
  • Service account: --service-account=SA_EMAIL with --scopes= to limit API access
  • Labels: --labels=env=prod,team=backend for resource organization
  • Startup script: --metadata-from-file=startup-script=script.sh
  • Preemptible/Spot VMs: Use --provisioning-model=SPOT for fault-tolerant workloads at up to 91% discount; instances can be terminated with 30-second notice

The required IAM role is roles/compute.instanceAdmin.v1, which bundles compute.instances.create and related permissions.

Exam trap: --provisioning-model=SPOT replaces the deprecated --preemptible flag. Spot VMs have no maximum runtime (unlike legacy preemptible VMs, which had a 24-hour limit). Both can be reclaimed at any time.

Managed Instance Groups (MIGs) and Autoscaling

A managed instance group manages a fleet of identical VMs based on an instance template. The exam tests creation, autoscaling configuration, and update strategies.

Instance templates define the VM blueprint (machine type, image, disks, network). Templates are immutable -- to change configuration, create a new template and update the MIG.

Zonal vs. Regional MIGs:

Feature Zonal MIG Regional MIG
Scope Single zone Multiple zones in one region
Default max VMs 1,000 2,000
Use case Dev/test, single-zone workloads Production HA workloads
Zone selection Fixed at creation Zones selected at creation, locked after

Creating a MIG with autoscaling:

# Create instance template
gcloud compute instance-templates create my-template \
  --machine-type=e2-medium \
  --image-family=debian-12 \
  --image-project=debian-cloud

# Create managed instance group
gcloud compute instance-groups managed create my-mig \
  --template=my-template \
  --size=2 \
  --zone=us-central1-a

# Configure autoscaling
gcloud compute instance-groups managed set-autoscaling my-mig \
  --zone=us-central1-a \
  --min-num-replicas=2 \
  --max-num-replicas=10 \
  --target-cpu-utilization=0.6

Autoscaling metrics:

Metric Flag Description
CPU utilization --target-cpu-utilization=0.6 Scale when average CPU exceeds target
Load balancing capacity --target-load-balancing-utilization Scale based on backend serving capacity
Cloud Monitoring metric --custom-metric-utilization Scale on any custom metric
Schedule-based Console/API only Scale at predetermined times
Predictive Console/API only Forecasts load based on historical patterns

Autohealing recreates unhealthy VMs automatically using application-level health checks (not just system health). Configure with --health-check= flag.

Update policies:

  • Rolling update: Gradually replaces instances with new template. Control with --update-policy= (proactive or opportunistic) and --max-surge= / --max-unavailable=.
  • Canary update: Apply new template to a subset by specifying --target-size= on the new template version.

Exam trap: Stateful MIGs (those preserving specific disks, IPs, or metadata per instance) cannot use autoscaling. If a question describes persistent data on individual MIG instances with autoscaling, that is an invalid configuration.

OS Login

OS Login replaces traditional SSH key metadata with IAM-based access control.

Enabling OS Login:

# Project-level
gcloud compute project-info add-metadata --metadata enable-oslogin=TRUE

# Instance-level (overrides project setting)
gcloud compute instances add-metadata my-vm --metadata enable-oslogin=TRUE

When OS Login is enabled, Compute Engine deletes the VM's authorized_keys files and no longer accepts connections from SSH keys stored in project/instance metadata.

Required IAM roles:

Role Access Level
roles/compute.osLogin Standard (non-root) SSH access
roles/compute.osAdminLogin Root/sudo SSH access
roles/iam.serviceAccountUser Required if VM uses a service account
roles/compute.osLoginExternalUser Cross-organization access (granted at org level)

OS Login 2FA: Enable with metadata key enable-oslogin-2fa=TRUE. Requires users to have 2-step verification on their Google Account. Does not apply to service account connections.

Exam trap: OS Login is not supported on Windows Server VMs. If a question mentions Windows and OS Login together, that is not a valid combination.

VM Manager and Ops Agent

VM Manager provides OS patch management and compliance monitoring for Compute Engine fleets. Key capabilities: OS patch deployment (on-demand and scheduled), OS inventory management, and OS policy compliance reporting.

Ops Agent is the unified agent for Cloud Monitoring and Cloud Logging. It replaces the legacy Monitoring Agent and Logging Agent.

# Install Ops Agent on a running VM
gcloud compute ssh my-vm -- \
  'curl -sSO https://dl.google.com/cloudagents/add-google-cloud-ops-agent-repo.sh && \
   sudo bash add-google-cloud-ops-agent-repo.sh --also-install'

The Ops Agent collects system metrics (CPU, memory, disk, network) and logs (syslog, application logs) and sends them to Cloud Monitoring and Cloud Logging respectively.


3.2 Deploying and Implementing GKE Resources

kubectl CLI Configuration

After creating a GKE cluster, configure kubectl to communicate with it:

# Install kubectl (if not bundled with gcloud)
gcloud components install kubectl

# Get cluster credentials -- writes to ~/.kube/config
gcloud container clusters get-credentials CLUSTER_NAME --location=LOCATION

This command configures kubectl to use the specified cluster as the current context. You can manage multiple cluster contexts in ~/.kube/config.

GKE Cluster Modes

The exam tests three cluster deployment decisions: mode (Autopilot vs. Standard), scope (zonal vs. regional), and access (public vs. private).

Autopilot vs. Standard:

Feature Autopilot Standard
Node management Google-managed User-managed
Node pools No user-managed node pools Full node pool control
Pricing Per-pod resource requests Per-node (full VM cost)
Scaling Automatic Manual or cluster autoscaler
Security Hardened by default (GKE Sandbox, Shielded nodes) User-configured
Use case Most production workloads (recommended) Specialized hardware, GPUs, custom node configs

Creating clusters:

# Autopilot cluster (recommended for most use cases)
gcloud container clusters create-auto my-cluster \
  --location=us-central1

# Standard cluster
gcloud container clusters create my-cluster \
  --location=us-central1 \
  --num-nodes=3 \
  --machine-type=e2-standard-4

Exam trap: The create-auto subcommand is for Autopilot. The create subcommand is for Standard. Do not confuse them. If a question says "recommended mode," the answer is Autopilot.

Zonal vs. Regional clusters:

Feature Zonal Regional
Control plane Single zone Three zones (automatic)
Node distribution Single zone Across multiple zones
HA guarantee No control plane redundancy Control plane survives zone failure
Use case Dev/test Production
Flag --zone=us-central1-a --location=us-central1 (region, not zone)

Private clusters restrict nodes to internal IP addresses only. The control plane has a private endpoint, and you can optionally enable authorized networks to allow specific external IPs to reach the control plane.

gcloud container clusters create my-private-cluster \
  --location=us-central1 \
  --enable-private-nodes \
  --enable-private-endpoint \
  --master-ipv4-cidr=172.16.0.0/28

GKE Enterprise adds fleet management, multi-cluster services, Config Sync, Policy Controller, and service mesh capabilities across multiple GKE clusters, including on-premises (via GKE on-prem).

Deploying a Containerized Application

The standard workflow: push a container image to Artifact Registry, create a Deployment, expose it with a Service.

# Deploy an application
kubectl create deployment hello-server \
  --image=us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0

# Expose via LoadBalancer (creates a Cloud Load Balancer)
kubectl expose deployment hello-server \
  --type=LoadBalancer \
  --port=80 \
  --target-port=8080

# Verify
kubectl get pods
kubectl get service hello-server

Kubernetes Service types on the exam:

Type Description External Access
ClusterIP Internal-only; default No
NodePort Exposes on each node's IP at a static port Yes (node IP:port)
LoadBalancer Provisions a Cloud Load Balancer Yes (external IP)

Exam trap: LoadBalancer service type creates a Compute Engine load balancer, which incurs additional cost. If a question asks about exposing a GKE service externally with minimal effort, LoadBalancer is the answer. If it asks about internal-only communication between microservices, ClusterIP is correct.

GKE Monitoring and Logging

GKE clusters have Cloud Monitoring and Cloud Logging enabled by default. System and workload logs are sent to Cloud Logging; metrics are sent to Cloud Monitoring. You can configure this at cluster creation or update:

gcloud container clusters update my-cluster \
  --location=us-central1 \
  --logging=SYSTEM,WORKLOAD \
  --monitoring=SYSTEM,WORKLOAD

3.3 Deploying Cloud Run and Cloud Functions Resources

Cloud Run Deployment

Cloud Run is a fully managed serverless platform that runs stateless containers. It supports two deployment methods: from a container image or directly from source code.

From a container image:

gcloud run deploy my-service \
  --image=us-docker.pkg.dev/PROJECT/REPO/IMAGE:TAG \
  --region=us-central1 \
  --allow-unauthenticated

From source code (builds automatically using Cloud Build):

gcloud run deploy my-service \
  --source=. \
  --region=us-central1

Key configuration parameters:

Parameter Flag Default
Authentication --allow-unauthenticated or --no-allow-unauthenticated Requires auth
Min instances --min-instances=N 0 (scales to zero)
Max instances --max-instances=N 100
Memory --memory=512Mi 512 MiB
CPU --cpu=1 1 vCPU
Concurrency --concurrency=N 80
Timeout --timeout=300 300 seconds
Port --port=8080 8080
CPU allocation --cpu-throttling / --no-cpu-throttling Throttled (CPU only during requests)

Revision and traffic management: Each deployment creates an immutable revision. You can split traffic between revisions for canary deployments:

gcloud run services update-traffic my-service \
  --to-revisions=my-service-v2=10,my-service-v1=90

Sidecar containers: Cloud Run supports up to 10 containers per instance (one ingress container plus sidecars). Sidecars share the network namespace and can communicate via localhost.

Cloud Run is a regional service -- infrastructure runs in a specific region and is redundantly available across all zones within that region.

Required IAM roles: roles/run.developer (deploy services), roles/iam.serviceAccountUser (use service identity), roles/artifactregistry.reader (pull images).

Exam trap: --allow-unauthenticated grants the IAM Invoker role to allUsers, making the service public. If the question describes a public-facing API, this is the flag. If it describes an internal microservice, omit it.

Cloud Functions Deployment

Cloud Functions (now officially "Cloud Run functions") provide event-driven serverless compute. The exam tests both 1st gen and 2nd gen.

1st Gen vs. 2nd Gen:

Feature 1st Gen 2nd Gen (Cloud Run functions)
Platform Dedicated Functions infrastructure Built on Cloud Run
Concurrency 1 request per instance Up to 1,000 concurrent requests per instance
Max timeout 9 minutes (HTTP), 9 minutes (event) 60 minutes (HTTP), 9 minutes (event)
Max memory 8 GB 32 GB
Min instances Not supported Supported (avoid cold starts)
Traffic splitting Not supported Supported (via Cloud Run)
Trigger mechanism Native triggers Eventarc

Deploying a function (2nd gen):

gcloud functions deploy my-function \
  --gen2 \
  --runtime=python312 \
  --trigger-http \
  --entry-point=hello_world \
  --region=us-central1 \
  --allow-unauthenticated

Function signature types:

  • HTTP functions: Triggered by HTTP requests; return HTTP responses. Ideal for webhooks and REST endpoints.
  • CloudEvent functions: Triggered by events (Pub/Sub, Cloud Storage, Firestore, Eventarc). Use the CloudEvents specification.

Supported runtimes: Node.js, Python, Go, Java, .NET (C#), Ruby, PHP.

Event-Driven Deployments

The exam expects you to know how to wire functions to event sources:

# Pub/Sub trigger
gcloud functions deploy my-function --gen2 \
  --trigger-topic=my-topic --runtime=python312 --entry-point=process_message

# Cloud Storage trigger
gcloud functions deploy my-function --gen2 \
  --trigger-event-filters="type=google.cloud.storage.object.v1.finalized" \
  --trigger-event-filters="bucket=my-bucket" \
  --runtime=python312 --entry-point=process_file

Eventarc is the unified eventing system for 2nd gen functions and Cloud Run. It routes events from Google Cloud services, Pub/Sub, and third-party sources.

Choosing Between Cloud Run and Cloud Functions

Decision Factor Cloud Run Cloud Functions
Input Container image or source Source code only
Language Any (containerized) Supported runtimes only
Complexity Full application, multiple routes Single-purpose function
Concurrency High (hundreds per instance) 1st gen: 1; 2nd gen: configurable
Long-running Up to 60 min (or always-on) Up to 60 min (HTTP, 2nd gen)
Event-driven Via Eventarc or Pub/Sub push Native trigger integration

Exam trap: If a question describes a containerized application with multiple endpoints, Cloud Run is correct. If it describes a lightweight, single-purpose event handler (e.g., "process image uploads"), Cloud Functions is the simpler answer. 2nd gen Cloud Functions are built on Cloud Run, so the lines are blurring -- but the exam still treats them as distinct products.


3.4 Deploying and Implementing Data Solutions

Data Product Overview

The exam tests basic deployment (initialization, configuration) of the following services. You do not need deep DBA-level knowledge, but you must know how to create each and load data.

Service Type Key Characteristic
Cloud SQL Managed relational DB MySQL, PostgreSQL, SQL Server; HA with REGIONAL availability
Cloud Spanner Globally distributed relational DB Horizontal scaling, strong consistency, 99.999% SLA (multi-region)
Firestore Managed NoSQL document DB Serverless, real-time sync, offline support; Native mode or Datastore mode
BigQuery Serverless data warehouse Columnar storage, SQL analytics, separate compute/storage billing
Pub/Sub Managed messaging service At-least-once delivery, push/pull subscriptions, global
Dataflow Managed stream/batch processing Apache Beam-based, autoscaling, exactly-once processing
Cloud Storage Object storage Buckets with storage classes (Standard, Nearline, Coldline, Archive)
AlloyDB Managed PostgreSQL-compatible DB AI-optimized, columnar engine, 4x faster than standard PostgreSQL for analytics

Cloud SQL Instance Creation

# Create a MySQL instance with HA
gcloud sql instances create my-instance \
  --database-version=MYSQL_8_0 \
  --tier=db-n1-standard-4 \
  --region=us-central1 \
  --availability-type=REGIONAL \
  --backup-start-time=02:00 \
  --enable-bin-log

# Create a database
gcloud sql databases create mydb --instance=my-instance

# Create a user
gcloud sql users create myuser --instance=my-instance --password=SECRET

Cloud SQL editions: Enterprise (default for older MySQL) and Enterprise Plus (default for MySQL 8.4+; adds data cache, near-zero-downtime maintenance).

Networking: Public IP (requires authorized networks) or Private IP (requires VPC private services access). Private IP is recommended for production.

Exam trap: Cloud SQL REGIONAL availability type means HA with a standby in a different zone. ZONAL means single-zone with no automatic failover. If a question asks about database HA, the answer involves --availability-type=REGIONAL.

AlloyDB

AlloyDB is a fully managed, PostgreSQL-compatible database designed for demanding workloads. Key exam-relevant facts:

  • PostgreSQL-compatible: Supports PostgreSQL wire protocol and most extensions
  • HA: Primary instance plus read pool instances; automatic failover
  • Columnar engine: Accelerates analytical queries without separate data warehouse
  • AI integration: Built-in vector search and ML model serving
  • Pricing: Separate compute and storage billing, like Spanner

Loading Data

The exam tests multiple data-loading methods:

Method Use Case Example
gcloud sql import Load SQL dump or CSV into Cloud SQL gcloud sql import sql my-instance gs://bucket/dump.sql --database=mydb
bq load Load data into BigQuery bq load --source_format=CSV dataset.table gs://bucket/data.csv
gsutil cp Upload files to Cloud Storage gsutil cp local-file.csv gs://my-bucket/
Storage Transfer Service Scheduled/recurring transfers from S3, HTTP, other buckets Console or API
Transfer Appliance Petabyte-scale offline transfer Physical device shipped to your location
BigQuery streaming Real-time row inserts bq insert or Pub/Sub -> Dataflow -> BigQuery
Dataflow ETL pipeline (batch or streaming) Apache Beam pipeline deployed to Dataflow

Exam trap: For importing a SQL dump into Cloud SQL, the file must be in Cloud Storage first -- you cannot import directly from a local machine via gcloud. The pattern is: upload to GCS, then gcloud sql import.


3.5 Deploying and Implementing Networking Resources

VPC Networks

A VPC network is a global resource that provides networking for Compute Engine, GKE, and other services. VPC networks contain subnets (regional resources) and are isolated by default.

Auto Mode vs. Custom Mode:

Feature Auto Mode Custom Mode
Subnet creation One per region, automatic Manual only
IP range Predefined within 10.128.0.0/9 User-defined
New regions Subnet auto-created Must add manually
IPv6 support No Yes
Conversion Can convert to custom (one-way) Cannot revert to auto
Use case Quick prototyping, simple setups Production (recommended)
# Create a custom-mode VPC
gcloud compute networks create my-vpc --subnet-mode=custom

# Create a subnet
gcloud compute networks subnets create my-subnet \
  --network=my-vpc \
  --region=us-central1 \
  --range=10.0.1.0/24

Default network: Every new project gets an auto-mode VPC named default with pre-populated firewall rules. Disable with organization policy compute.skipDefaultNetworkCreation.

Implied firewall rules (cannot be deleted):

Rule Direction Action Priority
Implied deny ingress Ingress Deny all 65535 (lowest)
Implied allow egress Egress Allow all 65535 (lowest)

These are always present. The default network adds permissive rules on top (e.g., default-allow-internal, default-allow-ssh, default-allow-rdp, default-allow-icmp).

Key VPC concepts:

  • Subnets are regional (not zonal), but VPC networks and firewall rules are global
  • Four reserved IPs per primary subnet range (network, gateway, two Google-reserved)
  • Private Google Access: Allows VMs without external IPs to reach Google APIs. Enable per-subnet.
  • VPC Flow Logs: Capture network flow data for monitoring and forensics. Enable per-subnet.

Exam trap: Auto-mode VPCs cannot support IPv6 subnets. If a question requires dual-stack or IPv6 networking, the answer must involve a custom-mode VPC.

Shared VPC

Shared VPC allows an organization to centralize network management in a host project while allowing service projects to use its subnets.

Key terminology:

Term Definition
Host project Contains the Shared VPC network and subnets
Service project Attached to a host project; creates resources in shared subnets
Shared VPC Admin Enables host projects and attaches service projects (requires compute.xpnAdmin and resourcemanager.projectIamAdmin)
Service Project Admin Creates resources in shared subnets (requires compute.networkUser on host project or specific subnets)

Constraints:

  • A project cannot be both host and service project simultaneously
  • Each service project attaches to exactly one host project
  • Multiple host projects can exist within an organization
  • Permission can be granted at project level (all subnets) or subnet level (specific subnets only)
  • Billing goes to the service project where resources reside

Shared VPC vs. VPC Peering:

Feature Shared VPC VPC Peering
Model Centralized (host/service hierarchy) Peer-to-peer (no hierarchy)
Organization Requires same organization Can cross organizations
Administration Central network admin team Each project manages its own network
Subnet sharing Service projects use host subnets Each VPC keeps its own subnets
Use case Enterprise multi-team centralized networking Cross-organization or independent team connectivity

Exam trap: Shared VPC requires an organization. If the question describes projects without an organization, Shared VPC is not possible -- use VPC Peering instead.

Firewall Rules and Policies

Firewall rules control ingress and egress traffic to VMs. They can target instances by network tags, service accounts, or all instances in the network.

# Allow HTTP ingress from any source to instances tagged "web-server"
gcloud compute firewall-rules create allow-http \
  --network=my-vpc \
  --allow=tcp:80 \
  --target-tags=web-server \
  --source-ranges=0.0.0.0/0

# Allow internal communication within a subnet
gcloud compute firewall-rules create allow-internal \
  --network=my-vpc \
  --allow=tcp,udp,icmp \
  --source-ranges=10.0.1.0/24

Rule priority: 0 (highest) to 65535 (lowest). Lower numbers take precedence. The implied deny/allow rules are at priority 65535.

Targeting methods:

Method Flag Scope
Network tags --target-tags=TAG Instances with matching tag
Service accounts --target-service-accounts=SA Instances using matching SA
All instances (no target flag) Every instance in the network

Firewall policies (hierarchical): Applied at the organization or folder level, they override or supplement VPC firewall rules. Policies use a goto_next action to delegate to the next level.

Exam trap: Service account-based targeting is more secure than network tags because tags can be modified by anyone with instance edit permissions, while service account assignment requires iam.serviceAccountUser permission.

Cloud VPN

Cloud VPN provides encrypted site-to-site connectivity between your on-premises network and Google Cloud VPCs. It does not support client-to-gateway VPN.

HA VPN vs. Classic VPN:

Feature HA VPN Classic VPN
SLA 99.99% 99.9%
Routing Dynamic only (BGP required) Static or dynamic
Interfaces Two (automatic external IPs) One (manual IP + forwarding rules)
IPv6 Supported Not supported
Recommended Yes Deprecated for new deployments

HA VPN requires a Cloud Router with BGP for dynamic route exchange. Each HA VPN gateway has two interfaces for redundancy.

# Create HA VPN gateway
gcloud compute vpn-gateways create my-ha-vpn \
  --network=my-vpc \
  --region=us-central1

# Create Cloud Router
gcloud compute routers create my-router \
  --network=my-vpc \
  --region=us-central1 \
  --asn=65001

# Create VPN tunnels (one per interface for 99.99% SLA)
gcloud compute vpn-tunnels create tunnel-0 \
  --vpn-gateway=my-ha-vpn \
  --peer-gcp-gateway=peer-gateway \
  --region=us-central1 \
  --ike-version=2 \
  --shared-secret=SECRET \
  --router=my-router \
  --vpn-gateway-region=us-central1 \
  --interface=0

Key facts:

  • Each tunnel supports up to 250,000 packets/second (~3 Gbps)
  • HA VPN gateway stack type (IPv4-only, dual-stack, IPv6-only) cannot be changed after creation
  • Supports IKEv1 and IKEv2 (IKEv2 recommended for cipher customization)
  • Cloud VPN is a regional resource
  • Replay detection uses a 4,096-packet window (not configurable)

Exam trap: HA VPN requires two tunnels (one per interface) to achieve the 99.99% SLA. A single tunnel provides no SLA guarantee. If a question mentions "highest availability" for VPN, the answer requires both interfaces configured.

VPC Network Peering

VPC Network Peering connects two VPC networks for internal IP communication, even across different projects or organizations. Key properties:

  • Non-transitive: If VPC-A peers with VPC-B, and VPC-B peers with VPC-C, VPC-A cannot reach VPC-C through VPC-B
  • No overlapping CIDR ranges: Peered networks cannot have overlapping subnet ranges
  • Decentralized: Each side manages its own network independently
  • No external IP needed: Traffic stays on Google's internal network
  • Both sides must configure: Peering requires configuration from both VPC networks
# From project-a: peer with project-b's VPC
gcloud compute networks peerings create peer-ab \
  --network=vpc-a \
  --peer-project=project-b \
  --peer-network=vpc-b

# From project-b: peer with project-a's VPC (both sides required)
gcloud compute networks peerings create peer-ba \
  --network=vpc-b \
  --peer-project=project-a \
  --peer-network=vpc-a

Exam trap: VPC Peering is non-transitive. If the exam describes a hub-and-spoke topology where spoke networks need to communicate through a central hub, VPC Peering alone will not work -- you would need full mesh peering or a different approach (e.g., Cloud VPN with a hub).


3.6 Implementing Resources Through Infrastructure as Code (IaC)

Terraform for Google Cloud

Terraform is the primary IaC tool tested on the exam for Google Cloud deployments.

Google Cloud provider configuration:

provider "google" {
  project = "my-project-id"
  region  = "us-central1"
}

resource "google_compute_instance" "web" {
  name         = "web-server"
  machine_type = "e2-medium"
  zone         = "us-central1-a"

  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-12"
    }
  }

  network_interface {
    network    = "default"
    access_config {} # Assigns external IP
  }
}

State management:

Location Use Case Configuration
Local (default) Individual developer terraform.tfstate in working directory
GCS backend (recommended) Team collaboration backend "gcs" { bucket = "my-tf-state" }

Remote state in GCS provides locking (via GCS object versioning), sharing, and audit trail. Always use remote state for production.

Core workflow:

terraform init      # Initialize provider plugins and backend
terraform plan      # Preview changes (dry run)
terraform apply     # Apply changes (creates/modifies resources)
terraform destroy   # Destroy all managed resources

Best practices from Google Cloud documentation:

  • Use main.tf, variables.tf, outputs.tf file structure
  • Group related resources logically (e.g., network.tf, instances.tf)
  • Name single resources of a type main for simplified references
  • Use modules for reusable infrastructure patterns
  • Store state remotely in GCS with object versioning
  • Use prevent_destroy lifecycle rule for stateful resources

Cloud Foundation Toolkit (CFT)

The Cloud Foundation Toolkit provides opinionated, production-ready Terraform modules for Google Cloud. These are pre-built, tested blueprints for common infrastructure patterns:

  • Project Factory: Standardized project creation with billing, IAM, and APIs
  • Network modules: VPC, subnets, firewall rules, Cloud NAT
  • GKE module: Cluster creation with security best practices
  • Service account module: SA creation with least-privilege roles

CFT modules are published to the Terraform Registry and follow Google's recommended architecture patterns.

Config Connector

Config Connector lets you manage Google Cloud resources using Kubernetes-native YAML manifests and kubectl. Resources are defined as Kubernetes custom resources.

apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeNetwork
metadata:
  name: my-vpc
spec:
  autoCreateSubnetworks: false

Key exam facts:

  • Runs as an add-on in a GKE cluster
  • Manages Google Cloud resources through Kubernetes CRDs (Custom Resource Definitions)
  • Reconciles desired state continuously (like all Kubernetes controllers)
  • Best for teams already invested in Kubernetes and GitOps workflows

Helm Charts

Helm is the package manager for Kubernetes. It bundles Kubernetes manifests into reusable, versioned charts.

# Install a Helm chart
helm install my-release my-chart/ --namespace my-ns

# Upgrade a release
helm upgrade my-release my-chart/ --set image.tag=v2.0

# List releases
helm list

Exam-relevant facts:

  • Helm charts contain templates, values files, and metadata
  • Charts can be stored in Artifact Registry or Helm repositories
  • values.yaml provides default configuration; override with --set or -f custom-values.yaml
  • Helm manages release history for rollback support

IaC Tool Comparison

Tool Language Scope Best For
Terraform HCL Multi-cloud, any GCP resource General infrastructure provisioning
Cloud Foundation Toolkit HCL (Terraform modules) Google Cloud best-practice patterns Enterprise landing zones, standardized projects
Config Connector YAML (Kubernetes CRDs) Google Cloud resources via kubectl Kubernetes-native teams, GitOps
Helm YAML (Go templates) Kubernetes application packaging Deploying apps to GKE clusters

Exam trap: Terraform manages infrastructure resources (VMs, networks, databases). Helm manages Kubernetes application deployments. Config Connector bridges both by managing GCP resources through Kubernetes. If a question asks about "deploying infrastructure," think Terraform. If it asks about "deploying an application to GKE," think Helm or kubectl. If it asks about "managing GCP resources from Kubernetes," think Config Connector.


Exam Strategy for Domain 3

  1. Know the CLI commands: This domain is heavily gcloud-focused. Memorize the core subcommands: gcloud compute instances create, gcloud container clusters create-auto, gcloud run deploy, gcloud functions deploy, gcloud compute networks create, gcloud sql instances create.

  2. Understand defaults: The exam tests what happens when you do not specify a flag. Know that Cloud Run defaults to requiring authentication, VPCs default to auto mode, Cloud SQL defaults to zonal availability, and GKE Autopilot is the recommended mode.

  3. Match service to scenario: Many questions present a scenario and ask which service to use. Use this decision tree:

    • Need a VM with full OS control? Compute Engine
    • Need container orchestration at scale? GKE
    • Need a single stateless container with HTTPS? Cloud Run
    • Need a lightweight event handler? Cloud Functions
    • Need managed relational DB? Cloud SQL (single-region) or Spanner (global)
    • Need NoSQL document store? Firestore
    • Need analytics/data warehouse? BigQuery
  4. HA patterns: Expect questions about making services highly available. The answers always involve: regional MIGs (not zonal), regional GKE clusters (not zonal), Cloud SQL REGIONAL availability type, HA VPN with two tunnels, and custom-mode VPCs with subnets in multiple regions.

  5. Networking fundamentals: At least 3-4 questions will test VPC concepts. Know the difference between auto and custom mode, how firewall rules use tags vs. service accounts, Shared VPC vs. VPC Peering, and that VPC Peering is non-transitive.


References