Real Exam Questions
Question 1
An online retailer at example.com runs its production workload on Cloud SQL for PostgreSQL. During a weekend promotion the volume of read queries tripled and users saw slow pages and some timeouts. You want a rapid way to detect this condition as it starts so that engineers can respond before it worsens. What should you do?
-
❏ A. Create a Cloud Function that polls the database every minute and writes high read latency values to logs
-
❏ B. Enable Cloud SQL Insights and rely on its dashboards to spot spikes in read latency
-
❏ C. Configure a Cloud Monitoring alert on read latency with a defined threshold
-
❏ D. Increase the database instance size to handle additional read operations
Question 2
Which Cloud Spanner design aligns with Google best practices to provide global high availability and simplify operations for expansion into North America and Asia Pacific?
-
❏ A. Switch to Cloud Bigtable with multi cluster routing and place clusters in europe-west3 us-east1 and asia-southeast1 targeting 35 percent CPU per cluster
-
❏ B. Create a single Cloud Spanner multi region instance using nam-eur-asia1 and keep average CPU near 50 percent so clients read from nearest replicas
-
❏ C. Use Cloud SQL with cross region read replicas in us-central1 europe-west1 and asia-northeast1
-
❏ D. Run separate Cloud Spanner regional instances in europe-west2 us-east4 and asia-southeast2 and route tenants in the application
Question 3
A streaming analytics startup at mcnz.com runs a regional Cloud Spanner instance that currently has 4 nodes. During evening spikes in demand you see elevated read and write latency. You need to enhance throughput quickly with the least disruption to the application. What should you do?
-
❏ A. Set up a read replica for the Cloud Spanner instance in another region
-
❏ B. Scale the Cloud Spanner instance by adding more nodes
-
❏ C. Increase the number of vCPUs allocated to each Cloud Spanner node
-
❏ D. Migrate the database to Cloud SQL to gain higher performance
Question 4
Which Google Cloud service enables a lift and shift of Microsoft SQL Server workloads, including SSIS, SSRS, and SSAS, with minimal changes?
-
❏ A. Google Kubernetes Engine
-
❏ B. Cloud SQL for SQL Server
-
❏ C. Compute Engine
Question 5
You are the lead database engineer at Lumina Metrics, a retail analytics provider that runs services on scrumtuous.com, and you operate a three region analytics platform on Google Cloud that serves executive dashboards and nightly models. Executives require quarterly disaster recovery drills that must not disrupt live requests and the results must closely reflect real recovery behavior. What is the best way to execute these drills so that production impact stays minimal and test fidelity remains high?
-
❏ A. Create a temporary environment from a recent database snapshot and rehearse a recovery scenario
-
❏ B. Run chaos experiments in the production stack by injecting random process and network failures
-
❏ C. Stand up a parallel environment that matches production and supply it with near real time database replication to conduct DR drills
-
❏ D. Skip dedicated tests and use the presence of backups and configured replication as evidence of DR readiness
Question 6
Which Google Cloud database deployment provides globally strong transactional consistency for real time OLTP and remains available during a regional outage?
-
❏ A. Firestore in Datastore mode multi region
-
❏ B. Cloud Bigtable multi cluster replication
-
❏ C. Cloud Spanner multi region with read write replicas
-
❏ D. AlloyDB for PostgreSQL single region with cross region read replicas
Question 7
An analytics team at a retail startup named NorthPeak Commerce manages a self-hosted PostgreSQL database on a Compute Engine VM. Customers intermittently experience slow responses and the team cannot find a consistent trigger. You need to capture detailed query information to identify which SQL statements are slow while keeping overhead low and without changing the application code. What should you do?
-
❏ A. Migrate the database to Cloud SQL using Database Migration Service and analyze performance with Query Insights
-
❏ B. Add tracing logic to the application so each database call logs its elapsed time
-
❏ C. Enable the log_min_duration_statement parameter in postgresql.conf with a 300 ms threshold and analyze the resulting server logs
-
❏ D. Install Prometheus and a PostgreSQL exporter on the VM to monitor performance metrics for the database
Question 8
Which Firestore mode and storage configuration supports 800 reads per second, 1000 writes per second, and 2 TB of capacity?
-
❏ A. Firestore Datastore mode with 2 TB SSD
-
❏ B. Firestore Native with 2 TB SSD
-
❏ C. Firestore Native with 2 TB HDD
Question 9
You are the cloud database engineer at Riverton Labs and you need to load about 320 million rows from CSV files stored in a Cloud Storage bucket into a production Cloud Bigtable instance. The job must handle roughly 750 GB of data, support parallel processing, and be resilient to worker restarts and partial failures so that the import can complete reliably. Which approach should you choose to perform this import?
-
❏ A. Run a Dataproc Spark job that reads the CSV files and performs row by row writes directly into Cloud Bigtable
-
❏ B. Use the gcloud bigtable tables import command with a csv source flag
-
❏ C. Build a Cloud Dataflow Apache Beam pipeline that reads CSV from Cloud Storage and upserts into Cloud Bigtable with autoscaling and pipeline checkpointing turned on
-
❏ D. Use gsutil cp to copy the CSV objects from Cloud Storage directly into the Cloud Bigtable table
Question 10
Which approach provides secure, scalable, and repeatable provisioning of Cloud SQL instances with comprehensive networking, storage, and encryption configurations across four environments and approximately 20 projects?
-
❏ A. Google Cloud Workflows with gcloud
-
❏ B. Cloud Deployment Manager with parameterized templates
-
❏ C. Config Connector
-
❏ D. Cloud SQL Admin API

All GCP questions come from my Google DB Engineer Udemy course and certificationexams.pro
Question 11
A media startup named NovaStream has provisioned a new Cloud SQL for PostgreSQL instance and a Compute Engine VM connects to it through the Cloud SQL Auth proxy. During testing you review the proxy logs and repeatedly see “Error 401 Unauthorized” when it attempts to establish the connection. What should you do to resolve this authentication issue?
-
❏ A. Enable the Cloud SQL Admin API for the project
-
❏ B. Reset the database user password and update the application connection parameters
-
❏ C. Grant the Cloud SQL Client IAM role to the service account used by the Cloud SQL Auth proxy or verify it already has it
-
❏ D. Reconfigure the proxy to use a private IP address instead of a public endpoint
Question 12
Which type of database is best suited to continuously ingest data from 40,000 devices each sending one record every 2 seconds while supporting real time queries on the last 10 minutes of data?
-
❏ A. Key value NoSQL Database
-
❏ B. Time series NoSQL Database
-
❏ C. Relational SQL Database
-
❏ D. Document NoSQL Database
Question 13
NovaCart Retail runs a self managed PostgreSQL deployment on Compute Engine for its checkout service and leadership demands high availability with synchronous replication across zones in europe-west1. You need to design an architecture on Google Cloud that meets these requirements and minimizes operational burden. What should you implement?
-
❏ A. Configure two Compute Engine virtual machines running PostgreSQL with synchronous replication across europe-west1-b and europe-west1-c and use a Regional Persistent Disk for storage redundancy
-
❏ B. Run PostgreSQL on Google Kubernetes Engine with a StatefulSet and PersistentVolumeClaims and configure synchronous replication between pods
-
❏ C. Use Google Cloud SQL for PostgreSQL and enable high availability with a cross zone failover replica and automatic failover
-
❏ D. Migrate to Cloud Spanner with the PostgreSQL interface and deploy a multi region configuration
Question 14
For a workload that must support roughly 300 concurrent ad hoc analytical queries, which managed service should you choose and which feature should you enable to speed interactive execution?
-
❏ A. Cloud Spanner with horizontal scaling
-
❏ B. BigQuery and enable BI Engine
-
❏ C. Cloud SQL with read replicas
-
❏ D. Bigtable with wide rows and GC tuning
Question 15
You are a data platform engineer at a retail analytics firm that uses Cloud Bigtable for time series inventory data, and a read query against the inventory_events table is experiencing high latency and inconsistent performance during peak usage. What is the best approach to diagnose and address the underlying bottleneck?
-
❏ A. Monitor the cluster’s CPU and memory metrics in Cloud Monitoring
-
❏ B. Add a secondary index on frequently filtered columns
-
❏ C. Use the Bigtable Key Visualizer to inspect row key access patterns and hotspots
-
❏ D. Migrate the Bigtable cluster to SSD storage
Question 16
Which Cloud SQL configuration disables public internet access and allows connections only from Compute Engine virtual machines within the same VPC network?
-
❏ A. Authorized networks for subnet CIDR
-
❏ B. IAP TCP forwarding to public IP
-
❏ C. Cloud SQL private IP only
-
❏ D. Cloud SQL Auth Proxy with public IP
Question 17
mcnz.com runs a reservation platform on Cloud Spanner and daily active users have tripled over the last 90 days. You expect event day spikes with sustained growth in both reads and writes and you must keep latency predictable while preserving strong consistency. Which actions will help you scale throughput effectively? (Choose 2)
-
❏ A. Consolidate all entities into one wide table to reduce joins
-
❏ B. Increase the instance’s node count to add serving capacity
-
❏ C. Disable synchronous replication to favor writes
-
❏ D. Design transactions to touch fewer rows and commit quickly to minimize lock contention
-
❏ E. Manually shard application tables across multiple databases
Question 18
Which Google Cloud service provides a fully managed scheduled transfer from Amazon S3 to Cloud Storage with minimal operational effort?
-
❏ A. Dataflow with Apache Beam
-
❏ B. Storage Transfer Service scheduled transfer
-
❏ C. BigQuery Data Transfer Service
-
❏ D. Cloud Scheduler with gsutil
Question 19
A retail marketplace operated by example.com needs to analyze very large volumes of clickstream time series events using ad hoc SQL and must sustain high velocity inserts from many microservices without managing servers. Which Google Cloud service should they choose to optimize for performance and seamless scalability?
-
❏ A. Cloud Spanner with secondary indexes
-
❏ B. BigQuery with date partitioned and clustered tables
-
❏ C. Cloud SQL for PostgreSQL with time-based partitioning
-
❏ D. Cloud Bigtable with a time-oriented row key design
Question 20
Which Firestore Native mode location stores data in a single EU region, tolerates a zone outage, and maintains low latency?
-
❏ A. Firestore multi-region EU
-
❏ B. Firestore regional EU with multi-zone replication
-
❏ C. Firestore global multi-region
-
❏ D. Firestore Datastore mode regional EU
Real Google Cloud Certification Questions Answered

All GCP questions come from my Google DB Engineer Udemy course and certificationexams.pro
Question 1
Riverton Retail runs several Cloud SQL for PostgreSQL databases in Google Cloud to support checkout and inventory systems. Demand is quiet most of the day but spikes for about 2 to 3 hours on weekdays. You want to control spending while still meeting performance needs during busy periods. What should you do?
-
❏ A. Enable automatic storage increase on the Cloud SQL instances
-
❏ B. Use the Cloud SQL idle instance recommender to rightsize machine tiers
-
❏ C. Add Cloud SQL read replicas and place them behind a TCP proxy load balancer to scale connections
-
❏ D. Configure metric based autoscaling on Compute Engine instances that run your application to handle database load
Question 2
Which Cloud Bigtable actions provide regional failover and also enable data restoration if an entire region is lost? (Choose 2)
-
❏ A. Enable Bigtable change streams to BigQuery
-
❏ B. Create a second cluster in another region with replication
-
❏ C. Export tables regularly to a dual region Cloud Storage bucket
-
❏ D. Create a second cluster in the same region
Question 3
An engineering team at mcnz.com is launching a worldwide ticketing service on Cloud Spanner. Compliance requires an RPO under 90 seconds and an RTO under 4 minutes. The design must tolerate planned maintenance and unexpected regional outages and must keep latency low for users across several continents. Which deployment approach should you choose in Cloud Spanner to meet these goals?
-
❏ A. Run two independent regional Cloud Spanner instances and replicate changes with Cloud Dataflow
-
❏ B. Use a regional Cloud Spanner instance and turn on automated backups and point in time recovery
-
❏ C. Provision Cloud Spanner in a multi region configuration that commits through a read write quorum
-
❏ D. Create a zonal Cloud Spanner instance and add read replicas in other regions for disaster recovery
Question 4
Which Google Cloud service should you use to build a resilient and efficient pipeline that transfers about 60 million rows each week from PostgreSQL on Compute Engine to Cloud SQL for PostgreSQL while supporting filtering and enrichment during the transfer?
-
❏ A. Cloud Data Fusion
-
❏ B. Cloud Dataflow with Apache Beam
-
❏ C. Dataproc Serverless Spark
-
❏ D. Database Migration Service
Question 5
Rivertown Analytics operates an on premises facility and needs a private connection to Google Cloud for a data platform. Current throughput is near 20 Gbps and capacity is expected to reach about 80 Gbps within 12 months as ingestion grows. The team wants private connectivity rather than sending traffic over the public internet and they must be able to expand bandwidth as demand increases. Which connectivity approach should they choose?
-
❏ A. Configure Cloud VPN with dynamic routing using Cloud Router
-
❏ B. Provision Dedicated Interconnect for a private physical link to Google Cloud with room to scale
-
❏ C. Use Direct Peering to connect your network to Google at an edge location
-
❏ D. Adopt Partner Interconnect through a supported provider when you do not need your own physical cross connect
Exam Questions Answered
Question 1
An online retailer at example.com runs its production workload on Cloud SQL for PostgreSQL. During a weekend promotion the volume of read queries tripled and users saw slow pages and some timeouts. You want a rapid way to detect this condition as it starts so that engineers can respond before it worsens. What should you do?
-
✓ C. Configure a Cloud Monitoring alert on read latency with a defined threshold
The correct answer is Configure a Cloud Monitoring alert on read latency with a defined threshold.
This approach gives you proactive and fast detection. It evaluates the Cloud SQL read latency metric against a threshold and triggers notifications when the condition starts to breach. Engineers are alerted quickly and can act before slow pages and timeouts become widespread.
Enable Cloud SQL Insights and rely on its dashboards to spot spikes in read latency is not sufficient for rapid detection. Insights helps with query performance analysis and dashboards but it requires someone to be watching and it does not notify on its own.
Create a Cloud Function that polls the database every minute and writes high read latency values to logs is a custom and fragile solution. It adds overhead to the database and introduces delays and logs alone do not alert anyone unless you also build and maintain separate alerting which duplicates what managed monitoring already provides.
Increase the database instance size to handle additional read operations attempts to mitigate load rather than detect it early. It does not provide proactive notification and it may be unnecessary if other strategies like read replicas or caching are used. The question asks for rapid detection so this does not meet the requirement.
When a question asks for rapid detection prefer a native alert on the most relevant metric with a clear threshold. Dashboards and custom polling help analysis but are not proactive.
Question 2
Which Cloud Spanner design aligns with Google best practices to provide global high availability and simplify operations for expansion into North America and Asia Pacific?
-
✓ B. Create a single Cloud Spanner multi region instance using nam-eur-asia1 and keep average CPU near 50 percent so clients read from nearest replicas
The correct option is Create a single Cloud Spanner multi region instance using nam-eur-asia1 and keep average CPU near 50 percent so clients read from nearest replicas.
A single multi region instance that uses the nam eur asia1 configuration places replicas across North America, Europe and Asia. This delivers strong consistency with synchronous replication and automatic failover, which meets the goal of global high availability. It also keeps operations simple because there is one global database and schema with unified access control, and you scale capacity by adjusting nodes rather than managing multiple systems. Clients in each geography can achieve low latency reads from nearby read only replicas when the workload can use bounded staleness reads.
Keeping average CPU near fifty percent provides headroom for traffic spikes and regional events so latency remains stable during leader movement or failover. Monitoring guidance for Spanner advises scaling up before CPU saturates, so operating around fifty percent is a prudent target that aligns with best practices.
Switch to Cloud Bigtable with multi cluster routing and place clusters in europe-west3 us-east1 and asia-southeast1 targeting 35 percent CPU per cluster is not appropriate because Bigtable is a No SQL wide column store and does not provide relational schema, SQL, or cross region strong consistency. Multi cluster routing improves availability and locality, but replication between clusters is eventually consistent with last write wins conflict resolution, which does not satisfy global transactional requirements.
Use Cloud SQL with cross region read replicas in us-central1 europe-west1 and asia-northeast1 does not meet the goal because cross region replicas are asynchronous and promotion during a regional outage is manual. You do not get automatic multi region write availability or global strong consistency, and Cloud SQL high availability is regional which adds operational complexity for a global footprint.
Run separate Cloud Spanner regional instances in europe-west2 us-east4 and asia-southeast2 and route tenants in the application increases operational burden and removes global database semantics. There are no cross region transactions across instances, consistent reads across all tenants are not guaranteed, failover is not automatic across instances, and the application must implement complex routing and data management that a single multi region instance provides natively.
When a question asks for global high availability with simple operations, favor a single Cloud Spanner multi region instance and look for a named multi region like nam-eur-asia1 with capacity headroom near fifty percent to absorb spikes and failover.
Question 3
A streaming analytics startup at mcnz.com runs a regional Cloud Spanner instance that currently has 4 nodes. During evening spikes in demand you see elevated read and write latency. You need to enhance throughput quickly with the least disruption to the application. What should you do?
-
✓ B. Scale the Cloud Spanner instance by adding more nodes
The correct option is Scale the Cloud Spanner instance by adding more nodes. This increases throughput for both reads and writes and can be applied online with no downtime, which keeps application disruption low during evening spikes.
Cloud Spanner capacity grows predictably as you add more nodes or processing units and the service automatically rebalances data and load within the region. By increasing capacity you reduce queuing and commit contention, which lowers observed latency when traffic bursts.
Set up a read replica for the Cloud Spanner instance in another region is not appropriate because read replicas serve only read traffic and do not process writes, and placing one in another region would not improve local read latency during spikes and can add distance and replication lag.
Increase the number of vCPUs allocated to each Cloud Spanner node is not possible because Spanner does not expose per node CPU sizing. You scale by increasing nodes or processing units instead.
Migrate the database to Cloud SQL to gain higher performance would be slow and disruptive to execute and it sacrifices horizontal scale. It does not meet the requirement to enhance throughput quickly with minimal disruption.
When you see a need for quick capacity gains with minimal disruption in Cloud Spanner, choose scaling nodes or processing units online. Be careful with read replicas since they help only reads and do not reduce write latency.
Question 4
Which Google Cloud service enables a lift and shift of Microsoft SQL Server workloads, including SSIS, SSRS, and SSAS, with minimal changes?
-
✓ C. Compute Engine
The correct option is Compute Engine because it allows you to run full Microsoft SQL Server on Windows virtual machines so you can bring SSIS, SSRS and SSAS with minimal or no changes.
With Compute Engine you lift and shift the entire Windows and SQL Server stack and keep full operating system and instance level control. You can install and manage Integration Services, Reporting Services and Analysis Services just as you would on premises which makes it the most direct path when these components are required.
Cloud SQL for SQL Server is a managed service that provides database engine functionality but it does not support SSIS, SSRS or SSAS and it restricts server level access. This makes it unsuitable for a true lift and shift that depends on those features.
Google Kubernetes Engine is designed for running containerized applications and not for hosting a full Windows Server and SQL Server stack. It does not provide the native environment needed for SSIS, SSRS and SSAS.
When you see lift and shift of SQL Server with SSIS, SSRS or SSAS choose the service that gives you full VM control rather than a managed database platform.
Question 5
You are the lead database engineer at Lumina Metrics, a retail analytics provider that runs services on scrumtuous.com, and you operate a three region analytics platform on Google Cloud that serves executive dashboards and nightly models. Executives require quarterly disaster recovery drills that must not disrupt live requests and the results must closely reflect real recovery behavior. What is the best way to execute these drills so that production impact stays minimal and test fidelity remains high?
-
✓ C. Stand up a parallel environment that matches production and supply it with near real time database replication to conduct DR drills
The correct option is Stand up a parallel environment that matches production and supply it with near real time database replication to conduct DR drills.
This approach lets you rehearse end to end recovery without touching the live stack so production traffic remains unaffected. Near real time replication keeps the dataset closely aligned with current writes which makes RPO and RTO measurements realistic and gives you high fidelity results. You can validate runbooks, networking and routing, IAM, automation, and regional failover behavior against a production-like system and you can safely practice cutover and rollback.
Create a temporary environment from a recent database snapshot and rehearse a recovery scenario is weaker because a snapshot is a static point in time copy which makes data stale and reduces test fidelity. It does not validate continuous replication, lag behavior, or real failover and cutover steps that you would perform during an actual event.
Run chaos experiments in the production stack by injecting random process and network failures can degrade live requests and violates the requirement to avoid disruption. Chaos testing is useful for resilience but it does not provide a realistic measure of disaster recovery objectives or full data restoration flows.
Skip dedicated tests and use the presence of backups and configured replication as evidence of DR readiness provides false confidence and misses the requirement for quarterly drills. Backups and replication must be proven by executing recovery procedures to confirm that objectives and runbooks work as intended.
When a question demands non disruptive and high fidelity DR tests, prefer an isolated production-like environment with near real time replication so you can practice full failover without touching live traffic.
Question 6
Which Google Cloud database deployment provides globally strong transactional consistency for real time OLTP and remains available during a regional outage?
-
✓ C. Cloud Spanner multi region with read write replicas
The correct option is Cloud Spanner multi region with read write replicas.
Cloud Spanner multi region with read write replicas provides externally consistent global transactions and is built for real time OLTP. Multi region configurations place read write replicas in at least two regions with synchronous replication and a witness region, so the database keeps serving both reads and writes even if a region goes offline. Spanner therefore delivers globally strong transactions and continues operating during a regional outage.
Firestore in Datastore mode multi region offers high availability, yet it does not provide globally strong transactional guarantees for relational style OLTP. In this service, transactions and ancestor queries can be strongly consistent but many queries are eventually consistent, and it lacks multi row ACID guarantees across collections. This makes it unsuitable when the requirement is globally strong transactions for real time OLTP.
Cloud Bigtable multi cluster replication is designed for low latency wide column workloads and does not offer ACID transactions. Replication across clusters is eventual, which means it does not deliver globally strong transactional behavior.
AlloyDB for PostgreSQL single region with cross region read replicas keeps writes in one primary region. Cross region replicas are read only and promotion is a disaster recovery action rather than continuous multi region writes, so the deployment cannot continue write operations automatically during a regional outage and it does not provide globally strong multi region transactions.
Match keywords to capabilities. If you see globally strong transactions, real time OLTP, and continued writes during a regional outage, look for multi region read write architectures that offer external consistency.
Question 7
An analytics team at a retail startup named NorthPeak Commerce manages a self-hosted PostgreSQL database on a Compute Engine VM. Customers intermittently experience slow responses and the team cannot find a consistent trigger. You need to capture detailed query information to identify which SQL statements are slow while keeping overhead low and without changing the application code. What should you do?
-
✓ C. Enable the log_min_duration_statement parameter in postgresql.conf with a 300 ms threshold and analyze the resulting server logs
The correct option is Enable the log_min_duration_statement parameter in postgresql.conf with a 300 ms threshold and analyze the resulting server logs.
The log_min_duration_statement setting instructs PostgreSQL to write any statement that takes longer than the threshold to the server log along with its execution time. This directly reveals which queries are slow without requiring any change to the application. It has relatively low overhead because it logs only statements that exceed the threshold rather than every query. It can be enabled with a configuration change and a reload of the database process and the resulting logs can be reviewed locally or exported to a central logging solution for analysis.
Migrate the database to Cloud SQL using Database Migration Service and analyze performance with Query Insights is not appropriate for immediate troubleshooting. This introduces a platform migration and operational risk when you only need to capture slow queries on an existing VM. Query Insights is a Cloud SQL feature and it does not apply to a self hosted PostgreSQL instance on Compute Engine.
Add tracing logic to the application so each database call logs its elapsed time requires code changes and redeployments. The requirement is to avoid changing the application. Application level timing may also miss the exact SQL text or transformations applied by an ORM while server logs capture the executed statement and duration.
Install Prometheus and a PostgreSQL exporter on the VM to monitor performance metrics for the database provides useful metrics such as CPU and query rates but it does not capture the full SQL text or per statement durations needed to pinpoint which queries are slow. It adds components to manage without delivering the detailed query logging that the scenario requires.
When the question emphasizes no code changes and low overhead on a self managed PostgreSQL instance, look first to database server parameters like log_min_duration_statement or built in extensions rather than migrating platforms or modifying the application.
Question 8
Which Firestore mode and storage configuration supports 800 reads per second, 1000 writes per second, and 2 TB of capacity?
-
✓ B. Firestore Native with 2 TB SSD
The correct option is Firestore Native with 2 TB SSD.
This choice is designed for high scalability and low latency which makes the stated 800 reads per second and 1000 writes per second straightforward to achieve. It also supports multi‑terabyte datasets while maintaining predictable performance and its documented quotas comfortably exceed the workload described.
Firestore Datastore mode with 2 TB SSD is not the preferred option for new applications and it lacks some of the capabilities that make Native mode the recommended choice for high throughput and real‑time workloads. While capacity might be sufficient, the mode selection is misaligned with current best practices.
Firestore Native with 2 TB HDD is not appropriate because hard disk drives provide lower IOPS and higher latency compared to solid state drives. That latency profile makes it harder to consistently meet the required read and write rates.
When a question mixes throughput and capacity, first check product quotas and default to the recommended mode for new designs. For Firestore, that is typically Native mode, and for performance sensitive workloads favor options that provide consistently low latency.
Question 9
You are the cloud database engineer at Riverton Labs and you need to load about 320 million rows from CSV files stored in a Cloud Storage bucket into a production Cloud Bigtable instance. The job must handle roughly 750 GB of data, support parallel processing, and be resilient to worker restarts and partial failures so that the import can complete reliably. Which approach should you choose to perform this import?
-
✓ C. Build a Cloud Dataflow Apache Beam pipeline that reads CSV from Cloud Storage and upserts into Cloud Bigtable with autoscaling and pipeline checkpointing turned on
The correct approach is Build a Cloud Dataflow Apache Beam pipeline that reads CSV from Cloud Storage and upserts into Cloud Bigtable with autoscaling and pipeline checkpointing turned on.
This approach is designed for large scale ingestion and provides massive parallelism so it can comfortably handle hundreds of millions of rows. Autoscaling adjusts worker count to meet throughput needs and manage 750 GB efficiently. Checkpointing and resilient execution let workers restart without losing progress so partial failures do not require restarting the entire job. Upserts through the Bigtable connector are idempotent and support batching which improves throughput and reliability for production imports.
Run a Dataproc Spark job that reads the CSV files and performs row by row writes directly into Cloud Bigtable is not ideal because writing row by row is inefficient for Bigtable and sustaining high throughput requires careful batching and retry logic. You would also need to implement your own fault tolerance and idempotency to safely recover from worker restarts which makes it less reliable for a production scale import.
Use the gcloud bigtable tables import command with a csv source flag is incorrect because there is no CSV source option for this command. The import command launches a Dataflow based template that expects data in the supported export format rather than raw CSV files.
Use gsutil cp to copy the CSV objects from Cloud Storage directly into the Cloud Bigtable table is not possible because gsutil copies objects between file based storage locations and does not write into database tables. It cannot import data into Bigtable.
When a scenario requires large scale ingestion with autoscaling and restart resilience, favor Dataflow or Beam pipelines with the appropriate connectors. Verify whether a command actually supports the file format mentioned by the option and be wary of options that treat databases like file systems.
Question 10
Which approach provides secure, scalable, and repeatable provisioning of Cloud SQL instances with comprehensive networking, storage, and encryption configurations across four environments and approximately 20 projects?
-
✓ B. Cloud Deployment Manager with parameterized templates
The correct option is Cloud Deployment Manager with parameterized templates.
Deployment Manager lets you define Cloud SQL instances and all required settings in declarative templates and then pass parameters for each environment and project. You can standardize private IP or authorized networks, storage size and autoscaling, automated backups and maintenance windows, high availability, and customer managed encryption keys so every deployment follows the same security and compliance baseline. The same template can be promoted through four environments and reused across about 20 projects which delivers repeatability and scale with strong governance through version control and reviews. Although Deployment Manager is in maintenance mode and many teams now prefer Terraform, among the given options it remains the best fit for secure, scalable, and repeatable provisioning.
Google Cloud Workflows with gcloud orchestrates API calls and command sequences rather than defining infrastructure state. Using imperative gcloud steps to build Cloud SQL at scale is brittle and hard to standardize across many projects and environments and it lacks native templating and drift management that you get from declarative infrastructure as code.
Config Connector can provision Cloud SQL through Kubernetes custom resources but it requires running a GKE cluster or Config Controller and a supporting GitOps pipeline. That adds operational overhead for this use case and the option does not describe that architecture, so it is not the most straightforward answer for broad multi project rollout compared to Deployment Manager.
Cloud SQL Admin API exposes low level methods to create and manage instances but it is not an infrastructure as code solution by itself. You would need to build your own templating, promotion workflows, and governance to achieve repeatable and secure provisioning across environments and many projects.
When a question stresses many projects and multiple environments with consistent security and settings, prefer a declarative infrastructure as code approach that uses parameterized templates for truly repeatable deployments.
Question 11
A media startup named NovaStream has provisioned a new Cloud SQL for PostgreSQL instance and a Compute Engine VM connects to it through the Cloud SQL Auth proxy. During testing you review the proxy logs and repeatedly see “Error 401 Unauthorized” when it attempts to establish the connection. What should you do to resolve this authentication issue?
-
✓ C. Grant the Cloud SQL Client IAM role to the service account used by the Cloud SQL Auth proxy or verify it already has it
The correct option is Grant the Cloud SQL Client IAM role to the service account used by the Cloud SQL Auth proxy or verify it already has it.
A 401 Unauthorized from the Cloud SQL Auth proxy indicates that the identity used by the proxy does not have the required permission to connect to the instance. The service account must have the Cloud SQL Client role so it can obtain an ephemeral certificate and call the Cloud SQL Admin API to authorize the connection. Assign the role to the service account that the proxy uses and then retry the connection.
Enable the Cloud SQL Admin API for the project is necessary for Cloud SQL operations but a 401 Unauthorized points to missing or invalid credentials or permissions. If the API were disabled you would see an error about the API not being enabled rather than an authentication failure.
Reset the database user password and update the application connection parameters does not address a proxy 401 error because the proxy fails before database authentication occurs. Database user credentials are evaluated only after the proxy establishes an authorized connection to the instance.
Reconfigure the proxy to use a private IP address instead of a public endpoint does not resolve an authorization error. Whether using public or private IP the proxy still requires the correct IAM permission and a missing role will continue to produce 401 errors.
When you see proxy errors first decide whether the failure is at the IAM layer or the database layer. A 401 usually means the service account lacks the Cloud SQL Client role while authentication errors about users or passwords indicate database credential issues.
Question 12
Which type of database is best suited to continuously ingest data from 40,000 devices each sending one record every 2 seconds while supporting real time queries on the last 10 minutes of data?
-
✓ B. Time series NoSQL Database
The correct option is Time series NoSQL Database because it is built to handle high volume time stamped writes from many devices and to execute efficient time window queries such as the last 10 minutes.
A Time series NoSQL Database organizes data by entity and time which makes appends sequential and range scans over recent intervals fast. It scales horizontally to absorb about 20,000 writes per second from 40,000 devices that each send one record every two seconds. Features such as time based partitioning, compaction tuned for append heavy workloads and retention controls keep real time queries responsive while controlling storage growth.
Key value NoSQL Database is optimized for single key lookups and simple puts which makes querying a recent time window across many devices inefficient without custom indexing or scatter gather patterns.
Relational SQL Database can model timestamps but sustaining very high ingest while maintaining the indexes required for fast time range queries is difficult and costly. Sharding and tuning add complexity and it may still miss strict real time latency at this scale.
Document NoSQL Database favors hierarchical documents and flexible schemas rather than sustained time ordered writes at this rate. Time window queries typically rely on secondary indexes that slow writes and can become a bottleneck.
Match the data access pattern to the store. High write rates with queries over a recent time window point to a time series database. Look for clues like continuous ingestion and analytics on the last few minutes.
Question 13
NovaCart Retail runs a self managed PostgreSQL deployment on Compute Engine for its checkout service and leadership demands high availability with synchronous replication across zones in europe-west1. You need to design an architecture on Google Cloud that meets these requirements and minimizes operational burden. What should you implement?
-
✓ C. Use Google Cloud SQL for PostgreSQL and enable high availability with a cross zone failover replica and automatic failover
The correct option is Use Google Cloud SQL for PostgreSQL and enable high availability with a cross zone failover replica and automatic failover.
Cloud SQL for PostgreSQL provides a managed high availability configuration that keeps a standby in a different zone within the same region and performs synchronous replication. This meets the requirement for cross zone durability in europe-west1 and it delivers automatic failover that reduces recovery time. Because Cloud SQL is fully managed, it minimizes operational burden for patching, backups, monitoring, and failover orchestration, which aligns with the directive to reduce operational effort.
Configure two Compute Engine virtual machines running PostgreSQL with synchronous replication across europe-west1-b and europe-west1-c and use a Regional Persistent Disk for storage redundancy is not ideal because it leaves you responsible for configuring and operating PostgreSQL replication, fencing, failover orchestration, and maintenance. A regional persistent disk protects storage across zones for a single instance but it does not simplify or replace database level replication and it does not safely provide shared block storage for two active database servers.
Run PostgreSQL on Google Kubernetes Engine with a StatefulSet and PersistentVolumeClaims and configure synchronous replication between pods increases operational complexity since you must operate the database stack yourself and add components such as Patroni or similar for reliable leader election and failover. This approach can be made to work but it does not minimize operational burden compared to a managed database service.
Migrate to Cloud Spanner with the PostgreSQL interface and deploy a multi region configuration changes the product and its compatibility guarantees and it introduces a different architecture that is not a drop in replacement for PostgreSQL. A multi region deployment goes beyond the requirement of cross zone availability within a single region and it would add cost and migration effort without directly addressing the stated need.
When a question asks for high availability and to minimize operational burden, prefer a managed service. If the workload is PostgreSQL and the scope is a single region with cross zone replication, consider Cloud SQL HA first.
Question 14
For a workload that must support roughly 300 concurrent ad hoc analytical queries, which managed service should you choose and which feature should you enable to speed interactive execution?
-
✓ B. BigQuery and enable BI Engine
The correct option is BigQuery and enable BI Engine. This managed analytics warehouse supports around 300 concurrent ad hoc SQL queries and enabling BI Engine adds in memory acceleration that speeds interactive execution.
With BigQuery you get serverless scaling and high concurrency for analytical workloads so you do not need to manage instances. BI Engine caches frequently accessed data in memory and optimizes execution to reduce scan time and latency which makes interactive analysis and dashboarding noticeably faster.
Cloud Spanner with horizontal scaling targets globally consistent transactional workloads and although it scales horizontally it is not designed for ad hoc analytics at this concurrency level and it lacks the in memory SQL acceleration that BI Engine provides.
Cloud SQL with read replicas is an OLTP service where replicas help scale reads for transactional use but it does not offer distributed columnar analytics or the level of parallelism needed for hundreds of concurrent analytical queries and replicas do not speed complex scans or aggregations.
Bigtable with wide rows and GC tuning is a low latency NoSQL key value store that is excellent for single digit millisecond lookups and time series patterns but it does not support ad hoc SQL analytics or interactive query acceleration.
Map the workload to the right engine. If you see ad hoc analytics with high concurrency think BigQuery and consider enabling BI Engine to speed interactive queries.
Question 15
You are a data platform engineer at a retail analytics firm that uses Cloud Bigtable for time series inventory data, and a read query against the inventory_events table is experiencing high latency and inconsistent performance during peak usage. What is the best approach to diagnose and address the underlying bottleneck?
-
✓ C. Use the Bigtable Key Visualizer to inspect row key access patterns and hotspots
The correct option is Use the Bigtable Key Visualizer to inspect row key access patterns and hotspots.
Key Visualizer is built to diagnose uneven access and storage patterns that cause latency spikes in Cloud Bigtable. It provides a heatmap of key ranges over time so you can see read pressure, write pressure, and hotspots that often arise with time series schemas where keys grow monotonically. With Key Visualizer you can confirm whether a small set of tablets or key ranges are overloaded during peak usage and then address the issue by redesigning row keys through techniques like hashing or bucketing timestamps, by increasing parallelism, or by adjusting splits.
Monitor the cluster’s CPU and memory metrics in Cloud Monitoring is helpful for general health but it does not reveal skewed row key distribution or pinpoint hot key ranges. You might see elevated utilization yet still not understand which keys are causing contention, so this does not directly address the root cause of inconsistent read latency.
Add a secondary index on frequently filtered columns is not feasible with Cloud Bigtable because it does not support secondary indexes. You would need to design additional tables or adjust the row key to support your access patterns.
Migrate the Bigtable cluster to SSD storage is not the right first step because storage media changes do not resolve hotspotting or poor key distribution. Many production deployments already use SSD and even if not, the primary issue for spiky latency in read queries is often uneven access across key ranges, which Key Visualizer helps you detect and fix.
When a Bigtable question mentions hotspots or inconsistent latency, think about row key distribution and reach for the Key Visualizer to confirm before changing capacity or storage.

All GCP questions come from my Google DB Engineer Udemy course and certificationexams.pro
Question 16
Which Cloud SQL configuration disables public internet access and allows connections only from Compute Engine virtual machines within the same VPC network?
-
✓ C. Cloud SQL private IP only
The correct option is Cloud SQL private IP only. This configuration removes the public endpoint and places the instance on your VPC so only resources in that network or peered networks can connect.
When you use a private address the instance receives an internal RFC 1918 IP in your VPC through VPC peering and no public IP is assigned. Connectivity is limited to Compute Engine VMs that can reach that VPC and you control access with firewall rules. This prevents any connection from traversing the public internet and satisfies the requirement to allow only same VPC access.
Authorized networks for subnet CIDR requires a public IP and uses a list of allowed source public addresses. It does not remove internet exposure and a VPC subnet range is not a valid public source on the internet, so it does not meet the requirement.
IAP TCP forwarding to public IP is intended for VM and GKE endpoints and not for Cloud SQL instances. It also still relies on a public IP path, so it does not prevent public internet access.
Cloud SQL Auth Proxy with public IP improves authentication and encryption but the connection still uses the public endpoint. It does not restrict access to only Compute Engine VMs in the same VPC.
When a question requires blocking the public internet, look for the option that removes the public IP entirely. Words like private IP, VPC peering, and internal usually indicate the right direction, while features that whitelist sources or use proxies still imply a public endpoint.
Question 17
mcnz.com runs a reservation platform on Cloud Spanner and daily active users have tripled over the last 90 days. You expect event day spikes with sustained growth in both reads and writes and you must keep latency predictable while preserving strong consistency. Which actions will help you scale throughput effectively? (Choose 2)
-
✓ B. Increase the instance’s node count to add serving capacity
-
✓ D. Design transactions to touch fewer rows and commit quickly to minimize lock contention
The correct options are Increase the instance’s node count to add serving capacity and Design transactions to touch fewer rows and commit quickly to minimize lock contention.
Increase the instance’s node count to add serving capacity is the most direct way to scale Cloud Spanner throughput because each node adds compute and IOPS for serving reads and writes. As you add nodes Spanner automatically splits and rebalances data which helps keep latency predictable during spikes while preserving strong consistency across replicas.
Design transactions to touch fewer rows and commit quickly to minimize lock contention improves concurrency under heavy write and mixed workloads. Shorter transactions that update fewer rows hold locks for less time which reduces conflicts and queuing. This keeps tail latency lower as demand grows and it aligns with Spanner’s transaction model while maintaining strong consistency.
Consolidate all entities into one wide table to reduce joins is not a scaling strategy in Spanner and can worsen performance. Very wide rows increase I O and locking scope and can create hotspots. Spanner performance depends more on good primary key and schema design than on eliminating joins.
Disable synchronous replication to favor writes is not possible in Cloud Spanner and it would violate the requirement for strong consistency. Spanner uses synchronous replication to provide strongly consistent reads and writes which cannot be turned off.
Manually shard application tables across multiple databases adds complexity and can break transactional guarantees because Spanner does not support cross database transactions. Spanner already shards data automatically within an instance so scaling the instance and optimizing transactions is the correct approach.
When a question stresses predictable latency and strong consistency with rising reads and writes, think of scaling Spanner compute with more nodes or processing units and keeping transactions small to reduce lock contention. Be cautious of answers that change replication mode or suggest manual sharding.
Question 18
Which Google Cloud service provides a fully managed scheduled transfer from Amazon S3 to Cloud Storage with minimal operational effort?
-
✓ B. Storage Transfer Service scheduled transfer
The correct option is Storage Transfer Service scheduled transfer.
It is a fully managed service that natively supports transferring data from Amazon S3 to Cloud Storage and it lets you configure recurring schedules with minimal operations. It takes care of scale, retries, checksums, bandwidth control, and incremental synchronization so you do not need to build or maintain pipelines or scripts.
Dataflow with Apache Beam can move data but it requires you to design, deploy, and operate a pipeline. This is not a purpose built managed scheduled transfer for S3 to Cloud Storage and it increases operational overhead.
BigQuery Data Transfer Service is designed to load data into BigQuery datasets rather than into Cloud Storage, so it does not meet the requirement of transferring from S3 to Cloud Storage.
Cloud Scheduler with gsutil relies on custom scripting and a scheduler you must operate. It lacks the reliability features and turnkey management that a dedicated transfer service provides.
When you see keywords like minimal operations and scheduled transfers between storage systems, prefer the dedicated managed transfer service. If the target is Cloud Storage then Storage Transfer Service is usually the best fit while BigQuery Data Transfer Service targets BigQuery only.
Question 19
A retail marketplace operated by example.com needs to analyze very large volumes of clickstream time series events using ad hoc SQL and must sustain high velocity inserts from many microservices without managing servers. Which Google Cloud service should they choose to optimize for performance and seamless scalability?
-
✓ B. BigQuery with date partitioned and clustered tables
The correct option is BigQuery with date partitioned and clustered tables because it is a serverless analytics warehouse that delivers high performance for ad hoc SQL at very large scale and it can sustain rapid streaming inserts from many producers while automatically scaling without server management.
This choice is built for analytics at petabyte scale and it optimizes query performance by scanning only relevant partitions and by clustering to reduce data scanned within partitions. It supports high velocity ingestion from many microservices through native streaming and batch write paths and it maintains strong concurrency and elasticity so you get both performance and seamless scalability.
Cloud Spanner with secondary indexes targets globally consistent transactional workloads rather than large scale analytical queries. You must size capacity and it is not a serverless analytics engine so even with indexes it is not optimized for broad ad hoc scans over massive time series data.
Cloud SQL for PostgreSQL with time-based partitioning is a managed relational service that scales primarily by increasing instance size and it is not serverless. It typically struggles to sustain very high insert rates from many microservices at extreme scale and ad hoc analytics over very large datasets is less efficient than a columnar analytics service.
Cloud Bigtable with a time-oriented row key design excels at low latency operational access for time series data but it does not provide native SQL for ad hoc analytics. You would need an additional analytics layer which adds complexity and does not align with the requirement to avoid managing servers.
When a question pairs ad hoc SQL with serverless analytics over very large datasets and streaming ingestion, favor BigQuery. Watch for clues like partitioning and clustering which point to columnar analytics rather than transactional or key value stores.
Question 20
Which Firestore Native mode location stores data in a single EU region, tolerates a zone outage, and maintains low latency?
-
✓ B. Firestore regional EU with multi-zone replication
The correct option is Firestore regional EU with multi-zone replication.
A regional Firestore location keeps data in a single region which matches the requirement to stay within the EU region and it replicates data across multiple zones inside that region. This design tolerates the loss of a zone and continues serving traffic while keeping latency low because reads and writes stay within one regional boundary.
Firestore multi-region EU is not correct because multi-region locations store data in multiple EU regions. This improves resilience to a regional outage but typically adds inter region latency and it does not keep data confined to one EU region.
Firestore global multi-region is not correct because there is no global multi-region location for Firestore Native mode. Firestore locations are regional or multi-region with defined geographic groupings.
Firestore Datastore mode regional EU is not correct because the question asks specifically about Firestore Native mode. Datastore mode uses a different API and feature set even though it can also be regional.
When you see a requirement to keep data in one region with low latency yet survive a zonal failure, choose a regional location with multi-zone replication. If the question emphasizes surviving a regional outage or highest durability across regions, think multi-region instead.
Question 1
Riverton Retail runs several Cloud SQL for PostgreSQL databases in Google Cloud to support checkout and inventory systems. Demand is quiet most of the day but spikes for about 2 to 3 hours on weekdays. You want to control spending while still meeting performance needs during busy periods. What should you do?
-
✓ B. Use the Cloud SQL idle instance recommender to rightsize machine tiers
The correct option is Use the Cloud SQL idle instance recommender to rightsize machine tiers because it guides you to adjust CPU and memory to match real demand which helps you save money during quiet hours while maintaining the capacity you need during short weekday spikes.
The Cloud SQL Recommender analyzes utilization and identifies idle or overprovisioned instances. It provides rightsizing suggestions so you can select smaller machine tiers for most of the day and ensure sufficient capacity for peak windows. Cloud SQL does not automatically autoscale compute so using these insights is the most effective way to balance cost and performance for PostgreSQL instances with predictable short spikes.
Enable automatic storage increase on the Cloud SQL instances is not addressing the problem because storage auto increase prevents running out of disk space but it does not improve CPU or memory performance during peaks. It can also lead to higher storage costs without helping the bursty load.
Add Cloud SQL read replicas and place them behind a TCP proxy load balancer to scale connections is not appropriate because Cloud SQL does not provide load balancing across replicas and you must direct read traffic at the application or proxy layer. Read replicas are read only and they add cost and complexity and are not a simple way to handle short lived spikes in mixed read and write workloads.
Configure metric based autoscaling on Compute Engine instances that run your application to handle database load does not increase database capacity. It only scales the application tier and can increase connection pressure on the database while leaving the Cloud SQL compute tier unchanged.
When a question focuses on Cloud SQL cost control with predictable peaks look for options that use native Recommender insights and rightsizing rather than adding replicas or load balancers that increase complexity and spend.
Question 2
Which Cloud Bigtable actions provide regional failover and also enable data restoration if an entire region is lost? (Choose 2)
-
✓ B. Create a second cluster in another region with replication
-
✓ C. Export tables regularly to a dual region Cloud Storage bucket
The correct options are Create a second cluster in another region with replication and Export tables regularly to a dual region Cloud Storage bucket.
Creating a second Bigtable cluster in a different region with replication provides regional failover. In a regional outage the service can route traffic to the healthy cluster and continue serving reads and writes. Replication keeps data in sync across regions which satisfies the requirement for continuity if one region becomes unavailable.
Exporting tables to a dual region Cloud Storage bucket gives you a durable copy of the data outside the impacted region. If an entire region is lost you can import the exported data into a new Bigtable cluster in another region and restore service. Dual region storage keeps redundant copies in separate locations which preserves the backup even during a regional loss.
Enable Bigtable change streams to BigQuery is intended for downstream analytics and change capture and it does not provide automatic failover and it is not a backup that supports full restoration after regional loss.
Create a second cluster in the same region can improve availability within the region and it does not protect against the loss of the whole region and therefore does not meet the requirement for regional failover or off region restoration.
Map each requirement to the capability that fulfills it. For failover look for cross region replication. For data restoration look for durable exports or backups stored in multi region or dual region locations.
Question 3
An engineering team at mcnz.com is launching a worldwide ticketing service on Cloud Spanner. Compliance requires an RPO under 90 seconds and an RTO under 4 minutes. The design must tolerate planned maintenance and unexpected regional outages and must keep latency low for users across several continents. Which deployment approach should you choose in Cloud Spanner to meet these goals?
-
✓ C. Provision Cloud Spanner in a multi region configuration that commits through a read write quorum
The correct choice is Provision Cloud Spanner in a multi region configuration that commits through a read write quorum.
This approach uses synchronous replication with quorum commits so a transaction is acknowledged only after a majority of read write replicas persist it. That prevents data loss for committed transactions and gives an effective RPO of zero which is well within the 90 second requirement. Automatic leader re election and failover complete quickly during planned maintenance or an unexpected regional outage which fits the 4 minute RTO target. Multi region configurations also place read only replicas in additional regions so read traffic can stay close to users which helps keep global latency low while writes remain strongly consistent.
Run two independent regional Cloud Spanner instances and replicate changes with Cloud Dataflow is incorrect because it relies on asynchronous application level replication. Pipeline lag and lack of synchronous quorum commits can push RPO beyond 90 seconds and there is no transparent failover or cross instance transactional consistency.
Use a regional Cloud Spanner instance and turn on automated backups and point in time recovery is incorrect because backups and point in time recovery address data recovery from corruption or mistakes rather than continuous high availability across regions. Restores typically take longer than a few minutes and a single region cannot tolerate a regional outage or provide consistently low latency for globally distributed users.
Create a zonal Cloud Spanner instance and add read replicas in other regions for disaster recovery is incorrect because Cloud Spanner does not offer zonal instances and read replicas cannot accept writes during a failure. This means there is no write failover so the strict RPO and RTO targets would not be met.
When you see strict RPO and RTO targets together with tolerance of regional outages and global users, look for multi region Spanner with quorum commits. Backups or batch replication usually signal recovery rather than high availability.
Question 4
Which Google Cloud service should you use to build a resilient and efficient pipeline that transfers about 60 million rows each week from PostgreSQL on Compute Engine to Cloud SQL for PostgreSQL while supporting filtering and enrichment during the transfer?
-
✓ B. Cloud Dataflow with Apache Beam
The correct option is Cloud Dataflow with Apache Beam.
Cloud Dataflow with Apache Beam is designed for scalable and reliable batch pipelines and it offers rich transformation capabilities during data movement. You can read from PostgreSQL using JDBC, apply filtering and enrichment in your pipeline, and write to Cloud SQL for PostgreSQL while benefiting from autoscaling, fault tolerance, and managed execution. This makes it well suited for a weekly workload of about 60 million rows where you want both efficiency and resilience.
Cloud Data Fusion provides a low code UI for building integrations and it typically executes on Dataflow under the hood. While it can move data and perform transformations, the scenario calls for a resilient and efficient pipeline with custom filtering and enrichment where directly building and operating the pipeline on Dataflow gives more control and efficiency without the additional abstraction layer.
Dataproc Serverless Spark can run Spark based batch jobs, yet this approach would require you to manage connectors and job logic for reading from and writing to PostgreSQL systems and to engineer reliability patterns that are natively addressed by Dataflow and Beam. It is better suited to Spark based analytics workloads rather than these relational ETL transfers.
Database Migration Service focuses on database migration and replication into Cloud SQL and it is not intended for pipelines that perform filtering and enrichment during transfer. It is best for minimal transformation migrations rather than custom ETL.
When a question emphasizes transformations during transfer and ongoing batch or streaming reliability, lean toward Dataflow with Beam. Choose Database Migration Service for minimal change migrations and consider Cloud Data Fusion when a low code UI driven integration is explicitly desired.
Question 5
Rivertown Analytics operates an on premises facility and needs a private connection to Google Cloud for a data platform. Current throughput is near 20 Gbps and capacity is expected to reach about 80 Gbps within 12 months as ingestion grows. The team wants private connectivity rather than sending traffic over the public internet and they must be able to expand bandwidth as demand increases. Which connectivity approach should they choose?
-
✓ B. Provision Dedicated Interconnect for a private physical link to Google Cloud with room to scale
The correct choice is Provision Dedicated Interconnect for a private physical link to Google Cloud with room to scale.
Dedicated Interconnect provides a private physical connection that does not traverse the public internet and it supports dynamic routing with Cloud Router. You can deploy multiple 10 Gbps or 100 Gbps circuits and aggregate capacity so it comfortably meets the current 20 Gbps requirement and can scale toward 80 Gbps and beyond. Dedicated Interconnect is designed for high throughput and predictable performance which aligns with the need for private connectivity and growth.
Configure Cloud VPN with dynamic routing using Cloud Router is not suitable because Cloud VPN uses the public internet and is limited per tunnel. Even with multiple tunnels it does not provide the same private path or the high and scalable bandwidth needed here.
Use Direct Peering to connect your network to Google at an edge location does not connect to VPC networks. It is intended for access to Google public services and it does not provide private connectivity to your Google Cloud VPC resources, so it does not meet the requirement.
Adopt Partner Interconnect through a supported provider when you do not need your own physical cross connect can offer private connectivity but it relies on a service provider and is best when you cannot establish your own cross connect. In this scenario the need for very high and scalable bandwidth from an on premises facility makes Dedicated Interconnect the more appropriate choice.
Match requirements to the product and watch for words like private, tens of Gbps, and scale. Choose Dedicated Interconnect for private high throughput to VPC, consider Partner Interconnect when you cannot host a cross connect, use Cloud VPN only when the internet path is acceptable, and remember Direct Peering reaches Google public services rather than your VPC.