GCP Professional ML Engineer Certification Exam Dump and Braindump

Free Google Cloud Machine Learning Engineer Exam Topics Test

Despite the title of this article, this is not a Professional GCP Machine Learning Engineer Certification Braindump in the traditional sense. I do not believe in cheating. Traditionally, the term “braindump” referred to someone taking an exam, memorizing the questions, and sharing them online for others to use. That practice is unethical and violates the certification agreement. It provides no integrity, no real learning, and no professional growth.

This is not a braindump. All of these questions come from my Google Cloud certification prep materials and from the certificationexams.pro website, which offers hundreds of free GCP Professional Machine Learning Engineer Practice Questions.

Google Certified Machine ML Exam Simulator

Each question has been carefully written to align with the official Google Cloud Professional Machine Learning Engineer exam objectives. They mirror the tone, logic, and technical depth of real Google Cloud exam scenarios, but none are copied from the actual test. Every question is designed to help you learn, reason, and master ML concepts such as data preparation, model optimization, pipeline automation, and model governance.

If you can answer these questions and understand why the incorrect options are wrong, you will not only pass the real exam but also gain a solid understanding of how to design, train, and deploy ML models effectively in production.

About GCP Exam Dumps

So if you want to call this your Google Machine Learning Engineer Certification Exam Dump, that is fine, but remember that every question here is built to teach, not to cheat. Each item includes detailed explanations, realistic examples, and insights that help you think like a professional ML engineer working on Google Cloud.

Study with focus, practice consistently, and approach your certification with integrity. Success as a Google Cloud ML Engineer comes not from memorizing answers but from understanding how machine learning, data engineering, and MLOps work together to deliver impactful solutions.

Use the Google Certified Machine ML Exam Simulator and the Google Certified Professional ML Engineer Practice Test to prepare effectively and move closer to earning your certification.

Git, GitHub & GitHub Copilot Certification Made Easy

Want to get certified on the most popular AI, ML & DevOps technologies of the day? These five resources will help you get GitHub certified in a hurry.

Get certified in the latest AI, ML and DevOps technologies. Advance your career today.

Google Cloud ML Professional Certification Exam Dump

Question 1

A customer insights team at scrumtuous.com needs to quickly build and train a sentiment model on review text with their own labels such as “delight” and “frustration”. They have about 3,000 annotated examples, they want strong predictive accuracy, and they prefer a solution that does not require writing code. Which Google Cloud service should they use?

  • ❏ A. BigQuery ML

  • ❏ B. Cloud Natural Language API

  • ❏ C. Vertex AI AutoML for Text

  • ❏ D. Vertex AI Training

Question 2

You are training an object detection model on Vertex AI using a single worker with one GPU and epochs are slow. What should you do to reduce total training time without degrading model quality?

  • ❏ A. Vertex AI Vizier hyperparameter tuning

  • ❏ B. Use MultiWorkerMirroredStrategy on Vertex AI for distributed training

  • ❏ C. Increase machine memory to 512 GB and raise batch size

  • ❏ D. Move the job to a single Cloud TPU v5e

Question 3

You lead the machine learning platform team at a digital media analytics startup, and you need a single place to track experiment lineage, parameters, executions, and generated artifacts so that results remain reproducible across projects. Which management solution should your team adopt?

  • ❏ A. Store training run logs and metrics in BigQuery

  • ❏ B. Use Vertex ML Metadata to track lineage, artifacts, and executions

  • ❏ C. Google Cloud operations suite

  • ❏ D. Vertex TensorBoard

Question 4

Which approach ensures that the same preprocessing is used in the Dataflow Apache Beam training pipeline and in low latency Vertex AI predictions to prevent input skew?

  • ❏ A. Use Dataflow streaming to preprocess each request before the endpoint

  • ❏ B. Add request schema checks

  • ❏ C. Refactor Beam transforms into a shared library and run the same code in Vertex AI prediction

  • ❏ D. Use Vertex AI Feature Store

Question 5

At a digital newspaper publisher you trained a TensorFlow model on Vertex AI using multiple years of subscriber history to predict who will renew in the next 18 months, and the model is now serving online predictions; stakeholders want to see which single customer feature most influenced each individual prediction at request time, so what should you do?

  • ❏ A. Train a Logistic Regression model in BigQuery ML on the same data and use coefficient magnitudes to infer which features matter most

  • ❏ B. Use the What If Tool in TensorBoard to remove features one at a time and compare the change in overall model performance

  • ❏ C. Enable Vertex AI Explainable AI on the deployed endpoint and request predictions with explanations to return sampled Shapley attributions for each instance

  • ❏ D. Write predictions and features to BigQuery and calculate Pearson correlations with the label using the CORR function

Question 6

You need to predict same-day purchases using tabular data stored in BigQuery and provide explanations for each prediction. Which Google Cloud approach provides instance-level feature attributions?

  • ❏ A. BigQuery ML logistic regression using coefficients

  • ❏ B. Vertex AI custom model without explanations

  • ❏ C. Vertex AI AutoML Tabular with attributions enabled

  • ❏ D. BigQuery ML boosted trees with ML.FEATURE_IMPORTANCE

Question 7

A commerce team at mcnz.com is building a BigQuery ML linear regression model to estimate the likelihood that a visitor will buy a product, and the customer_city field is one of the most predictive features with about 180 unique values. You want the training table to be fully columnar for the model and you want to keep this feature while doing as little coding as possible. What should you do?

  • ❏ A. Use BigQuery to create a view that removes the customer_city column

  • ❏ B. Use TensorFlow to build a categorical vocabulary for customer_city and upload the vocabulary along with the model to BigQuery ML

  • ❏ C. Use Dataprep to apply one hot encoding to the customer_city field so that each city becomes its own binary indicator column

  • ❏ D. Use Cloud Data Fusion to convert each city to a numeric region code such as 1 to 5 and train on that single code

Question 8

Which Google Cloud solution provides a fully managed and scalable way to automate notebook-based steps for data validation, training, and evaluation on a three-week schedule as the dataset grows from 250 GB to 25 TB?

  • ❏ A. BigQuery ML with scheduled queries

  • ❏ B. Vertex AI Pipelines with a TFX pipeline

  • ❏ C. Cloud Composer with BashOperator

  • ❏ D. Kubeflow Pipelines on GKE

Question 9

You work for a meal delivery platform. A model chooses which promo to display on the checkout page based on the basket contents and the customer’s profile. The prediction service on Google Cloud merges the live cart with a row from a BigQuery table named cust_purchase_log that holds 180 days of transaction history and then sends those features to the model. The web team reports that the promo widget times out because predictions are not fast enough to render with the rest of the page. What change should you make to reduce total latency?

  • ❏ A. Attach an NVIDIA T4 GPU to the model server

  • ❏ B. Serve the customers’ historical features from a low latency database

  • ❏ C. Create a BigQuery materialized view that precomputes the join to each customer’s last 180 days of purchases

  • ❏ D. Increase the number of model serving replicas behind a load balancer

Question 10

You collect confirmed rider check ins for shuttle stops on Google Kubernetes Engine at least 18 hours in advance. Based on this confirmed demand, which approach should you use to plan routes and determine bus size?

  • ❏ A. Reinforcement learning for routing

  • ❏ B. Vertex AI Forecast

  • ❏ C. Capacitated route on confirmed stops

  • ❏ D. Tree based regression for demand

Question 11

At AuroraFinTech you are running a synchronous training job on two GPUs and the profiler shows the GPUs often wait on input. The dataset is split across 1,500 TFRecord files in Cloud Storage and step time is dominated by file reads. You want to reduce input latency and accelerate training without changing the model architecture. What should you do?

  • ❏ A. Provision a machine with more vCPUs to increase data preparation throughput

  • ❏ B. Enable parallel interleave in the input pipeline to read from multiple files simultaneously

  • ❏ C. Insert a cache transformation in the pipeline after parsing so later epochs reuse data

  • ❏ D. Move parsing and shuffling to Cloud Dataflow and write data to a single large file before training

Question 12

Which validation method best evaluates a time series demand model using the most recent behavior before moving to production?

  • ❏ A. Apply k fold cross validation across the entire history

  • ❏ B. Hold out the most recent 21 days for time based validation to reflect current behavior

  • ❏ C. Vertex AI Model Monitoring

  • ❏ D. Create a random 25 percent holdout across all records regardless of date

Question 13

A midmarket logistics analytics team at Northwind Mobility runs PySpark-based data science pipelines on its own servers and now wants to test moving a single PySpark job to Google Cloud with the least setup and cost. What should you do first to start this proof of concept?

  • ❏ A. Use Cloud Dataflow by rewriting the job in Apache Beam and run it on a small Dataflow setup

  • ❏ B. Create a Dataproc Standard cluster with 1 master and 2 workers and open a Vertex AI Workbench notebook that uses the cluster for PySpark

  • ❏ C. Provision an e2-standard-4 Compute Engine VM and manually install Java Scala and Apache Spark

  • ❏ D. Launch a Vertex AI Workbench notebook on an e2-standard-4 machine without attaching any Spark cluster

Question 14

Which GCP approach delivers low latency streaming predictions by reading from a Pub/Sub request topic, automatically reloading the model from Cloud Storage about every 30 minutes, and publishing results to BigQuery and to a Pub/Sub response topic?

  • ❏ A. Cloud Functions with Pub/Sub

  • ❏ B. Dataflow RunInference with WatchFilePattern

  • ❏ C. Cloud Run with Pub/Sub push

  • ❏ D. Vertex AI endpoint called from Dataflow

Question 15

At BrightWave Insights you trained a model in a Vertex AI Workbench notebook that reports strong validation RMSE. You plan to search across 24 hyperparameters with defined ranges. You want a tuning approach that finishes in the least wall clock time, and you also want to keep cost, reproducibility, model quality, and scalability in mind as long as they do not slow the job. What should you do?

  • ❏ A. Set up a hyperparameter study in the notebook using Vertex AI Vizier and specify validation_rmse as the study metric

  • ❏ B. Use Hyperopt or Optuna in the notebook to run Bayesian optimization locally

  • ❏ C. Containerize a parameterized Python training entrypoint and push the image to Artifact Registry then create a Vertex AI hyperparameter tuning job with gcloud using Random Search and set the maximum trial count equal to the parallel trial count

  • ❏ D. Containerize a parameterized Python training script and push the image to Artifact Registry then create a Vertex AI hyperparameter tuning job in the Google Cloud console and choose Grid Search

Question 16

In Vertex AI Workbench, how should you track experiment runs with parameters, metrics, datasets, and models and then promote the winning approach to production?

  • ❏ A. Vertex AI Pipeline first with Kubeflow artifacts

  • ❏ B. Vertex AI Experiments in SDK then build a pipeline

  • ❏ C. Vertex ML Metadata only

Question 17

An ad tech startup named LumaDSP trained a scikit-learn model for click prediction and plans to deploy it on Vertex AI for both real time and batch predictions. The model expects 32 input features and several fields must be normalized during inference, and you want to package the solution with as little extra code as possible. What should you do?

  • ❏ A. Use the prebuilt scikit-learn prediction container, upload the model to Vertex AI Model Registry, deploy to Vertex AI Endpoints, and configure batch prediction to transform input data by setting instanceConfig.instanceType

  • ❏ B. Wrap the model with a Custom Prediction Routine that performs preprocessing and prediction, build a container image from the CPR artifacts, register it in Vertex AI Model Registry, deploy to Vertex AI Endpoints, and run Vertex AI batch prediction jobs

  • ❏ C. Build a custom container for the scikit-learn model and implement a custom serving function that handles preprocessing and prediction, register the image in Vertex AI Model Registry, deploy to Endpoints, and create a batch prediction job

  • ❏ D. Store features in Vertex AI Feature Store and schedule Dataflow to compute transformations, then deploy the scikit-learn model with the prebuilt prediction container to Vertex AI Endpoints and use Vertex AI batch prediction for offline scoring

Question 18

Which event-driven method in Google Cloud triggers a Vertex AI Pipelines run only when a new Cloud Storage object is created while keeping compute costs low?

  • ❏ A. Cloud Composer with GCS sensor

  • ❏ B. Cloud Functions on GCS finalize to start Vertex AI Pipelines run

  • ❏ C. Vertex AI Pipelines via Cloud Scheduler

  • ❏ D. Dataflow streaming on GCS notifications

Question 19

Solstice Analytics has built a Vertex AI Pipelines workflow that trains custom models with about 50 executions each week. The team wants the easiest way to collaborate when comparing metrics across executions through both programmatic access and interactive visualizations in Google Cloud. What should you add to the pipeline and which tools should the team use?

  • ❏ A. Add a pipeline component that writes metrics to a BigQuery table named ml_metrics_ds.train_metrics_v3, then use SQL to compare runs and visualize through Looker Studio

  • ❏ B. Add a pipeline component that records metrics to Vertex ML Metadata, then compare runs with Vertex AI Experiments and visualize with Vertex AI TensorBoard

  • ❏ C. Add a pipeline component that records metrics to Vertex ML Metadata, then export them into a pandas DataFrame to compare runs and render plots with Matplotlib

  • ❏ D. Add a pipeline component that pushes metrics to Vertex AI Model Registry, then monitor metrics with Cloud Monitoring dashboards

Question 20

In a photo policy classifier, which Vertex AI AutoML objective should you choose to minimize false negatives for the noncompliant class?

  • ❏ A. Vertex AI AutoML maximize F1 score

  • ❏ B. Vertex AI AutoML high recall on noncompliant

  • ❏ C. Vertex AI AutoML high precision on noncompliant

Question 21

You are designing a product recommendation system for a midmarket home furnishings retailer. Three years of purchase history is stored in BigQuery and roughly 90 GB of clickstream logs are saved as CSV files in Cloud Storage. You need to run exploratory analysis, clean data, and train models repeatedly while trying different algorithms, and you want to keep costs and setup effort low. How should you set up your working environment?

  • ❏ A. Start a Vertex AI Workbench managed notebook and use the BigQuery integration in JupyterLab to browse datasets and run queries

  • ❏ B. Use BigQuery Studio with BigQuery ML to explore data and build models directly in the BigQuery console

  • ❏ C. Provision a Vertex AI Workbench user managed notebook on the default machine type and use the %%bigquery magic in Jupyter to query BigQuery tables

  • ❏ D. Attach a Vertex AI Workbench managed notebook to a Dataproc cluster and access BigQuery through the Spark BigQuery connector

Question 22

What is the simplest way to use a BigQuery ML logistic regression model for real time predictions within a Dataflow streaming pipeline while keeping latency under 30 milliseconds per event?

  • ❏ A. Vertex AI online prediction

  • ❏ B. Per event ML.PREDICT in BigQuery

  • ❏ C. In pipeline TensorFlow RunInference in Dataflow

  • ❏ D. Cloud Run with TensorFlow Serving

Question 23

At Northstar Retail’s analytics group the machine learning team needs to run many quick trials using different feature sets, model variants, and tuning parameters. They want a low maintenance approach that automatically records accuracy metrics for every run and allows engineers to retrieve those metrics over time through an API for comparisons and dashboards. Which approach should they adopt?

  • ❏ A. Use Vertex AI Training to run jobs, write accuracy metrics to BigQuery, and access them with the BigQuery API

  • ❏ B. Use Vertex AI Training to execute runs and send accuracy values to Cloud Monitoring, then read them with the Monitoring API

  • ❏ C. Use Kubeflow Pipelines to orchestrate experiments, export metrics from steps, and retrieve run metrics with the Kubeflow Pipelines API

  • ❏ D. Use Vertex AI Workbench notebooks to run tests and log results in a shared Google Sheets workbook, then fetch values with the Google Sheets API

Question 24

Only 0.2% of labels are failures and a standard training approach predicts the majority class. How should you address this extreme class imbalance so the model learns to detect the rare failure class?

  • ❏ A. Vertex AI Vizier

  • ❏ B. Downsample negatives and apply class weights for about 20% failures per batch

  • ❏ C. Lower the classification threshold during inference

Question 25

At the streaming service mcnz.com your team has deployed a model on a Vertex AI endpoint and a Vertex AI Pipeline retrains the model when a Cloud Function is invoked. You want to keep the model current while keeping training spend predictable and low. How should you trigger retraining to balance freshness and cost?

  • ❏ A. Enable model monitoring on the Vertex AI endpoint for anomaly detection and have Pub/Sub notify the Cloud Function when anomalies are found

  • ❏ B. Configure a Cloud Scheduler job to invoke the Cloud Function on a fixed cadence that aligns with the budget

  • ❏ C. Enable Vertex AI model monitoring for input feature drift and publish notifications to Pub/Sub that invoke the Cloud Function when drift thresholds are exceeded

  • ❏ D. Create a Cloud Monitoring alert on endpoint latency anomalies and call the Cloud Function through a webhook when the alert fires

Certified GCP Machine Learning Professional Braindump Answers

Question 1

A customer insights team at scrumtuous.com needs to quickly build and train a sentiment model on review text with their own labels such as “delight” and “frustration”. They have about 3,000 annotated examples, they want strong predictive accuracy, and they prefer a solution that does not require writing code. Which Google Cloud service should they use?

  • ✓ C. Vertex AI AutoML for Text

The correct option is Vertex AI AutoML for Text. It lets the team train a custom sentiment classifier on their own labels without writing code and it works well with a few thousand annotated reviews, which fits the 3,000 labeled examples they have.

This service provides a guided console workflow to import labeled text, automatically extracts features, and selects and tunes models to deliver strong predictive accuracy. It also provides built in evaluation and simple deployment so the team can move quickly from data to predictions without managing infrastructure or writing code.

BigQuery ML is not a match because it requires SQL to build and evaluate models and it typically needs manual text preprocessing and feature engineering, which does not meet the no code preference and is better suited to tabular use cases.

Cloud Natural Language API offers pretrained sentiment scores and magnitudes but it cannot learn custom categories such as delight and frustration, so it does not satisfy the need to train on the team’s own labels.

Vertex AI Training targets custom training with user provided code and containers, which adds complexity and does not meet the no code requirement for this scenario.

When you see custom labels for text, a relatively small labeled dataset, and a no code requirement, choose AutoML on Vertex AI rather than pretrained APIs or code based training.

Question 2

You are training an object detection model on Vertex AI using a single worker with one GPU and epochs are slow. What should you do to reduce total training time without degrading model quality?

  • ✓ B. Use MultiWorkerMirroredStrategy on Vertex AI for distributed training

The correct option is Use MultiWorkerMirroredStrategy on Vertex AI for distributed training.

Use MultiWorkerMirroredStrategy on Vertex AI for distributed training scales training across multiple workers with synchronous data parallelism so you increase throughput and reduce wall clock time per epoch. On Vertex AI you can run custom training with multiple workers that coordinate gradients, and if you keep the effective global batch size and learning rate consistent then model quality is preserved while time to train decreases.

Vertex AI Vizier hyperparameter tuning explores parameter configurations across multiple trials and can improve model quality, yet it does not make a single training run faster. It often increases total compute time because it runs many jobs.

Increase machine memory to 512 GB and raise batch size targets host memory rather than GPU memory, which is usually not the bottleneck for GPU training. Raising batch size can change optimization dynamics and can harm convergence, and it does not guarantee shorter epoch times.

Move the job to a single Cloud TPU v5e does not provide distributed scaling and may require code changes for TPU compatibility. Performance gains are workload dependent, and moving to a single different accelerator does not reliably reduce training time without risking changes to training behavior.

When a question asks to reduce training time without harming quality, prefer synchronous data parallel scaling on managed training services. Be cautious with options that change batch size or rely on hyperparameter tuning because they do not make a single run faster and can affect convergence.

Question 3

You lead the machine learning platform team at a digital media analytics startup, and you need a single place to track experiment lineage, parameters, executions, and generated artifacts so that results remain reproducible across projects. Which management solution should your team adopt?

  • ✓ B. Use Vertex ML Metadata to track lineage, artifacts, and executions

The correct option is Use Vertex ML Metadata to track lineage, artifacts, and executions.

This service provides a managed metadata store that captures experiment lineage, parameters, executions, and generated artifacts so teams can reproduce results across projects. It integrates with Vertex AI workflows and builds lineage graphs that connect datasets, code versions, runs, and models, which enables auditability and repeatability.

Store training run logs and metrics in BigQuery is not sufficient because BigQuery is a data warehouse that can store numbers and logs but it does not model ML specific entities like executions and artifacts or their lineage relationships, and it does not automatically connect pipeline steps for reproducibility.

Google Cloud operations suite focuses on infrastructure and application monitoring and logging, so it does not capture experiment lineage, parameters, and artifact relationships, and it is not an ML metadata system.

Vertex TensorBoard visualizes training curves and metrics and helps compare runs, but it is not designed to track end to end lineage, artifacts, and executions across projects.

When you see keywords like lineage, artifacts, and executions, prefer an ML metadata solution rather than general logging, monitoring, or visualization tools.

Question 4

Which approach ensures that the same preprocessing is used in the Dataflow Apache Beam training pipeline and in low latency Vertex AI predictions to prevent input skew?

  • ✓ C. Refactor Beam transforms into a shared library and run the same code in Vertex AI prediction

The correct option is Refactor Beam transforms into a shared library and run the same code in Vertex AI prediction.

The shared library approach removes duplicate logic and guarantees that the exact same transformation code is used during both Dataflow training and online prediction. You package your Beam transform logic as reusable utilities or composite transforms and include that library in your Dataflow pipeline for training while also installing it in the Vertex AI prediction container. This keeps preprocessing consistent which prevents training serving skew and it also keeps latency low because preprocessing runs inside the prediction container rather than through an external service.

The shared library approach also enables stronger testing. You can unit test the transform functions once and rely on the same tests to validate both training and serving behavior. This makes the system easier to maintain and reduces the chance of silent divergence in preprocessing.

Use Dataflow streaming to preprocess each request before the endpoint is not suitable because it adds an extra network hop and queueing which increases latency and creates an unnecessary dependency during serving. It also does not ensure that the code path is identical to training and it complicates scaling and cost.

Add request schema checks only validates structure and types so it cannot guarantee that the same feature engineering and normalization steps are applied. Schema checks reduce bad inputs but they do not remove skew caused by different preprocessing logic.

Use Vertex AI Feature Store helps with managing and serving precomputed features and can reduce some kinds of skew when you materialize features consistently. However it does not ensure that your Dataflow Apache Beam transforms are executed identically at prediction time and it is unnecessary if your preprocessing logic needs to run per request inside the prediction container.

When you see training serving skew and low latency requirements, favor answers that re use the same code path for training and serving. Packaging transforms into a library and running them in the prediction container usually beats options that add external services to the serving path.

Question 5

At a digital newspaper publisher you trained a TensorFlow model on Vertex AI using multiple years of subscriber history to predict who will renew in the next 18 months, and the model is now serving online predictions; stakeholders want to see which single customer feature most influenced each individual prediction at request time, so what should you do?

  • ✓ C. Enable Vertex AI Explainable AI on the deployed endpoint and request predictions with explanations to return sampled Shapley attributions for each instance

The correct option is Enable Vertex AI Explainable AI on the deployed endpoint and request predictions with explanations to return sampled Shapley attributions for each instance.

With Explainable AI enabled on the endpoint, each online prediction includes feature attribution values that quantify how much each input feature contributed to the prediction for that specific instance. You can surface the top attribution by magnitude to identify the single most influential customer feature at request time. This works with your TensorFlow model and keeps explanations aligned with the model you are actually serving.

Train a Logistic Regression model in BigQuery ML on the same data and use coefficient magnitudes to infer which features matter most provides global importance for a different model and not local explanations for each served prediction. Coefficients from a separate linear model do not explain individual outputs of your deployed TensorFlow model.

Use the What If Tool in TensorBoard to remove features one at a time and compare the change in overall model performance supports exploratory analysis but it is about dataset level or scenario comparisons and not about returning per instance attributions with each online prediction. It does not integrate explanations directly into the endpoint response.

Write predictions and features to BigQuery and calculate Pearson correlations with the label using the CORR function yields global associations across the dataset and not local influence on a single prediction. Correlation does not provide per request feature attributions and can be misleading for non linear models or interacting features.

When you see a need for per instance explanations at request time, look for endpoint explanations that return feature attributions with each prediction rather than global metrics or offline analyses.

Question 6

You need to predict same-day purchases using tabular data stored in BigQuery and provide explanations for each prediction. Which Google Cloud approach provides instance-level feature attributions?

  • ✓ C. Vertex AI AutoML Tabular with attributions enabled

The correct option is Vertex AI AutoML Tabular with attributions enabled.

Vertex AI AutoML Tabular with attributions enabled uses Vertex Explainable AI to return per prediction feature attributions for tabular models. When explanations are enabled, each prediction includes scores that show how much each input feature pushed the prediction toward or away from the outcome, which satisfies the need for instance level attributions. You can train on BigQuery data and request explanations for both online and batch predictions.

BigQuery ML logistic regression using coefficients is not correct because coefficients provide global weights learned by the model and they describe overall influence rather than how features affected a specific row’s prediction, so they are not per example attributions.

Vertex AI custom model without explanations is not correct because if explanations are not enabled and configured, predictions do not include feature attributions, therefore there is no instance level explanation.

BigQuery ML boosted trees with ML.FEATURE_IMPORTANCE is not correct because ML.FEATURE_IMPORTANCE reports aggregate or global importance across the training data and it does not provide per row attributions for individual predictions.

When a question asks for per example explanations, look for options that explicitly mention explanations or feature attributions and avoid answers that only mention coefficients or feature importance because those are usually global.

Question 7

A commerce team at mcnz.com is building a BigQuery ML linear regression model to estimate the likelihood that a visitor will buy a product, and the customer_city field is one of the most predictive features with about 180 unique values. You want the training table to be fully columnar for the model and you want to keep this feature while doing as little coding as possible. What should you do?

  • ✓ C. Use Dataprep to apply one hot encoding to the customer_city field so that each city becomes its own binary indicator column

The correct option is Use Dataprep to apply one hot encoding to the customer_city field so that each city becomes its own binary indicator column.

One hot encoding turns a single categorical field into multiple numeric indicator columns so the training data becomes fully columnar and remains expressive across about 180 cities. Using Dataprep provides a largely no code path to generate those columns and write the result to BigQuery, which satisfies the requirement to keep the feature with minimal coding. Note that Dataprep has been retired on Google Cloud, so on newer exams this workflow may appear as an equivalent preparation step using other managed options or native SQL transformations, yet among the given choices it best matches the requirement.

Use BigQuery to create a view that removes the customer_city column is incorrect because it discards one of the most predictive features and does not meet the requirement to keep it.

Use TensorFlow to build a categorical vocabulary for customer_city and upload the vocabulary along with the model to BigQuery ML is incorrect because it introduces unnecessary custom modeling and operational complexity and does not provide a minimal coding solution inside BigQuery. BigQuery ML does not take a separately uploaded vocabulary for native models in this workflow.

Use Cloud Data Fusion to convert each city to a numeric region code such as 1 to 5 and train on that single code is incorrect because it collapses many distinct categories into a few labels and imposes an artificial numeric order, which can mislead a linear model and reduce predictive power.

When a categorical feature has many distinct values and the requirement is numeric and columnar inputs, think of one hot encoding. If the prompt asks for minimal coding, prefer a managed or no code transformation path that outputs a BigQuery table you can train on directly.

Question 8

Which Google Cloud solution provides a fully managed and scalable way to automate notebook-based steps for data validation, training, and evaluation on a three-week schedule as the dataset grows from 250 GB to 25 TB?

  • ✓ B. Vertex AI Pipelines with a TFX pipeline

The correct option is Vertex AI Pipelines with a TFX pipeline. This choice provides a fully managed and scalable way to orchestrate data validation, training and evaluation on a fixed schedule as data volume grows from hundreds of gigabytes to tens of terabytes.

Vertex AI Pipelines with a TFX pipeline turns the steps you prototype in notebooks into reusable pipeline components and stages. TFX provides built in components for data validation, model training and evaluation, and Vertex AI Pipelines manages execution, tracking and artifact lineage without you managing infrastructure. You can schedule recurring runs every three weeks and the service elastically scales underlying resources to handle large datasets.

This approach integrates with distributed training and data processing services on Google Cloud, which lets you process 250 GB today and scale to 25 TB as the pipeline reruns. You also gain reproducibility and auditability through managed metadata and versioning, which are important for production ML automation.

BigQuery ML with scheduled queries is not a good fit because it is limited to SQL driven model training and prediction inside BigQuery and it does not orchestrate multi step ML workflows such as separate data validation and model evaluation, nor does it automate notebook code.

Cloud Composer with BashOperator can schedule tasks but it leaves you to script ML steps yourself and manage operator reliability. It lacks the ML specific components, metadata tracking and integrated model lifecycle that a managed ML pipeline service provides.

Kubeflow Pipelines on GKE can run pipelines but it is self managed on your own GKE clusters. The question calls for a fully managed approach, so running and maintaining clusters yourself makes this a less suitable choice.

Look for phrases like fully managed, ML specific components and recurring automation. These often point to Vertex AI Pipelines rather than general workflow tools or self managed stacks.

Question 9

You work for a meal delivery platform. A model chooses which promo to display on the checkout page based on the basket contents and the customer’s profile. The prediction service on Google Cloud merges the live cart with a row from a BigQuery table named cust_purchase_log that holds 180 days of transaction history and then sends those features to the model. The web team reports that the promo widget times out because predictions are not fast enough to render with the rest of the page. What change should you make to reduce total latency?

  • ✓ B. Serve the customers’ historical features from a low latency database

The correct option is Serve the customers’ historical features from a low latency database.

The current bottleneck is fetching and joining 180 days of history from BigQuery during a synchronous web request. BigQuery is optimized for analytical scans and is not designed for per-request millisecond lookups. Moving the precomputed customer features into an online store lets the prediction path perform a single key-based read and then run inference, which cuts end-to-end latency and removes variability. On Google Cloud you can store online features in Vertex AI Feature Store�s online store or in a low latency database such as Cloud Bigtable, then fetch them at inference time in a few milliseconds while the live cart data remains in memory.

Attach an NVIDIA T4 GPU to the model server is not addressing the root cause. The slow step is retrieving and joining historical features at request time, not the model compute. A GPU can speed up heavy model inference but it will not materially reduce the time spent waiting on BigQuery.

Create a BigQuery materialized view that precomputes the join to each customer’s last 180 days of purchases still leaves you issuing a BigQuery query during page rendering. Materialized views can accelerate analytics but BigQuery remains an analytical warehouse with second-level interactive latencies and is not intended for sub-100 millisecond per-request lookups in a web path. Freshness lag also makes it less suitable for real-time personalization.

Increase the number of model serving replicas behind a load balancer can improve throughput when there is queueing, yet it does not make a single prediction faster when the latency is dominated by the upstream BigQuery fetch. Without moving features to an online store, requests will continue to time out.

Trace the request path and find the slowest step. If feature retrieval is dominating latency, move features to an online store and fetch by key rather than querying an analytical warehouse during inference.

Question 10

You collect confirmed rider check ins for shuttle stops on Google Kubernetes Engine at least 18 hours in advance. Based on this confirmed demand, which approach should you use to plan routes and determine bus size?

  • ✓ C. Capacitated route on confirmed stops

The correct option is Capacitated route on confirmed stops.

Because rider check ins are confirmed well in advance, this is a deterministic planning problem. The right approach is to formulate a capacity constrained vehicle routing problem that assigns stops to buses and selects appropriate bus sizes while respecting capacity and time constraints. This approach can be solved efficiently with mature optimization tools and will directly produce feasible routes and fleet sizing decisions from the known demand.

Reinforcement learning for routing is not appropriate here because there is no need to learn a policy when demand is fixed and constraints must be strictly satisfied. Classical optimization will deliver high quality solutions more reliably and with far less tuning for this kind of deterministic problem.

Vertex AI Forecast targets time series prediction problems yet there is nothing to predict when riders have already confirmed check ins. Forecasting would introduce avoidable error and would still not create the actual routes or vehicle assignments that are required.

Tree based regression for demand would build a model to estimate demand even though demand is already known. It would add complexity without value and you would still need a routing optimization step to produce implementable plans.

When a scenario gives you confirmed demand and a planning window, think about optimization such as a vehicle routing formulation rather than forecasting or learning based methods.

Question 11

At AuroraFinTech you are running a synchronous training job on two GPUs and the profiler shows the GPUs often wait on input. The dataset is split across 1,500 TFRecord files in Cloud Storage and step time is dominated by file reads. You want to reduce input latency and accelerate training without changing the model architecture. What should you do?

  • ✓ B. Enable parallel interleave in the input pipeline to read from multiple files simultaneously

The correct option is Enable parallel interleave in the input pipeline to read from multiple files simultaneously. This approach issues concurrent reads across many TFRecord shards which hides per file latency from Cloud Storage and increases input throughput so the two GPUs are kept busy without any change to the model.

Using interleave with parallel calls in tf.data lets the pipeline open multiple TFRecord files at once and mix their records as they are produced. With an appropriate cycle length and num parallel calls setting the input stage can overlap file I O and parsing so that step time is no longer dominated by single file reads. This technique is a standard remedy when you have many shards and the profiler shows the accelerators waiting on input because it converts a largely serial file access pattern into a parallel one while remaining within the training job and not adding external systems.

Provision a machine with more vCPUs to increase data preparation throughput is not the best fix when the bottleneck is file I O from Cloud Storage. Adding CPUs can help parsing but if reads are serialized across files then the GPUs will continue to stall because input latency has not been reduced.

Insert a cache transformation in the pipeline after parsing so later epochs reuse data does not help the first epoch which is when the GPUs are already waiting. It can also be impractical if the dataset does not fit in memory or local disk and it still does not parallelize the initial file reads that dominate step time.

Move parsing and shuffling to Cloud Dataflow and write data to a single large file before training adds complexity and can reduce read parallelism during training because a single large object is typically consumed sequentially. The training job would still wait on I O and you would lose the benefit of many shards that can be read concurrently within the tf.data pipeline.

When accelerators wait on input and you have many shards think about increasing parallelism inside the tf.data pipeline. Look for knobs like interleave with parallel calls map with parallel calls and prefetch and verify improvements with the profiler rather than adding new systems.

Question 12

Which validation method best evaluates a time series demand model using the most recent behavior before moving to production?

  • ✓ B. Hold out the most recent 21 days for time based validation to reflect current behavior

The correct option is Hold out the most recent 21 days for time based validation to reflect current behavior. This focuses evaluation on the latest patterns and preserves temporal order, which makes it the strongest indicator of how the model will perform immediately after deployment.

This approach mirrors real world usage because the model will forecast the near future and should therefore be validated on the most recent window. It avoids information leakage from the future into the past and it captures any recent shifts in demand that older data may not reflect. Evaluating on a contiguous and recent time slice provides a faithful estimate of generalization under current conditions.

Apply k fold cross validation across the entire history is not appropriate for time series because it breaks temporal ordering or mixes future and past when creating folds. Even time series aware folds do not directly evaluate on the most recent contiguous period, so they are not the best choice when you want to measure current behavior before production.

Vertex AI Model Monitoring is for monitoring models after deployment and detecting drift or anomalies in production. It does not validate a model offline before release and therefore does not answer the question about pre production evaluation on recent behavior.

Create a random 25 percent holdout across all records regardless of date introduces leakage from future records into the training process and fails to reflect the sequential nature of time series. It also does not focus evaluation on the latest period, so it underestimates shifts in recent demand.

For time series questions prefer chronological splits that preserve temporal order. When the goal is to judge readiness for deployment favor a holdout from the most recent period and avoid random or k fold approaches. Remember that monitoring is for post deployment and not for offline validation.

Question 13

A midmarket logistics analytics team at Northwind Mobility runs PySpark-based data science pipelines on its own servers and now wants to test moving a single PySpark job to Google Cloud with the least setup and cost. What should you do first to start this proof of concept?

  • ✓ B. Create a Dataproc Standard cluster with 1 master and 2 workers and open a Vertex AI Workbench notebook that uses the cluster for PySpark

The correct option is Create a Dataproc Standard cluster with 1 master and 2 workers and open a Vertex AI Workbench notebook that uses the cluster for PySpark.

This choice lets you run your existing PySpark code as is while relying on a managed Spark environment that you can size small for a low cost trial. You get an interactive notebook experience that can submit Spark code to the cluster which is ideal for a quick proof of concept with minimal setup and minimal operational overhead.

Use Cloud Dataflow by rewriting the job in Apache Beam and run it on a small Dataflow setup is wrong because it requires rewriting the PySpark job into Beam which increases effort and risk. It also changes the execution model and tooling which does not align with the requirement for the least setup to start testing.

Provision an e2-standard-4 Compute Engine VM and manually install Java Scala and Apache Spark is wrong because manual installation and configuration adds significant setup time and maintenance. A single self managed VM does not reflect the managed distributed Spark environment you want to test and it slows down the proof of concept.

Launch a Vertex AI Workbench notebook on an e2-standard-4 machine without attaching any Spark cluster is wrong because a notebook alone does not provide a Spark runtime. You would still need to install and configure Spark or attach a cluster which counters the goal of least setup and does not provide a realistic Spark execution environment.

Identify

the option that preserves your existing code with the least setup. When you see PySpark and a quick proof of concept, favor a managed Spark service so you can run code with no rewrite and only minimal environment changes.

Question 14

Which GCP approach delivers low latency streaming predictions by reading from a Pub/Sub request topic, automatically reloading the model from Cloud Storage about every 30 minutes, and publishing results to BigQuery and to a Pub/Sub response topic?

  • ✓ B. Dataflow RunInference with WatchFilePattern

The correct option is Dataflow RunInference with WatchFilePattern. It satisfies the need for low latency streaming predictions from a Pub/Sub request topic, performs in-process model inference, automatically reloads updated model files from Cloud Storage on a periodic schedule of about every 30 minutes, and can write results to BigQuery as well as publish a response to a Pub/Sub topic.

This approach uses a streaming Dataflow pipeline that subscribes to Pub/Sub for continuous input and performs model inference on workers for low end-to-end latency. It can monitor a Cloud Storage path for new model artifacts and reload weights without redeploying the pipeline, which aligns with the requirement for automatic model refreshes roughly every half hour. Dataflow also provides built-in connectors to write outputs to BigQuery and to publish to Pub/Sub, which cleanly fulfills the dual sink requirement.

Cloud Functions with Pub/Sub is not ideal because it does not provide a built-in file watching mechanism to reload models from Cloud Storage on a timed basis and each invocation would need to manage model loading which adds latency and complexity. While it can write to BigQuery with custom code, it is not a streaming inference pipeline pattern optimized for sustained low latency.

Cloud Run with Pub/Sub push would require custom implementation for model watching and reloads and it does not offer a native mechanism to poll Cloud Storage and hot-swap model files in memory on a set cadence. You could integrate with BigQuery and Pub/Sub through code, yet this does not directly meet the automatic model reload requirement.

Vertex AI endpoint called from Dataflow shifts inference to a managed online endpoint rather than reloading models from Cloud Storage within the pipeline. This can achieve low latency but it does not implement the requested pattern of periodic model reloads from Cloud Storage inside the streaming job.

When a scenario mentions streaming from Pub/Sub, low latency, automatic reload of models from Cloud Storage, and outputs to BigQuery and a Pub/Sub response, map it to a Dataflow streaming pattern that does in-process inference and watches a Cloud Storage path for new model files.

Question 15

At BrightWave Insights you trained a model in a Vertex AI Workbench notebook that reports strong validation RMSE. You plan to search across 24 hyperparameters with defined ranges. You want a tuning approach that finishes in the least wall clock time, and you also want to keep cost, reproducibility, model quality, and scalability in mind as long as they do not slow the job. What should you do?

  • ✓ C. Containerize a parameterized Python training entrypoint and push the image to Artifact Registry then create a Vertex AI hyperparameter tuning job with gcloud using Random Search and set the maximum trial count equal to the parallel trial count

The correct choice is Containerize a parameterized Python training entrypoint and push the image to Artifact Registry then create a Vertex AI hyperparameter tuning job with gcloud using Random Search and set the maximum trial count equal to the parallel trial count.

This approach gives you the shortest wall clock time because setting the number of parallel trials equal to the total trials runs everything at once and the job finishes when the slowest trial completes. Random Search parallelizes cleanly without waiting for prior results, while the managed Vertex AI service schedules trials across scalable compute which preserves speed. Containerizing the trainer and storing it in Artifact Registry improves reproducibility and portability, and using gcloud makes the setup scriptable and repeatable while letting you control machine types and regions to manage cost.

Set up a hyperparameter study in the notebook using Vertex AI Vizier and specify validation_rmse as the study metric is not the best choice for least wall clock time because running trials from within a notebook typically leads to limited parallelism and orchestration overhead. It also reduces reproducibility and scalability compared to a managed hyperparameter tuning job that fans out many trials concurrently.

Use Hyperopt or Optuna in the notebook to run Bayesian optimization locally is suboptimal for speed because Bayesian optimization benefits from sequential learning and does not scale to high parallelism as effectively as Random Search. Running locally in a notebook further constrains compute resources and makes large parallel sweeps harder to manage, which increases overall completion time.

Containerize a parameterized Python training script and push the image to Artifact Registry then create a Vertex AI hyperparameter tuning job in the Google Cloud console and choose Grid Search is not appropriate for 24 hyperparameters with ranges because grid search explodes combinatorially and will require many more trials, so it is slower and more costly. It also provides no advantage for speed compared to running a highly parallel random search.

When speed is the top priority, use managed hyperparameter tuning with Random Search and set parallelTrialCount equal to maxTrialCount so all trials run concurrently. Containerize your trainer for reproducibility and easier scaling.

Question 16

In Vertex AI Workbench, how should you track experiment runs with parameters, metrics, datasets, and models and then promote the winning approach to production?

  • ✓ B. Vertex AI Experiments in SDK then build a pipeline

The correct answer is Vertex AI Experiments in SDK then build a pipeline.

In Vertex AI Workbench you begin in a notebook and use the Python SDK to log parameters, metrics, datasets and model artifacts to Vertex AI Experiments. This lets you compare runs, capture lineage and decide on the best configuration. After selecting a winner you operationalize the steps with Vertex AI Pipelines so training, evaluation, model registration and deployment run consistently in production.

Vertex AI Pipeline first with Kubeflow artifacts is not the right starting point when the goal is to track and compare experiments. Pipelines are best once you know the approach and want reliable orchestration rather than during early iteration in notebooks.

Vertex ML Metadata only is too low level for this need. It stores lineage but does not provide the streamlined SDK and UI that Vertex AI Experiments offers for logging runs and comparing metrics.

When a question emphasizes comparing parameters and metrics from notebooks think Vertex AI Experiments for tracking and think Pipelines for moving the chosen approach to production.

Question 17

An ad tech startup named LumaDSP trained a scikit-learn model for click prediction and plans to deploy it on Vertex AI for both real time and batch predictions. The model expects 32 input features and several fields must be normalized during inference, and you want to package the solution with as little extra code as possible. What should you do?

  • ✓ B. Wrap the model with a Custom Prediction Routine that performs preprocessing and prediction, build a container image from the CPR artifacts, register it in Vertex AI Model Registry, deploy to Vertex AI Endpoints, and run Vertex AI batch prediction jobs

The correct option is Wrap the model with a Custom Prediction Routine that performs preprocessing and prediction, build a container image from the CPR artifacts, register it in Vertex AI Model Registry, deploy to Vertex AI Endpoints, and run Vertex AI batch prediction jobs.

This approach lets you keep your scikit-learn artifact intact and add only a lightweight Python package that handles input normalization and prediction. Vertex AI builds on a prebuilt serving image so you avoid writing a full server or Dockerfile while still supporting both online endpoints and batch prediction with the same logic. It therefore minimizes extra code while meeting the requirement to normalize several fields during inference and to accept the expected 32 features.

Use the prebuilt scikit-learn prediction container, upload the model to Vertex AI Model Registry, deploy to Vertex AI Endpoints, and configure batch prediction to transform input data by setting instanceConfig.instanceType is incorrect because the prebuilt container does not run custom preprocessing code during prediction and the referenced configuration field does not exist for Vertex AI batch prediction. You would not be able to implement the required normalization with only configuration.

Build a custom container for the scikit-learn model and implement a custom serving function that handles preprocessing and prediction, register the image in Vertex AI Model Registry, deploy to Endpoints, and create a batch prediction job is functionally viable but is not the best choice here because it requires significantly more code and ongoing maintenance to build and operate a custom image and server when a lighter option exists.

Store features in Vertex AI Feature Store and schedule Dataflow to compute transformations, then deploy the scikit-learn model with the prebuilt prediction container to Vertex AI Endpoints and use Vertex AI batch prediction for offline scoring is unnecessary for this scenario because Feature Store manages feature storage and serving rather than performing per request inference time normalization. It adds operational complexity without solving the immediate need to run preprocessing at prediction time.

When you need custom preprocessing at prediction time with minimal code, prefer a Custom Prediction Routine over a fully custom container. Choose prebuilt containers only when no custom inference logic is needed, and move to custom containers only when you need full control.

Question 18

Which event-driven method in Google Cloud triggers a Vertex AI Pipelines run only when a new Cloud Storage object is created while keeping compute costs low?

  • ✓ B. Cloud Functions on GCS finalize to start Vertex AI Pipelines run

The correct option is Cloud Functions on GCS finalize to start Vertex AI Pipelines run. This approach triggers only when a new object is finalized in Cloud Storage and it invokes short lived serverless code to start a pipeline run. Since it scales to zero when idle you pay only for invocations which keeps compute costs low.

This works by subscribing to the Cloud Storage object finalize event and using the event payload to decide which pipeline to launch. The function calls the pipelines API to create a run and no continuously running infrastructure is required which optimizes cost for sporadic arrivals.

Cloud Composer with GCS sensor is not ideal when minimizing cost for sporadic events because the environment must run continuously and sensors often wait for files which adds baseline cost even if deferrable sensors reduce worker usage.

Vertex AI Pipelines via Cloud Scheduler is time based rather than event driven so it does not start immediately on new object creation and can trigger when no new data has arrived which wastes compute.

Dataflow streaming on GCS notifications can process notifications in real time but a streaming job runs continuously and keeps workers active which increases ongoing cost and it is unnecessary when the requirement is only to start a pipeline run on file arrival.

When a question asks for triggers on new Cloud Storage objects and low cost think event driven and scale to zero. Prefer serverless triggers that call an API only on file arrival rather than always on schedulers or long running workers.

Question 19

Solstice Analytics has built a Vertex AI Pipelines workflow that trains custom models with about 50 executions each week. The team wants the easiest way to collaborate when comparing metrics across executions through both programmatic access and interactive visualizations in Google Cloud. What should you add to the pipeline and which tools should the team use?

  • ✓ B. Add a pipeline component that records metrics to Vertex ML Metadata, then compare runs with Vertex AI Experiments and visualize with Vertex AI TensorBoard

The correct option is Add a pipeline component that records metrics to Vertex ML Metadata, then compare runs with Vertex AI Experiments and visualize with Vertex AI TensorBoard.

This is the most seamless approach for teams using Vertex AI Pipelines because Vertex ML Metadata is natively integrated with pipeline executions and automatically ties metrics and parameters to runs. Vertex AI Experiments lets the team compare runs interactively in the console and also provides programmatic access through the SDK so you can query and analyze results in code. Managed Vertex AI TensorBoard gives rich, interactive visualizations of scalars and other artifacts, is easy to share within the project, and requires minimal setup.

Add a pipeline component that writes metrics to a BigQuery table named ml_metrics_ds.train_metrics_v3, then use SQL to compare runs and visualize through Looker Studio can work, yet it is not the easiest in Vertex AI. It requires designing and maintaining schemas, writing ingestion logic, and building and maintaining dashboards, and it does not automatically link metrics to pipeline lineage or the run comparison features that Experiments provides.

Add a pipeline component that records metrics to Vertex ML Metadata, then export them into a pandas DataFrame to compare runs and render plots with Matplotlib supports programmatic analysis but does not satisfy the need for interactive, collaborative visualizations in Google Cloud. This approach lacks a shared UI for side by side run comparison and requires custom plotting and hosting that the team would need to manage.

Add a pipeline component that pushes metrics to Vertex AI Model Registry, then monitor metrics with Cloud Monitoring dashboards misuses the services because the Model Registry is for model versions and related artifacts rather than per run training metrics, and Cloud Monitoring focuses on operational telemetry rather than custom experiment metrics. This does not provide an integrated experiment tracking and comparison experience.

When a question asks for the easiest collaboration across runs with both programmatic access and interactive visuals, think of Vertex AI Experiments with ML Metadata for tracking and Managed TensorBoard for visualization rather than building custom storage and dashboards.

Question 0

In a photo policy classifier, which Vertex AI AutoML objective should you choose to minimize false negatives for the noncompliant class?

  • ✓ B. Vertex AI AutoML high recall on noncompliant

The correct option is Vertex AI AutoML high recall on noncompliant.

Vertex AI AutoML high recall on noncompliant aligns with the goal of minimizing false negatives for the noncompliant class because recall measures how many actual positives are correctly identified. Recall equals true positives divided by true positives plus false negatives, so maximizing it directly drives down the number of missed noncompliant items.

Vertex AI AutoML maximize F1 score optimizes a balance between precision and recall and does not specifically minimize false negatives. When missing positives is costly, a balanced objective can still allow more false negatives than a recall focused choice.

Vertex AI AutoML high precision on noncompliant aims to reduce false positives rather than false negatives. This usually comes at the expense of recall, which can increase misses of noncompliant items.

Map the error type to the metric. Choose recall when minimizing false negatives and choose precision when minimizing false positives. Use F1 only when you need a balance.

Question 1

You are designing a product recommendation system for a midmarket home furnishings retailer. Three years of purchase history is stored in BigQuery and roughly 90 GB of clickstream logs are saved as CSV files in Cloud Storage. You need to run exploratory analysis, clean data, and train models repeatedly while trying different algorithms, and you want to keep costs and setup effort low. How should you set up your working environment?

  • ✓ C. Provision a Vertex AI Workbench user managed notebook on the default machine type and use the %%bigquery magic in Jupyter to query BigQuery tables

The correct option is Provision a Vertex AI Workbench user managed notebook on the default machine type and use the %%bigquery magic in Jupyter to query BigQuery tables.

This setup lets you iterate quickly while keeping costs down because most heavy data processing remains in BigQuery and the notebook only orchestrates and visualizes results. You can load the clickstream CSVs into BigQuery or define external tables over Cloud Storage so you query the data in place, then use the magic to pull only the results you need into the notebook. You also get full control over Python libraries for data cleaning and model experimentation, and you can stop the VM when you are done to avoid charges.

Start a Vertex AI Workbench managed notebook and use the BigQuery integration in JupyterLab to browse datasets and run queries is less suitable because the JupyterLab plugin is convenient for browsing but it is more limited for repeatable data science workflows compared with using the magic and client libraries, and managed notebooks give you less control over the environment while not providing a clear cost advantage for this use case.

Use BigQuery Studio with BigQuery ML to explore data and build models directly in the BigQuery console does not fit well because BQML supports a defined set of algorithms and is SQL centric, which restricts trying many different libraries and custom preprocessing steps. You would still need a separate environment for Python based feature engineering and model experimentation.

Attach a Vertex AI Workbench managed notebook to a Dataproc cluster and access BigQuery through the Spark BigQuery connector adds unnecessary setup and ongoing cost for this workload. Ninety gigabytes can be explored efficiently with serverless BigQuery, and spinning up Spark clusters is better reserved for distributed processing needs that exceed what BigQuery can handle directly.

First decide where the heavy work should run and prefer keeping large scale processing in serverless services like BigQuery while using notebooks only to orchestrate and visualize. Choose the simplest tool that covers iterative analysis and modeling needs, and avoid Spark clusters unless you truly need distributed compute. Pull samples with the %%bigquery magic instead of moving full datasets into the notebook.

Question 2

What is the simplest way to use a BigQuery ML logistic regression model for real time predictions within a Dataflow streaming pipeline while keeping latency under 30 milliseconds per event?

  • ✓ C. In pipeline TensorFlow RunInference in Dataflow

The correct option is In pipeline TensorFlow RunInference in Dataflow. This embeds the model directly in the streaming pipeline so predictions run in memory on the worker and can meet a target of less than 30 milliseconds per event.

A BigQuery ML logistic regression can be exported as a TensorFlow SavedModel and stored in Cloud Storage, then loaded once per worker and reused for per element scoring. Keeping inference inside the pipeline removes network hops and SQL job startup costs which makes latency predictable and very low. This approach is also simple to operate because it uses the managed Dataflow workers and a built in Beam transform for inference rather than introducing an external serving system.

Vertex AI online prediction is not ideal for this target because each event would require an external network call from the worker which often adds more than 30 milliseconds and increases operational complexity.

Per event ML.PREDICT in BigQuery is unsuitable because issuing a query per element introduces query planning and service overhead, and BigQuery is not designed as a low latency per request prediction service inside a streaming pipeline.

Cloud Run with TensorFlow Serving adds an extra service hop and potential cold starts which makes consistent sub 30 millisecond latency difficult and it also moves inference outside the pipeline which increases complexity.

When the requirement is very low latency inside a streaming pipeline, look for in-process inference patterns that avoid network hops and per request queries. If a BigQuery ML model can be exported to a SavedModel then using a built in Dataflow transform is usually the most direct path.

Question 3

At Northstar Retail’s analytics group the machine learning team needs to run many quick trials using different feature sets, model variants, and tuning parameters. They want a low maintenance approach that automatically records accuracy metrics for every run and allows engineers to retrieve those metrics over time through an API for comparisons and dashboards. Which approach should they adopt?

  • ✓ C. Use Kubeflow Pipelines to orchestrate experiments, export metrics from steps, and retrieve run metrics with the Kubeflow Pipelines API

The correct option is Use Kubeflow Pipelines to orchestrate experiments, export metrics from steps, and retrieve run metrics with the Kubeflow Pipelines API. This choice matches the need for many quick trials, automatic recording of accuracy metrics, and programmatic retrieval over time.

This approach is designed for experiment orchestration and tracking. You can emit metrics from components in a standard format so the system captures them for each run without extra maintenance. The run metadata persists these values and the client API lets engineers query runs and compare metrics, which makes building dashboards and long term comparisons straightforward.

Use Vertex AI Training to run jobs, write accuracy metrics to BigQuery, and access them with the BigQuery API is not the best fit because it requires building and maintaining custom logging code and schemas. It lacks native experiment lineage and run level comparison features, so the team would shoulder more operational work.

Use Vertex AI Training to execute runs and send accuracy values to Cloud Monitoring, then read them with the Monitoring API is not ideal because Monitoring focuses on time series for infrastructure and service health. While custom metrics are possible, it is not well suited for structured experiment tracking across runs and parameters and can introduce cardinality and retention constraints.

Use Vertex AI Workbench notebooks to run tests and log results in a shared Google Sheets workbook, then fetch values with the Google Sheets API is a manual and fragile workflow. It does not scale, it lacks strong versioning and lineage, and it increases the chance of human error, which conflicts with the requirement for a low maintenance and reliable solution.

When a question emphasizes automatic run metric capture and an accessible API for long term comparisons, look for native experiment tracking and pipeline orchestration tools rather than ad hoc logging solutions.

Question 4

Only 0.2% of labels are failures and a standard training approach predicts the majority class. How should you address this extreme class imbalance so the model learns to detect the rare failure class?

  • ✓ B. Downsample negatives and apply class weights for about 20% failures per batch

The correct option is Downsample negatives and apply class weights for about 20% failures per batch.

This approach directly confronts the severe imbalance by ensuring each mini-batch contains a meaningful proportion of failure examples so the model receives frequent gradient signals from the minority class. Weighting the classes then compensates for the altered batch composition so the optimization still reflects the true data distribution and prevents the learner from overfitting to the artificially balanced batches.

Vertex AI Vizier performs hyperparameter tuning and optimization rather than changing data balance during training. It does not modify class prevalence in the training signal and therefore will not by itself make the model learn the rare class.

Lower the classification threshold during inference only adjusts how probabilities are turned into labels after training. It does not help the model learn minority patterns during training and can increase false positives without improving the underlying representation.

When you see extreme class imbalance, think about fixing the training signal with resampling and class weights first, then fine tune the threshold only after the model can recognize the minority class.

Question 5

At the streaming service mcnz.com your team has deployed a model on a Vertex AI endpoint and a Vertex AI Pipeline retrains the model when a Cloud Function is invoked. You want to keep the model current while keeping training spend predictable and low. How should you trigger retraining to balance freshness and cost?

  • ✓ C. Enable Vertex AI model monitoring for input feature drift and publish notifications to Pub/Sub that invoke the Cloud Function when drift thresholds are exceeded

The correct option is Enable Vertex AI model monitoring for input feature drift and publish notifications to Pub/Sub that invoke the Cloud Function when drift thresholds are exceeded.

This approach uses Vertex AI monitoring to detect when the live input distribution diverges from the training baseline beyond your chosen thresholds and only then publishes to Pub/Sub to trigger the Cloud Function that starts the retraining pipeline. You gain freshness because retraining happens when the data meaningfully changes and you keep costs low and predictable because you control sensitivity through thresholds and windows rather than retraining on every schedule tick.

Enable model monitoring on the Vertex AI endpoint for anomaly detection and have Pub/Sub notify the Cloud Function when anomalies are found is not targeted to input feature drift thresholds that signal the need to retrain. Generic anomaly alerts can be noisy and may fire on transient outliers which can lead to unnecessary retraining and unpredictable spend.

Configure a Cloud Scheduler job to invoke the Cloud Function on a fixed cadence that aligns with the budget gives predictable timing but it is not data driven and can retrain when nothing has changed or miss important shifts between runs. This wastes compute or hurts freshness and does not balance both goals.

Create a Cloud Monitoring alert on endpoint latency anomalies and call the Cloud Function through a webhook when the alert fires ties retraining to serving performance rather than model quality or data drift. Latency issues are usually infrastructure or traffic related so retraining the model is unlikely to help and this would create unnecessary cost.

Prefer data driven retraining triggers such as feature drift or skew thresholds over fixed schedules. Tune thresholds and monitoring windows to control both freshness and spend.

Jira, Scrum & AI Certification

Want to get certified on the most popular software development technologies of the day? These resources will help you get Jira certified, Scrum certified and even AI Practitioner certified so your resume really stands out..

You can even get certified in the latest AI, ML and DevOps technologies. Advance your career today.

Cameron McKenzie Cameron McKenzie is an AWS Certified AI Practitioner, Machine Learning Engineer, Copilot Expert, Solutions Architect and author of many popular books in the software development and Cloud Computing space. His growing YouTube channel training devs in Java, Spring, AI and ML has well over 30,000 subscribers.