Google Machine Learning Certification Sample Questions

All GCP questions come from my Google ML Udemy Course and certificationexams.pro
Free GCP Professional Machine Learning Engineer Exam Topics Test
The Google Cloud Professional Machine Learning Engineer certification validates your ability to design, build, and productionize machine learning models that support scalable, efficient, and reliable business solutions.
It focuses on the key domains of ML engineering, including data preparation, model development, deployment, and monitoring on Google Cloud.
To prepare effectively, begin with the GCP Professional Machine Learning Engineer Practice Questions.
These questions mirror the tone, logic, and structure of the real certification exam and help you become familiar with Google’s question style and reasoning approach.
You can also explore Real Google Cloud Machine ML Certification Exam Questions for authentic, scenario-based challenges that simulate real-world machine learning engineering tasks.
For focused study, review GCP Professional ML Engineer Sample Questions covering data preprocessing, model evaluation, deployment with Vertex AI, and MLOps automation.
Google Certified Machine ML Exam Simulator
Each section of the GCP Certified Professional ML Engineer Questions and Answers collection is designed to teach as well as test. These materials reinforce essential ML concepts and provide clear explanations that help you understand why specific responses are correct, preparing you to think like a professional ML engineer on Google Cloud.
For complete readiness, use the Google Certified Machine ML Exam Simulator and take full-length Google Certified Professional ML Engineer Practice Tests. These simulations reproduce the pacing and structure of the actual Google Cloud certification exam so you can manage your time effectively and gain confidence under test conditions.
Google Cloud Certification Practice Exams
If you prefer focused study sessions, try the Google Machine Learning Engineer Certification Exam Dump, the Professional GCP Machine Learning Engineer Certification Braindump, and other curated sets of GCP Certified Professional ML Engineer Questions and Answers.
Working through these Google Machine Learning Engineer Certification Exam Questions builds both analytical and practical skills needed to design, train, and deploy ML models effectively.
By mastering these exercises, you will be ready to deliver AI solutions, maintain governance, and support enterprise ML initiatives on Google Cloud.
Start your preparation today with the GCP Professional Machine Learning Engineer Practice Questions.
Train using the Google Certified Machine ML Exam Simulator and measure your progress with comprehensive practice tests. Prepare to earn your certification and advance your career as a trusted Google Cloud Machine Learning Engineer.
Git, GitHub & GitHub Copilot Certification Made Easy |
---|
Want to get certified on the most popular AI, ML & DevOps technologies of the day? These five resources will help you get GitHub certified in a hurry.
Get certified in the latest AI, ML and DevOps technologies. Advance your career today. |
Google Cloud Machine Learning Professional Certification Sample Questions

All GCP questions come from my Google ML Udemy Course and certificationexams.pro
Question 1
A commerce platform at scrumtuous.com trained an image classification model on premises with local product photos and now wants to deploy it on Vertex AI. Compliance rules prevent copying any raw datasets to the cloud. The team anticipates the distribution of incoming images will evolve over the next 90 days and they need to detect any production performance degradation caused by those shifts. What should they do?
-
❏ A. Export prediction logs to Cloud Monitoring and alert on response latency and error counts
-
❏ B. Use Vertex Explainable AI and configure feature based explanations
-
❏ C. Create a Vertex AI Model Monitoring job and enable training and serving skew detection for the endpoint
-
❏ D. Create a Vertex AI Model Monitoring job and enable feature attribution skew and drift monitoring
Question 2
Which solution provides automatic end to end lineage for training artifacts and for each batch prediction output in a scheduled workflow that retrains a custom model on 60 million rows every two weeks and runs batch predictions every 90 days?
-
❏ A. Vertex AI Experiments with Model Registry and batch prediction
-
❏ B. Kubeflow Pipelines on GKE with custom lineage tracking
-
❏ C. Vertex AI Pipelines using CustomTrainingJob and BatchPredict components
-
❏ D. Cloud Composer with custom training and Vertex AI batch prediction plus BigQuery metadata
Question 3
You are the machine learning engineer at QuizRush, a live trivia platform. Over the past two weeks the service has seen a spike in automated cheating that reduces revenue and frustrates honest players. You trained a binary classifier that determines whether a participant cheated once a match ends and a downstream process immediately suspends flagged accounts. The model performed well in validation and you must now deploy it. The system processes about 120,000 matches per day and requires prediction latency under 250 milliseconds after each match completes. You need a serving approach that returns decisions immediately so the workflow can react in real time. What should you do?
-
❏ A. Publish match-complete events to Pub/Sub and process them with a Dataflow streaming job that runs model inference in micro-batches
-
❏ B. Store the model in Cloud Storage and have a Cloud Function load the model file on each request to serve online predictions
-
❏ C. Import the model into Vertex AI Model Registry then deploy it to a Vertex AI endpoint and send online prediction requests
-
❏ D. Import the model into Vertex AI Model Registry then use Vertex AI Batch Prediction to run scheduled inference jobs
Question 4
You trained a next day time series model with randomly shuffled splits and achieved 92% validation accuracy, but production accuracy is 58%. What change to your data splitting and evaluation should you make to better match production performance?
-
❏ A. Use K-fold cross-validation
-
❏ B. Use chronological train and validation splits where past trains and future validates
-
❏ C. Normalize training and validation independently
-
❏ D. Fit preprocessing on the full dataset before splitting
Question 5
You are working for a subscription news platform called Beacon Media, and you need to forecast which subscribers are likely to cancel. You have 18 months of historical data that includes demographics, purchase records, and clickstream interactions from example.com. You must train the model using BigQuery ML and perform an in-depth evaluation of classification performance within BigQuery. What should you do?
-
❏ A. Train a linear regression model in BigQuery ML and evaluate it with ML.EVALUATE
-
❏ B. Use BigQuery ML boosted_tree_regressor and review ML.FEATURE_IMPORTANCE for insights
-
❏ C. Build a logistic regression model in BigQuery ML and compute a confusion matrix with ML.CONFUSION_MATRIX
-
❏ D. Train a logistic regression model in BigQuery ML then register it in Vertex AI Model Registry and examine metrics in Vertex AI
Question 6
Which approach provides low maintenance detection of training and serving skew for a Vertex AI endpoint by using a BigQuery training table as the baseline and supports retraining every 30 days?
-
❏ A. Dataplex data quality rules
-
❏ B. Use Vertex AI Model Monitoring with BigQuery baseline
-
❏ C. Prediction logging to BigQuery with scheduled queries
-
❏ D. Model Monitoring with Cloud Logging and Cloud Functions retraining
Question 7
FinServe Labs trains a recommendation model using datasets purchased from an external aggregator, and the supplier sometimes changes column names, types, and categorical encodings without notice which has caused training jobs to fail. You want to harden your Vertex AI training pipeline so it quickly detects format and schema shifts from the upstream source and blocks bad data from propagating. What should you do?
-
❏ A. Implement custom TensorFlow checks at the start of training to catch only known formatting issues
-
❏ B. Create Dataplex data quality rules to validate incoming records before they reach the training dataset
-
❏ C. Configure TensorFlow Data Validation to infer a schema and surface anomalies from upstream changes
-
❏ D. Use TensorFlow Transform to normalize features and replace any values that violate the schema with 0
Question 8
Which approach enables recurring batch predictions on the 25 TB BigQuery table foo.content_2027 with minimal data movement and operational overhead, given that the TensorFlow model was trained on Vertex AI?
-
❏ A. Cloud Run job that reads BigQuery and calls Vertex AI online prediction
-
❏ B. Vertex AI Batch Predictions with BigQuery export to Cloud Storage
-
❏ C. Import TensorFlow SavedModel into BigQuery ML and run ML.PREDICT
-
❏ D. Dataflow pipeline that reads BigQuery and invokes the SavedModel
Question 9
Your team at BlueCar Labs trained a model that depended on compute intensive feature engineering during training, and the same transformations are required at inference time. The model is serving from AI Platform for low latency and high throughput online predictions. What architecture should you adopt to ensure scalable preprocessing during serving?
-
❏ A. Publish requests to Pub/Sub and trigger a Cloud Function that applies preprocessing then invokes AI Platform prediction and writes results to a second Pub/Sub topic
-
❏ B. Stream requests into BigQuery and expose the preprocessing as a SQL view then poll for ready rows and send them to AI Platform and publish outputs to Pub/Sub
-
❏ C. Send prediction messages to Pub/Sub then run the preprocessing in a Dataflow pipeline before calling AI Platform online prediction and publish the predictions to an output Pub/Sub topic
-
❏ D. Train a new model that accepts raw inputs without preprocessing and deploy it on AI Platform for real time prediction
Question 10
On Google Cloud, which approach best scales TensorFlow Transformer NMT training on approximately 15 million sentence pairs stored in Cloud Storage while requiring minimal code changes and no cluster management?
-
❏ A. Vertex AI custom training with MultiWorkerMirroredStrategy on A2 GPUs
-
❏ B. Vertex AI training on Cloud TPU VMs with TPUStrategy
-
❏ C. GKE Autopilot with Horovod on GPU nodes
-
❏ D. Vertex AI custom training with tf.distribute.ParameterServerStrategy

All GCP questions come from my Google ML Udemy Course and certificationexams.pro
Question 11
You are preparing a classification model in Vertex AI using a BigQuery table from example.com. During exploratory analysis you find that a categorical field named plan_tier shows strong predictive signal yet a portion of its entries are null. What should you do to retain the information from this feature while addressing the null values?
-
❏ A. Compute the mode of the column and replace all missing entries with that value
-
❏ B. Introduce a dedicated “missing” category for that column and add a new binary indicator that marks whether the value was missing
-
❏ C. Drop the column if more than 25% of its values are missing otherwise keep it unchanged
-
❏ D. Use a Dataflow pipeline to discard every record where the column is null before training
Question 12
After 45 days in production, a semantic segmentation model shows lower PR AUC than in offline validation. It performs well on sparse nighttime scenes but fails in dense rush hour traffic. What is the most plausible explanation?
-
❏ A. PR AUC is the wrong metric for segmentation
-
❏ B. Model overfit to sparse scenes and underfit dense traffic
-
❏ C. Training and serving preprocessing mismatch
-
❏ D. Training data overrepresented congested scenes
Question 13
At Meridian Clinical Research you are tasked with building natural language models to analyze clinician notes that include custom tags created by your annotation team. The data consists of free text with these labels and you need to detect spans and assign categories for a wide variety of medical terms based on the provided annotations. What should you do?
-
❏ A. Healthcare Natural Language API
-
❏ B. Train a custom model using AutoML Entity Extraction with your annotated medical text
-
❏ C. Fine-tune a BERT model for entity extraction using your labels on Vertex AI Training
-
❏ D. Document AI
Question 14
In Vertex AI, which feature allows you to explore an image classifier latent space and retrieve similar examples to investigate misclassifications made with high confidence?
-
❏ A. Vertex AI Matching Engine
-
❏ B. Example-based explanations with an embedding layer
-
❏ C. Vertex AI Model Monitoring
Question 15
Riverline Gear operates a production demand forecasting pipeline. Raw events are stored in BigQuery and a Dataflow job applies Z score normalization to features and writes the transformed data back to BigQuery. New training data arrives every three days. You want to reduce processing time and ongoing manual effort while keeping the workflow simple and avoiding extra platforms. What should you do?
-
❏ A. Create a Cloud Run job that queries BigQuery applies Z score normalization in Python and writes the output to a results table
-
❏ B. Express Z score normalization in BigQuery SQL then implement it as a view or a scheduled query that materializes the results
-
❏ C. Normalize with Dataproc Serverless Spark using the BigQuery connector and write the standardized features back to BigQuery
-
❏ D. Move preprocessing into a Vertex AI Pipelines component that uses tf.transform to standardize features before training
Question 16
Which action can reduce per request latency for CPU based TensorFlow Serving deployed on Google Kubernetes Engine without changing the architecture?
-
❏ A. Tune inter_op and intra_op thread counts in TensorFlow Serving
-
❏ B. Build TensorFlow Serving with CPU specific optimizations and set a matching minimum CPU platform on the GKE node pool
-
❏ C. Dramatically raise the max_batch_size setting in TensorFlow Serving
-
❏ D. Switch to the tensorflow-model-server-universal build of TensorFlow Serving
Question 17
At Asteria Retail you manage Vertex AI Pipelines that train models and deploy them to a Vertex AI endpoint for online predictions. You plan to use Cloud Build for continuous integration and continuous delivery so that teams can experiment frequently while keeping production stable. You want a release process that promotes updated pipeline versions to production quickly and reduces the chance that a new pipeline disrupts the live service. What should you do?
-
❏ A. Create a Cloud Build workflow that compiles and tests the repository, then use the Google Cloud console to push the container to Artifact Registry and upload the compiled pipeline to Vertex AI Pipelines
-
❏ B. Configure Cloud Build to build the code and deploy artifacts to a staging environment, then after a successful staging execution rebuild from the main branch and release to production
-
❏ C. Configure Cloud Build to build and test the source, deploy the validated artifacts to a staging Vertex AI Pipelines environment, run the pipeline successfully in staging, then promote the same artifacts to production
-
❏ D. Set up Cloud Build to compile the code and deploy to a staging environment, run only unit tests in staging, and if the tests pass deploy the pipeline to production
Question 18
How should you apply MinMax scaling to numeric columns and one hot encoding to categorical features on a BigQuery table of about 300 million rows to minimize engineering effort and cost when training a TensorFlow classifier in Vertex AI Training?
-
❏ A. Perform numeric scaling in BigQuery and let TensorFlow one hot encode during training
-
❏ B. Vertex AI Feature Store transformations with offline materialization for training
-
❏ C. BigQuery SQL preprocessing with min max lookup and one hot in a view for training
-
❏ D. Use TFX Transform on Dataflow and write TFRecords to Cloud Storage before training
Question 19
Your team at Blue Harbor Outfitters built a Vertex AI forecasting model that generates monthly demand estimates for each SKU. You need to quickly assemble a stakeholder report that explains how the model arrives at its predictions, and you have two months of recent actual sales that were excluded from training. What should you do to produce the data for this report?
-
❏ A. Run a batch prediction on the recent actual sales and compare the predicted values to the actuals
-
❏ B. Enable Vertex AI Model Monitoring on the deployed endpoint and use its drift metrics to describe the predictions
-
❏ C. Create a batch prediction on the recent actual sales and enable feature attributions, then include the attributions alongside the predictions in the report
-
❏ D. Generate counterfactual datasets from the recent actuals and run batch predictions on both the original and counterfactual data, then compare the results in the report
Question 20
Which managed service offers a simple feature store that serves online features in under 20 milliseconds and supports point in time historical retrieval for up to 150 days?
-
❏ A. Cloud Bigtable
-
❏ B. Vertex AI Feature Store service
-
❏ C. Cloud Spanner
-
❏ D. Cloud Firestore
Certified GCP Machine Learning Professional Sample Questions Answered

All GCP questions come from my Google ML Udemy Course and certificationexams.pro
Question 1
A commerce platform at scrumtuous.com trained an image classification model on premises with local product photos and now wants to deploy it on Vertex AI. Compliance rules prevent copying any raw datasets to the cloud. The team anticipates the distribution of incoming images will evolve over the next 90 days and they need to detect any production performance degradation caused by those shifts. What should they do?
-
✓ C. Create a Vertex AI Model Monitoring job and enable training and serving skew detection for the endpoint
The correct choice is Create a Vertex AI Model Monitoring job and enable training and serving skew detection for the endpoint.
Model Monitoring is designed to detect changes in the distribution of production inputs compared with the training baseline, which addresses the need to catch performance degradation as the incoming image distribution evolves over the next 90 days. With training and serving skew detection, Vertex AI computes baseline statistics from the training data and continuously compares them to live traffic. You can satisfy the compliance constraint by using baseline feature statistics instead of copying raw training examples, which lets you monitor for data shift without moving the original datasets to the cloud.
Export prediction logs to Cloud Monitoring and alert on response latency and error counts focuses on system health rather than data quality. Latency and error alerts do not reveal shifts in the input distribution or model performance degradation caused by drift.
Use Vertex Explainable AI and configure feature based explanations provides interpretability for individual predictions and helps you understand feature influence, but it does not monitor ongoing production data for skew or drift over time and therefore will not reliably detect distribution shift.
Create a Vertex AI Model Monitoring job and enable feature attribution skew and drift monitoring targets shifts in explanation attributions, which is useful when you already generate attributions, yet it is not the primary mechanism to detect distribution changes in input features or overall performance degradation. It also requires explanation configuration and additional compute that is unnecessary for the stated goal.
When you see a requirement to detect data skew or drift over time, think of Vertex AI Model Monitoring. Choose training serving skew for comparing production inputs to the training baseline and choose prediction drift for tracking shifts over rolling windows.
Question 2
Which solution provides automatic end to end lineage for training artifacts and for each batch prediction output in a scheduled workflow that retrains a custom model on 60 million rows every two weeks and runs batch predictions every 90 days?
-
✓ C. Vertex AI Pipelines using CustomTrainingJob and BatchPredict components
The correct option is Vertex AI Pipelines using CustomTrainingJob and BatchPredict components. This managed pipeline approach gives automatic end to end lineage for training artifacts and for each batch prediction output, and it supports scheduled retraining every two weeks and periodic batch predictions every ninety days.
In this approach, the pipeline records inputs and outputs to Vertex ML Metadata which powers Vertex AI Lineage. The CustomTrainingJob step captures datasets, models, and metrics as artifacts, and the Batch Predict step logs batch outputs as lineage tracked artifacts. You can set up recurring runs so the retraining and batch prediction cadence is automated, and the managed services scale to datasets with tens of millions of rows.
Vertex AI Experiments with Model Registry and batch prediction focuses on experiment tracking and model versioning. It does not orchestrate an end to end workflow with automatic lineage of every batch prediction output across scheduled runs.
Kubeflow Pipelines on GKE with custom lineage tracking can be made to work but it requires you to operate the platform and implement custom lineage capture. This is not automatic and adds operational overhead compared to the managed lineage that pipelines on Vertex AI provide.
Cloud Composer with custom training and Vertex AI batch prediction plus BigQuery metadata uses Airflow for orchestration, yet lineage would be manual and BigQuery metadata targets data lineage for tables rather than ML artifacts and model relationships. This does not deliver automatic end to end lineage for training artifacts and batch outputs.
When a scenario asks for automatic lineage across training and predictions in a recurring schedule, prefer a managed pipeline with built in components over custom orchestration or experiment tracking tools.
Question 3
You are the machine learning engineer at QuizRush, a live trivia platform. Over the past two weeks the service has seen a spike in automated cheating that reduces revenue and frustrates honest players. You trained a binary classifier that determines whether a participant cheated once a match ends and a downstream process immediately suspends flagged accounts. The model performed well in validation and you must now deploy it. The system processes about 120,000 matches per day and requires prediction latency under 250 milliseconds after each match completes. You need a serving approach that returns decisions immediately so the workflow can react in real time. What should you do?
-
✓ C. Import the model into Vertex AI Model Registry then deploy it to a Vertex AI endpoint and send online prediction requests
The correct option is Import the model into Vertex AI Model Registry then deploy it to a Vertex AI endpoint and send online prediction requests.
This approach provides synchronous inference with low latency. A Vertex AI endpoint keeps the model container warm so it avoids per request loading overhead and can meet a 250 millisecond target. Online prediction supports autoscaling and traffic management which comfortably handles about 120,000 requests per day and returns decisions immediately so the suspension workflow can react in real time. Using the Model Registry gives you versioning and controlled rollouts which simplifies safe promotion to production.
Publish match-complete events to Pub/Sub and process them with a Dataflow streaming job that runs model inference in micro-batches is asynchronous and introduces queuing and micro-batch latency. Even well tuned streaming pipelines do not provide a per request synchronous response right after a match finishes, so this does not meet the immediate decision requirement.
Store the model in Cloud Storage and have a Cloud Function load the model file on each request to serve online predictions suffers from cold starts and repeated model loading which adds substantial initialization time and network I O. This overhead commonly pushes latency beyond 250 milliseconds and is unreliable under bursty traffic.
Import the model into Vertex AI Model Registry then use Vertex AI Batch Prediction to run scheduled inference jobs is designed for offline processing of large datasets. Batch jobs do not return results immediately after each match and therefore cannot satisfy real time suspension needs.
Scan for words like immediate and low latency and per request. These point to online prediction on managed endpoints. If you see batch or micro batch streaming then it is usually asynchronous and not suitable for real time decisions.
Question 4
You trained a next day time series model with randomly shuffled splits and achieved 92% validation accuracy, but production accuracy is 58%. What change to your data splitting and evaluation should you make to better match production performance?
-
✓ B. Use chronological train and validation splits where past trains and future validates
The correct option is Use chronological train and validation splits where past trains and future validates.
Chronological splits mirror how the model will be used in production because you train only on past data and evaluate on future data. Shuffled splits let information from the future leak into training which inflates validation accuracy and creates an unrealistic estimate. Chronological splits respect temporal order and better capture concept drift and seasonality so they provide evaluation metrics that are much closer to what you will see after deployment.
Use K-fold cross-validation is not appropriate for time series when applied in its standard form because folds mix past and future which causes leakage and overestimates performance. You would need a time aware variant rather than plain K fold.
Normalize training and validation independently is not a data splitting or evaluation change and it also breaks the proper preprocessing workflow. You should fit scalers on the training data and apply them to validation and test sets so independent normalization is incorrect and does not address the mismatch with production.
Fit preprocessing on the full dataset before splitting introduces leakage from validation and test into training and will further inflate validation accuracy rather than make it representative of production.
For time series questions, align validation with the prediction horizon and use chronological splits so training is strictly earlier than evaluation. Watch for leakage from shuffling or fitting transforms on all data and always fit preprocessing on the training set only.
Question 5
You are working for a subscription news platform called Beacon Media, and you need to forecast which subscribers are likely to cancel. You have 18 months of historical data that includes demographics, purchase records, and clickstream interactions from example.com. You must train the model using BigQuery ML and perform an in-depth evaluation of classification performance within BigQuery. What should you do?
-
✓ C. Build a logistic regression model in BigQuery ML and compute a confusion matrix with ML.CONFUSION_MATRIX
The correct option is Build a logistic regression model in BigQuery ML and compute a confusion matrix with ML.CONFUSION_MATRIX.
This choice fits a churn prediction task because churn is a binary outcome. It also satisfies the requirement to perform an in-depth evaluation inside BigQuery since the confusion matrix function reveals true positives, false positives, true negatives, and false negatives directly in SQL. You can complement this with ML.EVALUATE for aggregate metrics such as accuracy and AUC while staying entirely in BigQuery.
Train a linear regression model in BigQuery ML and evaluate it with ML.EVALUATE is incorrect because linear regression is for continuous targets and does not model a binary churn label properly. While ML.EVALUATE can return metrics, this setup will not provide a valid classification confusion matrix and will yield inappropriate regression outputs for churn.
Use BigQuery ML boosted_tree_regressor and review ML.FEATURE_IMPORTANCE for insights is incorrect because the regressor predicts continuous values rather than a binary class. Feature importance can provide insight into predictors, yet it does not deliver the required classification performance evaluation within BigQuery. A boosted tree classifier would be closer, but that is not what this option specifies.
Train a logistic regression model in BigQuery ML then register it in Vertex AI Model Registry and examine metrics in Vertex AI is incorrect because the question requires in-depth evaluation within BigQuery. Moving evaluation to Vertex AI is unnecessary and violates the constraint to keep the analysis in BigQuery, where the confusion matrix and other evaluation functions already meet the need.
When a question asks to evaluate within BigQuery, prefer BigQuery ML evaluation functions such as ML.CONFUSION_MATRIX and ML.EVALUATE. Map the prediction type first and choose a classification algorithm for churn rather than any regressor.
Question 6
Which approach provides low maintenance detection of training and serving skew for a Vertex AI endpoint by using a BigQuery training table as the baseline and supports retraining every 30 days?
-
✓ B. Use Vertex AI Model Monitoring with BigQuery baseline
The correct option is Use Vertex AI Model Monitoring with BigQuery baseline.
This uses Vertex AI Model Monitoring to detect training serving skew by comparing live prediction data on the endpoint with a baseline built from the BigQuery training table. You configure thresholds and schedules so it runs automatically and raises alerts when distributions diverge. This provides a managed and low maintenance approach, and you can pair Model Monitoring with a simple monthly retraining workflow so the model is refreshed every 30 days without custom glue.
Dataplex data quality rules checks data quality in datasets such as nulls, ranges, and schemas, which does not address training serving skew on a Vertex AI endpoint and does not use a BigQuery training table as the model baseline.
Prediction logging to BigQuery with scheduled queries requires building custom queries and thresholds to approximate skew detection and it lacks a managed baseline comparison and alerting, which increases maintenance effort.
Model Monitoring with Cloud Logging and Cloud Functions retraining relies on custom log parsing and bespoke retraining triggers, which is higher maintenance and unnecessary because Vertex AI Model Monitoring already provides baseline based skew detection without Cloud Logging or custom functions.
Match keywords like training serving skew, Vertex AI endpoint, and BigQuery baseline to the managed Vertex AI Model Monitoring feature, and avoid do it yourself answers that rely on custom queries or glue code.
Question 7
FinServe Labs trains a recommendation model using datasets purchased from an external aggregator, and the supplier sometimes changes column names, types, and categorical encodings without notice which has caused training jobs to fail. You want to harden your Vertex AI training pipeline so it quickly detects format and schema shifts from the upstream source and blocks bad data from propagating. What should you do?
-
✓ C. Configure TensorFlow Data Validation to infer a schema and surface anomalies from upstream changes
The correct option is Configure TensorFlow Data Validation to infer a schema and surface anomalies from upstream changes. This approach profiles each batch, infers or loads an expected schema, compares new data against a baseline, and flags schema or distribution shifts so the pipeline can fail fast and prevent bad data from reaching training.
With TensorFlow Data Validation you can automatically detect renamed or missing features, type changes, unexpected categorical values, and distribution drift across data spans. You can integrate it into a Vertex AI or TFX pipeline and configure anomalies to be treated as errors which blocks the subsequent training step until the issue is resolved.
Implement custom TensorFlow checks at the start of training to catch only known formatting issues is brittle and runs too late in the workflow. It does not infer a schema or uncover unknown anomalies and it allows problematic data to reach the training step before failing.
Create Dataplex data quality rules to validate incoming records before they reach the training dataset can help with governance and rule based checks on tables, yet it requires authoring rules and is not specialized for automatic ML feature schema inference and drift detection inside a training pipeline. It is also not as tightly integrated with TensorFlow based validation that can gate the training task.
Use TensorFlow Transform to normalize features and replace any values that violate the schema with 0 focuses on feature preprocessing rather than validation. Silently coercing bad values to zero hides real data quality problems and it does not detect upstream schema changes.
When a question emphasizes schema drift or unexpected upstream changes, prefer a tool that can infer a schema and flag anomalies automatically. Remember that TensorFlow Data Validation is for validation while TensorFlow Transform is for preprocessing.
Question 8
Which approach enables recurring batch predictions on the 25 TB BigQuery table foo.content_2027 with minimal data movement and operational overhead, given that the TensorFlow model was trained on Vertex AI?
-
✓ C. Import TensorFlow SavedModel into BigQuery ML and run ML.PREDICT
The correct option is Import TensorFlow SavedModel into BigQuery ML and run ML.PREDICT.
This approach lets you score the 25 TB foo.content_2027 table in place which minimizes data movement because prediction runs inside BigQuery. You can import the SavedModel artifact produced by training on Vertex AI and then use ML.PREDICT to generate results directly over the table. It is easy to operationalize recurring runs with a scheduled query and BigQuery scales the batch scoring without you provisioning or managing compute.
Cloud Run job that reads BigQuery and calls Vertex AI online prediction is designed for low latency online serving rather than massive batch scoring. It would require orchestrating many requests, handling quotas, and moving large volumes of data out of BigQuery which increases operational overhead and cost.
Vertex AI Batch Predictions with BigQuery export to Cloud Storage introduces unnecessary data movement because you would first export 25 TB from BigQuery to Cloud Storage and then run batch prediction. The question asks for minimal movement and operations which this workflow does not achieve.
Dataflow pipeline that reads BigQuery and invokes the SavedModel adds development and operational burden to build, deploy, and schedule a pipeline and it still moves data out of BigQuery for inference. This is workable but not the simplest or most efficient path compared to running predictions directly in BigQuery.
When you see very large BigQuery tables and a requirement for minimal data movement prefer solutions that keep computation inside BigQuery such as BigQuery ML or remote models and plan to use scheduled queries for recurring runs.
Question 9
Your team at BlueCar Labs trained a model that depended on compute intensive feature engineering during training, and the same transformations are required at inference time. The model is serving from AI Platform for low latency and high throughput online predictions. What architecture should you adopt to ensure scalable preprocessing during serving?
-
✓ C. Send prediction messages to Pub/Sub then run the preprocessing in a Dataflow pipeline before calling AI Platform online prediction and publish the predictions to an output Pub/Sub topic
The correct architecture is Send prediction messages to Pub/Sub then run the preprocessing in a Dataflow pipeline before calling AI Platform online prediction and publish the predictions to an output Pub/Sub topic.
This pattern decouples producers from consumers with Pub/Sub so it smooths traffic spikes and provides durable buffering while maintaining low latency. Dataflow can horizontally scale streaming workers and efficiently execute CPU intensive feature engineering with strong throughput and predictable latency. After preprocessing, the pipeline can call the online prediction endpoint and then publish results for downstream consumers. AI Platform has since evolved into Vertex AI on newer exams, yet the same pattern applies where Dataflow performs the heavy preprocessing and then invokes the online prediction service.
Publish requests to Pub/Sub and trigger a Cloud Function that applies preprocessing then invokes AI Platform prediction and writes results to a second Pub/Sub topic is not suited for compute intensive preprocessing because Cloud Functions have execution time limits, cold starts, and constrained resources that make heavy feature engineering inefficient at scale.
Stream requests into BigQuery and expose the preprocessing as a SQL view then poll for ready rows and send them to AI Platform and publish outputs to Pub/Sub misuses an analytics warehouse for per request serving. Streaming inserts are not instantly queryable and polling increases latency and complexity. Complex feature engineering often exceeds what is practical in SQL and orchestrating predictions from BigQuery is cumbersome for online use cases.
Train a new model that accepts raw inputs without preprocessing and deploy it on AI Platform for real time prediction changes the model contract and risks accuracy loss. The scenario requires the same transformations at inference as in training, so removing preprocessing does not meet the requirement and would force a new end to end model development effort.
When you see compute intensive preprocessing with low latency serving requirements, favor a decoupled streaming design with Pub/Sub for buffering and Dataflow for autoscaling transforms, then call the online prediction endpoint. Watch for answers that push heavy work into Cloud Functions or BigQuery because those are usually not a fit for real time inference pipelines.
Question 10
On Google Cloud, which approach best scales TensorFlow Transformer NMT training on approximately 15 million sentence pairs stored in Cloud Storage while requiring minimal code changes and no cluster management?
-
✓ B. Vertex AI training on Cloud TPU VMs with TPUStrategy
The correct option is Vertex AI training on Cloud TPU VMs with TPUStrategy because it delivers high throughput for Transformer models on large parallel corpora while removing the need to manage clusters and it typically requires only small adjustments to an existing TensorFlow training script.
This approach lets you use TensorFlow distribution with a strategy that is designed for TPUs and it integrates cleanly with common training loops. Vertex AI provisions and manages the TPU VM resources and networking so you focus on model code and data input from Cloud Storage. For large scale sequence to sequence workloads such as Transformer NMT, TPUs provide excellent scaling characteristics and memory bandwidth, and the code changes are usually limited to creating the strategy and placing model creation inside the strategy scope. Your input pipeline can continue to read from Cloud Storage using standard TensorFlow utilities, which aligns with the requirement for minimal code changes.
Vertex AI custom training with MultiWorkerMirroredStrategy on A2 GPUs is not the best fit because it still requires multi worker setup and orchestration and it tends to need more code changes to configure workers and coordinate distributed training. It also does not remove cluster management concerns to the same degree and it may not match TPU throughput for this class of model at this scale.
GKE Autopilot with Horovod on GPU nodes is not ideal because you would manage Kubernetes jobs, manifests and Horovod integration, which increases operational overhead. This violates the requirement to avoid cluster management and it commonly requires meaningful code and container changes.
Vertex AI custom training with tf.distribute.ParameterServerStrategy is not preferred for modern Transformer training on accelerators since this strategy targets parameter server architectures that are better suited to asynchronous or sparse workloads. It also introduces additional roles and processes that add complexity and do not meet the goal of minimal code changes and no cluster management.
When a question emphasizes minimal code changes and no cluster management favor managed training on Vertex AI with native TensorFlow distribution strategies that map directly to the hardware such as TPU over solutions that require Kubernetes, Horovod, or multi worker setup.
Question 11
You are preparing a classification model in Vertex AI using a BigQuery table from example.com. During exploratory analysis you find that a categorical field named plan_tier shows strong predictive signal yet a portion of its entries are null. What should you do to retain the information from this feature while addressing the null values?
-
✓ B. Introduce a dedicated “missing” category for that column and add a new binary indicator that marks whether the value was missing
The correct answer is Introduce a dedicated “missing” category for that column and add a new binary indicator that marks whether the value was missing.
This approach preserves all training examples and allows the model to learn from both the observed categories and the fact that some values are absent. When missingness is related to the target, explicitly representing it can boost performance because the model can treat the absence as informative rather than noise. Creating a separate category avoids forcing nulls into an existing level, and the companion indicator cleanly captures the missingness signal without distorting the categorical distribution.
Compute the mode of the column and replace all missing entries with that value is not ideal because it hides the missingness signal and can bias the category frequencies. If missingness is correlated with the label, replacing with the mode can mislead the model and reduce predictive power.
Drop the column if more than 25% of its values are missing otherwise keep it unchanged uses an arbitrary threshold and risks discarding a feature with demonstrated predictive value. Leaving the field unchanged also fails to make missingness explicit, which can leave useful signal untapped.
Use a Dataflow pipeline to discard every record where the column is null before training removes potentially valuable data, shrinks the sample size, and can introduce bias if the presence of nulls is systematic. It throws away signal rather than modeling it.
When a feature shows strong signal yet has nulls, consider whether the missingness itself might be informative. Prefer strategies that retain data and make missingness explicit rather than dropping rows or collapsing nulls into a frequent category.
Question 12
After 45 days in production, a semantic segmentation model shows lower PR AUC than in offline validation. It performs well on sparse nighttime scenes but fails in dense rush hour traffic. What is the most plausible explanation?
-
✓ B. Model overfit to sparse scenes and underfit dense traffic
The correct option is Model overfit to sparse scenes and underfit dense traffic.
Production performance dropping below offline validation along with strong results on sparse nighttime scenes but failure in dense rush hour traffic points to a training distribution that favored sparse conditions. The model likely learned features that generalize in empty or low-occlusion frames and did not learn robust representations for heavy congestion. This pattern is consistent with overfitting to sparse scenarios and underfitting to complex crowded scenes, which explains the lower PR AUC once real-world traffic mixes became denser over time.
PR AUC is the wrong metric for segmentation is not the most plausible explanation because precision recall metrics can meaningfully track pixel-level segmentation performance, especially with class imbalance. A suboptimal metric alone would not cause a targeted failure in dense traffic while performing well in sparse scenes.
Training and serving preprocessing mismatch would typically create broad performance degradation or obvious artifacts across conditions. The scenario describes good performance in sparse nighttime scenes and specific failure in dense traffic, which is more indicative of data distribution issues rather than a consistent preprocessing mismatch.
Training data overrepresented congested scenes would more likely produce strong performance during rush hour rather than failure. The observed behavior suggests the opposite data balance.
When performance drops only in certain contexts, look for distribution shift and imbalanced training coverage. Map the failure mode to which slices of data were likely underrepresented, and confirm with drift or skew monitoring.
Question 13
At Meridian Clinical Research you are tasked with building natural language models to analyze clinician notes that include custom tags created by your annotation team. The data consists of free text with these labels and you need to detect spans and assign categories for a wide variety of medical terms based on the provided annotations. What should you do?
-
✓ B. Train a custom model using AutoML Entity Extraction with your annotated medical text
The correct option is Train a custom model using AutoML Entity Extraction with your annotated medical text.
Train a custom model using AutoML Entity Extraction with your annotated medical text is designed for custom named entity recognition on free text when you have your own labeled spans and categories. It learns directly from your annotation schema and detects entity boundaries and types across varied clinical narratives without requiring you to build and tune architectures. It provides managed training, evaluation, and deployment which aligns with the need to use your team’s custom tags for a wide set of medical terms.
Healthcare Natural Language API is a pretrained service that extracts clinical entities using predefined medical ontologies. It does not let you train on your own labels, so it cannot learn the custom tags defined by your annotation team.
Fine-tune a BERT model for entity extraction using your labels on Vertex AI Training could work but it requires significant custom modeling, tokenization and label alignment, hyperparameter tuning, and infrastructure management. The question points to a managed solution that trains directly on your annotations, which is better addressed by Train a custom model using AutoML Entity Extraction with your annotated medical text.
Document AI focuses on document parsing, OCR, and layout understanding for structured or semi-structured documents. It is not intended for training a custom free text entity extraction model based on your labeled spans and categories.
When you need custom entities based on your own labels choose AutoML for text entity extraction. Use pretrained APIs when their schema fits your needs and reserve custom training for cases where you need full control. Look for the keywords custom labels and span detection to identify AutoML entity extraction scenarios.
Question 14
In Vertex AI, which feature allows you to explore an image classifier latent space and retrieve similar examples to investigate misclassifications made with high confidence?
-
✓ B. Example-based explanations with an embedding layer
The correct option is Example-based explanations with an embedding layer.
Example-based explanations with an embedding layer let you select a model layer to compute embeddings for inputs, which represents the model�s latent space. Vertex AI then finds nearest neighbor examples from your dataset based on those embeddings. This makes it easy to inspect clusters in the latent space and to surface similar training or validation images that help you analyze confident misclassifications and understand decision boundaries.
Vertex AI Matching Engine is a managed vector search service that excels at large scale nearest neighbor retrieval, yet it is not the built in explanation feature used to visualize a classifier�s latent space within Vertex AI model evaluation. You would need to build and manage an additional pipeline to use it for analysis, which is not what the question asks.
Vertex AI Model Monitoring focuses on tracking data and prediction drift, skew, and anomalies in production. It does not provide tools to inspect embeddings or to retrieve similar examples for a specific prediction, so it does not help analyze misclassifications in latent space.
When you see phrases like latent space and find similar examples think of embeddings and example based explanations rather than monitoring features or general vector databases.
Question 15
Riverline Gear operates a production demand forecasting pipeline. Raw events are stored in BigQuery and a Dataflow job applies Z score normalization to features and writes the transformed data back to BigQuery. New training data arrives every three days. You want to reduce processing time and ongoing manual effort while keeping the workflow simple and avoiding extra platforms. What should you do?
-
✓ B. Express Z score normalization in BigQuery SQL then implement it as a view or a scheduled query that materializes the results
The correct choice is Express Z score normalization in BigQuery SQL then implement it as a view or a scheduled query that materializes the results. This keeps all processing in BigQuery, eliminates pipeline spin up overhead, and lets you run the computation automatically every three days with a scheduler while keeping the workflow simple.
Computing Z scores in BigQuery is straightforward using aggregate functions to derive means and standard deviations and applying them in a query. A view provides on demand normalization when freshness at query time is acceptable, while a scheduled query materializes results on a cadence that matches the three day training data arrival. This BigQuery SQL view or scheduled query approach reduces operational effort and processing time because it avoids managing additional services and code.
Create a Cloud Run job that queries BigQuery applies Z score normalization in Python and writes the output to a results table is not ideal because it introduces another platform, requires containerized code, and needs separate scheduling. You can achieve the same result directly in SQL with less operational overhead.
Normalize with Dataproc Serverless Spark using the BigQuery connector and write the standardized features back to BigQuery adds unnecessary complexity for a simple SQL friendly transformation. Spark is well suited for complex or multi source processing, but here BigQuery can perform the calculation more simply and with lower operational burden.
Move preprocessing into a Vertex AI Pipelines component that uses tf.transform to standardize features before training adds an extra platform and tighter coupling to the ML pipeline for a task that BigQuery can do efficiently on its own. While tf.transform helps ensure training and serving consistency, it is overkill for periodic batch standardization already stored and consumed in BigQuery.
When data is already in BigQuery and the transformation is simple and periodic, prefer BigQuery views or scheduled queries to avoid extra platforms and reduce operational overhead.

All GCP questions come from my Google ML Udemy Course and certificationexams.pro
Question 16
Which action can reduce per request latency for CPU based TensorFlow Serving deployed on Google Kubernetes Engine without changing the architecture?
-
✓ B. Build TensorFlow Serving with CPU specific optimizations and set a matching minimum CPU platform on the GKE node pool
The correct option is Build TensorFlow Serving with CPU specific optimizations and set a matching minimum CPU platform on the GKE node pool.
Compiling the server with CPU targeted instructions such as AVX or AVX2 and optional vendor math libraries enables vectorized kernels and reduces fallback paths which lowers per request latency on CPUs. Setting a minimum CPU platform on the node pool ensures the pods run on nodes that support those instructions so the optimized binary consistently takes the fastest execution paths without architectural changes.
This approach keeps the same serving topology on Google Kubernetes Engine and only changes the binary characteristics and the node pool constraint which directly targets single request latency rather than throughput tuning.
Tune inter_op and intra_op thread counts in TensorFlow Serving is not the best lever for reducing single request latency because thread tuning mostly trades off throughput and tail behavior under load and it does not make individual compute kernels execute faster.
Dramatically raise the max_batch_size setting in TensorFlow Serving typically increases latency because requests wait to form larger batches even if it can improve throughput, so it goes against the goal of reducing per request latency.
Switch to the tensorflow-model-server-universal build of TensorFlow Serving is counterproductive for latency because the universal build omits CPU specific optimizations to maximize portability which usually results in slower execution on modern CPUs.
When you see a latency focused question on CPU, look for options that enable CPU instruction optimizations and enforce a minimum CPU platform. Be cautious with thread tuning and batching since they often prioritize throughput over per request latency.
Question 17
At Asteria Retail you manage Vertex AI Pipelines that train models and deploy them to a Vertex AI endpoint for online predictions. You plan to use Cloud Build for continuous integration and continuous delivery so that teams can experiment frequently while keeping production stable. You want a release process that promotes updated pipeline versions to production quickly and reduces the chance that a new pipeline disrupts the live service. What should you do?
-
✓ C. Configure Cloud Build to build and test the source, deploy the validated artifacts to a staging Vertex AI Pipelines environment, run the pipeline successfully in staging, then promote the same artifacts to production
The correct option is Configure Cloud Build to build and test the source, deploy the validated artifacts to a staging Vertex AI Pipelines environment, run the pipeline successfully in staging, then promote the same artifacts to production. This approach validates changes in an isolated environment and then promotes the exact artifacts that passed testing which shortens release time and reduces the chance of disrupting the live endpoint.
Building and testing with Cloud Build ensures your container images and pipeline packages are consistent and versioned in Artifact Registry. Deploying to a staging Vertex AI Pipelines environment and executing the pipeline end to end verifies integration with data sources, components, and Vertex AI endpoints. Promoting the same immutable artifacts to production prevents drift between what was tested and what is released which is a core CI and CD best practice for reliability.
Create a Cloud Build workflow that compiles and tests the repository, then use the Google Cloud console to push the container to Artifact Registry and upload the compiled pipeline to Vertex AI Pipelines is incorrect because manual console steps break automation and introduce human error. It also skips an automated staging execution which increases risk to the production endpoint.
Configure Cloud Build to build the code and deploy artifacts to a staging environment, then after a successful staging execution rebuild from the main branch and release to production is incorrect because rebuilding after staging produces different artifacts than the ones validated. This creates drift and can reintroduce defects that were not present in staging.
Set up Cloud Build to compile the code and deploy to a staging environment, run only unit tests in staging, and if the tests pass deploy the pipeline to production is incorrect because unit tests alone do not prove the pipeline runs successfully. You need an end to end staging execution to validate components, data access, and deployment behavior before promotion.
Prefer answers that promote the same artifacts from staging to production and that include automated end to end validation in staging. Be cautious of manual console steps or rebuilds between stages since these add risk.
Question 18
How should you apply MinMax scaling to numeric columns and one hot encoding to categorical features on a BigQuery table of about 300 million rows to minimize engineering effort and cost when training a TensorFlow classifier in Vertex AI Training?
-
✓ C. BigQuery SQL preprocessing with min max lookup and one hot in a view for training
The correct option is BigQuery SQL preprocessing with min max lookup and one hot in a view for training.
This approach keeps preprocessing inside BigQuery so you use set based SQL to compute the min and max per numeric column once and reference them in a view that scales values on read. You can create one hot encoded columns using simple conditional expressions in the same view. The Vertex AI training job can read directly from the view using the BigQuery Storage Read API which avoids large data exports and reduces both engineering effort and cost. Using a view also keeps training and any later batch scoring consistent because the same transformation logic is applied each time without materializing new datasets.
Perform numeric scaling in BigQuery and let TensorFlow one hot encode during training splits transformations across systems which increases code complexity and the chance of skew. It also pushes categorical expansion into the trainer which can raise compute and memory costs on a very large dataset.
Vertex AI Feature Store transformations with offline materialization for training introduces significant setup and operational overhead for registering features and materializing offline data. This is more expensive and complex than needed for straightforward scaling and one hot encoding on an existing BigQuery table.
Use TFX Transform on Dataflow and write TFRecords to Cloud Storage before training requires building and running a Dataflow pipeline and managing TFRecord artifacts which adds cost and engineering effort for simple transformations that SQL can express efficiently.
When a question asks you to minimize engineering effort and cost for simple feature prep on very large tables, favor BigQuery SQL and views that the training job can read directly rather than building new pipelines or a feature store.
Question 19
Your team at Blue Harbor Outfitters built a Vertex AI forecasting model that generates monthly demand estimates for each SKU. You need to quickly assemble a stakeholder report that explains how the model arrives at its predictions, and you have two months of recent actual sales that were excluded from training. What should you do to produce the data for this report?
-
✓ C. Create a batch prediction on the recent actual sales and enable feature attributions, then include the attributions alongside the predictions in the report
The correct answer is Create a batch prediction on the recent actual sales and enable feature attributions, then include the attributions alongside the predictions in the report.
This approach produces explanations that quantify how each input feature influenced each forecast, which is exactly what stakeholders need to understand how the model arrives at its predictions. Since the two months of recent actuals were excluded from training, they are appropriate inputs for generating predictions with explanations, and you can present the predicted values together with their feature attributions in the report.
Run a batch prediction on the recent actual sales and compare the predicted values to the actuals only evaluates accuracy and does not explain why the model produced those values. Without attributions it does not satisfy the requirement to explain the predictions.
Enable Vertex AI Model Monitoring on the deployed endpoint and use its drift metrics to describe the predictions provides population level drift statistics and requires live traffic over time, which does not explain individual forecasts. It does not produce per prediction attributions for immediate stakeholder reporting.
Generate counterfactual datasets from the recent actuals and run batch predictions on both the original and counterfactual data, then compare the results in the report is unnecessary for this need and is not a built in workflow for AutoML forecasting. It would be manual and time consuming without guaranteeing clear explanations for stakeholders.
When a question asks you to explain model outputs, look for features like feature attributions and pair them with predictions on held out data. Monitoring and accuracy checks help different goals and do not replace explanations.
Question 20
Which managed service offers a simple feature store that serves online features in under 20 milliseconds and supports point in time historical retrieval for up to 150 days?
-
✓ B. Vertex AI Feature Store service
The correct option is Vertex AI Feature Store service.
Vertex AI Feature Store service is a fully managed feature store on Google Cloud that is designed for low latency online serving which is typically under 20 milliseconds. It also supports point in time correct historical retrieval so you can build training datasets that accurately reflect feature values as of a given timestamp and it retains history for up to 150 days. The service provides both an online store for real time inference and an offline store for training and batch workflows which is exactly what the question describes.
Cloud Bigtable is a low latency wide column NoSQL database that excels at time series and large scale workloads. It does not provide built in feature store capabilities such as point in time joins or managed offline and online stores for ML features.
Cloud Spanner is a globally distributed relational database that offers strong consistency and SQL. It is not a managed feature store and it does not natively provide point in time feature retrieval or purpose built online and offline feature serving.
Cloud Firestore is a serverless document database for application data. It is not designed as a feature store and it lacks built in functionality for point in time feature retrieval and coordinated online and offline feature management.
When a question mentions online serving latency and point in time historical retrieval you should think of specialized ML feature stores rather than general purpose databases. Map the requirement to the managed service that explicitly advertises those capabilities.
Jira, Scrum & AI Certification |
---|
Want to get certified on the most popular software development technologies of the day? These resources will help you get Jira certified, Scrum certified and even AI Practitioner certified so your resume really stands out..
You can even get certified in the latest AI, ML and DevOps technologies. Advance your career today. |
Cameron McKenzie is an AWS Certified AI Practitioner, Machine Learning Engineer, Copilot Expert, Solutions Architect and author of many popular books in the software development and Cloud Computing space. His growing YouTube channel training devs in Java, Spring, AI and ML has well over 30,000 subscribers.