Sample Questions for AWS' Machine Learning Associate Certification

All questions come from my Udemy AWS ML course, and certificationexams.pro
Free AWS Machine Learning Exam Topics Exams
The AWS Machine Learning Associate certification validates your ability to configure, build, and optimize ML solutions that support intelligent automation and data-driven decision-making. It focuses on the full machine learning lifecycle including data preparation, model training, evaluation, and deployment on AWS.
Start your preparation with the AWS ML Associate Practice Questions. These questions mirror the tone, logic, and format of the official exam. For deeper practice, use Real AWS Machine Learning Exam Questions to experience realistic, scenario-based challenges that reflect real-world AWS use cases.
For targeted study sessions, try AWS Machine Learning Sample Questions that cover topics such as feature engineering, data balancing, bias detection, and cost optimization. These practice items help you understand how AWS AI services integrate to deliver scalable and efficient ML solutions.
Real AI & ML Exam Simulator
Each question in the Machine Learning Associate Questions and Answers set is designed to teach as well as test. They include explanations that help you understand the reasoning behind each correct choice and the purpose of the incorrect options.
For complete exam readiness, use the AWS Machine Learning Exam Simulator and AWS ML Associate Practice Test. These full-length tests simulate the real exam experience, helping you manage time, interpret complex scenarios, and develop confidence.
If you prefer focused review, check out the AWS Machine Learning Associate Braindump and AWS ML Associate Exam Dump collections. Each study set is organized by topic to help you strengthen your skills in key areas of machine learning on AWS.
By mastering these exercises, you will be ready to design, deploy, and monitor ML models confidently within the AWS ecosystem. Begin today with the AWS ML Associate Practice Questions and train using the AWS Machine Learning Exam Simulator. Prepare to earn your certification and accelerate your career in machine learning and artificial intelligence.
Git, GitHub & GitHub Copilot Certification Made Easy |
---|
Want to get certified on the most popular AI, ML & DevOps technologies of the day? These five resources will help you get GitHub certified in a hurry.
Get certified in the latest AI, ML and DevOps technologies. Advance your career today. |
AWS Machine Learning Associate Sample Questions
Question 1
Which AWS-managed setup enables end-to-end ML pipelines with built-in drift monitoring to adjust within 3 hours and deliver daily and weekly forecasts cost-effectively?
-
❏ A. AWS Glue, Amazon SageMaker, AWS AppConfig
-
❏ B. SageMaker Pipelines and SageMaker Model Monitor
-
❏ C. AWS Step Functions, Amazon SageMaker, Amazon CloudWatch
-
❏ D. Amazon Redshift, Amazon Forecast, Amazon QuickSight
Question 2
In SageMaker Data Wrangler, which transformation fills missing numeric values with the column mean?
-
❏ A. Normalize
-
❏ B. Impute (mean)
-
❏ C. Drop
-
❏ D. Encode
Question 3
How should you scale an Amazon SageMaker endpoint to absorb sudden traffic spikes while keeping p95 latency low and spend controlled?
-
❏ A. Use AWS Lambda with provisioned concurrency for inference and cap concurrency
-
❏ B. Enable SageMaker endpoint auto scaling with target tracking on InvocationsPerInstance and set AWS Budgets alerts
-
❏ C. Request a SageMaker service quota increase and add a CloudWatch utilization alarm
-
❏ D. Migrate to SageMaker Serverless Inference with a concurrency limit
Question 4
What does the bias-variance tradeoff describe?
-
❏ A. High variance causes underfitting and high bias causes overfitting
-
❏ B. Choosing to optimize only bias or only variance
-
❏ C. Balancing bias and variance to minimize generalization error
-
❏ D. Tradeoff between precision and recall
Question 5
Which AWS service should host an always-on, low-latency (under 100 ms) ML model for live inference with strong security controls?
-
❏ A. Amazon SageMaker Serverless Inference
-
❏ B. Amazon ECS
-
❏ C. Amazon SageMaker real-time endpoints
-
❏ D. AWS Lambda
Question 6
Which Amazon Comprehend approach should be used to extract key phrases, entities, and sentiment at scale from about 200,000 S3 documents per week with minimal setup?
-
❏ A. Use Comprehend synchronous APIs in Lambda for each review
-
❏ B. Comprehend asynchronous batch job over S3 inputs
-
❏ C. Train a custom NLP model in SageMaker and run batch inference
-
❏ D. Comprehend real-time endpoints behind an autoscaling service
Question 7
Which SageMaker deployment best provides low-latency real-time inference with automatic scaling from roughly 120 to 3,000 RPS?
-
❏ A. SageMaker Serverless Inference
-
❏ B. SageMaker real-time endpoint with Application Auto Scaling
-
❏ C. Elastic Beanstalk with manual scaling
-
❏ D. SageMaker multi-model endpoint
Question 8
How can Redshift ML in one account read training data in an S3 bucket in another account using only private connectivity with no public IPv4 paths?
-
❏ A. S3 Access Point with VPC restriction for cross-account reads
-
❏ B. Create an S3 VPC endpoint and a cross-account bucket policy restricted to that endpoint
-
❏ C. VPC peering between the accounts to reach S3
-
❏ D. S3 Transfer Acceleration
Question 9
Which SageMaker capabilities provide a no-code or low-code path to fine-tune and quickly deploy a foundation model for text summarization? (Choose 2)
-
❏ A. Amazon SageMaker Autopilot
-
❏ B. Amazon SageMaker JumpStart
-
❏ C. Amazon SageMaker Clarify
-
❏ D. Amazon SageMaker Canvas
-
❏ E. Amazon SageMaker Pipelines
-
❏ F. Amazon SageMaker Data Wrangler
Question 10
Which AWS service should orchestrate an end-to-end ML pipeline with strict dependencies, native experiments/lineage and model registry, and tight AWS integration, running up to 300 times per day?
-
❏ A. AWS Glue Workflows
-
❏ B. AWS Step Functions
-
❏ C. SageMaker Pipelines
-
❏ D. Amazon Managed Workflows for Apache Airflow

All questions come from my Udemy AWS ML course, and certificationexams.pro
Question 11
Which AWS services should pair with SageMaker to provide interactive SQL on S3 and event-driven preprocessing so training data updates within 10 minutes of new arrivals?
-
❏ A. Amazon EMR with AWS Step Functions
-
❏ B. Amazon Athena plus S3-triggered AWS Lambda
-
❏ C. Amazon Redshift Spectrum with AWS Glue
-
❏ D. AWS Glue with Amazon Athena
Question 12
How can you continuously detect input data quality issues and feature drift for a deployed model to keep it reliable?
-
❏ A. Amazon CloudWatch
-
❏ B. Amazon SageMaker Model Monitor
-
❏ C. Add custom logging and review metrics every 4 days
-
❏ D. AWS Glue Data Quality
-
❏ E. Amazon QuickSight
-
❏ F. Amazon SageMaker Clarify
Question 13
Which serverless pattern enriches each Amazon Lookout for Vision inference with metadata from DynamoDB table foo and immediately updates the corresponding DynamoDB item?
-
❏ A. AWS Step Functions writing outputs to Amazon S3
-
❏ B. AWS Lambda calling Lookout for Vision, reading from DynamoDB foo, enriching, then updating DynamoDB
-
❏ C. Amazon EventBridge triggering AWS Glue ETL to write to DynamoDB
-
❏ D. Amazon Kinesis Data Firehose to Amazon Redshift for joins with DynamoDB data
-
❏ E. Amazon EC2 polling Lookout for Vision with AWS Batch to push updates to DynamoDB
Question 14
Which statement correctly distinguishes Amazon Bedrock from Amazon SageMaker JumpStart for rapid prototyping with minimal setup?
-
❏ A. SageMaker JumpStart gives a unified API to multiple third-party foundation models; Amazon Bedrock centers on AWS-only models
-
❏ B. Amazon Bedrock provides managed, multi-provider foundation models; SageMaker JumpStart offers a curated catalog of prebuilt models, algorithms, notebooks, and solution templates
-
❏ C. Amazon Bedrock is for building custom models end-to-end; SageMaker JumpStart only deploys pre-trained LLMs
-
❏ D. Amazon Bedrock focuses on training and hyperparameter tuning from scratch; SageMaker JumpStart mainly hosts third-party models
Question 15
With Amazon Lookout for Vision detecting defects at about 200 items per minute, which services enable immediate email alerts and centralized audit logging?
-
❏ A. Amazon Lookout for Vision with AWS Lambda and Amazon EventBridge
-
❏ B. Amazon Lookout for Vision with Amazon SNS and Amazon S3
-
❏ C. Amazon Lookout for Vision with Amazon SNS and CloudWatch Logs
-
❏ D. Amazon Lookout for Vision with AWS Step Functions and Amazon CloudWatch
Question 16
With 24 months of weekly time series data containing gaps and holiday-related spikes from data errors, what should you do first to ensure reliable forecasting?
-
❏ A. Standardize features and ignore gaps/outliers; train gradient-boosted trees
-
❏ B. Use SageMaker Autopilot to pick a model
-
❏ C. Perform EDA and fix missing timestamps and anomalies before modeling
-
❏ D. Start with a deep neural network and learn seasonality
Question 17
Which statement correctly contrasts K-Means and K-Nearest Neighbors regarding unlabeled clustering versus labeled prediction?
-
❏ A. K-Means clusters unlabeled data by similarity; KNN predicts from labeled neighbors
-
❏ B. K-Means requires labeled data; KNN is label-free
-
❏ C. Both are unsupervised clustering
-
❏ D. K-Means is supervised; KNN is unsupervised clustering
-
❏ E. Amazon SageMaker Autopilot
-
❏ F. K-Means performs regression; KNN does dimensionality reduction
Question 18
Which AWS service measures bias in training data and model predictions for ML fairness reporting?
-
❏ A. AWS Glue
-
❏ B. SageMaker Clarify
-
❏ C. Amazon SageMaker Model Monitor
-
❏ D. Amazon SageMaker Data Wrangler
Question 19
During repeated SageMaker training runs to test hyperparameters, which feature best reduces startup time between successive jobs?
-
❏ A. SageMaker Managed Spot Training
-
❏ B. SageMaker Warm Pools to reuse training infrastructure
-
❏ C. SageMaker Training Compiler
-
❏ D. SageMaker Pipelines step caching
Question 20
Which AWS service provides low-latency real-time model inference with automatic scaling?
-
❏ A. AWS Lambda with Amazon DynamoDB
-
❏ B. Amazon SageMaker Serverless Inference
-
❏ C. Amazon Kinesis Data Analytics
-
❏ D. Amazon SageMaker real-time endpoint
-
❏ E. AWS Fargate
AWS Machine Learning Associate Practice Questions Answers

All questions come from my Udemy AWS ML course, and certificationexams.pro
Question 1
Which AWS-managed setup enables end-to-end ML pipelines with built-in drift monitoring to adjust within 3 hours and deliver daily and weekly forecasts cost-effectively?
-
✓ B. SageMaker Pipelines and SageMaker Model Monitor
SageMaker Pipelines and SageMaker Model Monitor is correct because it provides a fully managed, end-to-end ML workflow with native monitoring for data and model quality on deployed endpoints. Model Monitor can run frequent schedules to detect drift and quality issues and, combined with Pipelines, can trigger retraining or rollbacks to adapt within tight time windows. This directly supports daily and weekly forecasting while keeping operational overhead low.
AWS Glue, Amazon SageMaker, AWS AppConfig is not ideal because AppConfig manages application configuration rather than ML-specific monitoring, so drift detection and automated responses would require custom build-out.
AWS Step Functions, Amazon SageMaker, Amazon CloudWatch can orchestrate and alert, but it lacks native drift detection. You would need to implement custom metrics, detectors, and retraining triggers, increasing complexity and operational cost.
Amazon Redshift, Amazon Forecast, Amazon QuickSight focuses on warehousing, forecasting, and visualization. While Forecast is strong for time series, this stack does not provide automatic model/data drift monitoring or rapid end-to-end retraining orchestration within hours.
Watch for keywords like end-to-end ML, fully managed, and drift monitoring. When the question emphasizes fast adaptation to changing data patterns on deployed models, think SageMaker Model Monitor plus Pipelines for native automation. If options rely on general orchestration or BI without explicit ML monitoring, they are usually distractors.
Question 2
In SageMaker Data Wrangler, which transformation fills missing numeric values with the column mean?
-
✓ B. Impute (mean)
The correct choice is Impute (mean) because Data Wrangler’s Impute transformation calculates a statistic such as the mean for each numeric column and replaces missing values accordingly, stabilizing model inputs without removing data.
Normalize is incorrect because it rescales data (e.g., z-score or min–max) and does not fill missing values.
Drop is incorrect since it eliminates rows or columns with nulls, which may discard useful data.
Encode is incorrect because it converts categorical data into numeric form and is not relevant to numeric mean imputation.
When you see keywords like missing values plus a numeric mean (or median/mode), think of the Impute transform. Terms like scale or normalize point to feature scaling, not missing value handling, and encode relates to categorical features.
Question 3
How should you scale an Amazon SageMaker endpoint to absorb sudden traffic spikes while keeping p95 latency low and spend controlled?
-
✓ B. Enable SageMaker endpoint auto scaling with target tracking on InvocationsPerInstance and set AWS Budgets alerts
The best approach is to Enable SageMaker endpoint auto scaling with target tracking on InvocationsPerInstance and set AWS Budgets alerts. Target tracking adjusts the number of endpoint instances to match demand, helping keep p95 latency low during surges. Pairing this with AWS Budgets provides visibility and guardrails on spend without manual intervention.
The option Use AWS Lambda with provisioned concurrency for inference and cap concurrency changes the architecture and can introduce cold starts or concurrency bottlenecks, which is not ideal for typical SageMaker-hosted models and larger ML frameworks.
The option Request a SageMaker service quota increase and add a CloudWatch utilization alarm only removes ceilings; it does not dynamically scale capacity or guarantee latency under sudden spikes.
The option Migrate to SageMaker Serverless Inference with a concurrency limit can work for intermittent traffic but may suffer from cold starts and less predictable latency, making it less suitable for strict p95 targets at sustained or bursty scale.
When a question stresses unpredictable spikes and latency SLOs, look for target tracking auto scaling on SageMaker endpoints with the metric InvocationsPerInstance. Purely increasing instance size or quotas does not auto-scale. Alternatives like Lambda or Serverless Inference can be useful but often trade off latency consistency and operational complexity. Combine scaling with AWS Budgets for cost control signals.
Question 4
What does the bias-variance tradeoff describe?
-
✓ C. Balancing bias and variance to minimize generalization error
Balancing bias and variance to minimize generalization error is correct because the bias–variance tradeoff is about finding the model complexity and regularization that jointly reduce underfitting and overfitting, thereby minimizing error on unseen data.
High variance causes underfitting and high bias causes overfitting is incorrect because it swaps the relationships; high bias typically leads to underfitting, and high variance typically leads to overfitting.
Choosing to optimize only bias or only variance is wrong because it ignores that improving one often worsens the other; the goal is to balance them.
Tradeoff between precision and recall is unrelated; that concerns classification thresholds, not error decomposition.
Link concepts: bias → underfitting, variance → overfitting, and look for wording about generalization error and balance. Beware distractors about dataset splits or metric tradeoffs like precision–recall that do not address the bias–variance framework.
Question 5
Which AWS service should host an always-on, low-latency (under 100 ms) ML model for live inference with strong security controls?
-
✓ C. Amazon SageMaker real-time endpoints
The correct choice is Amazon SageMaker real-time endpoints because it delivers managed, always-on, low-latency inference with autoscaling, VPC networking, IAM, and KMS integration, which aligns with stringent security and steady high-throughput requirements.
Amazon SageMaker Serverless Inference is better for intermittent or spiky traffic and can introduce cold starts that break consistent under-100 ms latency targets.
Amazon ECS can host models but leaves you to manage scaling, observability, patching, and compliance controls, adding risk and overhead compared to a purpose-built managed ML endpoint.
AWS Lambda faces cold starts, duration limits, and scaling constraints that make it unreliable for sustained, ultra-low-latency workloads.
When you see always-on, tight latency SLOs, and strong security needs, map to SageMaker real-time endpoints. Use SageMaker Serverless Inference for sporadic traffic, Asynchronous Inference for long-running requests, and Batch Transform for offline scoring.
Question 6
Which Amazon Comprehend approach should be used to extract key phrases, entities, and sentiment at scale from about 200,000 S3 documents per week with minimal setup?
-
✓ B. Comprehend asynchronous batch job over S3 inputs
Comprehend asynchronous batch job over S3 inputs is correct because it natively reads large S3 corpora, scales automatically, supports key phrases, entities, and sentiment, and writes outputs back to S3 with minimal code.
The choice Use Comprehend synchronous APIs in Lambda for each review targets single-document, low-latency use cases and is inefficient and costly for high-volume weekly batches.
Train a custom NLP model in SageMaker and run batch inference introduces unnecessary development and operations when Comprehend’s prebuilt capabilities suffice.
Comprehend real-time endpoints behind an autoscaling service are designed for online inference, not bulk S3 batch jobs, and add cost and complexity.
When the prompt emphasizes a large S3 text corpus plus minimal setup and prebuilt NLP (entities, key phrases, sentiment), pick Comprehend asynchronous batch jobs. Use synchronous APIs or endpoints for low-latency, per-request needs, and choose SageMaker only when you require custom models beyond what Comprehend provides.
Question 7
Which SageMaker deployment best provides low-latency real-time inference with automatic scaling from roughly 120 to 3,000 RPS?
-
✓ B. SageMaker real-time endpoint with Application Auto Scaling
The best choice is SageMaker real-time endpoint with Application Auto Scaling because it natively supports automatic horizontal scaling of endpoint instances based on metrics such as InvocationsPerInstance, enabling low latency during peaks and cost efficiency off-peak. This directly addresses variable real-time demand in the 120–3,000 RPS range.
The option SageMaker Serverless Inference is appealing for spiky or intermittent workloads, but cold starts and concurrency limits can introduce latency and make it less suitable for consistently high throughput.
Elastic Beanstalk with manual scaling is not SageMaker-native and cannot automatically scale endpoints in response to rapid changes.
SageMaker multi-model endpoint is meant to consolidate many models behind one endpoint; model loading overhead and its focus on model consolidation rather than traffic-based scaling make it a poor fit for this requirement.
Watch for keywords like real-time, low latency, and automatic scaling. Associate these with SageMaker real-time endpoints plus Application Auto Scaling using the InvocationsPerInstance metric. Be cautious of distractors like multi-model endpoints (optimize hosting many models) and serverless inference (great for intermittent traffic but can suffer from cold starts at high sustained RPS).
Question 8
How can Redshift ML in one account read training data in an S3 bucket in another account using only private connectivity with no public IPv4 paths?
-
✓ B. Create an S3 VPC endpoint and a cross-account bucket policy restricted to that endpoint
The correct choice is Create an S3 VPC endpoint and a cross-account bucket policy restricted to that endpoint. An Amazon S3 VPC endpoint ensures all S3 traffic from the VPC remains on the AWS network without traversing the public internet. Combined with a cross-account bucket policy that grants the Redshift role access and restricts access by aws:SourceVpce (or aws:VpcEndpointIds), this enforces private-only connectivity and least-privilege access for Redshift ML to read training data across accounts.
The option ‘S3 Access Point with VPC restriction for cross-account reads’ is insufficient by itself. While VPC-restricted Access Points help scope access, you still need an S3 VPC endpoint to guarantee private connectivity; without it, traffic may use public paths.
The option ‘VPC peering between the accounts to reach S3’ is incorrect because peering connects VPCs, not S3 service endpoints. S3 is accessed via service endpoints, so without an S3 VPC endpoint, traffic can still egress publicly.
The option ‘S3 Transfer Acceleration’ is wrong because it routes through public edge locations and the public internet, conflicting with the requirement to avoid public IPv4 paths.
When you see a requirement for S3 access that must avoid public internet, think VPC endpoints for S3 plus bucket policies restricted by aws:SourceVpce. For cross-account, remember to grant the consuming account’s role principal and limit access to the specific endpoint for defense in depth.
Question 9
Which SageMaker capabilities provide a no-code or low-code path to fine-tune and quickly deploy a foundation model for text summarization? (Choose 2)
-
✓ B. Amazon SageMaker JumpStart
-
✓ D. Amazon SageMaker Canvas
Amazon SageMaker JumpStart is correct because it provides a guided, low-code experience to select foundation models, run point-and-click fine-tuning with your dataset, and deploy endpoints directly from SageMaker Studio. Amazon SageMaker Canvas is also correct because it offers a no-code UI that integrates with JumpStart to initiate fine-tuning workflows and deploy models with minimal operational effort. Together, these capabilities deliver the fastest path to low-code or no-code FM fine-tuning and rapid deployment on SageMaker.
The option Amazon SageMaker Autopilot is not suitable because it targets automated modeling for tabular data, not fine-tuning large language models.
Amazon SageMaker Pipelines focuses on workflow orchestration and CI/CD; it does not provide a no-code fine-tuning interface.
Amazon SageMaker Data Wrangler handles data preparation, not model fine-tuning or serving.
Amazon SageMaker Clarify addresses bias and explainability rather than FM adaptation or deployment.
When you see no-code or low-code plus foundation model fine-tuning and rapid deployment, map to JumpStart and Canvas. Reserve Autopilot for automated tabular ML, Pipelines for orchestration, and Data Wrangler for data prep. Look for keywords like no-code, foundation models, and one-click deploy to quickly identify JumpStart and Canvas.
Question 10
Which AWS service should orchestrate an end-to-end ML pipeline with strict dependencies, native experiments/lineage and model registry, and tight AWS integration, running up to 300 times per day?
-
✓ C. SageMaker Pipelines
SageMaker Pipelines is purpose-built for managing end-to-end ML workflows. It provides native integration with SageMaker Experiments for experiment tracking, lineage tracking for datasets and models, the SageMaker Model Registry for versioned model governance, and first-class steps for training, hyperparameter tuning, and deployment. It supports strict task dependencies, conditional logic, parameters, and scales to frequent executions, aligning directly with the requirements.
The option AWS Step Functions is a strong general orchestrator but does not natively provide ML experiment tracking, lineage, or a model registry, so it falls short for ML governance needs.
Amazon Managed Workflows for Apache Airflow can orchestrate complex DAGs but introduces more operational overhead and still lacks ML-native lineage and experiment tracking out of the box.
AWS Glue Workflows focuses on ETL orchestration and is not optimized for ML experiments, lineage, or model registry workflows.
When a question emphasizes experiments, lineage, and a model registry alongside tight SageMaker integration, prefer SageMaker Pipelines. If the prompt stresses generic service orchestration across many AWS services without ML governance, think Step Functions. For DAG-heavy but non-ML-native orchestration, consider MWAA. For ETL-only pipelines, consider Glue Workflows.
Question 11
Which AWS services should pair with SageMaker to provide interactive SQL on S3 and event-driven preprocessing so training data updates within 10 minutes of new arrivals?
-
✓ B. Amazon Athena plus S3-triggered AWS Lambda
Amazon Athena plus S3-triggered AWS Lambda is the best fit because Athena provides serverless, interactive SQL directly on data in Amazon S3, and S3 event notifications can invoke Lambda functions as soon as new objects arrive. This enables lightweight, event-driven preprocessing to keep SageMaker training data current within the 10-minute target, with minimal operational overhead and elastic scaling.
The option Amazon EMR with AWS Step Functions relies on managed clusters and workflow orchestration, which adds provisioning latency and management overhead, making it less suitable for simple, near real-time S3-driven updates.
The option Amazon Redshift Spectrum with AWS Glue requires a Redshift cluster and introduces additional complexity and latency; while Spectrum can query S3, it is not as straightforward for event-driven preprocessing.
The option AWS Glue with Amazon Athena covers ETL and SQL exploration, but Glue alone is not the most direct event-driven mechanism from S3; it typically requires Lambda or EventBridge to react immediately to object creation.
When you see keywords like interactive SQL on S3, serverless, S3 event, and near real time, map them to Athena for querying and S3-triggered Lambda for event-driven transforms. Be cautious with choices that introduce clusters or warehouses (EMR, Redshift), as these often imply additional latency and operational management compared to a pure serverless approach for rapid refresh cycles.
Question 12
How can you continuously detect input data quality issues and feature drift for a deployed model to keep it reliable?
-
✓ B. Amazon SageMaker Model Monitor
Amazon SageMaker Model Monitor is designed to continuously track live inference data against a baseline to detect data quality issues and feature drift. It captures payloads from endpoints, profiles distributions, compares them to baseline statistics and constraints, and publishes metrics to CloudWatch so you can alarm and alert. This directly addresses ongoing reliability for models in production.
Amazon CloudWatch is excellent for infrastructure and custom metrics but does not natively profile feature distributions for drift.
Add custom logging and review metrics every 4 days is manual and slow, so it will miss real-time shifts.
AWS Glue Data Quality targets data quality in pipelines and data lakes, not monitoring live model inputs at endpoints.
Amazon QuickSight creates dashboards but does not automate drift detection.
Amazon SageMaker Clarify focuses on bias and explainability; while it can integrate with monitoring for bias or attribution drift, it is not the primary service for end-to-end, continuous data quality and feature drift on deployed endpoints.
When a question mentions continuous, automated monitoring of live model input distributions and alerts for drift, look for SageMaker Model Monitor. Keywords like baseline vs. live traffic, data capture, statistics and constraints, and CloudWatch alarms are strong signals.
Question 13
Which serverless pattern enriches each Amazon Lookout for Vision inference with metadata from DynamoDB table foo and immediately updates the corresponding DynamoDB item?
-
✓ B. AWS Lambda calling Lookout for Vision, reading from DynamoDB foo, enriching, then updating DynamoDB
AWS Lambda calling Lookout for Vision, reading from DynamoDB foo, enriching, then updating DynamoDB is the best fit because it is fully serverless, low latency, and directly integrates the inference result with existing metadata via simple SDK calls. Lambda can synchronously invoke DetectAnomalies, fetch attributes from the DynamoDB table foo, and use UpdateItem to write back to the same item immediately for live dashboards.
The option AWS Step Functions writing outputs to Amazon S3 emphasizes orchestration and storage in S3, which does not satisfy immediate updates to DynamoDB items. The choice Amazon EventBridge triggering AWS Glue ETL to write to DynamoDB introduces batch ETL and additional latency, which is unnecessary for per-inference real-time enrichment.
The option Amazon Kinesis Data Firehose to Amazon Redshift for joins with DynamoDB data targets analytics and warehousing rather than transactional updates in DynamoDB.
The option Amazon EC2 polling Lookout for Vision with AWS Batch to push updates to DynamoDB adds operational burden and latency, contradicting the serverless, near-real-time requirement.
When you see keywords like serverless, per-inference enrichment, and immediate DynamoDB updates, prefer Lambda integrating directly with the required APIs (DetectAnomalies and DynamoDB UpdateItem). Avoid batch ETL, analytics warehouses, or compute-heavy orchestration unless the question explicitly requires them.
Question 14
Which statement correctly distinguishes Amazon Bedrock from Amazon SageMaker JumpStart for rapid prototyping with minimal setup?
-
✓ B. Amazon Bedrock provides managed, multi-provider foundation models; SageMaker JumpStart offers a curated catalog of prebuilt models, algorithms, notebooks, and solution templates
Amazon Bedrock provides managed, multi-provider foundation models; SageMaker JumpStart offers a curated catalog of prebuilt models, algorithms, notebooks, and solution templates is correct. Bedrock gives a fully managed API to invoke and orchestrate foundation models from multiple providers (such as Amazon, Anthropic, Meta, and others), plus tooling for orchestration and guardrails. SageMaker JumpStart, by contrast, is part of SageMaker that surfaces prebuilt models, algorithms, notebooks, and end-to-end solution templates to speed both traditional ML and generative AI within SageMaker.
The option Amazon Bedrock is for building custom models end-to-end; SageMaker JumpStart only deploys pre-trained LLMs is wrong because Bedrock is not an end-to-end custom training platform, and JumpStart offers far more than LLM deployment, including classical ML assets and templates.
The option Amazon Bedrock focuses on training and hyperparameter tuning from scratch; SageMaker JumpStart mainly hosts third-party models is incorrect since Bedrock does not target from-scratch training, and JumpStart includes AWS, open-source, and third-party resources with customization paths.
The option SageMaker JumpStart gives a unified API to multiple third-party foundation models; Amazon Bedrock centers on AWS-only models is incorrect because the unified, multi-provider FM access is Bedrock’s role, and Bedrock supports third-party models, not just AWS models.
If you see multi-provider foundation model access via a managed API, think Bedrock. If you see curated prebuilt models, algorithms, notebooks, and solution templates within SageMaker, think JumpStart. Distinguish FM inference/orchestration (Bedrock) from SageMaker development accelerators (JumpStart).
Question 15
With Amazon Lookout for Vision detecting defects at about 200 items per minute, which services enable immediate email alerts and centralized audit logging?
-
✓ C. Amazon Lookout for Vision with Amazon SNS and CloudWatch Logs
The correct choice is Amazon Lookout for Vision with Amazon SNS and CloudWatch Logs. SNS supports immediate email notifications via topic subscriptions, and CloudWatch Logs is purpose-built for centralized, durable event log retention with querying and retention policies. Together they cover real-time alerting and audit logging cleanly.
The option Amazon Lookout for Vision with AWS Lambda and Amazon EventBridge can route events and invoke code, but it does not natively deliver emails and does not itself provide a centralized log store. You would still need SNS and a log service such as CloudWatch Logs or S3.
The option Amazon Lookout for Vision with Amazon SNS and Amazon S3 sends emails via SNS, and S3 can store artifacts, but S3 is not an operational log service. It lacks native log querying and aggregation features that CloudWatch Logs provides for compliance reviews.
The option Amazon Lookout for Vision with AWS Step Functions and Amazon CloudWatch focuses on orchestration and metrics/alarms. It does not provide direct email notifications or centralized log ingestion without adding other services.
Map requirements to native services. For email alerts, think Amazon SNS. For centralized log retention and search, think CloudWatch Logs.
EventBridge routes events, Lambda runs code, and Step Functions orchestrates workflows, but none of these alone solve email plus centralized logging.

All questions come from my Udemy AWS ML course, and certificationexams.pro
Question 16
With 24 months of weekly time series data containing gaps and holiday-related spikes from data errors, what should you do first to ensure reliable forecasting?
-
✓ C. Perform EDA and fix missing timestamps and anomalies before modeling
Perform EDA and fix missing timestamps and anomalies before modeling is correct because time series forecasting depends on consistent, correctly indexed data. Identifying and remediating gaps, duplicates, and spurious holiday spikes ensures patterns like trend and seasonality are learnable and not distorted by data quality issues. Typical steps include verifying a complete timestamp index, imputing or reconstructing missing periods, removing or capping erroneous spikes, and validating feature distributions before model selection.
The option Standardize features and ignore gaps/outliers; train gradient-boosted trees is incorrect because scaling does not address structural data defects and will propagate noise.
The option Use SageMaker Autopilot to pick a model is incorrect since AutoML still requires clean, well-formed data; it will not fix missing timestamps or anomalies for you.
The option Start with a deep neural network and learn seasonality is incorrect because complex models cannot compensate for corrupted inputs and may overfit to errors.
Always think data quality first for time series. Look for cues like “missing timestamps,” “holiday spikes,” or “reporting delays.” The right first action is EDA and cleaning before choosing algorithms or services. Remember that services like Amazon Forecast or AutoML tools assume you bring clean, consistent time series. Use complete timestamp indexes, imputation, and anomaly handling prior to model training.
Question 17
Which statement correctly contrasts K-Means and K-Nearest Neighbors regarding unlabeled clustering versus labeled prediction?
-
✓ A. K-Means clusters unlabeled data by similarity; KNN predicts from labeled neighbors
The K-Means clusters unlabeled data by similarity; KNN predicts from labeled neighbors option is correct because K-Means is an unsupervised clustering algorithm that partitions data into k groups by minimizing within-cluster variance, while K-Nearest Neighbors is a supervised, instance-based method that uses labeled neighbors to classify or regress new points.
The option K-Means requires labeled data; KNN is label-free is wrong because it reverses the roles; only KNN depends on labels.
The option Both are unsupervised clustering is incorrect since KNN is supervised.
The option K-Means is supervised; KNN is unsupervised clustering swaps the paradigms and is false.
The option Amazon SageMaker Autopilot does not address the conceptual difference between K-Means and KNN.
The option K-Means performs regression; KNN does dimensionality reduction is incorrect because K-Means is for clustering and KNN is for classification/regression, not dimensionality reduction.
Anchor on whether labels are available. If there are no labels and the goal is to find structure, think K-Means and clustering. If labels exist and the goal is to classify or predict, think KNN and supervised learning. Remember that the “K” in each algorithm refers to different things: number of clusters for K-Means versus number of neighbors for KNN. In AWS, both algorithms are available as built-in algorithms in SageMaker, but the learning paradigm dictates when to use each.
Question 18
Which AWS service measures bias in training data and model predictions for ML fairness reporting?
-
✓ B. SageMaker Clarify
The correct choice is SageMaker Clarify because it computes bias metrics on both datasets and model predictions, producing detailed fairness reports that help quantify and diagnose potential bias before and after training.
The option AWS Glue is incorrect because it is an ETL service focused on data integration and transformation, not fairness analysis or bias metrics.
Amazon SageMaker Model Monitor is incorrect since it primarily monitors models in production; while it can run bias checks on endpoints, it does not offer the same breadth of dataset-level bias analysis as Clarify.
Amazon SageMaker Data Wrangler is incorrect because it is designed for data preparation and exploratory analysis but does not generate bias or fairness metrics.
When the question highlights bias metrics, dataset bias, and prediction bias or asks for fairness reports, map directly to SageMaker Clarify. If the scenario focuses on live endpoint monitoring or drift in production, lean toward Model Monitor. If it mentions PII detection or anonymization, think of Macie; if it mentions data prep pipelines, consider Data Wrangler or Glue.
Question 19
During repeated SageMaker training runs to test hyperparameters, which feature best reduces startup time between successive jobs?
-
✓ B. SageMaker Warm Pools to reuse training infrastructure
The correct choice is SageMaker Warm Pools to reuse training infrastructure. Warm Pools retain provisioned training instances after a job completes so subsequent jobs can start on already-initialized infrastructure, cutting cold-start and container pull time. This is designed for rapid iteration during experimentation with back-to-back jobs.
SageMaker Managed Spot Training reduces cost by using spare capacity but does not specifically reduce startup time and can introduce additional delays due to capacity acquisition or interruptions.
SageMaker Training Compiler accelerates training computations for supported deep learning models but does not impact instance provisioning or container startup latency.
SageMaker Pipelines step caching can skip repeated steps when inputs are identical, but it does not consistently reduce training infrastructure startup time, especially when hyperparameters change between runs.
When a question emphasizes reducing startup time between successive training jobs or reusing infrastructure, look for warm pools. Distinguish between cost optimization (Spot) and latency reduction, and between algorithm-level acceleration (Training Compiler) and infrastructure warmup.
Question 20
Which AWS service provides low-latency real-time model inference with automatic scaling?
-
✓ D. Amazon SageMaker real-time endpoint
The correct choice is Amazon SageMaker real-time endpoint because it is a fully managed service purpose-built for hosting ML models behind HTTPS endpoints that deliver consistently low-latency predictions and support automatic scaling based on traffic. It integrates with Amazon CloudWatch for metrics, offers production-ready deployment workflows, and simplifies rolling updates and monitoring.
AWS Lambda with Amazon DynamoDB is serverless and scalable, but Lambda is not optimized for high-throughput, consistently low-latency ML inference and would require custom packaging, scaling logic, and potentially suffer from cold starts.
Amazon SageMaker Serverless Inference is convenient and auto-scales, but cold starts can add latency, making it less ideal for stringent low-latency SLAs.
Amazon Kinesis Data Analytics processes streaming data but does not host models for online inference.
AWS Fargate can run containers, yet you must build your own inference stack, scaling, and observability, which SageMaker handles natively.
When you see requirements for “low-latency real-time inference” plus “automatic scaling” for ML models, strongly consider SageMaker real-time endpoints. Watch for distractors like serverless compute or stream processing that do not specifically provide managed model hosting. Keywords to look for include real-time inference, auto scaling, and managed model hosting.
Jira, Scrum & AI Certification |
---|
Want to get certified on the most popular software development technologies of the day? These resources will help you get Jira certified, Scrum certified and even AI Practitioner certified so your resume really stands out..
You can even get certified in the latest AI, ML and DevOps technologies. Advance your career today. |
Cameron McKenzie is an AWS Certified AI Practitioner, Machine Learning Engineer, Copilot Expert, Solutions Architect and author of many popular books in the software development and Cloud Computing space. His growing YouTube channel training devs in Java, Spring, AI and ML has well over 30,000 subscribers.