35 Tough AWS AI Practitioner Sample Questions and Answers

Why the AWS AI Practitioner certification is important

Over the past few months, my team’s been working hard to help professionals adapt to the rapid rise of artificial intelligence and discover exciting new opportunities in cloud and data-driven careers.

Part of that journey is strengthening your resume with the right certifications, and the one I recommend every learner start with is the AWS Certified AI Practitioner credential.

Whether you’re a Scrum Master, Product Owner, DevOps Engineer, or senior developer, this AI certification builds the foundation you need to understand AI and machine learning the way AWS implements them securely, at scale, and in the cloud.

In today’s tech landscape, you can’t afford to be unaware of how AI services power modern applications.

Every successful technologist needs to understand how to use foundation models, integrate APIs like Amazon Bedrock, and design solutions that take advantage of AWS’s AI ecosystem.

That’s exactly what the AWS AI Practitioner Exam validates. It tests your understanding of fundamental AI and ML concepts, AWS AI services, responsible AI practices, and how businesses can use AI to innovate faster and more efficiently.

AWS AI Practitioner exam simulators

Through my Udemy courses on the AWS Cloud Practitioner exam, machine learning, AWS DevOps and solutions architect certifications, and through the free practice question banks at certificationexams.pro, I’ve seen which AI concepts challenge learners the most.

Based on thousands of student sessions and performance data, these are 20 of the toughest AWS AI Practitioner certification exam questions currently circulating in the practice pool.

Each question is carefully explained at the end of the set. Take your time, think like an AWS AI architect, and review the reasoning behind each answer to strengthen your understanding.

If you’re preparing for the AWS AI Practitioner Exam or exploring related certifications from AWS, Google Cloud, or Azure, you’ll find hundreds more free practice exam questions and detailed explanations at certificationexams.pro as well.

And just to be clear, these are not AI Practitioner exam dumps or braindumps.

Every question is original and designed to teach you how to think about AI and cloud integration, not just what to memorize.

Each answer includes its own tip and guidance to help you pass the real exam with confidence.

Now, let’s dive into the 20 toughest AWS AI Practitioner certification exam questions.

Good luck, and remember, every great cloud career in the age of AI starts with understanding how AWS brings artificial intelligence to life.

Git, GitHub & GitHub Copilot Certification Made Easy

Want to get certified on the most popular AI, ML & DevOps technologies of the day? These five resources will help you get GitHub certified in a hurry.

Get certified in the latest AI, ML and DevOps technologies. Advance your career today.

AWS AI Practitioner Exam Questions

Certification Exam Simulator Questions

BlueQuill Systems, a design automation startup, wants to add generative AI to tools used by eight internal teams and is comparing Amazon Q for automating enterprise workflows with Amazon Bedrock for working with foundation models; which statements correctly explain their key differences in capabilities, model choice, and typical use so the team can assign the right service to each workload? (Choose 2)

  • ❏ A. Amazon Q allows you to pick from multiple foundation models, whereas Amazon Bedrock restricts you to a single default model

  • ❏ B. Amazon Q is a generative AI assistant for ready-to-use business and developer workflows, while Amazon Bedrock is a managed platform to build and scale generative AI solutions with foundation models

  • ❏ C. Both Amazon Q and Amazon Bedrock are assistants that ship as prepackaged generative AI apps

  • ❏ D. Amazon Bedrock offers a catalog of foundation models to choose from, but Amazon Q does not let you select the underlying model

  • ❏ E. Amazon SageMaker

An academic consortium is building a climate forecasting model using Amazon SageMaker and wants other scientists to verify how the model makes decisions. They are considering adopting an open-source model to improve openness and reviewability. What is the primary transparency advantage of choosing an open-source model?

  • ❏ A. Open-source projects include automatic explanations for every prediction

  • ❏ B. Open-source licensing always permits unrestricted commercial use

  • ❏ C. The source code can be examined and adapted, making behavior easier to understand

  • ❏ D. Amazon SageMaker Clarify

Meridian Finance operates an AI-powered HR analytics platform on AWS that processes confidential employee records. The security team needs a service that continuously evaluates the environment against security best practices and provides prioritized remediation guidance. Which service should they use?

  • ❏ A. Amazon Macie

  • ❏ B. AWS Well-Architected Tool

  • ❏ C. AWS Trusted Advisor

  • ❏ D. AWS CloudTrail

A parcel delivery network at MetroShip uses Amazon Bedrock to forecast late arrivals from traffic feeds. The predictions often miss abrupt events such as lane closures or multi-vehicle accidents that occur within minutes. What change would most improve the model’s responsiveness to these rapid conditions?

  • ❏ A. Embedding indexing

  • ❏ B. Ingest live traffic events with Amazon Kinesis Data Streams

  • ❏ C. Fine-tuning

  • ❏ D. Amazon SageMaker Clarify

A regional airline plans to launch a customer help assistant built with Amazon Bedrock to answer booking and policy questions. Which AWS feature will most improve the assistant’s factual accuracy and safe, consistent behavior?

  • ❏ A. SageMaker Model Monitor

  • ❏ B. Guardrails

  • ❏ C. Amazon Polly

  • ❏ D. Amazon Rekognition

A ride-hailing startup aims to group trip paths from the last 90 days into similar clusters based on geographic proximity and congestion trends to improve driver dispatching. Which category of machine learning should they use?

  • ❏ A. Supervised Learning

  • ❏ B. Reinforcement Learning

  • ❏ C. Unsupervised learning

  • ❏ D. Semi-Supervised Learning

A regional insurance provider plans a virtual assistant that can search through about 8 million policy documents and claim notes and then deliver grounded, contextual answers to staff questions. Which AI design pattern should they adopt?

  • ❏ A. Basic retrieval system without generation

  • ❏ B. Fine-tuned model on a general-purpose dataset

  • ❏ C. Retrieval-Augmented Generation (RAG) approach

  • ❏ D. Pretrained model with no access to external knowledge

A retail marketplace named RiverCart is moving to AWS and plans to power personalized product suggestions with Amazon Bedrock. The team has a curated dataset of 250,000 labeled clicks and purchases and wants the customization to remain private to their account while improving the chosen foundation model’s outputs. Which approach best meets this requirement?

  • ❏ A. Train the base FM directly in Amazon Bedrock using the labeled dataset

  • ❏ B. Build a brand-new model from scratch in Amazon Bedrock using the labeled dataset

  • ❏ C. Create a private fine-tuned copy of the selected FM in Amazon Bedrock and train that copy with the labeled dataset

  • ❏ D. Create a publicly shareable copy of the base FM in Amazon Bedrock and train that public copy with the labeled dataset

At a cybersecurity startup using Amazon SageMaker to classify unsolicited email collected over the last 120 days, you need to compare models by how well they distinguish spam from legitimate messages regardless of where the prediction threshold is set. Which evaluation method should you use to understand performance across all possible decision cutoffs?

  • ❏ A. Average precision on the precision–recall curve

  • ❏ B. Confusion matrix at a chosen score cutoff

  • ❏ C. AUC-ROC to evaluate threshold-independent separability

  • ❏ D. F1 score computed at a single operating point

A streaming startup called AuroraPlay wants to add machine learning so that when a viewer selects a title, the app shows closely related content to boost discovery and engagement. Which option describes a capability that Amazon Personalize can provide?

  • ❏ A. Provide natural-language enterprise search across company content

  • ❏ B. Automate telecom customer workflows such as SIM activation and plan changes

  • ❏ C. Recommend items that are most similar to a product a user is viewing

  • ❏ D. Extract document layout elements like paragraphs, titles, and lists from files

A logistics company is using Amazon SageMaker to train a vision model with 90,000 labeled warehouse photos to detect damaged parcels across different lighting conditions and camera angles. The team wants the model to generalize to new images and avoid both overfitting and underfitting. Which statements correctly differentiate overfitting from underfitting? (Choose 2)

  • ❏ A. More complex models invariably outperform simpler ones

  • ❏ B. Simplifying the model can reduce overfitting, while adding more labeled training data can lessen underfitting

  • ❏ C. Underfitting shows strong training accuracy but weak validation accuracy, while overfitting performs poorly on both

  • ❏ D. Amazon CloudWatch

  • ❏ E. Overfitting occurs when a model performs well on training data but poorly on new data, while underfitting fails to capture patterns even in the training data

A digital media company is fine-tuning a foundation model in Amazon Bedrock to study audience interactions and anticipate content demand. The team plans to run validation on roughly 40 TB of labeled data and needs a storage location that Bedrock natively supports for accessing large datasets during model validation. Which storage service should the team choose?

  • ❏ A. Amazon RDS

  • ❏ B. Amazon S3

  • ❏ C. Amazon EBS

  • ❏ D. Amazon EFS

ByteCraft Electronics operates a customer help portal where shoppers ask about device compatibility and warranty terms. The team needs Amazon Q Business to return precise answers that reflect catalog and policy changes that sync every 45 minutes from internal systems. Which capability of Amazon Q Business ensures answers stay accurate and relevant to the most current business data?

  • ❏ A. Generates responses only from the model’s general training data without checking company-specific sources

  • ❏ B. Uses Retrieval Augmented Generation to pull the latest catalog and policy content and ground the answer in that context

  • ❏ C. Delivers replies from a static, rule-driven knowledge base that is not updated in real time

  • ❏ D. Enforces guardrails and content filters to block unsafe topics instead of retrieving fresh business data

A sports arena in Chicago operates gated parking with IP cameras at each entrance. The operations team wants to read license plate text from live video and instantly compare it to a denylist of about 1,200 plate numbers stored in their system so security is alerted when a match is found. Which AWS service should they use to implement this solution?

  • ❏ A. Amazon Bedrock

  • ❏ B. Amazon Rekognition

  • ❏ C. Amazon Kendra

  • ❏ D. Amazon Comprehend

A drone logistics startup is building an AI flight controller to improve obstacle avoidance and route selection for autonomous quadcopters. The team will use deep learning and wants to clearly understand how a neural network learns from large volumes of flight and video data so it can make split-second choices. What best describes the training process for a deep learning model?

  • ❏ A. Training in deep learning involves manually configuring all weights and biases according to fixed heuristics

  • ❏ B. Amazon SageMaker

  • ❏ C. Training in deep learning iteratively updates network weights and biases using backpropagation with optimizers like gradient descent on large datasets to minimize a loss function

  • ❏ D. Training in deep learning requires no dataset, since the model learns solely from built-in rules

Northstar Analytics is building an internal assistant with an Amazon Bedrock foundation model and wants to lower monthly token spend while keeping answer quality high. Which approach is the most effective to achieve this?

  • ❏ A. Raise the model’s temperature to produce outputs faster

  • ❏ B. Fine-tune the Bedrock model with domain data so prompts can be shorter while maintaining quality

  • ❏ C. Perform continued pretraining on Amazon SageMaker using the company’s full dataset

  • ❏ D. Lower the maximum tokens per response to reduce output size

At Helios Retail Labs, a small analytics group is transforming raw datasets before training a predictive model. They apply one-hot encoding, standardization, and build interaction terms to improve predictive quality. Which choice best describes the aim of feature engineering in this situation?

  • ❏ A. Automatically choosing the best algorithm or architecture based on the dataset and task

  • ❏ B. Amazon SageMaker Autopilot

  • ❏ C. Applying domain expertise to convert raw data into informative features that boost model performance and efficiency

  • ❏ D. Systematically searching for the most effective hyperparameters over many training runs

An analytics startup is using a foundation model with Amazon Bedrock to power an internal question-and-answer assistant, and the team wants to improve response quality by progressing from the simplest approach to the most advanced; which sequence represents the techniques in increasing implementation complexity?

  • ❏ A. Retrieval-Augmented Generation (RAG), Prompt engineering, Fine-tuning

  • ❏ B. Fine-tuning, Retrieval-Augmented Generation (RAG), Prompt engineering

  • ❏ C. Prompt engineering, Retrieval-Augmented Generation (RAG), Fine-tuning

  • ❏ D. Prompt engineering, Fine-tuning, Retrieval-Augmented Generation (RAG)

Orion Logistics built a machine learning model to predict which employees might leave. Over the last 90 days the evaluation results show that accuracy differs widely by division, ranging from 91% in Finance to 67% in Warehouse Operations. Which approach would best uncover and correct this uneven performance across groups?

  • ❏ A. Amazon Polly

  • ❏ B. Automated hyperparameter tuning

  • ❏ C. Subgroup evaluation of model performance by division

  • ❏ D. Amazon Transcribe

A healthcare research lab called NorthBay Insights is rolling out a machine learning platform that processes confidential datasets. The team must ensure only authorized users can read or modify the information and wants to enforce least-privilege access across AWS resources. Which approach should they use to protect access to the data?

  • ❏ A. AWS CloudTrail

  • ❏ B. Use AWS IAM roles with fine-grained policies and permissions to enforce access control

  • ❏ C. Amazon Macie

  • ❏ D. Encrypt the data but do not configure any specific access permissions

AeroTrip, an online travel agency, handles about 25,000 customer support tickets each day and plans to roll out Agents for Amazon Bedrock to streamline how requests are resolved. Which benefits of these agents would most directly address this need?

  • ❏ A. Automatically building and training new proprietary foundation models to anticipate customer intent

  • ❏ B. Offloading all task coordination and state management to AWS Step Functions

  • ❏ C. Automating repetitive actions and orchestrating multi-step workflows with tool and API calls

  • ❏ D. Choosing and switching between multiple foundation models and merging their outputs without setup

A global insurance carrier is building machine learning models in AWS to flag suspicious claims and refine underwriting risk scores. To enable five distributed data science squads to share work and keep training and inference consistent, they want a centralized place to store, discover, and version reusable ML features. Which AWS service best fits this requirement?

  • ❏ A. Amazon SageMaker Data Wrangler

  • ❏ B. Amazon SageMaker Feature Store

  • ❏ C. Amazon SageMaker Ground Truth

  • ❏ D. Amazon SageMaker Canvas

Skylark Retail is launching a customer support platform and needs to derive insights from recorded customer conversations. The team processes about 45,000 calls each month and wants to automatically convert the call audio into text so downstream tools can extract key details. Which AWS service should they use?

  • ❏ A. Amazon Lex

  • ❏ B. Amazon Transcribe

  • ❏ C. Amazon SageMaker Model Monitor

  • ❏ D. Amazon Comprehend

A same-day courier, Northwind Express, uses Amazon Bedrock to generate turn-by-turn routes using real-time congestion data and package urgency. During the evening rush, traffic to the API rises to about 450 requests per second and p95 latency grows from 250 ms to 1.4 seconds, slowing dispatch decisions. Which change should the team prioritize to bring latency back under target while demand is high?

  • ❏ A. Fine-tune the model with more route data

  • ❏ B. Increase provisioned throughput and concurrency in Amazon Bedrock

  • ❏ C. AWS Global Accelerator

  • ❏ D. Lower the maximum tokens per response

A fintech startup wants to enforce fairness and transparent predictions in its ML workflow. During data preparation, which capability of Amazon SageMaker Clarify would help achieve this goal?

  • ❏ A. Amazon SageMaker Model Monitor

  • ❏ B. Flags potential dataset bias during data preparation

  • ❏ C. Amazon SageMaker Model Cards

  • ❏ D. Amazon Bedrock Knowledge Bases

Orion Mutual Insurance processes tens of thousands of policy contracts and addendums each week and plans to use AWS AI to extract key terms, detect missing clauses, and classify agreements from scanned documents. The compliance team is worried that confidential customer information could be exposed when interacting with AI models. Which AWS practices best reduce the risk of disclosing sensitive data? (Choose 2)

  • ❏ A. Store document data in Amazon RDS instead of Amazon S3

  • ❏ B. Encrypt stored documents with AWS Key Management Service (KMS)

  • ❏ C. Disable API logging to prevent data leakage

  • ❏ D. AWS Shield Advanced

  • ❏ E. Implement IAM policies that tightly restrict access to AI models and datasets

A regional retailer, Luma Outfitters, is rolling out a support chatbot on Amazon Bedrock. The team wants the model to generate replies while limiting choices to the smallest set of highest-probability next tokens whose cumulative likelihood reaches a chosen threshold, for example about 88%. Which inference parameter should they configure?

  • ❏ A. Temperature

  • ❏ B. Top-k

  • ❏ C. Top-p sampling

  • ❏ D. Epochs

NorthGrid Power, a regional utility, plans to deploy a customer support chatbot that handles about 12,000 chats per month and steadily improves its replies by learning from thumbs-up or thumbs-down ratings on prior conversations as well as drawing on newly added knowledge sources like updated product guides and community posts. Which machine learning approach best enables this continuous improvement?

  • ❏ A. Supervised learning with a labeled dataset of high-quality and low-quality responses

  • ❏ B. Unsupervised learning to cluster similar support questions

  • ❏ C. Reinforcement learning that uses customer ratings as reward signals

  • ❏ D. Supervised learning that retrains when the FAQ and knowledge articles are updated

A travel booking startup called Skyline Trails plans to use Amazon Bedrock to help writers and designers produce trip guides, ad copy, and promotional visuals for campaigns. The team wants a simple explanation of how generative models can create original text, images, or audio from what they have learned so they can select the right workflow and safeguards. How do these systems generate new content?

  • ❏ A. Using fixed rules and templates created by developers with no learning from examples

  • ❏ B. Sampling novel outputs from models that learned statistical patterns in large training datasets

  • ❏ C. Amazon Rekognition

  • ❏ D. Producing content by making arbitrary random choices without reference to prior data

A precision medicine startup is using a Foundation Model in Amazon Bedrock to analyze high-throughput sequencing results and generate insights for new therapies. The team wants the model to become highly specialized in genomics so it can better understand domain-specific terminology, patterns, and data sources. Which approaches would most effectively transform the general FM into a genomics-focused expert? (Choose 2)

  • ❏ A. Continued pretraining on a large, curated corpus of genomics literature, variant databases, and sequencing data

  • ❏ B. Reinforcement learning with reward signals from human feedback to adapt the model to genomics

  • ❏ C. Supervised learning on a labeled downstream task such as gene–disease classification to specialize the FM

  • ❏ D. Domain adaptation fine-tuning of the FM using high-quality, domain-specific genomics datasets

  • ❏ E. Incremental learning to add new genomics samples over time without catastrophic forgetting

A mobile gaming studio wants to automatically tag player reviews as positive, negative, or neutral to monitor customer sentiment. Which learning approach should they use?

  • ❏ A. Clustering

  • ❏ B. Unsupervised learning

  • ❏ C. Supervised learning

  • ❏ D. Reinforcement learning

A video streaming startup is using Amazon Bedrock to produce personalized show summaries and viewing suggestions. The machine learning team is adjusting decoding settings and wants to understand what changing the Top K parameter actually controls so they can balance consistency and variety in the outputs. What should you tell the team about Top K?

  • ❏ A. Specifies character sequences that cause generation to stop

  • ❏ B. Controls the cumulative probability mass of candidates considered for the next token

  • ❏ C. Sets how many of the highest-probability token candidates are eligible for the next step

  • ❏ D. Adjusts randomness by increasing the chance of choosing lower-probability tokens

Rivera Robotics plans to add language-model features to rugged handhelds used by field crews at remote wind farms. The team needs sub-50 ms responses and cannot depend on consistent network access. Which approach will best achieve this?

  • ❏ A. Deploy compressed large language models locally on the edge devices

  • ❏ B. Use a central small language model endpoint with asynchronous calls from devices

  • ❏ C. Run tuned small language models directly on the edge hardware

  • ❏ D. Integrate a centralized large language model endpoint with asynchronous communication

A data science group at Aurora Retail is developing a machine learning service that must adhere to company fairness policies and equal opportunity laws to prevent discriminatory outcomes. Which capability should they prioritize to confirm the model treats all user segments fairly?

  • ❏ A. Model compression techniques

  • ❏ B. Bias detection and mitigation

  • ❏ C. Amazon SageMaker Automatic Model Tuning

  • ❏ D. Cross-validation

A wealth management startup is building a generative AI tool to condense equity research memos from the past 18 months into brief summaries. The security team worries that the model’s responses could unintentionally surface proprietary trading approaches and nonpublic methods. Which discipline should the team focus on first to address this risk?

  • ❏ A. Identity and access management

  • ❏ B. AI risk management

  • ❏ C. Network segmentation and isolation

  • ❏ D. Centralized logging and monitoring

Answers to the Certification Exam Simulator Questions

BlueQuill Systems, a design automation startup, wants to add generative AI to tools used by eight internal teams and is comparing Amazon Q for automating enterprise workflows with Amazon Bedrock for working with foundation models; which statements correctly explain their key differences in capabilities, model choice, and typical use so the team can assign the right service to each workload? (Choose 2)

  • ✓ B. Amazon Q is a generative AI assistant for ready-to-use business and developer workflows, while Amazon Bedrock is a managed platform to build and scale generative AI solutions with foundation models

  • ✓ D. Amazon Bedrock offers a catalog of foundation models to choose from, but Amazon Q does not let you select the underlying model

The correct answers are Amazon Q is a generative AI assistant for ready-to-use business and developer workflows, while Amazon Bedrock is a managed platform to build and scale generative AI solutions with foundation models and Amazon Bedrock offers a catalog of foundation models to choose from, but Amazon Q does not let you select the underlying model.

Amazon Q is designed as a ready to use assistant that speeds task completion and automates workflows for business and developer teams. It provides prebuilt workflows and integrations so teams can adopt generative AI without building end to end ML pipelines. Amazon Bedrock is a managed platform for building and scaling generative AI applications and it provides a catalog of foundation models from multiple providers along with capabilities for retrieval augmented generation and model customization and orchestration.

Amazon Bedrock exposes multiple foundation models that you can choose from and combine for different workloads. Amazon Q does not let you pick the underlying foundation model and instead offers a packaged assistant experience optimized for productivity and common business workflows.

Amazon Q allows you to pick from multiple foundation models, whereas Amazon Bedrock restricts you to a single default model is wrong because Amazon Bedrock supports multiple FMs from different providers and Amazon Q does not provide direct model selection.

Both Amazon Q and Amazon Bedrock are assistants that ship as prepackaged generative AI apps is incorrect because only Amazon Q functions as a ready to use assistant while Amazon Bedrock is a platform for developers to build and operate custom generative AI solutions.

Amazon SageMaker is not a direct comparison statement about Q versus Bedrock and is therefore irrelevant to this choice. SageMaker is AWS machinery for building training and hosting workflows and it is distinct from the assistant focus of Amazon Q and the foundation model platform role of Amazon Bedrock.

Map each workload to the service role so choose assistant for quick productivity and prebuilt workflows with Amazon Q and choose platform when you need model choice and production ML capabilities with Amazon Bedrock.

An academic consortium is building a climate forecasting model using Amazon SageMaker and wants other scientists to verify how the model makes decisions. They are considering adopting an open-source model to improve openness and reviewability. What is the primary transparency advantage of choosing an open-source model?

  • ✓ C. The source code can be examined and adapted, making behavior easier to understand

The source code can be examined and adapted, making behavior easier to understand is correct because it gives researchers direct access to the implementation so they can audit algorithms, trace decision logic, and reproduce or modify behavior for validation.

The source code can be examined and adapted, making behavior easier to understand being available means reviewers can follow data handling and model architecture details that are not visible from predictions alone, and they can run experiments that verify claims or reveal issues.

Open-source projects include automatic explanations for every prediction is incorrect because simply being open source does not guarantee that per prediction explanations exist or are implemented. Explainability depends on the model design and the tooling used to produce explanations.

Open-source licensing always permits unrestricted commercial use is incorrect because licenses vary and some require attribution or impose share alike or noncommercial terms. License review is necessary before assuming commercial rights.

Amazon SageMaker Clarify is a useful AWS service for explainability and bias detection but it is not an inherent property of open-source models and it does not by itself represent the primary transparency advantage of choosing open source.

When a question stresses transparency or auditability look for language about inspecting, reviewing, or modifying the source code. That phrasing usually points to an open source advantage.

Meridian Finance operates an AI-powered HR analytics platform on AWS that processes confidential employee records. The security team needs a service that continuously evaluates the environment against security best practices and provides prioritized remediation guidance. Which service should they use?

  • ✓ C. AWS Trusted Advisor

The correct choice is AWS Trusted Advisor. It provides continuous automated checks across security and other categories and gives prioritized, actionable recommendations to improve configurations across the AWS environment.

AWS Trusted Advisor runs account-wide checks and highlights high priority items with guidance you can follow to remediate issues. It covers security, fault tolerance, performance, cost optimization, and service limits and is designed to provide practical remediation steps rather than only collecting logs.

Amazon Macie focuses on discovering and classifying sensitive data stored in Amazon S3 and it does not perform comprehensive, account-wide best-practice posture checks or provide prioritized remediation across services.

AWS CloudTrail captures and records API activity for auditing and investigations and forensics but it does not evaluate resource configurations or generate best-practice remediation guidance.

AWS Well-Architected Tool supports structured design reviews and guidance for individual workloads and it is intended for periodic assessments rather than continuous, account-wide monitoring with prioritized remediation like Trusted Advisor.

When a question asks for continuous best-practice checks and prioritized remediation choose AWS Trusted Advisor. Map data discovery to Amazon Macie and API audit logs to AWS CloudTrail.

A parcel delivery network at MetroShip uses Amazon Bedrock to forecast late arrivals from traffic feeds. The predictions often miss abrupt events such as lane closures or multi-vehicle accidents that occur within minutes. What change would most improve the model’s responsiveness to these rapid conditions?

  • ✓ B. Ingest live traffic events with Amazon Kinesis Data Streams

The correct option is Ingest live traffic events with Amazon Kinesis Data Streams. Real time streaming of traffic events lets models and the delivery orchestration receive new incidents as they happen and so it improves responsiveness to lane closures and multi vehicle accidents.

By ingesting events with Ingest live traffic events with Amazon Kinesis Data Streams the system can update forecasts or trigger rerouting logic within minutes or seconds. Streaming reduces latency between incident occurrence and the input available to Amazon Bedrock or to downstream components and it complements model improvements and retraining by supplying current signals.

Embedding indexing improves retrieval for retrieval augmented generation and it can help find relevant historical context but it does not provide fresh, time sensitive traffic signals that reflect sudden incidents.

Fine-tuning adapts a model to historical patterns and it can improve baseline accuracy but it cannot make the model see events that occur after training unless those events are delivered in real time.

Amazon SageMaker Clarify is intended for bias detection and explainability and it does not ingest live streaming events or reduce latency for rapidly occurring traffic incidents.

When scenario details emphasize changes that happen by the minute choose streaming ingestion such as Kinesis instead of relying only on retraining or indexing.

A regional airline plans to launch a customer help assistant built with Amazon Bedrock to answer booking and policy questions. Which AWS feature will most improve the assistant’s factual accuracy and safe, consistent behavior?

  • ✓ B. Guardrails

The correct choice is Guardrails. Guardrails in Amazon Bedrock let you define safety and content policies and enforce company rules so the assistant gives more factual and consistent answers.

Guardrails are applied to model outputs to restrict topics and enforce policy constraints which reduces hallucinations and improves alignment with company guidelines. They operate at the Bedrock orchestration level so you can shape response content and behavior without changing the underlying LLM.

SageMaker Model Monitor is designed to detect data and model quality drift for SageMaker endpoints and does not directly control or constrain Bedrock chat responses.

Amazon Polly provides text to speech capabilities and it does not affect the factual accuracy or safety of text generated by a language model.

Amazon Rekognition performs image and video analysis and is unrelated to improving a text assistant’s reliability.

When a question asks about improving LLM safety and factual consistency in Bedrock choose policy and control features such as guardrails rather than monitoring tools or modality services.

A ride-hailing startup aims to group trip paths from the last 90 days into similar clusters based on geographic proximity and congestion trends to improve driver dispatching. Which category of machine learning should they use?

  • ✓ C. Unsupervised learning

Unsupervised learning is correct because the task is to group unlabeled trip paths by geographic proximity and congestion trends to improve driver dispatching.

Unsupervised learning uses clustering algorithms such as k-means and DBSCAN to group similar GPS traces and traffic features without labels. These clusters can reveal common corridors and congestion patterns that inform scheduling and routing rules, and they can be implemented with platforms such as Amazon SageMaker or with standard machine learning libraries when processing the last 90 days of trip data.

Supervised Learning is incorrect because it requires labeled targets such as predefined route categories and the scenario has no labels to train against.

Reinforcement Learning is incorrect because it focuses on learning policies from rewards through interaction over time and not on discovering structure in a static dataset of past trips.

Semi-Supervised Learning is incorrect because it is intended for problems where a small labeled set is available to guide learning together with many unlabeled examples and this task is purely about clustering unlabeled trip data so semi supervised approaches are unnecessary.

When the data is unlabeled and you need to group similar records choose unsupervised clustering. Look for mentions of algorithms like k-means or DBSCAN on the exam.

A regional insurance provider plans a virtual assistant that can search through about 8 million policy documents and claim notes and then deliver grounded, contextual answers to staff questions. Which AI design pattern should they adopt?

  • ✓ C. Retrieval-Augmented Generation (RAG) approach

The correct choice is Retrieval-Augmented Generation (RAG) approach. This option best fits a virtual assistant that must search eight million policy documents and claim notes and deliver grounded, contextual answers to staff.

Retrieval-Augmented Generation (RAG) approach couples a retrieval step that finds the most relevant passages from the insurer’s corpus with a generation step that composes concise, contextual responses. The retrieval component can scale with vector search or other indexed methods and the generator uses the retrieved text to produce answers that are grounded in the organization’s own documents.

Retrieval-Augmented Generation (RAG) approach also allows the document index to be updated independently of the model so the assistant can reflect new policies and recent claim notes without retraining. This separation reduces the risk of hallucination because the generator conditions its output on authoritative, retrieved content.

Basic retrieval system without generation is insufficient because it returns documents or passages but does not synthesize a conversational, concise answer for staff. That approach forces users to read and interpret results which reduces usability and speed.

Fine-tuned model on a general-purpose dataset may improve some behaviors but it does not provide live access to the insurer’s proprietary corpus. Fine-tuning on general data cannot guarantee that responses are grounded in current internal documents and it will become stale as policies change.

Pretrained model with no access to external knowledge is limited to what it learned during training and cannot incorporate organization specific policies or recent claim notes. That makes it unsuitable for delivering up to date, grounded answers for internal staff.

When a question requires answers that are grounded in private documents or must stay up to date pick a retrieval plus generation design rather than pure retrieval or a static fine-tune.

A retail marketplace named RiverCart is moving to AWS and plans to power personalized product suggestions with Amazon Bedrock. The team has a curated dataset of 250,000 labeled clicks and purchases and wants the customization to remain private to their account while improving the chosen foundation model’s outputs. Which approach best meets this requirement?

  • ✓ C. Create a private fine-tuned copy of the selected FM in Amazon Bedrock and train that copy with the labeled dataset

Create a private fine-tuned copy of the selected FM in Amazon Bedrock and train that copy with the labeled dataset is correct because Bedrock creates a separate, account scoped model variant that you can train with your data while leaving the provider’s base model unchanged and keeping your customization private.

Create a private fine-tuned copy of the selected FM in Amazon Bedrock and train that copy with the labeled dataset lets RiverCart use the curated 250,000 labeled clicks and purchases to adapt model outputs for personalized product suggestions. The fine-tuned copy is managed in your account and Region so your training data and resulting model behavior remain private.

Train the base FM directly in Amazon Bedrock using the labeled dataset is incorrect because Bedrock does not alter the shared base foundation model directly and instead provides private customized variants for customers.

Build a brand-new model from scratch in Amazon Bedrock using the labeled dataset is incorrect because Bedrock is intended for using and customizing foundation models and it is not the service for end to end training from scratch. Building a model from scratch is typically handled with Amazon SageMaker.

Create a publicly shareable copy of the base FM in Amazon Bedrock and train that public copy with the labeled dataset is incorrect because Bedrock customizations are private to the account and are not published as public base models for other customers to use.

When you need to improve a foundation model with your data and keep the result private choose fine-tuning a private Bedrock model copy rather than trying to retrain the shared base model or build from scratch.

At a cybersecurity startup using Amazon SageMaker to classify unsolicited email collected over the last 120 days, you need to compare models by how well they distinguish spam from legitimate messages regardless of where the prediction threshold is set. Which evaluation method should you use to understand performance across all possible decision cutoffs?

  • ✓ C. AUC-ROC to evaluate threshold-independent separability

AUC-ROC to evaluate threshold-independent separability is correct because it measures how well a model ranks positive examples above negative ones across every possible decision threshold.

AUC-ROC to evaluate threshold-independent separability aggregates true positive rate and false positive rate over all thresholds and the area under the ROC curve quantifies class separability so a higher area indicates the model better distinguishes spam from legitimate messages independent of any chosen cutoff.

Confusion matrix at a chosen score cutoff only reports performance at a single threshold and it cannot show how the model behaves as you change the decision cutoff so it is not suitable for threshold-agnostic comparison.

F1 score computed at a single operating point gives a balanced measure of precision and recall but only at one operating point so it does not reflect performance across all thresholds.

Average precision on the precision and recall curve summarizes ranking quality in the precision recall space and it is useful for imbalanced datasets but it emphasizes precision recall trade offs rather than ROC separability which is the specific concern when comparing models across every possible cutoff.

Think ROC AUC for questions that ask about performance independent of thresholds and use PR metrics when precision versus recall matters for imbalanced positive classes.

A streaming startup called AuroraPlay wants to add machine learning so that when a viewer selects a title, the app shows closely related content to boost discovery and engagement. Which option describes a capability that Amazon Personalize can provide?

  • ✓ C. Recommend items that are most similar to a product a user is viewing

The correct option is Recommend items that are most similar to a product a user is viewing.

Amazon Personalize provides item to item similarity recommendations and it enables an application to surface related titles or products to increase discovery and viewer engagement and to improve conversions.

Provide natural-language enterprise search across company content is a use case for Amazon Kendra and it focuses on intelligent search over organizational data rather than delivering personalized item suggestions.

Automate telecom customer workflows such as SIM activation and plan changes points to Amazon Lex and its prebuilt telecom bots which support conversational self service rather than recommendation systems.

Extract document layout elements like paragraphs, titles, and lists from files corresponds to Amazon Textract which performs OCR and layout extraction instead of providing personalization features.

When you see keywords like personalized recommendations or similar items think Amazon Personalize and not search or OCR services.

A logistics company is using Amazon SageMaker to train a vision model with 90,000 labeled warehouse photos to detect damaged parcels across different lighting conditions and camera angles. The team wants the model to generalize to new images and avoid both overfitting and underfitting. Which statements correctly differentiate overfitting from underfitting? (Choose 2)

  • ✓ B. Simplifying the model can reduce overfitting, while adding more labeled training data can lessen underfitting

  • ✓ E. Overfitting occurs when a model performs well on training data but poorly on new data, while underfitting fails to capture patterns even in the training data

Simplifying the model can reduce overfitting, while adding more labeled training data can lessen underfitting and Overfitting occurs when a model performs well on training data but poorly on new data, while underfitting fails to capture patterns even in the training data are correct because they describe the common causes and remedies for poor generalization.

Simplifying the model can reduce overfitting, while adding more labeled training data can lessen underfitting is correct because reducing model capacity or adding regularization prevents memorizing noise in the training set and because more diverse labeled examples help the model learn real variation instead of missing important patterns.

Overfitting occurs when a model performs well on training data but poorly on new data, while underfitting fails to capture patterns even in the training data is correct because overfitting yields high training accuracy and low validation accuracy while underfitting produces low accuracy on both sets due to insufficient capacity or inadequate training.

More complex models invariably outperform simpler ones is wrong because adding complexity can increase the risk of memorizing training noise and can hurt performance on unseen data when not paired with proper regularization or more data.

Underfitting shows strong training accuracy but weak validation accuracy, while overfitting performs poorly on both is incorrect because this statement swaps the definitions and does not match observed training and validation behaviors.

Amazon CloudWatch is wrong in this context because it is a monitoring and logging service and it does not define model generalization or replace model design and data strategies for addressing overfitting and underfitting.

Watch training and validation curves and use regularization or a simpler model if training performance is much better than validation performance and use more data or greater model capacity if both training and validation perform poorly.

A digital media company is fine-tuning a foundation model in Amazon Bedrock to study audience interactions and anticipate content demand. The team plans to run validation on roughly 40 TB of labeled data and needs a storage location that Bedrock natively supports for accessing large datasets during model validation. Which storage service should the team choose?

  • ✓ B. Amazon S3

The correct choice is Amazon S3. Amazon Bedrock natively integrates with Amazon S3 to supply large training and validation datasets and it uses S3 URIs to read objects at scale for model customization and evaluation.

Amazon S3 provides durable and scalable object storage that matches Bedrock data access patterns for large unstructured datasets. The service supports multi terabyte and petabyte scale storage and allows Bedrock to stream or batch read validation files without attaching block volumes to compute instances.

Amazon RDS is a managed relational database and is not suitable for large unstructured dataset ingestion because Bedrock expects object storage rather than row oriented relational tables.

Amazon EBS is block storage that must be attached to EC2 instances and it is not a native source for Bedrock dataset ingestion or validation jobs.

Amazon EFS is a network file system and it does not offer the native Bedrock dataset integration that Amazon S3 provides which makes it a less appropriate choice for this use case.

If the question highlights native support or integration with Amazon Bedrock for training or validation data then choose Amazon S3 unless the exam states a different supported integration.

ByteCraft Electronics operates a customer help portal where shoppers ask about device compatibility and warranty terms. The team needs Amazon Q Business to return precise answers that reflect catalog and policy changes that sync every 45 minutes from internal systems. Which capability of Amazon Q Business ensures answers stay accurate and relevant to the most current business data?

  • ✓ B. Uses Retrieval Augmented Generation to pull the latest catalog and policy content and ground the answer in that context

The correct choice is Uses Retrieval Augmented Generation to pull the latest catalog and policy content and ground the answer in that context. This capability ensures Amazon Q Business can fetch and use the most recent catalog and policy data that sync from internal systems so answers remain accurate and relevant.

Retrieval augmented generation combines a retrieval step that pulls authoritative documents from your enterprise sources with a generation step that composes responses grounded in those documents. In Amazon Q Business the retrieval step indexes the catalog and policy content and the generator uses that indexed context to produce precise answers that reflect changes made every 45 minutes.

Generates responses only from the model’s general training data without checking company-specific sources is incorrect because a pretrained model alone cannot guarantee current or organization specific facts and it will not reflect recent catalog or policy updates.

Delivers replies from a static, rule-driven knowledge base that is not updated in real time is incorrect since a static knowledge base does not adapt to frequent catalog or policy changes and cannot ground answers in newly updated enterprise data.

Enforces guardrails and content filters to block unsafe topics instead of retrieving fresh business data is incorrect because guardrails help with safety and compliance but they do not perform retrieval or add up to date factual content to responses.

When a question asks about keeping generative answers factual and current choose options that reference retrieval augmented generation or say answers are grounded in enterprise data.

A sports arena in Chicago operates gated parking with IP cameras at each entrance. The operations team wants to read license plate text from live video and instantly compare it to a denylist of about 1,200 plate numbers stored in their system so security is alerted when a match is found. Which AWS service should they use to implement this solution?

  • ✓ B. Amazon Rekognition

The correct choice is Amazon Rekognition because it provides image and video analysis APIs with text detection that can extract license plate characters from live camera feeds and allow immediate comparison against a denylist.

Amazon Rekognition supports OCR from images and video streams and can be used in near real time to detect and return text in frames. The service can be integrated into a pipeline that pulls frames or streams from IP cameras, performs text detection, and compares the extracted plate strings to an internal denylist so security can be alerted on a match.

Amazon Bedrock is focused on generative AI and foundation models and is not intended for computer vision tasks or live video OCR which are required for license plate reading.

Amazon Kendra provides intelligent search over indexed documents and does not analyze image or video content in real time so it is not suitable for this use case.

Amazon Comprehend performs natural language processing on text and cannot extract characters from images or video streams so it cannot perform license plate recognition.

When a question asks about OCR from live camera feeds choose Rekognition for images and video and map other services to their primary modalities.

A drone logistics startup is building an AI flight controller to improve obstacle avoidance and route selection for autonomous quadcopters. The team will use deep learning and wants to clearly understand how a neural network learns from large volumes of flight and video data so it can make split-second choices. What best describes the training process for a deep learning model?

  • ✓ C. Training in deep learning iteratively updates network weights and biases using backpropagation with optimizers like gradient descent on large datasets to minimize a loss function

The correct answer is Training in deep learning iteratively updates network weights and biases using backpropagation with optimizers like gradient descent on large datasets to minimize a loss function. This choice correctly names the iterative weight updates and the use of a loss function to drive learning.

The training workflow performs forward passes to produce predictions and then computes a loss that measures error. Backpropagation computes gradients of that loss with respect to each parameter and an optimizer such as gradient descent or Adam applies incremental updates to weights and biases over many epochs. This iterative process is what the described training process refers to and it is why models improve as they see more flight and video data.

Training in deep learning involves manually configuring all weights and biases according to fixed heuristics is wrong because neural networks do not rely on hand tuning every connection. The whole point of training is to let the algorithm learn parameter values from data rather than setting them manually.

Amazon SageMaker is not a description of the learning algorithm. It is an AWS service that can orchestrate and scale training jobs and experiments, but it does not itself define how weights and biases are updated inside a neural network.

Training in deep learning requires no dataset, since the model learns solely from built-in rules is incorrect because data is essential to compute a loss and provide the signal that guides parameter updates. Without data there is no measurable objective to minimize.

On the exam look for mentions of backpropagation and gradient descent together with minimizing a loss function over multiple epochs as the hallmark of supervised deep learning training.

Northstar Analytics is building an internal assistant with an Amazon Bedrock foundation model and wants to lower monthly token spend while keeping answer quality high. Which approach is the most effective to achieve this?

  • ✓ B. Fine-tune the Bedrock model with domain data so prompts can be shorter while maintaining quality

Fine-tune the Bedrock model with domain data so prompts can be shorter while maintaining quality is the correct option.

By applying fine-tuning you align the foundation model to your domain language and tasks so prompts no longer need long context to elicit accurate responses. This reduces both prompt and response tokens used during inference and directly lowers monthly token spend while preserving answer quality.

Raise the model’s temperature to produce outputs faster is incorrect because temperature controls output randomness and diversity rather than latency or token usage and increasing it can harm accuracy.

Perform continued pretraining on Amazon SageMaker using the company’s full dataset is incorrect because continued pretraining is resource intensive and expensive and it is unnecessary when managed Bedrock models can be customized more cheaply through fine-tuning or instruction tuning.

Lower the maximum tokens per response to reduce output size is incorrect because hard caps can truncate useful content and degrade answer quality which defeats the goal of maintaining high quality even though costs may drop.

Prefer fine-tuning to shorten prompts and cut token costs while keeping quality high. On exams pick options that reduce tokens through customization rather than by changing randomness or by truncating responses.

At Helios Retail Labs, a small analytics group is transforming raw datasets before training a predictive model. They apply one-hot encoding, standardization, and build interaction terms to improve predictive quality. Which choice best describes the aim of feature engineering in this situation?

  • ✓ C. Applying domain expertise to convert raw data into informative features that boost model performance and efficiency

Applying domain expertise to convert raw data into informative features that boost model performance and efficiency is correct because the team is transforming raw datasets with one-hot encoding, standardization, and interaction terms to make patterns easier for models to learn.

Applying domain expertise to convert raw data into informative features that boost model performance and efficiency focuses on reshaping inputs so learning algorithms can detect signal more easily. The one-hot encoding and scaling steps create consistent numeric inputs and the interaction terms expose relationships that raw values do not make explicit. These changes can improve accuracy and training efficiency because models receive clearer, more informative inputs.

Automatically choosing the best algorithm or architecture based on the dataset and task is incorrect because that describes AutoML or model selection and it chooses models rather than transforming the data.

Amazon SageMaker Autopilot is incorrect because it is an AutoML service that automates model selection and tuning rather than the process of crafting features from raw inputs.

Systematically searching for the most effective hyperparameters over many training runs is incorrect because that refers to hyperparameter tuning and it optimizes model settings instead of creating or transforming input features.

When you see operations such as one-hot encoding scaling or interaction terms think feature engineering and not AutoML or hyperparameter tuning.

An analytics startup is using a foundation model with Amazon Bedrock to power an internal question-and-answer assistant, and the team wants to improve response quality by progressing from the simplest approach to the most advanced; which sequence represents the techniques in increasing implementation complexity?

  • ✓ C. Prompt engineering, Retrieval-Augmented Generation (RAG), Fine-tuning

The correct sequence is Prompt engineering, Retrieval-Augmented Generation (RAG), Fine-tuning.

Prompt engineering is the quickest and lowest effort way to improve responses because you only change the input instructions and context without touching model weights or adding infrastructure. Retrieval-Augmented Generation (RAG) increases implementation complexity because it requires an external retrieval layer and content indexing to provide up-to-date or domain specific context that the model can use to reduce hallucinations. Fine-tuning is the most complex and resource intensive option because it changes model parameters and needs curated training data and ML workflows to produce reliable results.

Retrieval-Augmented Generation (RAG), Prompt engineering, Fine-tuning is incorrect because it places RAG before the simple step of adjusting prompts and it skips the usual progression from quick prompt tweaks to added retrieval and then to model retraining.

Fine-tuning, Retrieval-Augmented Generation (RAG), Prompt engineering is incorrect because it starts with the most costly and permanent step and then moves to simpler options, which reverses the recommended escalation of effort.

Prompt engineering, Fine-tuning, Retrieval-Augmented Generation (RAG) is incorrect because it puts fine-tuning before adding retrieval, but teams commonly try retrieval first to incorporate fresh or proprietary knowledge without the overhead of retraining.

Begin with small prompt changes, then add RAG for domain knowledge, and use fine-tuning only when prompts and retrieval cannot achieve the required behavior.

Orion Logistics built a machine learning model to predict which employees might leave. Over the last 90 days the evaluation results show that accuracy differs widely by division, ranging from 91% in Finance to 67% in Warehouse Operations. Which approach would best uncover and correct this uneven performance across groups?

  • ✓ C. Subgroup evaluation of model performance by division

Subgroup evaluation of model performance by division is the correct choice because it measures accuracy and other metrics for each division so you can detect where the model performs poorly and apply targeted remediation.

This approach examines per-group metrics and uncovers systematic differences in performance so you can take corrective actions such as collecting more representative data for underperforming divisions, reweighting or resampling training data, or applying calibration and fairness-aware techniques. AWS SageMaker Clarify supports subgroup analysis and bias reporting which helps automate these checks and present actionable diagnostics.

Automated hyperparameter tuning can improve overall model performance by searching for better settings but it does not reveal which divisions have lower accuracy and it does not by itself fix group-level imbalances or fairness problems.

Amazon Transcribe is a speech to text service and it does not help analyze structured HR data or evaluate subgroup fairness in a classification model.

Amazon Polly converts text to spoken audio and it is unrelated to diagnosing or remediating per-division accuracy differences in predictive models.

When you see different accuracy across groups prefer a subgroup analysis or fairness tool to find where the model fails and then apply targeted fixes rather than only tuning global hyperparameters.

A healthcare research lab called NorthBay Insights is rolling out a machine learning platform that processes confidential datasets. The team must ensure only authorized users can read or modify the information and wants to enforce least-privilege access across AWS resources. Which approach should they use to protect access to the data?

  • ✓ B. Use AWS IAM roles with fine-grained policies and permissions to enforce access control

The correct choice is Use AWS IAM roles with fine-grained policies and permissions to enforce access control. This approach enforces least privilege across AWS and ensures only authorized principals can read or modify sensitive datasets.

Use AWS IAM roles with fine-grained policies and permissions to enforce access control lets you attach precise permissions to roles and grant temporary credentials to users and services so access is limited to what is required. IAM policies and roles integrate with service control policies and with key policies for encryption keys so you can combine identity based rules and key based controls to protect data end to end.

AWS CloudTrail records API activity for auditing and governance and it does not grant or deny permissions so it cannot by itself enforce who can access data.

Amazon Macie helps discover and classify sensitive data and it provides visibility and alerts and it does not provide identity based access control to block reads or writes.

Encrypt the data but do not configure any specific access permissions is insufficient because encryption alone does not control who can decrypt or use the data and it must be combined with IAM policies and key policies to properly restrict access.

Think authorization not just encryption when the question asks who can access data. Prefer IAM roles and least privilege to control access.

AeroTrip, an online travel agency, handles about 25,000 customer support tickets each day and plans to roll out Agents for Amazon Bedrock to streamline how requests are resolved. Which benefits of these agents would most directly address this need?

  • ✓ C. Automating repetitive actions and orchestrating multi-step workflows with tool and API calls

Automating repetitive actions and orchestrating multi-step workflows with tool and API calls is correct because Agents for Amazon Bedrock are designed to decompose user intent into steps and invoke external tools and APIs to complete tasks, which directly supports handling high volumes of support tickets.

Automating repetitive actions and orchestrating multi-step workflows with tool and API calls lets agents plan sequences of actions use action groups to call services retrieve relevant knowledge and coordinate end to end resolutions so workflows are automated and human effort is reduced, which helps scale to thousands of daily tickets.

Automatically building and training new proprietary foundation models to anticipate customer intent is incorrect because agents do not train or create foundation models. Building and training models is done with separate services such as Amazon SageMaker and is not an inherent agent capability.

Offloading all task coordination and state management to AWS Step Functions is incorrect because AWS Step Functions is a separate orchestration service. Agents can call Step Functions as a tool but they do not automatically transfer all coordination and state handling to it.

Choosing and switching between multiple foundation models and merging their outputs without setup is incorrect because agents do not automatically ensemble or route between models without explicit configuration. Model selection and output merging require the builder to configure routing or ensembling.

Think of agents as workflow planners that call tools and APIs to automate tasks and reduce manual ticket handling, and do not confuse them with model training or automatic model ensembling.

A global insurance carrier is building machine learning models in AWS to flag suspicious claims and refine underwriting risk scores. To enable five distributed data science squads to share work and keep training and inference consistent, they want a centralized place to store, discover, and version reusable ML features. Which AWS service best fits this requirement?

  • ✓ B. Amazon SageMaker Feature Store

Amazon SageMaker Feature Store is the correct choice because it provides a centralized and governed repository for machine learning features with built in versioning and both online and offline stores to ensure consistency between training and inference.

The Feature Store lets teams register discover and reuse features across squads and supports point in time access access control and encryption to meet compliance and production needs. It offers an online store for low latency inference and an offline store for batch training so features remain consistent across development and production workflows.

Amazon SageMaker Data Wrangler focuses on data preparation and transformation and it does not act as a feature registry or provide built in feature versioning for cross team reuse.

Amazon SageMaker Ground Truth is a labeling and annotation service and it is unrelated to storing or sharing engineered features for model training and inference.

Amazon SageMaker Canvas is a no code modeling environment and it does not manage a centralized versioned feature store for multiple data science teams.

When a question asks for a centralized, versioned feature repository that must serve both training and inference choose a feature store rather than a data preparation or labeling service.

Skylark Retail is launching a customer support platform and needs to derive insights from recorded customer conversations. The team processes about 45,000 calls each month and wants to automatically convert the call audio into text so downstream tools can extract key details. Which AWS service should they use?

  • ✓ B. Amazon Transcribe

Amazon Transcribe is the correct choice because it converts recorded call audio into text so downstream tools can extract entities sentiment and key phrases from the transcripts.

Amazon Transcribe performs automatic speech to text at scale and supports batch and streaming transcription along with speaker identification timestamps and vocabulary tuning to improve accuracy. These transcripts are the required input for NLP services so you can send them to tools that perform entity extraction sentiment analysis and other text analytics.

Amazon Lex is focused on building chatbots and conversational interfaces and it does not provide a service to transcribe recorded audio files into text.

Amazon SageMaker Model Monitor is intended to monitor data and model quality in machine learning deployments and it does not perform audio transcription or text extraction from recordings.

Amazon Comprehend provides NLP capabilities such as entity recognition sentiment detection and key phrase extraction but it operates on text only and therefore requires a speech to text step first so it cannot by itself convert call recordings into transcripts.

Match the data type to the service by choosing transcription first and then applying NLP. Use transcription for audio and then use text analysis tools on the transcripts.

A same-day courier, Northwind Express, uses Amazon Bedrock to generate turn-by-turn routes using real-time congestion data and package urgency. During the evening rush, traffic to the API rises to about 450 requests per second and p95 latency grows from 250 ms to 1.4 seconds, slowing dispatch decisions. Which change should the team prioritize to bring latency back under target while demand is high?

  • ✓ B. Increase provisioned throughput and concurrency in Amazon Bedrock

Increase provisioned throughput and concurrency in Amazon Bedrock is the correct change to prioritize because increasing capacity reduces queuing and brings p95 latency back under target during high request rates.

Provisioning more throughput and concurrency adds parallel model capacity so requests are handled faster and dispatch decisions are not delayed by backend wait time. This change directly addresses the capacity bottleneck observed when traffic rises to about 450 requests per second and p95 latency grows from 250 milliseconds to 1.4 seconds.

Fine-tune the model with more route data may improve route quality over time but it does not relieve an immediate throughput or concurrency bottleneck. Model training and evaluation are slow to implement and will not reduce peak-time queueing.

AWS Global Accelerator can improve network path performance for some workloads but it will not reduce compute wait time when the model backend is saturated. It does not replace the need to scale Bedrock capacity when the service is the limiting factor.

Lower the maximum tokens per response can slightly reduce per-request compute time but it risks producing incomplete route guidance and usually provides only marginal latency improvement compared to increasing provisioned throughput and concurrency.

When latency rises only at peak load think scale capacity and concurrency first and reserve enough provisioned throughput to prevent backend queuing.

A fintech startup wants to enforce fairness and transparent predictions in its ML workflow. During data preparation, which capability of Amazon SageMaker Clarify would help achieve this goal?

  • ✓ B. Flags potential dataset bias during data preparation

The correct answer is Flags potential dataset bias during data preparation. This capability is provided by Amazon SageMaker Clarify and it is the feature that helps enforce fairness and transparency during the data preparation phase.

Amazon SageMaker Clarify computes bias metrics on datasets before training and it also provides explainability for model predictions. These capabilities let teams detect dataset imbalances and measure fairness metrics early so they can remediate bias before models are trained and deployed.

Amazon SageMaker Model Monitor focuses on monitoring models in production for drift and data quality and it does not perform pre-training bias detection.

Amazon SageMaker Model Cards centralize documentation and governance metadata about models and they do not run automated bias analysis on datasets.

Amazon Bedrock Knowledge Bases support retrieval augmented generation and knowledge management for applications and they are not related to dataset fairness or explainability in the training phase.

Map the requirement to the ML lifecycle and choose Clarify for bias detection during data preparation.

Orion Mutual Insurance processes tens of thousands of policy contracts and addendums each week and plans to use AWS AI to extract key terms, detect missing clauses, and classify agreements from scanned documents. The compliance team is worried that confidential customer information could be exposed when interacting with AI models. Which AWS practices best reduce the risk of disclosing sensitive data? (Choose 2)

  • ✓ B. Encrypt stored documents with AWS Key Management Service (KMS)

  • ✓ E. Implement IAM policies that tightly restrict access to AI models and datasets

The correct controls are Encrypt stored documents with AWS Key Management Service (KMS) and Implement IAM policies that tightly restrict access to AI models and datasets.

Encrypt stored documents with AWS Key Management Service (KMS) secures data at rest with managed and auditable keys and supports key rotation and fine grained access to decrypted material. Implement IAM policies that tightly restrict access to AI models and datasets enforces least privilege for users roles and services and limits who can upload documents call model endpoints and view inference results.

Store document data in Amazon RDS instead of Amazon S3 does not automatically improve confidentiality because both storage options can be secured and the real protections are encryption access controls and monitoring rather than swapping storage engines.

Disable API logging to prevent data leakage is counterproductive because logging provides audit trails for detection and investigation and you should protect logs with encryption and access controls while keeping them enabled.

AWS Shield Advanced protects against distributed denial of service attacks but it does not address data confidentiality or access control for AI workloads and so it is not sufficient to prevent sensitive data exposure.

Prioritize encrypting data at rest and in transit and enforce least privilege while keeping audit logs enabled to investigate incidents.

A regional retailer, Luma Outfitters, is rolling out a support chatbot on Amazon Bedrock. The team wants the model to generate replies while limiting choices to the smallest set of highest-probability next tokens whose cumulative likelihood reaches a chosen threshold, for example about 88%. Which inference parameter should they configure?

  • ✓ C. Top-p sampling

The correct choice is Top-p sampling. This option restricts the model to the smallest set of next-token candidates whose cumulative probability meets the threshold p for example 0.88 which matches the requirement to limit choices to a specific percentage of the highest probability tokens.

In nucleus sampling the model ranks tokens by probability and then includes tokens until their combined probability reaches the chosen p value. Using Top-p sampling therefore adapts the candidate set size to the probability distribution and ensures the generation considers only the top probability mass up to the cutoff.

Temperature scales logits to make the output more or less random by flattening or sharpening the distribution but it does not enforce a cumulative probability cutoff or directly limit token candidates by percentage.

Top-k restricts selection to a fixed number of highest probability tokens which can yield more or less than the desired cumulative probability depending on the distribution so it does not guarantee a target percentage like 88 percent.

Epochs refers to the number of passes over training data during model training or fine tuning and it does not control inference sampling behavior in Amazon Bedrock.

Remember that Top-p targets a cumulative probability threshold while Top-k limits a fixed count and Temperature adjusts randomness.

NorthGrid Power, a regional utility, plans to deploy a customer support chatbot that handles about 12,000 chats per month and steadily improves its replies by learning from thumbs-up or thumbs-down ratings on prior conversations as well as drawing on newly added knowledge sources like updated product guides and community posts. Which machine learning approach best enables this continuous improvement?

  • ✓ C. Reinforcement learning that uses customer ratings as reward signals

The best fit is Reinforcement learning that uses customer ratings as reward signals.

Reinforcement learning that uses customer ratings as reward signals optimizes a decision policy by maximizing cumulative reward so thumbs up and thumbs down directly guide the chatbot to prefer responses that earn positive feedback. New product guides and community posts can be included as additional context or state features and the policy can learn when to consult or cite those sources during conversations. Over time the agent adapts based on real user reactions rather than relying only on static labeled examples.

Supervised learning with a labeled dataset of high-quality and low-quality responses is useful for initial training because it teaches mapping from inputs to desired outputs. It does not, however, provide a built-in mechanism to continuously learn from live thumbs up and thumbs down unless you add a separate feedback loop.

Unsupervised learning to cluster similar support questions helps organize and discover patterns in incoming queries and it can support routing or template creation. Clustering does not choose or optimize actions based on customer ratings so it is not sufficient for continuous improvement driven by feedback.

Supervised learning that retrains when the FAQ and knowledge articles are updated can incorporate new content into models during periodic retraining and that improves coverage after updates. It still lacks an intrinsic, reward-driven update path from each conversation so it will not continuously improve from per-chat ratings in the same way reinforcement learning does.

When a scenario emphasizes ongoing user feedback favor reinforcement learning over static supervised or purely unsupervised approaches.

A travel booking startup called Skyline Trails plans to use Amazon Bedrock to help writers and designers produce trip guides, ad copy, and promotional visuals for campaigns. The team wants a simple explanation of how generative models can create original text, images, or audio from what they have learned so they can select the right workflow and safeguards. How do these systems generate new content?

  • ✓ B. Sampling novel outputs from models that learned statistical patterns in large training datasets

The correct answer is Sampling novel outputs from models that learned statistical patterns in large training datasets. This option describes how generative models create new text images or audio by learning statistical relationships in large datasets and then producing new outputs that follow those learned patterns.

Generative models are trained to predict elements of data such as the next word or pixel based on context. During generation the model samples from the learned distribution to assemble sequences or pixels that are coherent and varied. Sampling parameters and strategies influence creativity and safety so teams can tune outputs for quality and control.

Using fixed rules and templates created by developers with no learning from examples is incorrect because rule or template engines do not learn patterns from data and they cannot generate novel phrasing beyond what is explicitly coded.

Amazon Rekognition is incorrect because that service focuses on image and video analysis tasks such as detection and classification rather than producing new content.

Producing content by making arbitrary random choices without reference to prior data is incorrect because unguided randomness does not produce coherent outputs that reflect learned semantics and structure.

Look for phrases like learns training data and sampling when answering generative AI questions. Distractors that emphasize fixed rules or pure randomness are usually wrong.

A precision medicine startup is using a Foundation Model in Amazon Bedrock to analyze high-throughput sequencing results and generate insights for new therapies. The team wants the model to become highly specialized in genomics so it can better understand domain-specific terminology, patterns, and data sources. Which approaches would most effectively transform the general FM into a genomics-focused expert? (Choose 2)

  • ✓ A. Continued pretraining on a large, curated corpus of genomics literature, variant databases, and sequencing data

  • ✓ D. Domain adaptation fine-tuning of the FM using high-quality, domain-specific genomics datasets

The most effective approaches are Continued pretraining on a large, curated corpus of genomics literature, variant databases, and sequencing data and Domain adaptation fine-tuning of the FM using high-quality, domain-specific genomics datasets. These two strategies together prepare a Foundation Model in Amazon Bedrock to become a genomics-focused expert.

Continued pretraining on a large, curated corpus of genomics literature, variant databases, and sequencing data exposes the model to domain vocabulary, recurring patterns, and contextual usages that are rare or absent in general corpora. This builds deep statistical knowledge of genomics language and relationships. Domain adaptation fine-tuning of the FM using high-quality, domain-specific genomics datasets then adjusts the model parameters on curated examples so the model aligns with task signals and dataset conventions while retaining the broader knowledge from pretraining.

Reinforcement learning with reward signals from human feedback to adapt the model to genomics is focused on shaping outputs via reward optimization and it helps with alignment or behavior but it is less effective as the primary path to broad domain knowledge learned from text corpora.

Supervised learning on a labeled downstream task such as gene disease classification to specialize the FM can yield strong performance on that specific task but it does not by itself create a comprehensive domain expert across diverse genomics tasks and data types.

Incremental learning to add new genomics samples over time without catastrophic forgetting supports continual updates and model maintenance and it is useful for keeping a model current but it is not a substitute for large-scale pretraining and targeted domain fine-tuning when the goal is deep initial specialization.

Prioritize continued pretraining to absorb genomics language and context and then apply domain adaptation fine-tuning on high quality datasets to align the model to your specific genomics tasks.

A mobile gaming studio wants to automatically tag player reviews as positive, negative, or neutral to monitor customer sentiment. Which learning approach should they use?

  • ✓ C. Supervised learning

Supervised learning is the correct choice because the studio has known sentiment categories and can train a model on reviews already labeled as positive, negative, or neutral so the model learns to classify new reviews into those classes.

A supervised approach builds a classifier from examples where each review is paired with its sentiment label and the model discovers patterns in the text that map to each class. This lets teams use simple models such as logistic regression or Naive Bayes for smaller workloads or transformer based neural networks for higher accuracy and scale, and it also allows evaluation on validation data to tune performance.

Clustering is not ideal because it groups reviews by similarity without labels and it may separate texts by topic or writing style rather than by sentiment.

Unsupervised learning in general does not use labeled outputs so it cannot directly learn the three sentiment classes required and it would need extra steps to map discovered groups to positive negative or neutral.

Reinforcement learning focuses on learning policies to maximize rewards in sequential decision problems and it is not typically used for straightforward single label text classification tasks.

If the question states you have labeled examples choose supervised learning on the exam and use unsupervised only when you must discover structure without labels.

A video streaming startup is using Amazon Bedrock to produce personalized show summaries and viewing suggestions. The machine learning team is adjusting decoding settings and wants to understand what changing the Top K parameter actually controls so they can balance consistency and variety in the outputs. What should you tell the team about Top K?

  • ✓ C. Sets how many of the highest-probability token candidates are eligible for the next step

The correct option is Sets how many of the highest-probability token candidates are eligible for the next step. Top K limits generation to the K most likely next tokens so increasing K permits the model to consider more candidates and produce more varied outputs while decreasing K narrows the choices and yields more consistent or deterministic text.

Top K works by trimming the full probability distribution down to a fixed number of top tokens at each decoding step. Sampling or selection then happens only from that reduced set which makes it a straightforward way to control diversity by count rather than by probability mass.

Specifies character sequences that cause generation to stop is incorrect because that behavior belongs to stop sequences. Stop sequences halt output when matched and do not change how many candidate tokens are considered for each step.

Controls the cumulative probability mass of candidates considered for the next token is incorrect because that describes nucleus sampling often called Top P. Top P selects tokens until a probability mass threshold is reached rather than selecting a fixed number of tokens.

Adjusts randomness by increasing the chance of choosing lower-probability tokens is incorrect because that describes temperature. Temperature rescales the distribution to make low probability tokens relatively more or less likely but it does not directly limit how many candidate tokens are eligible.

Top K limits candidates by count and Top P limits by cumulative probability. Tune temperature to make choices more or less random and adjust K to trade consistency for variety.

Rivera Robotics plans to add language-model features to rugged handhelds used by field crews at remote wind farms. The team needs sub-50 ms responses and cannot depend on consistent network access. Which approach will best achieve this?

  • ✓ C. Run tuned small language models directly on the edge hardware

Run tuned small language models directly on the edge hardware is the correct option because it delivers predictable, local inference that can meet strict latency targets and does not rely on network availability.

Executing a tuned small model on-device eliminates network round trips and the variability of remote endpoints, so responses are far more likely to arrive under 50 milliseconds in remote settings. Small, optimized models are also more likely to fit the memory, compute, and power constraints of rugged handhelds while still supporting practical language features.

Deploy compressed large language models locally on the edge devices is less suitable because compressed large models still often exceed the compute and memory of typical field handhelds and they can increase latency and power consumption compared with tuned small models.

Use a central small language model endpoint with asynchronous calls from devices remains dependent on network connectivity and it introduces round trip delays, so it cannot guarantee the low and consistent latency required at remote wind farms.

Integrate a centralized large language model endpoint with asynchronous communication is the poorest match because large centralized models increase bandwidth and latency demands and they require reliable connectivity that is not available in many remote field deployments.

When a scenario stresses very low latency and unreliable connectivity choose on-device inference with a small, optimized model that fits the device hardware and power budget.

A data science group at Aurora Retail is developing a machine learning service that must adhere to company fairness policies and equal opportunity laws to prevent discriminatory outcomes. Which capability should they prioritize to confirm the model treats all user segments fairly?

  • ✓ B. Bias detection and mitigation

Bias detection and mitigation is the correct capability because it directly measures and helps remediate disparate outcomes across sensitive attributes and it enables the team to validate equitable treatment and meet anti discrimination requirements.

SageMaker Clarify provides bias analysis during training and inference and it can compute fairness metrics and produce reports that help the team identify and reduce discriminatory outcomes. Using bias detection and mitigation lets the data science group test model behavior across user segments and implement corrective actions before deployment.

Model compression techniques focus on reducing model size and improving latency which helps deployment but they do not detect or correct unfair predictions across demographic groups.

Amazon SageMaker Automatic Model Tuning searches for hyperparameters to improve model performance and it does not evaluate or mitigate bias across protected classes.

Cross-validation improves confidence that the model will generalize and it reduces overfitting but it does not surface inequities among protected classes and therefore does not ensure fairness.

When a question highlights fairness or equitable outcomes prefer answers that mention bias detection or mitigation tools such as SageMaker Clarify to verify treatment across user groups.

A wealth management startup is building a generative AI tool to condense equity research memos from the past 18 months into brief summaries. The security team worries that the model’s responses could unintentionally surface proprietary trading approaches and nonpublic methods. Which discipline should the team focus on first to address this risk?

  • ✓ B. AI risk management

The correct choice is AI risk management because it specifically targets risks that arise from what a model generates and helps prevent accidental disclosure of proprietary trading approaches.

AI risk management focuses on assessing model behavior and applying controls such as prompt hardening, output filtering, red teaming, human-in-the-loop review, and data governance for training and fine tuning. These measures address semantic leakage and nonpublic method disclosure when condensing internal research into summaries, and they enable policies and testing that reduce the chance of sensitive information surfacing in responses.

Identity and access management constrains who can access the model and underlying systems and that is important for limiting exposure, but it does not control the semantic content that the model may synthesize from permitted inputs.

Centralized logging and monitoring increase observability and support detection and post incident analysis, but they do not prevent the model from generating sensitive material in the first place.

Network segmentation and isolation reduce attack surface and protect infrastructure, but they do not mitigate the risk of confidential details being produced by the model during normal use.

When a scenario calls out risks from model outputs pick controls that focus on model behavior and governance rather than only on infrastructure or access controls.

Darcy Declute

Darcy DeClute is a Certified Cloud Practitioner and author of the Scrum Master Certification Guide. Popular both on Udemy and social media, Darcy’s @Scrumtuous account has well over 250K followers on Twitter/X.


Git, GitHub & GitHub Copilot Certification Made Easy

Want to get certified on the most popular AI, ML & DevOps technologies of the day? These five resources will help you get GitHub certified in a hurry.

Get certified in the latest AI, ML and DevOps technologies. Advance your career today.