20 Tough AWS AI Practitioner Certification Exam Questions and Answers

Why the AWS AI Practitioner certification is important

Over the past few months, I’ve been working hard to help professionals adapt to the rapid rise of artificial intelligence and discover exciting new opportunities in cloud and data-driven careers.

Part of that journey is strengthening your resume with the right certifications, and the one I recommend every learner start with is the AWS Certified AI Practitioner credential.

Whether you’re a Scrum Master, Product Owner, DevOps Engineer, or senior developer, this AI certification builds the foundation you need to understand AI and machine learning the way AWS implements them securely, at scale, and in the cloud.

In today’s tech landscape, you can’t afford to be unaware of how AI services power modern applications. Every successful technologist needs to understand how to use foundation models, integrate APIs like Amazon Bedrock, and design solutions that take advantage of AWS’s AI ecosystem.

That’s exactly what the AWS AI Practitioner Exam validates. It tests your understanding of fundamental AI and ML concepts, AWS AI services, responsible AI practices, and how businesses can use AI to innovate faster and more efficiently.

AWS AI Practitioner exam simulators

Through my Udemy courses on the AWS Cloud Practitioner exam, machine learning, AWS DevOps and solutions architect certifications, and through the free practice question banks at certificationexams.pro, I’ve seen which AI concepts challenge learners the most.

Based on thousands of student sessions and performance data, these are 20 of the toughest AWS AI Practitioner certification exam questions currently circulating in the practice pool.

Each question is carefully explained at the end of the set. Take your time, think like an AWS AI architect, and review the reasoning behind each answer to strengthen your understanding.

If you’re preparing for the AWS AI Practitioner Exam or exploring related certifications from AWS, Google Cloud, or Azure, you’ll find hundreds more free practice exam questions and detailed explanations at certificationexams.pro as well.

AWS AI Practitioner Udemy Course

You can find over 500 more questions in my AI Practitioner Udemy course.

And just to be clear, these are not AI Practitioner exam dumps or braindumps. Every question is original and designed to teach you how to think about AI and cloud integration, not just what to memorize. Each answer includes its own tip and guidance to help you pass the real exam with confidence.

Now, let’s dive into the 20 toughest AWS AI Practitioner certification exam questions.

Good luck, and remember, every great cloud career in the age of AI starts with understanding how AWS brings artificial intelligence to life.

Git, GitHub & GitHub Copilot Certification Made Easy

Want to get certified on the most popular AI, ML & DevOps technologies of the day? These five resources will help you get GitHub certified in a hurry.

Get certified in the latest AI, ML and DevOps technologies. Advance your career today.

AWS AI Practitioner Exam Questions

Question 1

A digital retailer, NovaMart, is building an AI solution to automatically label items in photos that customers upload to product listings. The ML engineers are reviewing different neural network families to achieve high-accuracy image categorization at scale. Which neural network architecture is the most appropriate for this image classification use case?

  • A. Amazon Rekognition
  • B. Generative adversarial networks (GANs)
  • C. Convolutional neural networks (CNNs)
  • D. Recurrent neural networks (RNNs)

Question 2

NorthBridge Ledger, a fintech firm, uses AI to analyze highly sensitive payment transactions. The company must demonstrate conformance with global security frameworks such as ISO/IEC 27001 and SOC 2 during quarterly audits. Which AWS service provides direct access to AWS compliance reports and certifications to support these reviews?

  • A. AWS Config
  • B. AWS Artifact
  • C. AWS Trusted Advisor
  • D. AWS CloudTrail

Question 3

Which statement best describes the Transformer architecture for modeling long text with preserved context?

  • A. Amazon Translate
  • B. A recurrent neural network that processes tokens sequentially
  • C. A deep learning architecture using self-attention to capture token relationships
  • D. A technique limited to numeric time-series forecasting

Question 4

A product team at Vega Retail has launched a text-generation model on Amazon Bedrock with safety guardrails enabled. During internal red-team exercises, testers craft carefully worded inputs that bypass the guardrails and cause the model to return responses that violate the content policy even though restrictions are in place. Which security risk best describes this behavior?

  • A. Indirect prompt injection through chained context
  • B. Model jailbreak using adversarial prompting
  • C. Training-time backdoor that exposes parameters without authorization
  • D. Inference-time leakage of memorized sensitive data

Question 5

A retail startup, Aurora Threads, is deploying a customer support chatbot on Amazon Bedrock and wants every reply to match their brand’s friendly and confident voice. Which technique should the team use to reliably guide the model to use the desired tone without retraining the model?

  • A. Few-shot prompting
  • B. Temperature parameter tuning
  • C. Prompt engineering
  • D. Model fine-tuning

Question 6

During a 90-day pilot of an internal Q&A assistant on Amazon Bedrock, how should impact on employee efficiency be measured?

  • A. BLEU score
  • B. CloudWatch latency and token usage
  • C. Measure productivity KPIs like average handle time, resolutions per hour, and time saved per request
  • D. Increase retrieval recall

Question 7

BlueSky Metrics, a media research startup, is building a knowledge hub with Amazon Bedrock to drive semantic search and summarization. They expect to index about 12 million embeddings for natural language queries and document retrieval and want to avoid managing an external vector store. If they let Knowledge Bases for Amazon Bedrock create the storage automatically, which vector database will be used by default?

  • A. Amazon DynamoDB
  • B. Amazon OpenSearch Serverless vector store
  • C. Amazon Aurora
  • D. Redis Enterprise Cloud

Question 8

Meridian Pay is building a customer support assistant on AWS using Amazon SageMaker and Amazon Bedrock. The team wants to ensure the assistant acts responsibly and maintains user trust during conversations. Which guideline should the team prioritize to meet this goal?

  • A. AWS Shield Advanced
  • B. Build transparency and explainability into the assistant so users can understand how responses are generated
  • C. Use proprietary customer datasets even if consent is not explicit
  • D. Optimize solely for speed and efficiency

Question 9

What does implementing data residency primarily ensure?

  • A. Blocks access to encrypted data except for a specified IAM role
  • B. Keeps personal data stored and processed within a chosen region to meet legal or contractual requirements
  • C. Enables automatic KMS encryption for all data at rest
  • D. Automatically records all training datasets in CloudTrail for audits

Question 10

Riverstone Outfitters uses Amazon Bedrock to draft automated replies to customer emails and live chat messages. Leadership wants every message to consistently reflect the company’s friendly and trustworthy brand voice. What change would be the most effective to achieve this?

  • A. Adjust the temperature setting
  • B. Prompt engineering with explicit tone instructions
  • C. Guardrails for Amazon Bedrock
  • D. Model fine-tuning

Question 11

A fintech startup that processes mobile payments notices its classifier achieves very high precision but noticeably lower recall. They want a single evaluation metric that fairly balances both. What does the F1 score represent?

  • A. The arithmetic sum of precision and recall
  • B. The average of the true positive rate and the true negative rate
  • C. The harmonic mean of precision and recall
  • D. Amazon SageMaker

Question 12

Which AWS capability provides bias detection, explainability, and continuous production monitoring for responsible AI?

  • A. Amazon Bedrock Guardrails
  • B. Amazon CloudWatch Alarms
  • C. SageMaker Clarify plus SageMaker Model Monitor
  • D. SageMaker Model Cards and Model Registry

Question 13

A travel-tech startup, AeroTrail, is building an assistant on Amazon Bedrock that must carry out multi-turn reasoning over several steps, select and invoke external tools at runtime, and make HTTP calls to third-party APIs based on intermediate outcomes. Which AWS service or feature should the team choose to meet these needs?

  • A. AWS Step Functions
  • B. Amazon Bedrock Agents
  • C. Amazon SageMaker JumpStart
  • D. Amazon Augmented AI (A2I)

Question 14

Aurora Media Group built a multilingual foundation model that translates everyday text well but misses terminology used in oil and gas compliance reports; the team plans to fine-tune the model in Amazon SageMaker with about 1,600 domain documents and a bilingual glossary to improve accuracy; which data preparation approach is most critical to capture this specialized vocabulary?

  • A. Amazon Translate
  • B. Randomly sample sentences from the corpus without quality checks
  • C. Curate and label domain-parallel examples using a consistent glossary
  • D. Rely solely on the base pre-trained model without any domain data curation

Question 15

Which method best grounds LLM responses in trusted internal data to reduce hallucinations?

  • A. Fine-tuning
  • B. Prompt engineering
  • C. RAG over trusted data
  • D. Guardrails for Amazon Bedrock

Question 16

A recruiting analytics startup is building a machine learning tool to assist employers with initial candidate screening. During evaluation, the team observes that the model appears to systematically prefer certain demographic groups. They believe the training dataset reflects biased hiring patterns from the past, which could drive skewed predictions. The team now wants to pinpoint how human decision making can introduce bias into the system. Which scenario best demonstrates human bias influencing machine learning outcomes?

  • A. A machine learning algorithm that forecasts customer churn using last month’s data produces misleading results because the dataset reflects a holiday spike and off-season slump
  • B. Amazon Comprehend shows inconsistent sentiment scores for social media posts that contain sarcasm and slang
  • C. A data scientist manually chooses input variables based on their own assumptions about what matters, which embeds personal bias into the model
  • D. A model trained on legacy recruiting records consistently ranks male applicants higher for software engineering roles

Question 17

Aurora Outfitters, an online marketplace, stores 18 months of purchase history, inventory counts, and revenue metrics in a central relational database. Leadership wants regional supervisors and marketing coordinators to ask questions in plain English and get answers without knowing schemas or SQL. Which approach best enables this capability?

  • A. Create fixed dashboards backed by preset SQL that users can pick from
  • B. Use a generative AI system to translate natural language prompts into executable SQL in real time
  • C. Implement a simple keyword search that matches terms but cannot interpret full questions into structured queries
  • D. Amazon Kendra

Question 18

Which approach enforces, at generation time, that Amazon Bedrock responses never include PII?

  • A. Amazon Comprehend PII entity detection
  • B. Guardrails for Amazon Bedrock plus CloudWatch alarms
  • C. AWS WAF sensitive data rules
  • D. Amazon Macie on S3 outputs

Question 19

A subscription-based video platform is training a recommendation model using 30 months of user interaction data, and the ML team must tune hyperparameters to balance model complexity and generalization. They want to avoid both overfitting and underfitting to ensure strong results on historical data as well as validation sets. How should the team distinguish overfitting from underfitting when evaluating performance on training data versus new data?

  • A. Overfitting and underfitting both describe a model that performs similarly well on training data and on unseen data
  • B. Amazon SageMaker Debugger
  • C. Overfitting means the model scores high on the training set but degrades on new data, whereas underfitting means the model performs poorly on both training and new data
  • D. Overfitting happens when a model is too simple to represent the data, while underfitting happens when a model is overly complex and memorizes noise

Question 20

A retail marketplace called NovaStreet is creating a customer assistant that accepts written questions plus uploaded product photos or screenshots of problems. The team wants the assistant to jointly interpret text and images and return accurate, context-aware replies while keeping inference costs under control. Which approach would be the most cost-effective way to support these multimodal requests?

  • A. A convolutional neural network
  • B. A text-only large language model
  • C. A multimodal embedding model that aligns image and text inputs in a common vector space
  • D. A multimodal generative model

AWS AI Practitioner Braindump Answers

AWS AI Practitioner Udemy Course

You can find more exam questions in my AI Practitioner Udemy course.

Question 1


A digital retailer, NovaMart, is building an AI solution to automatically label items in photos that customers upload to product listings. The ML engineers are reviewing different neural network families to achieve high-accuracy image categorization at scale. Which neural network architecture is the most appropriate for this image classification use case?


The best fit is Convolutional neural networks (CNNs). CNNs exploit local connectivity and weight sharing through convolutions and pooling, learning spatial features such as edges, textures, and shapes that make them highly effective and efficient for image classification.

Amazon Rekognition is an AWS managed service for vision tasks, not a model architecture, so it does not directly answer the choice of network type.

Generative adversarial networks (GANs) are optimized for generating new, realistic images and are not the standard approach for supervised classification.

Recurrent neural networks (RNNs) are designed for sequences like text or time series and generally underperform CNNs on static image classification.

Exam Tip

Match model families to data: use CNNs for images, RNNs/Transformers for sequences, and GANs for data generation; AWS services like Amazon Rekognition often leverage underlying CNNs to solve vision problems.

Question 2


NorthBridge Ledger, a fintech firm, uses AI to analyze highly sensitive payment transactions. The company must demonstrate conformance with global security frameworks such as ISO/IEC 27001 and SOC 2 during quarterly audits. Which AWS service provides direct access to AWS compliance reports and certifications to support these reviews?


The correct choice is AWS Artifact. It provides on-demand access to AWS’s audit artifacts, including ISO/IEC 27001 and SOC 2 reports, which customers can download and present to regulators and auditors.

AWS Config helps track and evaluate configuration compliance of your resources but does not furnish the official certifications auditors request.

AWS Trusted Advisor focuses on best-practice checks across cost, performance, fault tolerance, service quotas, and security, not formal attestations.

AWS CloudTrail captures API activity for forensic and governance needs, yet it does not provide AWS compliance certifications or third-party audit reports.

Exam Tip

When a question mentions audit-ready reports, certifications, or standards like ISO/IEC 27001 or SOC 2, map it to AWS Artifact; if it emphasizes configuration rules or API logging, think AWS Config or AWS CloudTrail instead.

Question 3


Which statement best describes the Transformer architecture for modeling long text with preserved context?


The correct answer is A deep learning architecture using self-attention to capture token relationships. Transformers use self-attention to directly relate every token to every other token in a sequence, enabling strong context preservation and parallel processing, which is highly effective for tasks like summarization and translation.

The option Amazon Translate is a managed service, not a model architecture, so it does not describe what a Transformer is.

The option A recurrent neural network that processes tokens sequentially is inaccurate because RNNs rely on recurrence and struggle with long-range dependencies and parallelization, which Transformers address with self-attention.

The option A technique limited to numeric time-series forecasting incorrectly narrows the scope; Transformers are widely used across NLP and beyond.

Exam Tip

Watch for keywords like self-attention, parallel processing, and long-range dependencies. If an option emphasizes recurrence or a specific AWS service, it likely is not describing the Transformer architecture.

Question 4


A product team at Vega Retail has launched a text-generation model on Amazon Bedrock with safety guardrails enabled. During internal red-team exercises, testers craft carefully worded inputs that bypass the guardrails and cause the model to return responses that violate the content policy even though restrictions are in place. Which security risk best describes this behavior?


The correct risk is Model jailbreak using adversarial prompting. Testers are crafting inputs that intentionally bypass guardrails to generate restricted content, which is characteristic of jailbreak attacks against generative models.

Indirect prompt injection through chained context is about untrusted external data or tool outputs influencing the model during multi-step flows, which is not present in this direct user-prompt scenario.

Training-time backdoor that exposes parameters without authorization involves malicious data or triggers inserted during training or fine-tuning, whereas the described behavior emerges solely from cleverly phrased prompts at inference.

Inference-time leakage of memorized sensitive data would be the model revealing private training data, but the problem here is bypassing safety policies rather than data exposure.

Exam Tip

On the exam, map direct user-crafted prompts that bypass safety controls to jailbreak, while prompt injection typically involves untrusted external content in tool use or chaining scenarios.

Question 5


A retail startup, Aurora Threads, is deploying a customer support chatbot on Amazon Bedrock and wants every reply to match their brand’s friendly and confident voice. Which technique should the team use to reliably guide the model to use the desired tone without retraining the model?


The best choice is Prompt engineering. By explicitly instructing the model with style guidance (for example, persona, tone, and do/don’t lists), teams can reliably enforce a brand voice without changing model weights.

Few-shot prompting can nudge outputs by showing examples, but it is less consistent for tone than direct style instructions and often requires more tokens and maintenance.

Temperature parameter tuning primarily controls randomness; lowering temperature makes outputs more deterministic but does not ensure a specific voice.

Model fine-tuning can embed style but adds cost, governance overhead, and time, and is unnecessary when simple prompt directives achieve the goal.

Exam Tip

For tone and style, prefer prompt engineering. Use system prompts or instructions to set persona and tone; reserve fine-tuning for domain adaptation or consistent task performance, not mere voice control.

Question 6


During a 90-day pilot of an internal Q&A assistant on Amazon Bedrock, how should impact on employee efficiency be measured?


The best way to assess efficiency impact is to track business outcome metrics tied to worker productivity. Measure productivity KPIs like average handle time, resolutions per hour, and time saved per request because these directly capture whether the assistant reduces effort and speeds up task completion.

The option BLEU score is inappropriate because it measures text-overlap quality for machine translation and does not relate to workplace efficiency.

CloudWatch latency and token usage provides useful operational and cost metrics but does not quantify productivity gains for end users.

Increase retrieval recall tunes the retrieval layer, yet it remains a system metric; without outcome KPIs it cannot validate real efficiency improvements.

Exam Tip

Tie generative AI evaluation to business outcomes. Prioritize productivity, quality, cost, and risk metrics over pure model or system metrics. For pilots, define baseline KPIs, run A/B or before/after comparisons, and attribute improvements to the assistant by controlling for confounders.

Question 7


BlueSky Metrics, a media research startup, is building a knowledge hub with Amazon Bedrock to drive semantic search and summarization. They expect to index about 12 million embeddings for natural language queries and document retrieval and want to avoid managing an external vector store. If they let Knowledge Bases for Amazon Bedrock create the storage automatically, which vector database will be used by default?


Amazon OpenSearch Serverless vector store is the default vector database that Knowledge Bases for Amazon Bedrock provisions when you allow Bedrock to create the store, using the OpenSearch Serverless vector engine for embedding storage and similarity search.

Amazon DynamoDB is a key-value and document database and is not supported as a default or native vector store for Knowledge Bases.

Amazon Aurora (with pgvector on PostgreSQL) can be configured as an integration, but it is customer-managed and not the default Bedrock-managed option.

Redis Enterprise Cloud is supported as an external connector for vector storage, yet it is not automatically created as the default store.

Exam Tip

When a prompt highlights default behavior or that Bedrock will create the vector store for you in Knowledge Bases, select Amazon OpenSearch Serverless; other stores are customer-managed or external connectors.

Question 8


Meridian Pay is building a customer support assistant on AWS using Amazon SageMaker and Amazon Bedrock. The team wants to ensure the assistant acts responsibly and maintains user trust during conversations. Which guideline should the team prioritize to meet this goal?


The correct choice is Build transparency and explainability into the assistant so users can understand how responses are generated. Transparency and explainability provide insight into model outputs, support accountability, and help detect and mitigate bias, which are core to responsible AI on AWS.

AWS Shield Advanced is unrelated to AI ethics because it focuses on DDoS protection rather than model transparency.

Use proprietary customer datasets even if consent is not explicit conflicts with privacy and data governance requirements, which undermines responsible AI.

Optimize solely for speed and efficiency ignores fairness, accountability, and user trust, which are essential for ethical deployment.

Exam Tip

When a scenario stresses ethical or responsible AI, look for options mentioning transparency, explainability, human oversight, and privacy rather than performance-only goals or unrelated AWS services.

Question 9


What does implementing data residency primarily ensure?


The correct answer is Keeps personal data stored and processed within a chosen region to meet legal or contractual requirements. Data residency focuses on the geographic placement of data to satisfy laws, regulations, and contractual obligations that mandate data remain within specific boundaries. On AWS, this means selecting and constraining use of particular Regions for storage and processing so data does not leave those jurisdictions.

The option Blocks access to encrypted data except for a specified IAM role is incorrect because it describes access control and encryption enforcement, which are IAM and KMS concerns, not geographic placement.

The option Enables automatic KMS encryption for all data at rest is incorrect because encryption at rest is separate from residency; you can have encrypted data in any Region.

The option Automatically records all training datasets in CloudTrail for audits is incorrect because CloudTrail logs API events and does not automatically capture or retain datasets, nor does it control where data is stored or processed.

Exam Tip

When you see data residency, think where data is stored and processed. Do not confuse it with encryption (how data is secured), access control (who can access), or logging/auditing (what happened). Keywords like Region, geographic boundary, and regulatory requirements usually point to data residency.

Question 10


Riverstone Outfitters uses Amazon Bedrock to draft automated replies to customer emails and live chat messages. Leadership wants every message to consistently reflect the company’s friendly and trustworthy brand voice. What change would be the most effective to achieve this?


The most effective change is Prompt engineering with explicit tone instructions. By specifying the desired voice, style, role, and providing short exemplars, you directly steer the model to produce consistent, brand-aligned language across replies without additional training.

Adjust the temperature setting changes output variability but does not reliably enforce a consistent brand voice, so messages may still drift in tone.

Guardrails for Amazon Bedrock are designed for safety, topic control, and content policies, which are different from shaping a specific brand tone.

Model fine-tuning could imprint style but adds cost and operational complexity and is usually unnecessary when clear prompt instructions can achieve the tone goal.

Exam Tip

For controlling tone or style, think prompt engineering first; use temperature for variability, guardrails for safety/policy, and reserve fine-tuning for durable behavior changes or when prompts are insufficient.

Question 11


A fintech startup that processes mobile payments notices its classifier achieves very high precision but noticeably lower recall. They want a single evaluation metric that fairly balances both. What does the F1 score represent?


The F1 score combines precision and recall into a single number using the harmonic mean, which penalizes extreme imbalance. The harmonic mean of precision and recall ensures the score is low if either precision or recall is low, making it suitable when both matter.

The arithmetic sum of precision and recall is incorrect because adding the two does not balance them and can overstate performance when one is poor.

The average of the true positive rate and the true negative rate is incorrect because that describes balanced accuracy and includes true negatives, which F1 does not.

Amazon SageMaker is not a metric; it is an AWS service for building and deploying ML models and does not define what F1 represents.

Exam Tip

When you see a need to balance precision and recall, think F1; remember it is the harmonic mean, not the arithmetic mean.

Question 12


Which AWS capability provides bias detection, explainability, and continuous production monitoring for responsible AI?


The best choice is SageMaker Clarify plus SageMaker Model Monitor. Clarify provides bias detection and explanation capabilities (including feature attribution) during training and inference, while Model Monitor sets up continuous monitoring for data quality, model quality, and drift in production. Together, they directly address fairness, transparency, and ongoing safety checks.

The option Amazon Bedrock Guardrails focuses on safety policies and content filtering for generative applications but does not deliver systematic bias detection or model explainability across datasets and predictions.

The option Amazon CloudWatch Alarms can alert on metric thresholds but does not assess bias, fairness, or explainability of ML models.

The option SageMaker Model Cards and Model Registry strengthens governance, lineage, and approval workflows, yet it does not provide bias analysis or continuous runtime monitoring.

Exam Tip

When you see keywords like bias, explainability, and continuous monitoring in production, map them to Clarify (bias/explainability) and Model Monitor (ongoing monitoring). Be careful not to confuse content safety features like Guardrails or general observability tools like CloudWatch with responsible AI tooling for fairness and explainability.

Question 13


A travel-tech startup, AeroTrail, is building an assistant on Amazon Bedrock that must carry out multi-turn reasoning over several steps, select and invoke external tools at runtime, and make HTTP calls to third-party APIs based on intermediate outcomes. Which AWS service or feature should the team choose to meet these needs?


Amazon Bedrock Agents is the right choice because it extends foundation models with planning, memory, and tool-use so the model can break tasks into steps, manage state across turns, and invoke external APIs dynamically based on intermediate results.

AWS Step Functions coordinates explicit, predefined steps but does not provide LLM-native reasoning or autonomous tool selection at inference time.

Amazon SageMaker JumpStart supplies prebuilt models and solutions for quick starts, not multi-step orchestration or tool calling by an LLM at runtime.

Amazon Augmented AI (A2I) focuses on human-in-the-loop review and is not designed for autonomous API invocation or multi-step decision-making by an LLM.

Exam Tip

When you see an LLM on Bedrock needing multi-step reasoning plus dynamic tool/API calls, think Bedrock Agents; do not confuse this with Step Functions, which orchestrates static workflows rather than LLM-driven tool use.

Question 14


Aurora Media Group built a multilingual foundation model that translates everyday text well but misses terminology used in oil and gas compliance reports; the team plans to fine-tune the model in Amazon SageMaker with about 1,600 domain documents and a bilingual glossary to improve accuracy; which data preparation approach is most critical to capture this specialized vocabulary?


The essential strategy is to prepare high-quality, domain-specific training pairs that are explicitly labeled with your glossary so the model can learn how specialized terms should be translated. Curate and label domain-parallel examples using a consistent glossary directly exposes the model to the correct terminology and patterns in context, which is exactly what domain adaptation requires.

Amazon Translate is a managed service for performing translations at inference time and is not a data preparation method for training or fine-tuning your own model.

Randomly sample sentences from the corpus without quality checks introduces noise and irrelevant data, making it less likely the model will learn precise domain terminology.

Rely solely on the base pre-trained model without any domain data curation does not adapt the model to the specialized vocabulary and therefore will not resolve the accuracy gap for the target domain.

Exam Tip

When a scenario asks about improving performance for a specialized domain, look for curated and labeled domain data rather than adding more general data, skipping quality checks, or using a service that operates at inference time.

Question 15


Which method best grounds LLM responses in trusted internal data to reduce hallucinations?


RAG over trusted data is correct because it retrieves authoritative documents at query time and provides that context to the model, which anchors responses in verified sources and reduces hallucinations. This is the standard pattern for grounding LLM outputs in enterprise data.

Fine-tuning is not ideal for keeping answers factually accurate against dynamic data, since it changes weights but does not connect to up-to-date sources.

Prompt engineering can improve instructions but cannot ensure factual grounding without retrieval.

Guardrails for Amazon Bedrock help enforce safety and content policies but do not supply correct facts.

Exam Tip

When you see keywords like grounding, enterprise data, or reducing hallucinations, think RAG. Services like knowledge bases can implement this pattern, but the approach itself is retrieval plus generation.

Question 16


A recruiting analytics startup is building a machine learning tool to assist employers with initial candidate screening. During evaluation, the team observes that the model appears to systematically prefer certain demographic groups. They believe the training dataset reflects biased hiring patterns from the past, which could drive skewed predictions. The team now wants to pinpoint how human decision making can introduce bias into the system. Which scenario best demonstrates human bias influencing machine learning outcomes?


The best answer is A data scientist manually chooses input variables based on their own assumptions about what matters, which embeds personal bias into the model. This directly illustrates human bias because subjective judgments during feature selection can encode personal beliefs into the model, affecting its outcomes.

A machine learning algorithm that forecasts customer churn using last month’s data produces misleading results because the dataset reflects a holiday spike and off-season slump is about temporal or seasonal data skew, which is a data quality issue rather than human bias.

Amazon Comprehend shows inconsistent sentiment scores for social media posts that contain sarcasm and slang reflects a model’s difficulty with linguistic nuance, not bias caused by human decisions.

A model trained on legacy recruiting records consistently ranks male applicants higher for software engineering roles is an example of bias inherited from historical data, often termed algorithmic or data bias, rather than bias introduced by human choices during modeling.

Exam Tip

When a question asks about human bias, look for clues about subjective choices by people, such as feature selection, labeling criteria, or excluding attributes; outcomes driven by historic patterns point to data or algorithmic bias instead.

Question 17


Aurora Outfitters, an online marketplace, stores 18 months of purchase history, inventory counts, and revenue metrics in a central relational database. Leadership wants regional supervisors and marketing coordinators to ask questions in plain English and get answers without knowing schemas or SQL. Which approach best enables this capability?


The best solution is to use Use a generative AI system to translate natural language prompts into executable SQL in real time. An NL-to-SQL approach allows nontechnical staff to ask questions like “Show last week’s top five items by revenue” and have the system generate and run the appropriate SQL over the relational dataset.

Create fixed dashboards backed by preset SQL that users can pick from only supports predefined questions and cannot adapt to ad hoc, plain-language requests.

Implement a simple keyword search that matches terms but cannot interpret full questions into structured queries lacks understanding of user intent and does not produce executable SQL for precise analytics.

Amazon Kendra is designed for intelligent document and FAQ search across content repositories, not for translating natural language into SQL against transactional data.

Exam Tip

When the requirement is ad hoc analytics for nontechnical users who speak in plain language, look for NL-to-SQL with a generative model; static dashboards or keyword/document search tools do not satisfy this need.

Question 18


Which approach enforces, at generation time, that Amazon Bedrock responses never include PII?


Guardrails for Amazon Bedrock plus CloudWatch alarms is correct because Guardrails provide native, proactive enforcement to block PII during response generation, ensuring sensitive information never leaves the model. CloudWatch alarms can notify on guardrail events for operational awareness.

The option Amazon Comprehend PII entity detection identifies PII but requires custom integration and does not provide centralized, policy-based blocking for Bedrock outputs.

The option AWS WAF sensitive data rules focuses on HTTP traffic inspection at the edge and is not intended to filter or control Bedrock model responses.

The option Amazon Macie on S3 outputs scans data at rest in S3, so it cannot prevent PII from being generated or returned in real time.

Exam Tip

When you see requirements like “ensure no PII in model outputs” for Bedrock, choose the native enforcement feature. Prefer proactive controls (guardrails) over detection-only tools such as logging, at-rest scanning, or edge firewalls for runtime model safety.

Question 19


A subscription-based video platform is training a recommendation model using 30 months of user interaction data, and the ML team must tune hyperparameters to balance model complexity and generalization. They want to avoid both overfitting and underfitting to ensure strong results on historical data as well as validation sets. How should the team distinguish overfitting from underfitting when evaluating performance on training data versus new data?


The correct differentiation is Overfitting means the model scores high on the training set but degrades on new data, whereas underfitting means the model performs poorly on both training and new data. Overfitting indicates the model has learned patterns plus noise specific to the training set, while underfitting signals the model is too simple to capture the underlying relationships.

Overfitting and underfitting both describe a model that performs similarly well on training data and on unseen data is wrong because similar strong performance on both sets indicates good generalization, not either problem.

Amazon SageMaker Debugger is not a definition; it is a tool that can help detect training issues, but it does not explain the conceptual difference requested here.

Overfitting happens when a model is too simple to represent the data, while underfitting happens when a model is overly complex and memorizes noise is incorrect because it inverts the standard definitions.

Exam Tip

Link the terms to performance patterns: overfitting equals high train and low test performance, while underfitting equals low train and low test performance; use the generalization gap to quickly decide which issue you are seeing.

Question 20


A retail marketplace called NovaStreet is creating a customer assistant that accepts written questions plus uploaded product photos or screenshots of problems. The team wants the assistant to jointly interpret text and images and return accurate, context-aware replies while keeping inference costs under control. Which approach would be the most cost-effective way to support these multimodal requests?


The correct answer is A multimodal embedding model that aligns image and text inputs in a common vector space. Embeddings let you encode both images and text into a shared vector space for similarity search and grounding, which is typically cheaper and faster than running full multimodal generation for every request. You can store vectors in a managed vector database and perform retrieval to provide relevant, context-aware responses with lower inference cost.

A convolutional neural network is primarily for image processing and lacks native mechanisms to jointly reason over text and images without additional components.

A text-only large language model cannot ingest or interpret images, so it fails to meet the multimodal requirement.

A multimodal generative model can handle both modalities end to end, but it generally costs more to run continuously compared to retrieval powered by multimodal embeddings for most support and search-style interactions.

Exam Tip

When the goal is to understand images and text together with cost efficiency, think embeddings for retrieval before reaching for full multimodal generation.

Jira, Scrum & AI Certification

Want to get certified on the most popular software development technologies of the day? These resources will help you get Jira certified, Scrum certified and even AI Practitioner certified so your resume really stands out..

You can even get certified in the latest AI, ML and DevOps technologies. Advance your career today.

Cameron McKenzie Cameron McKenzie is an AWS Certified AI Practitioner, Machine Learning Engineer, Solutions Architect and author of many popular books in the software development and Cloud Computing space. His growing YouTube channel training devs in Java, Spring, AI and ML has well over 30,000 subscribers.