AWS DevOps Professional Practice Tests on Exam Topics
The AWS Certified DevOps Engineer Professional exam validates your ability to provision, operate, and manage distributed systems on AWS with strong focus on automation, observability, reliability, and security.
It targets practitioners with two or more years of hands-on experience with AWS, CI or CD, infrastructure as code, and operational excellence practices.
If you are mapping a broader certification plan, see the full AWS certification catalog and compare adjacent tracks such as Solutions Architect, Security, Developer, Data Engineer, and ML Specialty.
For cross-cloud perspective you can also review the GCP pathways including DevOps Engineer and Solutions Architect Professional.
DevOps exam basics
The AWS Certified DevOps Engineer Professional exam measures how you design and run automated delivery systems, secure and govern multi account environments, monitor complex workloads, and respond to incidents. Question types include multiple choice and multiple response items.
The exam uses a scaled score from 100 to 1000 with a minimum passing score of 750. Scoring is compensatory which means you pass based on overall performance.
The AWS Certified DevOps Engineer Professional exam includes unscored questions that AWS uses to evaluate future content.
If this is your first AWS exam, warm up with the Cloud Practitioner to learn the format, then step up to professional level expectations, which are similar in difficulty to Solutions Architect Professional.
- About 65 scored questions plus unscored trial items
- Multiple choice and multiple response formats
- Scaled scoring with a 750 minimum to pass
- Compensatory scoring across domains
DevOps Exam Sample Questions
All DevOps questions come from certificationexams.pro and my Certified DevOps Engineer Udemy course.
A Vancouver-based fintech has opened a satellite office in Paris, France. Employees in Paris are experiencing slow responses from the company’s CRM. The application is hosted on Amazon EC2 instances behind an Application Load Balancer and stores data in an Amazon DynamoDB table. The workload currently runs in the us-west-2 Region. A DevOps engineer must reduce end-user latency and improve resiliency for users in both North America and Europe. What actions should be taken? (Choose 3)
-
❏ A. Put Amazon CloudFront in front of the current ALB to accelerate requests globally
-
❏ B. Configure Amazon Route 53 latency-based routing with health checks to direct users to the closest ALB
-
❏ C. Create a new DynamoDB table in eu-west-3 and enable cross-Region replication for it
-
❏ D. Deploy a new Application Load Balancer and an Auto Scaling group in eu-west-3 and configure the new ALB to route traffic to the new Auto Scaling group
-
❏ E. Add an Auto Scaling group in eu-west-3 and register it with the existing ALB in us-west-2
-
❏ F. Use DynamoDB Global Tables to add a replica of the CRM table in eu-west-3
A regional logistics startup is moving its Docker workloads from its data center to AWS. They will run several services on Amazon ECS using the EC2 launch type behind an Application Load Balancer. The operations team needs the platform to automatically aggregate all container and load balancer logs and deliver them to an Amazon S3 bucket for near-real-time analysis, targeting about a two-minute end-to-end delay. How should the team configure the ECS environment to achieve this? (Choose 3)
-
❏ A. Enable access logging on the Application Load Balancer and configure the destination to the specified S3 bucket
-
❏ B. Set up Amazon Macie to scan the S3 bucket and provide near-real-time analysis of the access logs
-
❏ C. Use the awslogs log driver in ECS task definitions, install the CloudWatch Logs agent on the container instances, and grant the needed permissions on the instance role
-
❏ D. Create a CloudWatch Logs subscription filter to an Amazon Kinesis Data Firehose delivery stream that writes continuously to the S3 bucket
-
❏ E. Turn on Detailed Monitoring in CloudWatch for the load balancer to store access logs in S3
-
❏ F. Use an AWS Lambda function triggered by an Amazon EventBridge schedule every 2 minutes to call CreateLogGroup and CreateExportTask to move logs to S3
An engineering sandbox account at Orion Robotics allows unrestricted experimentation with AWS services. A recent review in AWS Trusted Advisor flagged multiple Amazon EC2 instances using default security groups that expose SSH on port 22 to the internet. The security team requires that SSH be limited to the corporate data center’s public IP, even in this nonproduction account. The team also wants near real-time alerts about Trusted Advisor security findings and automated remediation when port 22 is found to be open to the world. Which actions should be implemented? (Choose 3)
-
❏ A. Set up a custom AWS Config remediation that calls a Lambda function to update security groups for port 22 to the office IP
-
❏ B. Create an AWS Config rule that flags any security group with port 22 open to 0.0.0.0/0 and sends a notification to an SNS topic when noncompliant
-
❏ C. Enable AWS Security Hub and use AWS Foundational Security Best Practices to auto-remediate open port 22 findings from Trusted Advisor
-
❏ D. Schedule an Amazon EventBridge rule to invoke a Lambda function that calls the AWS Support API to refresh Trusted Advisor checks and publishes findings to an SNS topic
-
❏ E. Implement a custom AWS Config rule with an AWS Systems Manager Automation remediation runbook to restrict port 22 in security groups to the headquarters IP
-
❏ F. Configure a Lambda function to run every 15 minutes to refresh Trusted Advisor checks and rely on Trusted Advisor email alerts for changes
A platform engineer at Aurora Outfitters operates an Amazon EKS cluster with managed node groups. The cluster has an OIDC provider configured for IAM roles for service accounts, and applications request gp3-backed persistent volumes via a StorageClass named gp3-standard. When creating a 40 GiB PersistentVolumeClaim, kubectl describe pvc shows failed to provision volume with StorageClass and could not create volume in EC2: UnauthorizedOperation. What is the most appropriate way to eliminate these errors?
-
❏ A. Create a Kubernetes ClusterRole and RoleBinding that lets persistent volumes and storage objects be listed, watched, created, and deleted
-
❏ B. Change the PVC to reference a pre-existing EBS volume and disable dynamic provisioning
-
❏ C. Configure an IAM role for the EBS CSI driver using IRSA and attach it to the add-on so it can call the required EC2 APIs
-
❏ D. Add the necessary EC2 permissions to the node group instance profile so pods inherit them
A data analytics consultancy named Ridgewave Partners collaborates with about 18 client companies. The consultancy must roll out the same application to client-owned AWS accounts across three Regions by using AWS CloudFormation. Each client has authorized the consultancy to create IAM roles in their account solely for the deployment. The consultancy wants to minimize ongoing operations and avoid per-account stack babysitting. What is the most appropriate approach to deploy to all accounts?
-
❏ A. Create an AWS Organization with all features, invite the client accounts, and use CloudFormation StackSets with service-managed permissions
-
❏ B. Establish a cross-account IAM role in every client account that trusts the consultancy account for StackSets operations, then deploy using CloudFormation StackSets with self-managed permissions
-
❏ C. Upload the template to a shared Amazon S3 bucket, create admin users in each client account, and sign in to build stacks from the shared template
-
❏ D. Adopt AWS Control Tower and use a delegated administrator to roll out the application with service-managed StackSets
A ticketing startup operates a Node.js web service on Amazon EC2 instances in an Auto Scaling group behind an Application Load Balancer. During abrupt traffic surges, some scale-out events fail and intermittent errors appear. CloudWatch logs show messages like: ‘Instance did not finish the user’s lifecycle action; lifecycle action with token was abandoned due to heartbeat timeout.’ What should a DevOps engineer do to capture logs from all impacted instances and preserve them for later analysis? (Choose 3)
-
❏ A. Enable VPC Flow Logs for the subnets hosting the Auto Scaling group and export them to Amazon S3
-
❏ B. Configure an Auto Scaling lifecycle hook on instance termination and use an Amazon EventBridge rule to trigger an AWS Lambda function that runs AWS Systems Manager Run Command to pull application logs and upload them to Amazon S3
-
❏ C. Update the Auto Scaling group health check to point to the correct application port and protocol
-
❏ D. Turn on access logging at the target group and send logs to an Amazon S3 bucket
-
❏ E. Adjust the AWS CodeDeploy deployment group to resolve account deployment concurrency limits that have been reached
-
❏ F. Use Amazon Athena to query the logs directly from the Amazon S3 bucket
BlueOrbit, a fintech startup, uses Amazon Cognito user pools to authenticate its web portal users. MFA is enforced, and the team needs to protect access to user profiles. They also require an automatic email notification to the user whenever a sign-in succeeds. Which approach is the most operationally efficient way to achieve this?
-
❏ A. Create an AWS Lambda function that uses Amazon SES to send an email and front it with an Amazon API Gateway endpoint that the client app calls after confirming login
-
❏ B. Configure Amazon Cognito to deliver authentication logs to Amazon Kinesis Data Firehose, process them with AWS Lambda, and send emails based on login outcomes
-
❏ C. Attach an AWS Lambda function that uses Amazon SES to the Amazon Cognito post authentication trigger to send the login email
-
❏ D. Use Amazon EventBridge to match AWS CloudTrail events for Cognito sign-ins and invoke an AWS Lambda function that emails the user via Amazon SES
At Aurora Dynamics, the platform team uses AWS CodeDeploy to publish a new version of an AWS Lambda function once a CodeBuild stage in an AWS CodePipeline completes successfully. Before traffic is shifted, the pipeline may trigger a Step Functions state machine named AssetReorgFlow-v3 that runs an Amazon ECS on Fargate task to reorganize objects in an Amazon S3 bucket for forward compatibility. The Lambda update must not receive production traffic until that reorganization has fully finished. What should you implement to keep traffic from reaching the new Lambda version until the workflow is done?
-
❏ A. Use an AfterAllowTraffic hook in the AppSpec to verify the Step Functions execution after traffic is shifted
-
❏ B. Configure a CodeDeploy canary to send 20% of traffic for 15 minutes during the S3 reorganization
-
❏ C. Add a BeforeAllowTraffic lifecycle hook in the Lambda AppSpec that waits for the Step Functions execution to complete
-
❏ D. Insert a Step Functions state that calls a CodeDeploy API to tell it to switch the Lambda alias when ready
NovaRetail uses AWS Control Tower to run a landing zone with 36 AWS accounts across two Regions. Each product team owns its own application account, and a shared DevOps account provides CI/CD pipelines. A CodeBuild project in the DevOps account must deploy to an Amazon EKS cluster named prodclaims that runs in a team account. The team already has a deployment IAM role in its account, and the cluster uses the default aws-auth ConfigMap. CodeBuild runs with a service role in the DevOps account, but kubectl commands fail with Unauthorized when connecting cross account. What should be changed so the pipeline can authenticate and deploy successfully?
-
❏ A. Modify the DevOps account deployment role to trust the application account using sts:AssumeRoleWithSAML and grant the role access to CodeBuild and the EKS cluster
-
❏ B. Create an IAM OIDC identity provider and have CodeBuild call sts:AssumeRoleWithWebIdentity to reach the cluster without updating aws-auth
-
❏ C. In the application account, update the deployment role trust policy to trust the DevOps account with sts:AssumeRole, attach EKS permissions to that role, and add the role to the cluster aws-auth ConfigMap
-
❏ D. Configure the trust relationship on the centralized DevOps role to allow the application account using sts:AssumeRole, then grant EKS permissions and update aws-auth
A streaming media startup runs a serverless backend that handles tens of thousands of API calls with AWS Lambda and stores state in Amazon DynamoDB. Clients invoke a Lambda function through an Amazon API Gateway HTTP API to read large batches from the DynamoDB PlaybackSessions table. Although the table uses DynamoDB Accelerator, users still see cold-start delays of roughly 8–12 seconds during afternoon surges. Traffic reliably peaks from 3 PM to 6 PM and tapers after 9 PM. What Lambda configuration change should a DevOps engineer make to keep latency consistently low at all times?
-
❏ A. Configure reserved concurrency for the function and use Application Auto Scaling to set reserved concurrency to roughly half of observed peak traffic
-
❏ B. Enable provisioned concurrency for the function and configure Application Auto Scaling with a minimum of 2 and a maximum of 120 provisioned instances
-
❏ C. Increase the Lambda memory size to 8,192 MB to reduce initialization and execution time
-
❏ D. Set the function’s ephemeral storage to 10,240 MB to cache data in /tmp between invocations
All DevOps questions come from certificationexams.pro and my Certified DevOps Engineer Udemy course.
Lumina Labs runs workloads in nine AWS accounts and wants to monitor the cost-effectiveness of Amazon EC2 across these accounts. All EC2 resources are tagged with environment, cost center, and division for chargeback. Leadership has asked the DevOps team to automate cross-account cost optimization in shared environments, including detecting persistently low-utilization EC2 instances and taking action to remove them. What is the most appropriate approach to implement this automation?
-
❏ A. Create Amazon CloudWatch dashboards filtered by environment, cost center, and division tags to monitor utilization, and use Amazon EventBridge rules with AWS Lambda to stop or terminate underused instances
-
❏ B. Use AWS Trusted Advisor with a Business or Enterprise Support plan integrated with Amazon EventBridge, and invoke AWS Lambda to filter on tags and automatically terminate consistently low-utilization EC2 instances
-
❏ C. Enroll all accounts in AWS Compute Optimizer and use its recommendations to automatically shut down instances via an Amazon EventBridge rule and AWS Lambda
-
❏ D. Build a scheduled EC2-based script to collect utilization across accounts, store results in Amazon DynamoDB, visualize in Amazon QuickSight, and trigger AWS Lambda to terminate idle instances
Zento Labs operates a media-sharing service where users upload images and receive device-optimized variants in iOS and Android apps, desktop browsers, and popular chat platforms. A Lambda function inspects the request User-Agent and chooses the right image size. To deliver consistently low latency for viewers across several continents, what should you implement?
-
❏ A. Expose the Lambda through Amazon API Gateway using an edge-optimized endpoint
-
❏ B. Associate the function with a CloudFront distribution using Lambda@Edge
-
❏ C. Use CloudFront Functions for the device-based response logic
-
❏ D. Configure the Lambda function as the origin for a CloudFront distribution
A travel-tech company, Lumen Journeys, operates a web application on an Auto Scaling group of Amazon EC2 instances across multiple Availability Zones behind an Application Load Balancer. The database runs on Amazon RDS for MySQL, and Amazon Route 53 directs clients to the load balancer. The ALB health check hits an endpoint that verifies the application can connect to the database. Due to regulatory requirements, leadership mandates a disaster recovery site in a separate AWS Region. The recovery point objective is 10 minutes and the recovery time objective is 90 minutes. Which approach would require the fewest changes to the existing stack?
-
❏ A. Enable RDS Multi-AZ and place the standby in a second Region, duplicate the app tier there, and use Route 53 latency routing to direct traffic during failures
-
❏ B. Recreate the full stack in another Availability Zone, add a MySQL read replica in that AZ, and configure Route 53 failover to switch when the primary AZ is unavailable
-
❏ C. Deploy the application tier in a different AWS Region without RDS, create a cross-Region RDS MySQL read replica, point the DR app to the local replica, and use Route 53 failover routing
-
❏ D. Migrate the database to Amazon Aurora MySQL-Compatible and use Aurora Global Database for cross-Region DR
NovaEdge Solutions runs a CI/CD pipeline in AWS that builds with CodeBuild and deploys to a fleet of Amazon EC2 instances through CodeDeploy. The team needs a mandatory human sign-off before any release reaches production, even when all unit and integration tests pass, and the workflow is managed by CodePipeline. What is the simplest and most cost-effective way to add this enforced approval gate?
-
❏ A. Run the unit and integration tests with AWS Step Functions, then add a test action after the last deploy, add a manual approval with SNS notifications, and finally add a deploy action to promote to production
-
❏ B. Use CodeBuild to execute the tests, insert a Manual approval action in CodePipeline immediately before the production CodeDeploy stage with SNS notifications to approvers, then proceed to the production deploy after approval
-
❏ C. Use CodeBuild for tests and create a custom CodePipeline action with a bespoke job worker to perform the approval, notify through SNS, and promote on success
-
❏ D. Perform the tests in a self-managed Jenkins or GitLab on EC2, add a test action, add a manual approval in the pipeline with SNS notifications, and then deploy to production
A DevOps engineer at Apex Retail Solutions must design a disaster recovery plan for a live web application. The application runs on Amazon EC2 instances in an Auto Scaling group across multiple Availability Zones behind an Application Load Balancer, and Amazon Route 53 uses an alias to the ALB. The database is Amazon RDS for PostgreSQL. Leadership requires an RTO of up to four hours and an RPO near 20 minutes while keeping ongoing costs low. Which disaster recovery approach best meets these goals?
-
❏ A. Maintain a warm standby in another Region with a smaller but fully functional environment and scale out with Auto Scaling during failover
-
❏ B. Implement a pilot light in a secondary Region with a cross-Region RDS PostgreSQL read replica, a minimal application stack ready to start, Route 53 health checks for failover, and promote the replica to primary during disaster
-
❏ C. Run multi-site active/active across two Regions with Route 53 latency-based routing and replicate all components for near-zero RPO
-
❏ D. Use a backup-and-restore pattern with AWS Backup to copy RDS snapshots to another Region, rebuild EC2 and ALB on demand, and update DNS after the restore completes
A healthcare analytics startup operates workloads across 12 AWS accounts that are organized under AWS Organizations. After a compliance assessment, leadership requires that all threat detections and security logs be funneled into a single security account where analysts will triage findings and retain raw events for at least 120 days. Which approach enables automated detection of attacks on EC2 instances across every account and centralized delivery of findings into an S3 bucket in the security account?
-
❏ A. Use Amazon Inspector across the organization, assign a delegated administrator, add an EventBridge rule in that account, and stream results to a central S3 bucket with Kinesis Data Firehose
-
❏ B. Enable Amazon GuardDuty for the entire organization with a delegated administrator, capture GuardDuty findings via an EventBridge rule in the admin account, and deliver them to an S3 bucket using Kinesis Data Firehose
-
❏ C. Deploy Amazon Macie in every account, make one account the Macie administrator, forward alerts with an EventBridge rule, and archive them to S3 through Kinesis Data Firehose
-
❏ D. Run Amazon GuardDuty separately in each account, send findings to Kinesis Data Streams, and write directly to S3 from the stream
Vertex Media Labs uses cost allocation tags to charge back AWS spend. Their analytics service runs on Amazon EC2 instances in an Auto Scaling group that launches from a template. Newly attached Amazon EBS volumes on these instances are missing the required CostCenterId tag value 8421. A DevOps engineer must implement the most efficient change so that EBS volumes receive the correct cost center tags automatically at creation time. What should the engineer do?
-
❏ A. Create an Amazon EventBridge rule for CreateVolume and invoke an AWS Lambda function to apply the CostCenterId tag to new EBS volumes
-
❏ B. Add the CostCenterId tag to the Auto Scaling group and enable PropagateAtLaunch
-
❏ C. Update the Auto Scaling launch template to include TagSpecifications for EBS volumes with the required cost center tags
-
❏ D. Use AWS Config to prevent creation of EBS volumes that are missing the CostCenterId tag
The platform engineering team at NovaCare Analytics manages more than 320 AWS accounts through AWS Organizations. Security mandates that every EC2 instance launches from a centrally approved, hardened base AMI. When a new AMI version is released, the team must ensure no new instances are started from the previous AMI, and they also need a centralized and auditable view of AMI compliance across all accounts. What approach should be implemented to meet these goals across the organization? (Choose 2)
-
❏ A. Use AWS Systems Manager Automation distributed with AWS CloudFormation StackSets to build the AMI inside every account
-
❏ B. Deploy an AWS Config custom rule with AWS CloudFormation StackSets to check instance AMI IDs against an approved list and aggregate results in an AWS Config aggregator in the management account
-
❏ C. Create the AMI in a central account and copy it to each account and Region whenever a new version is published
-
❏ D. Use AWS Systems Manager Automation to produce the AMI in a central account and share it with organizational accounts, then revoke sharing on the previous AMI and share the new one when updated
-
❏ E. Publish the approved AMI as a product in AWS Service Catalog across the organization
Delta Ledger, a regional fintech startup, manages about 36 AWS accounts across five organizational units with AWS Organizations. The governance team believes an unknown external AWS account was invited into the organization and granted broad permissions, though no harmful actions were taken. Which monitoring setup would most effectively deliver near-real-time alerts when organization membership or account-related changes occur? (Choose 2)
-
❏ A. Use AWS Config with an organization aggregator to evaluate changes across all accounts and send notifications through Amazon SNS or Amazon EventBridge when the organization’s structure or account configuration changes
-
❏ B. Deploy a third-party SIEM from AWS Marketplace, integrate with Amazon GuardDuty findings, and publish administrator alerts via Amazon SNS
-
❏ C. Create an organization-level trail in AWS CloudTrail that records all AWS Organizations API and console activity, and wire Amazon EventBridge rules to trigger an SNS alert for administrator-defined events such as invited or created accounts
-
❏ D. Use AWS Systems Manager with Amazon EventBridge to watch for organizational updates and notify the platform team of new activities
-
❏ E. Enable AWS Security Hub across the organization and pair it with Amazon Detective to surface suspicious behavior and notify the security team
NorthPeak Media runs a containerized microservice on Amazon ECS inside a private VPC, and over the past 14 days users have experienced intermittent timeouts and high latency in production. Leadership asked the DevOps engineer to enable distributed tracing to identify which request paths and downstream services are causing the slow responses. What should the engineer implement to collect end-to-end traces from the ECS tasks?
-
❏ A. Add an xray-daemon.config file to the image and map UDP 2000 in the task definition
-
❏ B. AWS CloudTrail
-
❏ C. Build an image for the AWS X-Ray daemon, push it to ECR, run it as a sidecar in the ECS task, and open UDP 2000 via task definition port mappings
-
❏ D. Install the X-Ray daemon via user data in /etc/ecs/ecs.config and expose TCP 3000 on the container agent
All DevOps questions come from certificationexams.pro and my Certified DevOps Engineer Udemy course.
A DevOps engineer at a digital ticketing startup named VeloTix manages about 40 Windows Server Amazon EC2 instances and must automate patching with an approved baseline while ensuring that reboots do not occur at the same time to avoid payment delays and lost revenue. How should the engineer design the patching process to stagger reboots and maintain availability?
-
❏ A. Create one patch group tagged on all Windows instances, associate AWS-DefaultPatchBaseline, configure a single Systems Manager maintenance window, and run AWS-RunPatchBaseline during that window
-
❏ B. Configure two patch groups with unique tags mapped to all Windows instances, attach AWS-DefaultPatchBaseline to both, use EventBridge rules to trigger Systems Manager Run Command with a cron schedule, and manage custom steps with State Manager during execution
-
❏ C. Define two distinct patch groups with unique tags for the Windows fleet, associate each to AWS-DefaultPatchBaseline, create two non-overlapping Systems Manager maintenance windows targeted by those patch group tags, and schedule AWS-RunPatchBaseline in each window with different start times
-
❏ D. Use AWS Systems Manager Distributor to push Windows updates to instances on demand and rely on Auto Scaling rolling updates to stagger reboots across the fleet
Northwind Outfitters operates a web API that fronts Amazon EC2 instances behind an Application Load Balancer by using an Amazon API Gateway REST API. The engineering team wants new releases to be rolled out with minimal user impact and with the ability to revert quickly if defects are found. What approach will achieve this with the least changes to the existing application?
-
❏ A. Use AWS CodeDeploy blue/green with the Auto Scaling group behind the ALB and shift production traffic to the new revision
-
❏ B. Create a parallel environment behind the ALB with the new build and configure API Gateway canary release to send a small portion of requests to it
-
❏ C. Stand up a duplicate environment and update the Route 53 alias to the new stack
-
❏ D. Create a new target group for the ALB and have API Gateway weight requests directly to that target group
CineWave Media is wrapping up its move to AWS and has about 30 engineers, with a handful holding the AWS Certified DevOps Engineer Professional while many newer teammates have not earned Associate-level certifications yet. The company enforces specific architecture patterns and mandatory tags and wants to reduce the chance that less experienced users deploy noncompliant resources, without blocking them from provisioning what they need. What should the DevOps engineer implement?
-
❏ A. Enable AWS Config with custom rules powered by AWS Lambda to evaluate compliance and give users broad permissions while you improve the rules over time
-
❏ B. Create a starter IAM group for novices and attach a policy that requires senior engineer approval before any resource creation, using an SNS topic for the approval workflow
-
❏ C. Package approved architectures as AWS CloudFormation templates and publish them as products in AWS Service Catalog with required tags, then allow beginners to launch only Service Catalog products and deny write access to other services
-
❏ D. Define reusable AWS CloudFormation templates and permit beginners to create stacks directly in CloudFormation while restricting write access to other services
A regional credit union operates a VMware-based automated golden image pipeline in its on-premises data center. The DevOps engineer needs a way to validate those server images with the current on-prem pipeline while closely mirroring how they will run on Amazon EC2. The team is moving to Amazon Linux 2 in AWS and wants to confirm functionality, surface incompatibilities, and identify dependencies before a 90-day cutover. What approach should be implemented to meet these goals?
-
❏ A. Deploy AWS Outposts racks with Amazon Linux 2 hosts connected to the data center and execute tests there
-
❏ B. Use AWS VM Import/Export to import the on-prem VMware image as an EC2 AMI, validate on EC2, then export the imported instance as a VMware-compatible OVA to Amazon S3 and load it back into vSphere for on-prem testing
-
❏ C. Provision a local server on Ubuntu or Fedora since any Linux distribution is effectively equivalent to Amazon Linux 2 for testing
-
❏ D. Download an Amazon Linux 2 ISO and install it directly on a physical host in the data center for validation
BrightCart, a national online retailer, recently migrated from another cloud to AWS. The web tier runs in an Auto Scaling group behind an Application Load Balancer. The team will publish an Amazon CloudFront distribution with a custom domain whose origin is the ALB. What should the engineers configure to enforce HTTPS from viewers to CloudFront and from CloudFront back to the ALB?
-
❏ A. Create a self-signed certificate for the ALB, set CloudFront Viewer Protocol Policy to HTTPS Only, and use a third-party certificate imported into ACM or the IAM certificate store for the custom domain
-
❏ B. Import a trusted CA certificate into ACM for the ALB, choose Match Viewer for the CloudFront Viewer Protocol Policy, and use a third-party certificate imported into ACM or the IAM certificate store for the custom domain
-
❏ C. Request or import an ACM certificate for the ALB, associate an ACM certificate in us-east-1 with the CloudFront distribution’s custom domain, set Viewer Protocol Policy to HTTPS Only, and configure the origin to use HTTPS
-
❏ D. Upload a trusted CA certificate to an Amazon S3 bucket for CloudFront to reference, set Viewer Protocol Policy to HTTPS Only, use the default CloudFront certificate for the custom domain, and forward requests to the origin over HTTP
A platform team at a fintech startup plans to launch workloads in six Amazon VPCs across two AWS accounts. The services must have any-to-any connectivity with transitive routing among the VPCs. Leadership wants centralized administration of network traffic policies for consistent security. What architecture should the team implement to meet these needs with the least operational overhead?
-
❏ A. Configure VPC peering between every VPC to build a full mesh and centralize WebACLs with AWS WAF
-
❏ B. Use AWS Transit Gateway for transitive connectivity among VPCs and manage network access policies centrally with AWS Firewall Manager
-
❏ C. Set up AWS PrivateLink endpoints between each VPC and use AWS Security Hub for centralized security policies
-
❏ D. Establish AWS Site-to-Site VPN tunnels between each pair of VPCs and manage policies with AWS Firewall Manager
A fitness streaming startup stores all workout videos in an Amazon S3 bucket. Over the past 9 months, traffic has surged by more than 200x, and the team plans to roll out premium subscription tiers. They must quickly and inexpensively determine which individual video objects are most frequently viewed and downloaded to guide content strategy and pricing. What is the most cost-effective approach that can be implemented immediately?
-
❏ A. Turn on Amazon S3 Storage Lens and visualize activity metrics in the dashboard or Amazon QuickSight
-
❏ B. Enable S3 server access logging and use Amazon Athena with an external table to query the log files and identify top GETs and downloads
-
❏ C. Enable S3 server access logging and use Amazon Redshift Spectrum after provisioning a Redshift cluster to query the logs
-
❏ D. Stream new S3 access logs with event notifications to AWS Lambda, deliver to Kinesis Data Firehose, and index into Amazon OpenSearch Service for analysis
At a media-streaming startup called Polar Pixel, several product squads share one AWS account, and most photos and clips reside in Amazon S3. Some buckets must stay publicly readable on the internet while others are limited to internal services. The company wants to use AWS Trusted Advisor to flag any public buckets and to verify that only approved principals have List permissions. They also want immediate alerts if a public bucket drifts to unsafe settings and automatic fixes when appropriate. What should the DevOps engineer implement to meet these goals? (Choose 3)
-
❏ A. Configure a custom AWS Config rule that evaluates S3 bucket policies for public access and publishes noncompliance notifications to an Amazon SNS topic
-
❏ B. Create a custom Amazon Inspector rule to scan S3 bucket permissions and invoke AWS Systems Manager to fix the bucket policy
-
❏ C. Use Amazon EventBridge to capture Trusted Advisor S3 bucket permission check state changes and trigger an SNS email notification
-
❏ D. Enable S3 Block Public Access at the account level to prevent any bucket from being public and rely on manual exceptions for the few that must be public
-
❏ E. Create a custom AWS Config rule that emits an EventBridge event on violation and have an EventBridge rule invoke a Lambda function to automatically correct the S3 bucket policy
-
❏ F. Schedule an AWS Lambda function to call the Trusted Advisor API every 30 minutes and subscribe to Trusted Advisor summary emails to receive results
A media startup, NovaStream, runs its main web service on Amazon EC2 instances behind a single Application Load Balancer with instances managed by Auto Scaling. The team has created separate launch templates and Auto Scaling groups for blue and green, each registered with distinct target groups, and a Route 53 alias record with a 45 second TTL points to the ALB. They want to perform an immediate cutover of all traffic from the blue version to the newly deployed green version using this single ALB. What should the engineer do to accomplish this?
-
❏ A. Perform an all-at-once deployment to the blue Auto Scaling group, then update the Route 53 alias to an ALB endpoint for the green target group
-
❏ B. Use an AWS CLI command to switch the ALB listener to the green target group first, then run a rolling restart on the green Auto Scaling group to deploy the new build
-
❏ C. Run a rolling update on the green Auto Scaling group to roll out the new build, then use the AWS CLI to move the ALB listener to the green target group
-
❏ D. Run a rolling restart on the green Auto Scaling group to deploy the new build, then change the Route 53 alias to point to the green environment on the ALB
A fintech startup runs a critical service on Amazon EC2 instances in an Auto Scaling group. A lightweight health probe on each instance runs every 5 seconds to verify that the application responds. The DevOps engineer must use the probe results for monitoring and to raise an alarm when failures occur. Metrics must be captured at 1 minute intervals while keeping costs low. What should the engineer implement?
-
❏ A. Use a default CloudWatch metric at standard resolution and add a dimension so the script can publish once every 60 seconds
-
❏ B. Amazon CloudWatch Synthetics
-
❏ C. Create a custom CloudWatch metric and publish statistic sets that roll up the 5 second results, sending one update every 60 seconds
-
❏ D. Use a custom CloudWatch metric at high resolution and push data every 5 seconds
You are a DevOps Engineer at a fintech startup where your application is deployed via AWS CloudFormation into an Auto Scaling group. You have created a new launch configuration that moves to a newer instance family, and the group currently runs 8 instances while at least 5 must remain in service during the update. In the template, which configuration will update the group’s instances in batches so they adopt the new launch configuration while maintaining the required in-service capacity?
-
❏ A. AutoScalingReplacingUpdate
-
❏ B. AWS CodeDeploy
-
❏ C. AutoScalingRollingUpdate
-
❏ D. AutoScalingLaunchTemplateUpdate
Northwind Diagnostics runs its customer portals on Amazon EC2 and wants a managed approach that continuously identifies software vulnerabilities and unexpected network exposure on those instances. The security team also needs a centralized audit trail of all user logins to the servers kept for 90 days. Which solution best satisfies these needs?
-
❏ A. Configure Amazon GuardDuty with Amazon EventBridge and AWS Lambda for automated remediation
-
❏ B. Use AWS Systems Manager SSM Agent to detect vulnerabilities on EC2 and run an Automation runbook to patch them
-
❏ C. Deploy Amazon Inspector for EC2 vulnerability and exposure scans, install the CloudWatch Agent to forward login logs to CloudWatch Logs, and send CloudTrail events to CloudWatch Logs for centralized auditing
-
❏ D. Enable Amazon ECR image scanning with EventBridge notifications and route CloudTrail data to EventBridge for processing
Meridian Assurance, a global insurance firm, is rolling out centralized compliance controls across its AWS organization. Every API invocation in all member accounts must be captured for audits, and the company relies on AWS CloudTrail to record activity and flag sensitive operations. Leadership has asked the platform team to implement an automated guardrail so that if CloudTrail logging is turned off in any account, it is quickly turned back on with minimal interruption to log delivery. Which approach best meets this requirement?
-
❏ A. Create a CloudWatch Logs metric filter for StopLogging events and alarm to an SNS topic for notifications only
-
❏ B. Deploy the cloudtrail-enabled AWS Config managed rule with a 30-minute periodic evaluation and use an EventBridge rule for AWS Config compliance changes to invoke a Lambda function that calls StartLogging on the affected trail
-
❏ C. Use EventBridge with a Lambda scheduled every two minutes to query CloudTrail and if DeleteTrail is detected, recreate it with CreateTrail
-
❏ D. Enable the cloudtrail-enabled AWS Config rule with a Configuration changes trigger and rely on its default automatic remediation
Orion Byte Labs runs several services with a MERN front end behind NGINX and uses AWS CodeDeploy to automate rollouts. The team has a QA deployment group and will add PREPROD and PRODUCTION groups later. They want the NGINX log level to be set dynamically at deploy time so each group can have different verbosity without creating separate application revisions or maintaining different scripts per environment. Which approach provides the lowest ongoing management effort and avoids multiple script variants?
-
❏ A. Invoke a script during ApplicationStart that uses the DEPLOYMENT_GROUP_ID environment variable to detect the group and update the NGINX log level
-
❏ B. Use a single script that reads the DEPLOYMENT_GROUP_NAME environment variable in CodeDeploy and call it in the BeforeInstall hook to set NGINX logging per group
-
❏ C. Tag each EC2 instance with its deployment group and have a ValidateService hook script call aws ec2 describe-tags to choose the log level
-
❏ D. Define a custom environment variable in CodeDeploy for each environment such as QA, PREPROD, and PROD, and have a ValidateService hook script read it to set the log level
A DevOps specialist at Norstar Media operates a web application behind an Application Load Balancer across three Availability Zones in one AWS Region. Several EC2 instances in a single zone have begun failing health checks and returning errors. The specialist must isolate that Availability Zone and shift client requests to the healthy zones with minimal changes to the stack. What should be implemented?
-
❏ A. Configure Auto Scaling health checks to replace unhealthy instances in the degraded zone
-
❏ B. Turn off cross-zone load balancing on the ALB and use Amazon Route 53 Application Recovery Controller to start a zonal shift away from the impaired Availability Zone
-
❏ C. Enable cross-zone load balancing and set Amazon Route 53 failover routing so requests are spread evenly across all zones
-
❏ D. AWS Global Accelerator
DevOps Sample Questions Answered
All DevOps questions come from certificationexams.pro and my Certified DevOps Engineer Udemy course.
A Vancouver-based fintech has opened a satellite office in Paris, France. Employees in Paris are experiencing slow responses from the company’s CRM. The application is hosted on Amazon EC2 instances behind an Application Load Balancer and stores data in an Amazon DynamoDB table. The workload currently runs in the us-west-2 Region. A DevOps engineer must reduce end-user latency and improve resiliency for users in both North America and Europe. What actions should be taken? (Choose 3)
-
✓ B. Configure Amazon Route 53 latency-based routing with health checks to direct users to the closest ALB
-
✓ D. Deploy a new Application Load Balancer and an Auto Scaling group in eu-west-3 and configure the new ALB to route traffic to the new Auto Scaling group
-
✓ F. Use DynamoDB Global Tables to add a replica of the CRM table in eu-west-3
The correct actions are Configure Amazon Route 53 latency-based routing with health checks to direct users to the closest ALB, Deploy a new Application Load Balancer and an Auto Scaling group in eu-west-3 and configure the new ALB to route traffic to the new Auto Scaling group and Use DynamoDB Global Tables to add a replica of the CRM table in eu-west-3 because these steps place compute closer to European users route traffic to the nearest healthy endpoint and keep data synchronized across Regions.
The Configure Amazon Route 53 latency-based routing with health checks to direct users to the closest ALB choice steers clients to the Region with the lowest latency and avoids unhealthy endpoints. The Deploy a new Application Load Balancer and an Auto Scaling group in eu-west-3 and configure the new ALB to route traffic to the new Auto Scaling group option provides local load balancing and autoscaled EC2 capacity so responses for Paris users are faster and resilient. The Use DynamoDB Global Tables to add a replica of the CRM table in eu-west-3 option enables active active replication so reads and writes can be served locally and changes are propagated automatically between us-west-2 and eu-west-3.
The Put Amazon CloudFront in front of the current ALB to accelerate requests globally option is not sufficient because CloudFront accelerates static or cacheable content but it does not eliminate application write latency to DynamoDB or provide regional compute capacity. The Create a new DynamoDB table in eu-west-3 and enable cross-Region replication for it option is incorrect because DynamoDB does not offer a generic cross Region replication mechanism outside of Global Tables and a standalone table would require custom synchronization. The Add an Auto Scaling group in eu-west-3 and register it with the existing ALB in us-west-2 option will not work because Application Load Balancers and their target groups are scoped to a single Region and you cannot register targets from another Region.
Focus on multi-Region compute and active active data replication. Remember that ALBs are Regional and that DynamoDB Global Tables are the supported way to replicate data across Regions.
A regional logistics startup is moving its Docker workloads from its data center to AWS. They will run several services on Amazon ECS using the EC2 launch type behind an Application Load Balancer. The operations team needs the platform to automatically aggregate all container and load balancer logs and deliver them to an Amazon S3 bucket for near-real-time analysis, targeting about a two-minute end-to-end delay. How should the team configure the ECS environment to achieve this? (Choose 3)
-
✓ A. Enable access logging on the Application Load Balancer and configure the destination to the specified S3 bucket
-
✓ C. Use the awslogs log driver in ECS task definitions, install the CloudWatch Logs agent on the container instances, and grant the needed permissions on the instance role
-
✓ D. Create a CloudWatch Logs subscription filter to an Amazon Kinesis Data Firehose delivery stream that writes continuously to the S3 bucket
The correct options are Enable access logging on the Application Load Balancer and configure the destination to the specified S3 bucket, Use the awslogs log driver in ECS task definitions, install the CloudWatch Logs agent on the container instances, and grant the needed permissions on the instance role, and Create a CloudWatch Logs subscription filter to an Amazon Kinesis Data Firehose delivery stream that writes continuously to the S3 bucket.
The Enable access logging on the Application Load Balancer and configure the destination to the specified S3 bucket option is correct because Application Load Balancers can deliver detailed request logs directly to S3 which provides the raw ALB request data needed for analysis and meets the requirement to aggregate load balancer logs into the same S3 target.
The Use the awslogs log driver in ECS task definitions, install the CloudWatch Logs agent on the container instances, and grant the needed permissions on the instance role option is correct because the awslogs driver sends container STDOUT and STDERR to CloudWatch Logs and installing the CloudWatch Logs agent on EC2-backed container instances centralizes node level logs while proper IAM on the instance role lets tasks and agents push logs without filling local disk.
The Create a CloudWatch Logs subscription filter to an Amazon Kinesis Data Firehose delivery stream that writes continuously to the S3 bucket option is correct because subscription filters stream log events from CloudWatch Logs in near real time and Firehose can buffer and deliver continuously to S3 which meets a roughly two minute end to end delay target.
Set up Amazon Macie to scan the S3 bucket and provide near-real-time analysis of the access logs is incorrect because Amazon Macie focuses on sensitive data discovery and classification in S3 and it is not a streaming ingestion or real time log analysis service.
Turn on Detailed Monitoring in CloudWatch for the load balancer to store access logs in S3 is incorrect because Detailed Monitoring increases metric resolution and does not generate or deliver ALB access log files to S3.
Use an AWS Lambda function triggered by an Amazon EventBridge schedule every 2 minutes to call CreateLogGroup and CreateExportTask to move logs to S3 is incorrect because CloudWatch Logs export tasks are batch oriented and CreateExportTask is not suitable for near real time delivery and it adds complexity compared with a subscription plus Firehose pipeline.
Stream container logs to CloudWatch with the awslogs driver and then use a subscription filter into Kinesis Data Firehose to write continuously to S3 while enabling ALB access logging to the same bucket.
An engineering sandbox account at Orion Robotics allows unrestricted experimentation with AWS services. A recent review in AWS Trusted Advisor flagged multiple Amazon EC2 instances using default security groups that expose SSH on port 22 to the internet. The security team requires that SSH be limited to the corporate data center’s public IP, even in this nonproduction account. The team also wants near real-time alerts about Trusted Advisor security findings and automated remediation when port 22 is found to be open to the world. Which actions should be implemented? (Choose 3)
-
✓ B. Create an AWS Config rule that flags any security group with port 22 open to 0.0.0.0/0 and sends a notification to an SNS topic when noncompliant
-
✓ D. Schedule an Amazon EventBridge rule to invoke a Lambda function that calls the AWS Support API to refresh Trusted Advisor checks and publishes findings to an SNS topic
-
✓ E. Implement a custom AWS Config rule with an AWS Systems Manager Automation remediation runbook to restrict port 22 in security groups to the headquarters IP
Create an AWS Config rule that flags any security group with port 22 open to 0.0.0.0/0 and sends a notification to an SNS topic when noncompliant, Schedule an Amazon EventBridge rule to invoke a Lambda function that calls the AWS Support API to refresh Trusted Advisor checks and publishes findings to an SNS topic, and Implement a custom AWS Config rule with an AWS Systems Manager Automation remediation runbook to restrict port 22 in security groups to the headquarters IP are correct.
The AWS Config rule gives continuous, resource level evaluation and it can publish noncompliant alerts to an SNS topic for immediate notification. The Systems Manager Automation remediation runbook provides a supported and repeatable remediation path that Config can invoke to change security group rules and restrict SSH to the corporate IP. The EventBridge rule that triggers a Lambda to call the AWS Support API for Trusted Advisor refreshes lets you pull TA results on a schedule and publish those findings to SNS so you get near real time awareness of issues Trusted Advisor detects.
Set up a custom AWS Config remediation that calls a Lambda function to update security groups for port 22 to the office IP is not ideal because the exam and best practice emphasize using Systems Manager Automation runbooks as the native Config remediation mechanism rather than relying on a separate Lambda based fix.
Enable AWS Security Hub and use AWS Foundational Security Best Practices to auto-remediate open port 22 findings from Trusted Advisor is incorrect because Security Hub does not ingest Trusted Advisor findings nor does it provide automatic remediation for Trusted Advisor results.
Configure a Lambda function to run every 15 minutes to refresh Trusted Advisor checks and rely on Trusted Advisor email alerts for changes is wrong because relying on email notifications is not near real time and periodic polling without publishing your own SNS alerts does not meet the requirement for timely notifications and automated remediation.
Use Systems Manager Automation for AWS Config remediations and use EventBridge to trigger Lambda jobs that publish findings to SNS for near real time alerts.
A platform engineer at Aurora Outfitters operates an Amazon EKS cluster with managed node groups. The cluster has an OIDC provider configured for IAM roles for service accounts, and applications request gp3-backed persistent volumes via a StorageClass named gp3-standard. When creating a 40 GiB PersistentVolumeClaim, kubectl describe pvc shows failed to provision volume with StorageClass and could not create volume in EC2: UnauthorizedOperation. What is the most appropriate way to eliminate these errors?
-
✓ C. Configure an IAM role for the EBS CSI driver using IRSA and attach it to the add-on so it can call the required EC2 APIs
The correct option is Configure an IAM role for the EBS CSI driver using IRSA and attach it to the add-on so it can call the required EC2 APIs. This gives the EBS CSI controller the AWS permissions it needs to create and manage gp3 EBS volumes so dynamic provisioning succeeds.
The error shows the component that attempts to create the EBS volume lacks permission to call Amazon EC2 APIs. The EBS CSI driver performs dynamic provisioning and runs as Kubernetes pods. By creating an IAM role that trusts the cluster OIDC provider and granting it the required EC2 actions then attaching that role to the CSI driver add-on or service account the controller can assume the role and call CreateVolume CreateTags DescribeVolumes and related APIs and the UnauthorizedOperation errors will be eliminated.
Create a Kubernetes ClusterRole and RoleBinding that lets persistent volumes and storage objects be listed, watched, created, and deleted is incorrect because Kubernetes RBAC controls access to the Kubernetes API and does not grant permissions to call AWS EC2 APIs required to provision EBS volumes.
Change the PVC to reference a pre-existing EBS volume and disable dynamic provisioning is incorrect because it is only a workaround that avoids dynamic provisioning. It does not solve the underlying AWS permission problem and it reduces automation and flexibility.
Add the necessary EC2 permissions to the node group instance profile so pods inherit them is incorrect because pods do not automatically inherit the node instance profile for fine grained access. AWS recommends using IRSA so the CSI controller assumes a dedicated IAM role and follows least privilege practices.
When you see EKS dynamic EBS provisioning fail with UnauthorizedOperation and an OIDC provider exists check whether the EBS CSI driver has an IRSA IAM role with EC2 permissions before changing RBAC or node profiles.
A data analytics consultancy named Ridgewave Partners collaborates with about 18 client companies. The consultancy must roll out the same application to client-owned AWS accounts across three Regions by using AWS CloudFormation. Each client has authorized the consultancy to create IAM roles in their account solely for the deployment. The consultancy wants to minimize ongoing operations and avoid per-account stack babysitting. What is the most appropriate approach to deploy to all accounts?
-
✓ B. Establish a cross-account IAM role in every client account that trusts the consultancy account for StackSets operations, then deploy using CloudFormation StackSets with self-managed permissions
Establish a cross-account IAM role in every client account that trusts the consultancy account for StackSets operations, then deploy using CloudFormation StackSets with self-managed permissions is correct because clients have permitted creation of IAM roles and this lets the consultancy push the same CloudFormation stacks across each client-owned account and across the three Regions with minimal ongoing per-account work.
With the bolded approach you create the administration and execution roles in each target account and then use self-managed StackSets from your consultancy account to deploy and update the stacks at scale. This design avoids creating users in client accounts and it removes the need to babysit individual stacks after they are deployed because updates can be pushed through the StackSets model.
Create an AWS Organization with all features, invite the client accounts, and use CloudFormation StackSets with service-managed permissions is not appropriate because Create an AWS Organization with all features, invite the client accounts, and use CloudFormation StackSets with service-managed permissions requires the target accounts to be in the same AWS Organizations trust boundary and enabled for service-managed StackSets, which is unlikely for independent client companies.
Upload the template to a shared Amazon S3 bucket, create admin users in each client account, and sign in to build stacks from the shared template is a poor choice because Upload the template to a shared Amazon S3 bucket, create admin users in each client account, and sign in to build stacks from the shared template increases operational toil and it violates the constraint that clients only authorized role creation rather than creating admin users in their accounts.
Adopt AWS Control Tower and use a delegated administrator to roll out the application with service-managed StackSets does not fit the scenario because Adopt AWS Control Tower and use a delegated administrator to roll out the application with service-managed StackSets assumes the client accounts are enrolled in your Control Tower landing zone and managed under the same governance model, which is not the case for separate client-owned accounts.
Choose self-managed StackSets with cross-account roles when deploying to independent client accounts that only allow role creation and when you need multi-Region automated updates.
A ticketing startup operates a Node.js web service on Amazon EC2 instances in an Auto Scaling group behind an Application Load Balancer. During abrupt traffic surges, some scale-out events fail and intermittent errors appear. CloudWatch logs show messages like: ‘Instance did not finish the user’s lifecycle action; lifecycle action with token was abandoned due to heartbeat timeout.’ What should a DevOps engineer do to capture logs from all impacted instances and preserve them for later analysis? (Choose 3)
-
✓ B. Configure an Auto Scaling lifecycle hook on instance termination and use an Amazon EventBridge rule to trigger an AWS Lambda function that runs AWS Systems Manager Run Command to pull application logs and upload them to Amazon S3
-
✓ E. Adjust the AWS CodeDeploy deployment group to resolve account deployment concurrency limits that have been reached
-
✓ F. Use Amazon Athena to query the logs directly from the Amazon S3 bucket
The correct choices are Configure an Auto Scaling lifecycle hook on instance termination and use an Amazon EventBridge rule to trigger an AWS Lambda function that runs AWS Systems Manager Run Command to pull application logs and upload them to Amazon S3, Adjust the AWS CodeDeploy deployment group to resolve account deployment concurrency limits that have been reached, and Use Amazon Athena to query the logs directly from the Amazon S3 bucket.
Configure an Auto Scaling lifecycle hook on instance termination and use an Amazon EventBridge rule to trigger an AWS Lambda function that runs AWS Systems Manager Run Command to pull application logs and upload them to Amazon S3 gives you a deterministic window to collect instance level artifacts before EC2 termination. A lifecycle hook lets you pause termination so the EventBridge triggered Lambda can invoke Systems Manager Run Command to gather application logs and push them to S3 for durable storage.
Adjust the AWS CodeDeploy deployment group to resolve account deployment concurrency limits that have been reached addresses the root cause that can cause lifecycle actions to stall. When CodeDeploy concurrency limits are hit deployments may be throttled and lifecyle hooks can time out, so lifting or tuning those limits lets your log collection workflow complete reliably during traffic surges.
Use Amazon Athena to query the logs directly from the Amazon S3 bucket enables fast serverless analysis of the exported logs without moving data. Once logs are persisted to S3 you can run SQL queries in Athena to filter and aggregate events for troubleshooting.
Enable VPC Flow Logs for the subnets hosting the Auto Scaling group and export them to Amazon S3 is not suitable because VPC Flow Logs capture network metadata and they do not contain application level logs needed to debug the service errors.
Update the Auto Scaling group health check to point to the correct application port and protocol does not capture or store instance logs and there is no indication that a health check misconfiguration explains the abandoned lifecycle actions.
Turn on access logging at the target group and send logs to an Amazon S3 bucket records load balancer access entries and it will not provide the instance level application logs required to diagnose the failures.
Use lifecycle hooks to export logs from ephemeral instances before termination and store them in S3 so you can analyze them with Athena.
BlueOrbit, a fintech startup, uses Amazon Cognito user pools to authenticate its web portal users. MFA is enforced, and the team needs to protect access to user profiles. They also require an automatic email notification to the user whenever a sign-in succeeds. Which approach is the most operationally efficient way to achieve this?
-
✓ C. Attach an AWS Lambda function that uses Amazon SES to the Amazon Cognito post authentication trigger to send the login email
The correct choice is Attach an AWS Lambda function that uses Amazon SES to the Amazon Cognito post authentication trigger to send the login email. This option uses Cognito native hooks to run code immediately after a successful sign in and it sends the notification without extra moving parts.
This approach is efficient because the Attach an AWS Lambda function that uses Amazon SES to the Amazon Cognito post authentication trigger to send the login email runs only after the user has completed authentication including MFA. It keeps email delivery server side so the client does not need to call additional endpoints and it avoids building and operating extra APIs or log pipelines. Using the built in post authentication trigger lets you enforce profile access controls in the same flow that validates the user.
Create an AWS Lambda function that uses Amazon SES to send an email and front it with an Amazon API Gateway endpoint that the client app calls after confirming login is unnecessary because it forces the client to orchestrate email delivery and it increases attack surface and operational overhead by adding an API endpoint that Cognito can already trigger directly.
Configure Amazon Cognito to deliver authentication logs to Amazon Kinesis Data Firehose, process them with AWS Lambda, and send emails based on login outcomes is heavier to operate and slower because it depends on log delivery and stream processing. It also increases cost and latency for a simple immediate notification.
Use Amazon EventBridge to match AWS CloudTrail events for Cognito sign-ins and invoke an AWS Lambda function that emails the user via Amazon SES is unreliable for end user sign ins because CloudTrail does not emit the per-user user pool authentication events that you would need for immediate notifications.
Prefer native service triggers such as post authentication when you need immediate actions after sign in. They reduce latency and operational complexity compared with client driven calls or log based pipelines.
At Aurora Dynamics, the platform team uses AWS CodeDeploy to publish a new version of an AWS Lambda function once a CodeBuild stage in an AWS CodePipeline completes successfully. Before traffic is shifted, the pipeline may trigger a Step Functions state machine named AssetReorgFlow-v3 that runs an Amazon ECS on Fargate task to reorganize objects in an Amazon S3 bucket for forward compatibility. The Lambda update must not receive production traffic until that reorganization has fully finished. What should you implement to keep traffic from reaching the new Lambda version until the workflow is done?
-
✓ C. Add a BeforeAllowTraffic lifecycle hook in the Lambda AppSpec that waits for the Step Functions execution to complete
The correct option is Add a BeforeAllowTraffic lifecycle hook in the Lambda AppSpec that waits for the Step Functions execution to complete. This ensures CodeDeploy does not move the Lambda alias and send production traffic until the AssetReorgFlow-v3 state machine and the ECS Fargate reorganization finish.
Implement the BeforeAllowTraffic lifecycle hook so it polls or verifies the Step Functions execution or checks a completion signal from the Fargate task. The hook runs during the Lambda deployment before the alias shift and it can block the traffic transition until the S3 reorganization is fully complete.
Use an AfterAllowTraffic hook in the AppSpec to verify the Step Functions execution after traffic is shifted is incorrect because that hook runs after the alias has already moved and the new Lambda version would be receiving production traffic before the workflow completes.
Configure a CodeDeploy canary to send 20% of traffic for 15 minutes during the S3 reorganization is incorrect because a canary still routes a portion of production traffic to the new version while the reorganization is in progress and it does not prevent traffic until completion.
Insert a Step Functions state that calls a CodeDeploy API to tell it to switch the Lambda alias when ready is incorrect because deployment progression is controlled by CodeDeploy lifecycle hooks and there is no supported way for a running Step Functions execution to directly advance an in-progress Lambda deployment.
Remember that the BeforeAllowTraffic hook runs before the alias shift and use it to block or verify external workflows before allowing production traffic.
NovaRetail uses AWS Control Tower to run a landing zone with 36 AWS accounts across two Regions. Each product team owns its own application account, and a shared DevOps account provides CI/CD pipelines. A CodeBuild project in the DevOps account must deploy to an Amazon EKS cluster named prodclaims that runs in a team account. The team already has a deployment IAM role in its account, and the cluster uses the default aws-auth ConfigMap. CodeBuild runs with a service role in the DevOps account, but kubectl commands fail with Unauthorized when connecting cross account. What should be changed so the pipeline can authenticate and deploy successfully?
-
✓ C. In the application account, update the deployment role trust policy to trust the DevOps account with sts:AssumeRole, attach EKS permissions to that role, and add the role to the cluster aws-auth ConfigMap
In the application account, update the deployment role trust policy to trust the DevOps account with sts:AssumeRole, attach EKS permissions to that role, and add the role to the cluster aws-auth ConfigMap is correct because the build in the centralized DevOps account must assume a role that exists in the team account and that assumed role must be authorized by the EKS cluster for kubectl to succeed.
The correct pattern places the trust policy on the role that will be assumed in the application account and allows the DevOps account to call sts:AssumeRole. You must attach the required EKS IAM permissions to that role so it can interact with the cluster and then map the role into the cluster aws-auth ConfigMap so Kubernetes RBAC recognizes the assumed identity.
Modify the DevOps account deployment role to trust the application account using sts:AssumeRoleWithSAML and grant the role access to CodeBuild and the EKS cluster is wrong because SAML federation is not used in this pipeline and the trust relationship belongs on the target role in the application account rather than on the DevOps role.
Create an IAM OIDC identity provider and have CodeBuild call sts:AssumeRoleWithWebIdentity to reach the cluster without updating aws-auth is incorrect because CodeBuild uses an IAM service role by default and not a web identity token in this scenario and EKS still requires the IAM role to be mapped in aws-auth for cluster authorization.
Configure the trust relationship on the centralized DevOps role to allow the application account using sts:AssumeRole, then grant EKS permissions and update aws-auth is incorrect because giving trust to the centralized role does not let it assume a role in the application account. The trust must be on the role in the application account that the DevOps account will assume.
For cross account EKS deployments remember the two step pattern of assume the target account role then authorize that role in the cluster aws-auth. Put trust on the role being assumed and map that role into the cluster.
A streaming media startup runs a serverless backend that handles tens of thousands of API calls with AWS Lambda and stores state in Amazon DynamoDB. Clients invoke a Lambda function through an Amazon API Gateway HTTP API to read large batches from the DynamoDB PlaybackSessions table. Although the table uses DynamoDB Accelerator, users still see cold-start delays of roughly 8–12 seconds during afternoon surges. Traffic reliably peaks from 3 PM to 6 PM and tapers after 9 PM. What Lambda configuration change should a DevOps engineer make to keep latency consistently low at all times?
-
✓ B. Enable provisioned concurrency for the function and configure Application Auto Scaling with a minimum of 2 and a maximum of 120 provisioned instances
Enable provisioned concurrency for the function and configure Application Auto Scaling with a minimum of 2 and a maximum of 120 provisioned instances is correct because provisioned concurrency keeps pre‑initialized execution environments ready and scaling that pool to match the predictable afternoon peaks keeps request latency low.
Provisioned concurrency pre-warms runtimes and loaded code and it reduces the initialization delay that causes cold starts. Application Auto Scaling lets you maintain a small minimum to cover baseline traffic and increase the pool during the 3 PM to 6 PM surge so users see consistent latency while you avoid overprovisioning outside peak hours.
Configure reserved concurrency for the function and use Application Auto Scaling to set reserved concurrency to roughly half of observed peak traffic isolates capacity and caps concurrent executions but it does not pre-initialize runtimes and therefore does not eliminate cold starts.
Increase the Lambda memory size to 8,192 MB to reduce initialization and execution time can shorten execution time and sometimes reduce latency but increasing memory does not reliably remove the cold-start initialization delay and it increases cost.
Set the function’s ephemeral storage to 10,240 MB to cache data in /tmp between invocations increases temporary disk space but the execution environment can be recycled and cached files do not prevent cold-start initialization on new environments.
When spikes are predictable prefer provisioned concurrency with Application Auto Scaling and set a sensible minimum to cover baseline traffic and a higher maximum for peak windows.
Lumina Labs runs workloads in nine AWS accounts and wants to monitor the cost-effectiveness of Amazon EC2 across these accounts. All EC2 resources are tagged with environment, cost center, and division for chargeback. Leadership has asked the DevOps team to automate cross-account cost optimization in shared environments, including detecting persistently low-utilization EC2 instances and taking action to remove them. What is the most appropriate approach to implement this automation?
-
✓ B. Use AWS Trusted Advisor with a Business or Enterprise Support plan integrated with Amazon EventBridge, and invoke AWS Lambda to filter on tags and automatically terminate consistently low-utilization EC2 instances
Use AWS Trusted Advisor with a Business or Enterprise Support plan integrated with Amazon EventBridge, and invoke AWS Lambda to filter on tags and automatically terminate consistently low-utilization EC2 instances is correct because Trusted Advisor provides a managed Low Utilization Amazon EC2 Instances check and can emit status-change events to EventBridge so you can build an automated, tag-aware remediation workflow across accounts.
Trusted Advisor with Business or Enterprise Support exposes the full set of checks that include low utilization detection and it integrates natively with EventBridge to deliver events when checks change state and findings appear. You can subscribe across accounts or centralize events and then invoke Lambda to evaluate the environment, cost center, and division tags and take safe automated actions such as stopping or terminating consistently idle instances.
Create Amazon CloudWatch dashboards filtered by environment, cost center, and division tags to monitor utilization, and use Amazon EventBridge rules with AWS Lambda to stop or terminate underused instances is weaker because CloudWatch does not natively identify low-utilization EC2 instances across accounts from tags alone and you would need to build custom collectors or metrics which increases operational overhead compared with a managed check that emits events.
Enroll all accounts in AWS Compute Optimizer and use its recommendations to automatically shut down instances via an Amazon EventBridge rule and AWS Lambda is not ideal because Compute Optimizer focuses on rightsizing and cost recommendations and it does not emit built-in idle detection events for automatic termination so you would need additional polling or orchestration to achieve the same automation.
Build a scheduled EC2-based script to collect utilization across accounts, store results in Amazon DynamoDB, visualize in Amazon QuickSight, and trigger AWS Lambda to terminate idle instances is overly complex and high-maintenance because this approach requires custom collection, storage, and visualization layers and QuickSight does not directly trigger remediation, making it inferior to using Trusted Advisor checks with EventBridge.
Trusted Advisor events pair well with EventBridge and Lambda for tag-aware, cross-account remediation. Choose Trusted Advisor for managed low-utilization detection and use Lambda to enforce your chargeback tags.
Zento Labs operates a media-sharing service where users upload images and receive device-optimized variants in iOS and Android apps, desktop browsers, and popular chat platforms. A Lambda function inspects the request User-Agent and chooses the right image size. To deliver consistently low latency for viewers across several continents, what should you implement?
-
✓ B. Associate the function with a CloudFront distribution using Lambda@Edge
Associate the function with a CloudFront distribution using Lambda@Edge is correct because Lambda@Edge runs your inspection and routing logic at CloudFront edge locations in response to viewer and origin events and this lets you read the User-Agent header and route or rewrite requests to the device-appropriate image variant with minimal latency for a global audience.
Lambda@Edge executes code close to viewers and it can modify requests and responses so you can select the right image size or redirect to a specific variant without an extra round trip to a regional backend. Placing the logic at the edge reduces latency and provides consistent performance across continents.
Expose the Lambda through Amazon API Gateway using an edge-optimized endpoint is not ideal because the edge-optimized hostname only fronts the API with CloudFront and the API still terminates in a specific AWS Region and your Lambda runs in that region rather than at every edge location.
Use CloudFront Functions for the device-based response logic is unsuitable here because CloudFront Functions are limited to very lightweight JavaScript that cannot perform network I O or run the full Lambda runtime and they are best for simple header or URL rewrites rather than non trivial selection and transformation flows implemented in Lambda.
Configure the Lambda function as the origin for a CloudFront distribution is invalid because CloudFront origins must be HTTP endpoints such as S3 buckets, load balancers, or custom web servers and you cannot register a Lambda function directly as an origin.
Remember that Lambda@Edge runs at CloudFront edge locations so use it for per request logic that needs global low latency and access to request headers. Use CloudFront Functions only for ultra light header or URL rewrites when you do not need the Lambda runtime.
A travel-tech company, Lumen Journeys, operates a web application on an Auto Scaling group of Amazon EC2 instances across multiple Availability Zones behind an Application Load Balancer. The database runs on Amazon RDS for MySQL, and Amazon Route 53 directs clients to the load balancer. The ALB health check hits an endpoint that verifies the application can connect to the database. Due to regulatory requirements, leadership mandates a disaster recovery site in a separate AWS Region. The recovery point objective is 10 minutes and the recovery time objective is 90 minutes. Which approach would require the fewest changes to the existing stack?
-
✓ C. Deploy the application tier in a different AWS Region without RDS, create a cross-Region RDS MySQL read replica, point the DR app to the local replica, and use Route 53 failover routing
Deploy the application tier in a different AWS Region without RDS, create a cross-Region RDS MySQL read replica, point the DR app to the local replica, and use Route 53 failover routing is correct because it provides a warm standby in a separate Region while requiring the fewest changes to the existing stack.
The approach uses a cross-Region RDS MySQL read replica to maintain an asynchronous copy of the primary database and typical replication lag is measured in minutes which can meet the 10 minute RPO. Promoting that replica to be writable and switching clients with Route 53 failover routing can achieve the 90 minute RTO in a predictable way. This option avoids an engine migration and limits application changes to repointing the database endpoint which makes it the least disruptive solution.
Enable RDS Multi-AZ and place the standby in a second Region, duplicate the app tier there, and use Route 53 latency routing to direct traffic during failures is wrong because RDS Multi-AZ is confined to a single Region and cannot place a synchronous standby in another Region.
Recreate the full stack in another Availability Zone, add a MySQL read replica in that AZ, and configure Route 53 failover to switch when the primary AZ is unavailable is wrong because building a recovery site in another Availability Zone does not provide geographic isolation and it does not protect against a Regional outage.
Migrate the database to Amazon Aurora MySQL-Compatible and use Aurora Global Database for cross-Region DR is viable for low RPOs but it requires migrating engines and adapting operations and application behavior which makes it more invasive than the chosen read replica approach.
Focus on RPO and RTO when mapping DR patterns to solutions. Multi-AZ protects an AZ loss while cross-Region read replicas provide a warm standby for Region-wide failures and they usually require fewer changes.
All DevOps questions come from certificationexams.pro and my Certified DevOps Engineer Udemy course.
NovaEdge Solutions runs a CI/CD pipeline in AWS that builds with CodeBuild and deploys to a fleet of Amazon EC2 instances through CodeDeploy. The team needs a mandatory human sign-off before any release reaches production, even when all unit and integration tests pass, and the workflow is managed by CodePipeline. What is the simplest and most cost-effective way to add this enforced approval gate?
-
✓ B. Use CodeBuild to execute the tests, insert a Manual approval action in CodePipeline immediately before the production CodeDeploy stage with SNS notifications to approvers, then proceed to the production deploy after approval
Use CodeBuild to execute the tests, insert a Manual approval action in CodePipeline immediately before the production CodeDeploy stage with SNS notifications to approvers, then proceed to the production deploy after approval is the correct choice because it uses CodePipeline’s built in Manual approval action to enforce a mandatory human gate before production while CodeBuild performs the tests and SNS handles notifications.
The built in Use CodeBuild to execute the tests, insert a Manual approval action in CodePipeline immediately before the production CodeDeploy stage with SNS notifications to approvers, then proceed to the production deploy after approval approach is simplest because the Manual approval action requires no custom infrastructure and it integrates directly into the existing pipeline so there is no extra service to run or maintain.
The same approach is also the most cost effective because it leverages managed AWS services you already use so there is no ongoing cost for custom workers or self hosted CI servers and approvers receive notifications through SNS so the pipeline only proceeds after explicit sign off.
Run the unit and integration tests with AWS Step Functions, then add a test action after the last deploy, add a manual approval with SNS notifications, and finally add a deploy action to promote to production is incorrect because introducing Step Functions adds complexity without delivering the required manual gate and CodeBuild already covers the build and test stages.
Use CodeBuild for tests and create a custom CodePipeline action with a bespoke job worker to perform the approval, notify through SNS, and promote on success is incorrect because a custom action requires development and ongoing maintenance and it duplicates functionality that the managed Manual approval action already provides.
Perform the tests in a self-managed Jenkins or GitLab on EC2, add a test action, add a manual approval in the pipeline with SNS notifications, and then deploy to production is incorrect because running and maintaining self managed CI/CD tooling increases operational overhead and cost compared with using CodeBuild and CodePipeline built in features.
When a question asks for the simplest and most cost effective human gate in CodePipeline choose the built in Manual approval action paired with SNS rather than custom workers or self hosted CI.
A DevOps engineer at Apex Retail Solutions must design a disaster recovery plan for a live web application. The application runs on Amazon EC2 instances in an Auto Scaling group across multiple Availability Zones behind an Application Load Balancer, and Amazon Route 53 uses an alias to the ALB. The database is Amazon RDS for PostgreSQL. Leadership requires an RTO of up to four hours and an RPO near 20 minutes while keeping ongoing costs low. Which disaster recovery approach best meets these goals?
-
✓ B. Implement a pilot light in a secondary Region with a cross-Region RDS PostgreSQL read replica, a minimal application stack ready to start, Route 53 health checks for failover, and promote the replica to primary during disaster
The best option is Implement a pilot light in a secondary Region with a cross-Region RDS PostgreSQL read replica, a minimal application stack ready to start, Route 53 health checks for failover, and promote the replica to primary during disaster. This approach hits the near 20 minute RPO and the up to four hour RTO while keeping steady state costs low.
The reason the Implement a pilot light in a secondary Region with a cross-Region RDS PostgreSQL read replica, a minimal application stack ready to start, Route 53 health checks for failover, and promote the replica to primary during disaster option works is because the database is continuously replicated to the cross region read replica so data loss is minimal. The minimal application stack is kept ready as a small footprint and it can be scaled out during a failover to meet the hours level RTO. Route 53 health checks and aliasing to the load balancer allow DNS based failover and the promoted replica becomes the primary quickly which reduces recovery time compared with full restore from backups.
Maintain a warm standby in another Region with a smaller but fully functional environment and scale out with Auto Scaling during failover is functional but it keeps more resources running continuously and it therefore raises ongoing cost. That approach would reduce recovery time but it is not the most cost efficient match for a four hour RTO requirement.
Run multi-site active/active across two Regions with Route 53 latency-based routing and replicate all components for near-zero RPO meets aggressive RTO and RPO targets but it is the most complex and expensive option. The higher cost and operational complexity conflict with the requirement to keep ongoing costs low.
Use a backup-and-restore pattern with AWS Backup to copy RDS snapshots to another Region, rebuild EC2 and ALB on demand, and update DNS after the restore completes is the least expensive in steady state but it cannot reliably achieve a near 20 minute RPO and it risks exceeding a four hour RTO because snapshot copy and restore take time.
Match RTO and RPO to DR patterns by thinking about cost trade offs. Pilot light is a good fit when you need minutes of RPO and hours of RTO with low steady state cost.
A healthcare analytics startup operates workloads across 12 AWS accounts that are organized under AWS Organizations. After a compliance assessment, leadership requires that all threat detections and security logs be funneled into a single security account where analysts will triage findings and retain raw events for at least 120 days. Which approach enables automated detection of attacks on EC2 instances across every account and centralized delivery of findings into an S3 bucket in the security account?
-
✓ B. Enable Amazon GuardDuty for the entire organization with a delegated administrator, capture GuardDuty findings via an EventBridge rule in the admin account, and deliver them to an S3 bucket using Kinesis Data Firehose
The correct approach is to use Enable Amazon GuardDuty for the entire organization with a delegated administrator, capture GuardDuty findings via an EventBridge rule in the admin account, and deliver them to an S3 bucket using Kinesis Data Firehose.
Amazon GuardDuty can integrate with AWS Organizations so a delegated administrator account centrally manages and aggregates findings from all member accounts. Findings are emitted to Amazon EventBridge which lets you route detections from every account into the administrator account and then trigger a Kinesis Data Firehose delivery stream. Firehose provides managed, direct delivery to a central S3 bucket and it can buffer, transform, compress, and retry so raw events can be retained for the required 120 day window.
Use Amazon Inspector across the organization, assign a delegated administrator, add an EventBridge rule in that account, and stream results to a central S3 bucket with Kinesis Data Firehose is incorrect because Amazon Inspector focuses on vulnerability and network reachability assessments and not on detecting active attacks from sources like VPC Flow Logs CloudTrail and DNS logs.
Deploy Amazon Macie in every account, make one account the Macie administrator, forward alerts with an EventBridge rule, and archive them to S3 through Kinesis Data Firehose is incorrect because Amazon Macie is designed for sensitive data discovery and classification in S3 and not for threat detection on EC2 instances or log analytics across multiple accounts.
Run Amazon GuardDuty separately in each account, send findings to Kinesis Data Streams, and write directly to S3 from the stream is incorrect because it does not use the Organizations delegated administrator to centrally manage findings and because Kinesis Data Streams does not natively deliver to S3 without a custom consumer, whereas Firehose provides managed delivery to S3.
Look for services that integrate with AWS Organizations and provide managed delivery to S3. Focus on GuardDuty with a delegated administrator and EventBridge to trigger Firehose for direct S3 retention.
Vertex Media Labs uses cost allocation tags to charge back AWS spend. Their analytics service runs on Amazon EC2 instances in an Auto Scaling group that launches from a template. Newly attached Amazon EBS volumes on these instances are missing the required CostCenterId tag value 8421. A DevOps engineer must implement the most efficient change so that EBS volumes receive the correct cost center tags automatically at creation time. What should the engineer do?
-
✓ C. Update the Auto Scaling launch template to include TagSpecifications for EBS volumes with the required cost center tags
Update the Auto Scaling launch template to include TagSpecifications for EBS volumes with the required cost center tags is correct because this ensures the CostCenterId tag is applied at creation time to the volumes launched with the instances.
When an Auto Scaling group launches instances from a launch template with TagSpecifications the template controls instance and volume creation. Placing the cost center tag in the template TagSpecifications tags EBS volumes immediately which prevents tagging drift and removes the need for extra automation that would tag after creation.
Create an Amazon EventBridge rule for CreateVolume and invoke an AWS Lambda function to apply the CostCenterId tag to new EBS volumes is not ideal because it introduces additional services and complexity and it applies tags after volumes are created which can produce race conditions and operational overhead.
Add the CostCenterId tag to the Auto Scaling group and enable PropagateAtLaunch is incorrect because PropagateAtLaunch only propagates tags to the EC2 instances and does not apply tags to the EBS volumes created by the launch process.
Use AWS Config to prevent creation of EBS volumes that are missing the CostCenterId tag is incorrect because AWS Config evaluates and reports resource compliance and can trigger remediation but it does not block resource creation in real time.
Put required EBS tags in the launch template TagSpecifications because PropagateAtLaunch only affects instances and tagging after creation adds complexity.
The platform engineering team at NovaCare Analytics manages more than 320 AWS accounts through AWS Organizations. Security mandates that every EC2 instance launches from a centrally approved, hardened base AMI. When a new AMI version is released, the team must ensure no new instances are started from the previous AMI, and they also need a centralized and auditable view of AMI compliance across all accounts. What approach should be implemented to meet these goals across the organization? (Choose 2)
-
✓ B. Deploy an AWS Config custom rule with AWS CloudFormation StackSets to check instance AMI IDs against an approved list and aggregate results in an AWS Config aggregator in the management account
-
✓ D. Use AWS Systems Manager Automation to produce the AMI in a central account and share it with organizational accounts, then revoke sharing on the previous AMI and share the new one when updated
The combination of Deploy an AWS Config custom rule with AWS CloudFormation StackSets to check instance AMI IDs against an approved list and aggregate results in an AWS Config aggregator in the management account and Use AWS Systems Manager Automation to produce the AMI in a central account and share it with organizational accounts, then revoke sharing on the previous AMI and share the new one when updated is the correct approach to provide both enforcement and centralized auditing across the organization.
Producing the AMI centrally with Systems Manager Automation and sharing only the current image lets you revoke access to the previous AMI so new launches cannot use it. The sharing and unsharing pattern enforces a single approved image per version across accounts. The AWS Config custom rule deployed with CloudFormation StackSets checks running instances against the approved AMI list and the Config aggregator in the management account gives a single auditable view of compliance across all accounts and Regions.
Use AWS Systems Manager Automation distributed with AWS CloudFormation StackSets to build the AMI inside every account is operationally heavy and creates many independent AMI copies that are harder to retire consistently. That approach increases the risk that old images remain launchable.
Create the AMI in a central account and copy it to each account and Region whenever a new version is published proliferates stale copies that remain launchable unless every copy is tracked and revoked. Copying does not by itself prevent new instances from using older images.
Publish the approved AMI as a product in AWS Service Catalog across the organization can steer provisioning toward approved images but it does not prevent users from bypassing the catalog and launching older AMIs directly. Service Catalog alone also does not provide the same centralized, aggregated compliance reporting that AWS Config does.
Enforce AMI usage by sharing and revoking the current image and audit compliance with an AWS Config rule aggregated to the management account
Delta Ledger, a regional fintech startup, manages about 36 AWS accounts across five organizational units with AWS Organizations. The governance team believes an unknown external AWS account was invited into the organization and granted broad permissions, though no harmful actions were taken. Which monitoring setup would most effectively deliver near-real-time alerts when organization membership or account-related changes occur? (Choose 2)
-
✓ A. Use AWS Config with an organization aggregator to evaluate changes across all accounts and send notifications through Amazon SNS or Amazon EventBridge when the organization’s structure or account configuration changes
-
✓ C. Create an organization-level trail in AWS CloudTrail that records all AWS Organizations API and console activity, and wire Amazon EventBridge rules to trigger an SNS alert for administrator-defined events such as invited or created accounts
The correct choices are Use AWS Config with an organization aggregator to evaluate changes across all accounts and send notifications through Amazon SNS or Amazon EventBridge when the organization’s structure or account configuration changes and Create an organization-level trail in AWS CloudTrail that records all AWS Organizations API and console activity, and wire Amazon EventBridge rules to trigger an SNS alert for administrator-defined events such as invited or created accounts.
The Create an organization-level trail in AWS CloudTrail that records all AWS Organizations API and console activity, and wire Amazon EventBridge rules to trigger an SNS alert for administrator-defined events such as invited or created accounts option provides near real time visibility into Organizations API and console actions. CloudTrail captures events such as InviteAccountToOrganization, CreateAccount, AcceptHandshake, and LeaveOrganization and EventBridge can match those events and route them to SNS or other responders so administrators are alerted immediately when membership or OU changes occur.
The Use AWS Config with an organization aggregator to evaluate changes across all accounts and send notifications through Amazon SNS or Amazon EventBridge when the organization’s structure or account configuration changes option gives centralized compliance and drift detection across accounts and Regions. Config aggregator lets you record and evaluate account configuration baselines and resource relationships so you can detect structural or policy drift that may accompany an unexpected account invite or privilege change.
Deploy a third-party SIEM from AWS Marketplace, integrate with Amazon GuardDuty findings, and publish administrator alerts via Amazon SNS focuses on threat detection and log correlation and not on producing authoritative, immediate signals for Organizations API events. A SIEM may add useful context but it introduces extra processing and latency and it is not the primary source for Organization membership events.
Use AWS Systems Manager with Amazon EventBridge to watch for organizational updates and notify the platform team of new activities is not appropriate because Systems Manager manages instances and configuration inside accounts and it does not record AWS Organizations membership or policy change events. EventBridge can be used but Systems Manager is not the right producer for organization change events.
Enable AWS Security Hub across the organization and pair it with Amazon Detective to surface suspicious behavior and notify the security team aggregates and investigates security findings and it does not serve as the primary mechanism to detect Organization membership or account creation events. These services are useful for investigation but they do not replace CloudTrail and Config for governance change detection.
Use CloudTrail organization trails plus EventBridge for immediate alerts and add an AWS Config organization aggregator to track compliance and structural drift across accounts.
NorthPeak Media runs a containerized microservice on Amazon ECS inside a private VPC, and over the past 14 days users have experienced intermittent timeouts and high latency in production. Leadership asked the DevOps engineer to enable distributed tracing to identify which request paths and downstream services are causing the slow responses. What should the engineer implement to collect end-to-end traces from the ECS tasks?
-
✓ C. Build an image for the AWS X-Ray daemon, push it to ECR, run it as a sidecar in the ECS task, and open UDP 2000 via task definition port mappings
Build an image for the AWS X-Ray daemon, push it to ECR, run it as a sidecar in the ECS task, and open UDP 2000 via task definition port mappings is correct because the X Ray SDK sends trace segments to a local daemon that batches and forwards them to the X Ray service and colocating the daemon with the application in the same task provides reliable local ingestion.
Running the daemon as a sidecar lets your application containers send UDP traffic to localhost and it keeps tracing configuration scoped to the task. Exposing UDP 2000 in the task definition and using the appropriate ECS network mode ensures the daemon can receive segments from the SDK. Building or using an image for the X Ray daemon and pushing it to ECR is the common approach when you run tasks in a private VPC or when you need a controlled image source.
Add an xray-daemon.config file to the image and map UDP 2000 in the task definition is wrong because that configuration file is used by Elastic Beanstalk rather than ECS and it does not implement the recommended ECS sidecar daemon pattern.
AWS CloudTrail is incorrect because CloudTrail records AWS API calls and account activity and it does not provide application level distributed traces or service to service latency information.
Install the X-Ray daemon via user data in /etc/ecs/ecs.config and expose TCP 3000 on the container agent is not appropriate because the X Ray daemon ingests over UDP port 2000 and installing the daemon via instance user data modifies the host instead of following the ECS task sidecar pattern. Exposing TCP port 3000 is the wrong protocol and port for X Ray.
Run the X Ray daemon as a sidecar and open UDP 2000 in the task definition. Use the official daemon image or build and push it to ECR when you need a private registry.
A DevOps engineer at a digital ticketing startup named VeloTix manages about 40 Windows Server Amazon EC2 instances and must automate patching with an approved baseline while ensuring that reboots do not occur at the same time to avoid payment delays and lost revenue. How should the engineer design the patching process to stagger reboots and maintain availability?
-
✓ C. Define two distinct patch groups with unique tags for the Windows fleet, associate each to AWS-DefaultPatchBaseline, create two non-overlapping Systems Manager maintenance windows targeted by those patch group tags, and schedule AWS-RunPatchBaseline in each window with different start times
The correct option is Define two distinct patch groups with unique tags for the Windows fleet, associate each to AWS-DefaultPatchBaseline, create two non-overlapping Systems Manager maintenance windows targeted by those patch group tags, and schedule AWS-RunPatchBaseline in each window with different start times. This approach staggers patching and reboots so that only part of the fleet is updated at any one time while the approved baseline is applied.
Using patch groups together with Patch Manager and Maintenance Windows lets you map subsets of instances to a baseline and to controlled schedules. Maintenance Windows provide built in targeting and timing so you can set non overlapping start times and avoid simultaneous reboots. Scheduling the AWS-RunPatchBaseline document in each window applies the AWS-DefaultPatchBaseline automatically and produces compliance reporting to verify the approved baseline is enforced.
Create one patch group tagged on all Windows instances, associate AWS-DefaultPatchBaseline, configure a single Systems Manager maintenance window, and run AWS-RunPatchBaseline during that window is wrong because a single shared window risks patching and rebooting the entire fleet at once and it does not provide the staggered reboots needed to maintain availability.
Configure two patch groups with unique tags mapped to all Windows instances, attach AWS-DefaultPatchBaseline to both, use EventBridge rules to trigger Systems Manager Run Command with a cron schedule, and manage custom steps with State Manager during execution is wrong because EventBridge and Run Command add unnecessary complexity for scheduling patch baselines and do not replace the built in scheduling and targeting features of Maintenance Windows that are designed for safe patch orchestration.
Use AWS Systems Manager Distributor to push Windows updates to instances on demand and rely on Auto Scaling rolling updates to stagger reboots across the fleet is wrong because Distributor is intended for distributing software packages and is not the supported mechanism for Windows OS patching and Auto Scaling rolling updates do not coordinate Patch Manager operations or baselines for a static EC2 fleet.
Map instances into patch groups and schedule AWS-RunPatchBaseline in separate, non overlapping Maintenance Windows so reboots never occur across the whole fleet at once.
Northwind Outfitters operates a web API that fronts Amazon EC2 instances behind an Application Load Balancer by using an Amazon API Gateway REST API. The engineering team wants new releases to be rolled out with minimal user impact and with the ability to revert quickly if defects are found. What approach will achieve this with the least changes to the existing application?
-
✓ B. Create a parallel environment behind the ALB with the new build and configure API Gateway canary release to send a small portion of requests to it
Create a parallel environment behind the ALB with the new build and configure API Gateway canary release to send a small portion of requests to it is correct because it lets you route a small percentage of API traffic to the new EC2 instances behind the existing ALB and observe behavior before increasing the rollout.
This canary release approach requires no changes to application code and only minimal configuration at the API stage. You deploy a parallel backend behind the ALB and then use API Gateway canary weights to shift traffic incrementally. If a defect is detected you can quickly revert by reducing the canary weight or disabling the canary which gives fast rollback without touching DNS or redeploying the fleet.
Use AWS CodeDeploy blue/green with the Auto Scaling group behind the ALB and shift production traffic to the new revision can achieve safe cutovers but it is more invasive because it typically requires deployment agents and AppSpec configuration and introduces additional deployment tooling compared with controlling traffic at the API layer.
Stand up a duplicate environment and update the Route 53 alias to the new stack forces a full DNS cutover that exposes all users at once and can be slower to roll back due to DNS propagation and caching which makes progressive validation harder.
Create a new target group for the ALB and have API Gateway weight requests directly to that target group is not viable because API Gateway does not natively weight traffic across ALB target groups. The supported weighted exposure mechanism when API Gateway fronts a backend is a canary deployment at the API stage.
When an API Gateway REST API fronts an ALB prefer using canary deployments to stage traffic and enable an immediate rollback without changing application code.
All DevOps questions come from certificationexams.pro and my Certified DevOps Engineer Udemy course.
CineWave Media is wrapping up its move to AWS and has about 30 engineers, with a handful holding the AWS Certified DevOps Engineer Professional while many newer teammates have not earned Associate-level certifications yet. The company enforces specific architecture patterns and mandatory tags and wants to reduce the chance that less experienced users deploy noncompliant resources, without blocking them from provisioning what they need. What should the DevOps engineer implement?
-
✓ C. Package approved architectures as AWS CloudFormation templates and publish them as products in AWS Service Catalog with required tags, then allow beginners to launch only Service Catalog products and deny write access to other services
The correct choice is Package approved architectures as AWS CloudFormation templates and publish them as products in AWS Service Catalog with required tags, then allow beginners to launch only Service Catalog products and deny write access to other services. This approach enforces preventative controls so novices can self serve while the platform protects compliance.
AWS Service Catalog products are preapproved, parameterized artifacts so teams can only provision vetted architectures with required tagging and configuration. Packaging templates as AWS CloudFormation products inside the Service Catalog gives you centralized versioning, parameter constraints, and the ability to deny other write paths so less experienced users cannot bypass governance.
Enable AWS Config with custom rules powered by AWS Lambda to evaluate compliance and give users broad permissions while you improve the rules over time offers only detective controls and will not prevent noncompliant resources from being created in the first place.
Create a starter IAM group for novices and attach a policy that requires senior engineer approval before any resource creation, using an SNS topic for the approval workflow is impractical because IAM does not provide a native precreation approval mechanism and SNS cannot act as an enforcement gate to block resource creation.
Define reusable AWS CloudFormation templates and permit beginners to create stacks directly in CloudFormation while restricting write access to other services still permits broad provisioning through CloudFormation so users could deploy unvetted templates or parameter values and it is harder to centrally enforce required tags and product constraints than using a managed catalog.
Favor preventative self service by offering vetted templates through a managed catalog and use AWS Config rules for ongoing, detective compliance checks.
A regional credit union operates a VMware-based automated golden image pipeline in its on-premises data center. The DevOps engineer needs a way to validate those server images with the current on-prem pipeline while closely mirroring how they will run on Amazon EC2. The team is moving to Amazon Linux 2 in AWS and wants to confirm functionality, surface incompatibilities, and identify dependencies before a 90-day cutover. What approach should be implemented to meet these goals?
-
✓ B. Use AWS VM Import/Export to import the on-prem VMware image as an EC2 AMI, validate on EC2, then export the imported instance as a VMware-compatible OVA to Amazon S3 and load it back into vSphere for on-prem testing
The correct approach is Use AWS VM Import/Export to import the on-prem VMware image as an EC2 AMI, validate on EC2, then export the imported instance as a VMware-compatible OVA to Amazon S3 and load it back into vSphere for on-prem testing.
This method uses VM Import/Export to bridge your existing vSphere images and Amazon EC2 so you can run the identical image in AWS and then return the validated instance to your on premises pipeline. Importing the VMware VM as an EC2 AMI lets you confirm how the image behaves on Amazon Linux 2 in EC2 and surface incompatibilities and dependencies. Exporting the instance that was previously imported produces a VMware compatible OVA so you can run the same validated image under your current VMware automation and compare results.
Deploy AWS Outposts racks with Amazon Linux 2 hosts connected to the data center and execute tests there is unnecessary and costly for this validation because Outposts is intended to extend AWS infrastructure on premises and it is not required to mirror EC2 behavior while keeping your current VMware toolchain.
Provision a local server on Ubuntu or Fedora since any Linux distribution is effectively equivalent to Amazon Linux 2 for testing is incorrect because distributions differ in kernels, libraries, package versions, and system tooling and those differences can hide or introduce incompatibilities versus Amazon Linux 2.
Download an Amazon Linux 2 ISO and install it directly on a physical host in the data center for validation is not viable because AWS provides images tailored for virtual environments and the supported path for parity testing with EC2 is VM Import/Export rather than installing a generic ISO on bare metal.
Remember that you can export only instances that were originally imported via VM Import/Export so import to EC2 first, validate on Amazon Linux 2, and then export the imported instance for on premises testing
BrightCart, a national online retailer, recently migrated from another cloud to AWS. The web tier runs in an Auto Scaling group behind an Application Load Balancer. The team will publish an Amazon CloudFront distribution with a custom domain whose origin is the ALB. What should the engineers configure to enforce HTTPS from viewers to CloudFront and from CloudFront back to the ALB?
-
✓ C. Request or import an ACM certificate for the ALB, associate an ACM certificate in us-east-1 with the CloudFront distribution’s custom domain, set Viewer Protocol Policy to HTTPS Only, and configure the origin to use HTTPS
Request or import an ACM certificate for the ALB, associate an ACM certificate in us-east-1 with the CloudFront distribution’s custom domain, set Viewer Protocol Policy to HTTPS Only, and configure the origin to use HTTPS is correct because it enforces HTTPS for the viewer to CloudFront connection and for the CloudFront to ALB origin connection.
Request or import an ACM certificate for the ALB, associate an ACM certificate in us-east-1 with the CloudFront distribution’s custom domain, set Viewer Protocol Policy to HTTPS Only, and configure the origin to use HTTPS works because CloudFront requires a certificate for custom domains to be in the us-east-1 region and the Viewer Protocol Policy set to HTTPS Only forces clients to use TLS. The ALB must present a trusted certificate so that CloudFront can complete an HTTPS connection to the origin and using an ACM certificate on the ALB provides a managed trusted certificate for that purpose.
Create a self-signed certificate for the ALB, set CloudFront Viewer Protocol Policy to HTTPS Only, and use a third-party certificate imported into ACM or the IAM certificate store for the custom domain is incorrect because CloudFront does not trust self-signed origin certificates and a self-signed certificate on the ALB will not provide a trusted TLS connection from CloudFront to the origin.
Import a trusted CA certificate into ACM for the ALB, choose Match Viewer for the CloudFront Viewer Protocol Policy, and use a third-party certificate imported into ACM or the IAM certificate store for the custom domain is incorrect because choosing Match Viewer does not force TLS from the client and it can allow HTTP which breaks end to end encryption between the viewer and CloudFront.
Upload a trusted CA certificate to an Amazon S3 bucket for CloudFront to reference, set Viewer Protocol Policy to HTTPS Only, use the default CloudFront certificate for the custom domain, and forward requests to the origin over HTTP is incorrect because CloudFront cannot reference certificates from S3 and the default CloudFront certificate only covers the cloudfront.net domain. Forwarding to the origin over HTTP breaks end to end HTTPS and does not secure the CloudFront to ALB leg.
Remember that CloudFront custom domain certificates must live in us-east-1 and you must set Viewer Protocol Policy to HTTPS Only while configuring the origin to require HTTPS.
A platform team at a fintech startup plans to launch workloads in six Amazon VPCs across two AWS accounts. The services must have any-to-any connectivity with transitive routing among the VPCs. Leadership wants centralized administration of network traffic policies for consistent security. What architecture should the team implement to meet these needs with the least operational overhead?
-
✓ B. Use AWS Transit Gateway for transitive connectivity among VPCs and manage network access policies centrally with AWS Firewall Manager
The correct option is Use AWS Transit Gateway for transitive connectivity among VPCs and manage network access policies centrally with AWS Firewall Manager. Transit Gateway provides a scalable hub that enables native transitive routing so the six VPCs can have any to any connectivity while Firewall Manager centralizes and enforces network security policies across accounts and VPCs.
Transit Gateway removes the need to manage a large mesh of point to point links and it scales more easily than peering or many VPN tunnels. Firewall Manager lets you deploy and govern network firewall rules and related protections from a central place so policies remain consistent and operational overhead is reduced.
Configure VPC peering between every VPC to build a full mesh and centralize WebACLs with AWS WAF is incorrect because VPC peering does not support transitive routing and AWS WAF provides application layer web protections rather than centralized network level routing or firewall enforcement.
Set up AWS PrivateLink endpoints between each VPC and use AWS Security Hub for centralized security policies is incorrect because PrivateLink exposes individual services via interface endpoints and does not provide general VPC routing, and Security Hub aggregates findings rather than enforcing network access policies.
Establish AWS Site-to-Site VPN tunnels between each pair of VPCs and manage policies with AWS Firewall Manager is incorrect because building pairwise VPN tunnels is operationally heavy and VPNs are primarily intended for connecting on premises networks to AWS rather than providing an efficient VPC mesh.
When a question requires transitive routing and centralized network policy across multiple VPCs and accounts, choose AWS Transit Gateway with AWS Firewall Manager for scalability and lower operational overhead.
A fitness streaming startup stores all workout videos in an Amazon S3 bucket. Over the past 9 months, traffic has surged by more than 200x, and the team plans to roll out premium subscription tiers. They must quickly and inexpensively determine which individual video objects are most frequently viewed and downloaded to guide content strategy and pricing. What is the most cost-effective approach that can be implemented immediately?
-
✓ B. Enable S3 server access logging and use Amazon Athena with an external table to query the log files and identify top GETs and downloads
Enable S3 server access logging and use Amazon Athena with an external table to query the log files and identify top GETs and downloads is the correct choice because it provides per object request details and can be queried immediately with minimal cost and setup.
S3 server access logs include the object key and operation for each request so you can compute top N viewed or downloaded videos. Athena is serverless and uses schema on read so you can create an external table over the log files and run SQL queries without provisioning infrastructure which keeps time to value and cost low.
Turn on Amazon S3 Storage Lens and visualize activity metrics in the dashboard or Amazon QuickSight is not sufficient because Storage Lens provides aggregated metrics at the bucket prefix or account level and does not expose per object request counts needed to rank individual videos.
Enable S3 server access logging and use Amazon Redshift Spectrum after provisioning a Redshift cluster to query the logs can produce the needed results but it requires provisioning and managing a Redshift cluster which adds setup time and ongoing cost that is unnecessary when Athena is available.
Stream new S3 access logs with event notifications to AWS Lambda, deliver to Kinesis Data Firehose, and index into Amazon OpenSearch Service for analysis enables near real time analytics but it adds multiple services and indexing overhead which increases complexity and cost compared with querying logs directly using Athena for this batch analysis.
For quick per object access analysis use S3 server access logging and Athena because they are serverless and low cost. Create an external table on the log files and run a GROUP BY on the object key to find top videos.
At a media-streaming startup called Polar Pixel, several product squads share one AWS account, and most photos and clips reside in Amazon S3. Some buckets must stay publicly readable on the internet while others are limited to internal services. The company wants to use AWS Trusted Advisor to flag any public buckets and to verify that only approved principals have List permissions. They also want immediate alerts if a public bucket drifts to unsafe settings and automatic fixes when appropriate. What should the DevOps engineer implement to meet these goals? (Choose 3)
-
✓ A. Configure a custom AWS Config rule that evaluates S3 bucket policies for public access and publishes noncompliance notifications to an Amazon SNS topic
-
✓ C. Use Amazon EventBridge to capture Trusted Advisor S3 bucket permission check state changes and trigger an SNS email notification
-
✓ E. Create a custom AWS Config rule that emits an EventBridge event on violation and have an EventBridge rule invoke a Lambda function to automatically correct the S3 bucket policy
The correct choices are Configure a custom AWS Config rule that evaluates S3 bucket policies for public access and publishes noncompliance notifications to an Amazon SNS topic, Use Amazon EventBridge to capture Trusted Advisor S3 bucket permission check state changes and trigger an SNS email notification, and Create a custom AWS Config rule that emits an EventBridge event on violation and have an EventBridge rule invoke a Lambda function to automatically correct the S3 bucket policy.
Configure a custom AWS Config rule that evaluates S3 bucket policies for public access and publishes noncompliance notifications to an Amazon SNS topic provides continuous, rule based evaluation of bucket ACLs and policies so drift is detected as soon as Config runs. This gives you consistent compliance checks and a simple SNS notification path for alerting owners and automations. Use Amazon EventBridge to capture Trusted Advisor S3 bucket permission check state changes and trigger an SNS email notification delivers near real time alerts when Trusted Advisor identifies a risky public bucket so you get immediate visibility from the support tooling. Create a custom AWS Config rule that emits an EventBridge event on violation and have an EventBridge rule invoke a Lambda function to automatically correct the S3 bucket policy enables automated remediation so known safe corrections run without manual work and only affect buckets that actually drifted out of the approved state.
Create a custom Amazon Inspector rule to scan S3 bucket permissions and invoke AWS Systems Manager to fix the bucket policy is incorrect because Amazon Inspector focuses on host and container vulnerabilities and does not evaluate S3 bucket policies or public access settings. Enable S3 Block Public Access at the account level to prevent any bucket from being public and rely on manual exceptions for the few that must be public is incorrect because an account level block prevents per bucket public settings and would block the buckets that intentionally need public read access. Schedule an AWS Lambda function to call the Trusted Advisor API every 30 minutes and subscribe to Trusted Advisor summary emails to receive results is incorrect because polling and email summaries add latency and complexity when EventBridge and native alerts provide timelier and more automated integration.
Use AWS Config for continuous checks and pair it with EventBridge to trigger notifications and automated remediation via Lambda.
A media startup, NovaStream, runs its main web service on Amazon EC2 instances behind a single Application Load Balancer with instances managed by Auto Scaling. The team has created separate launch templates and Auto Scaling groups for blue and green, each registered with distinct target groups, and a Route 53 alias record with a 45 second TTL points to the ALB. They want to perform an immediate cutover of all traffic from the blue version to the newly deployed green version using this single ALB. What should the engineer do to accomplish this?
-
✓ C. Run a rolling update on the green Auto Scaling group to roll out the new build, then use the AWS CLI to move the ALB listener to the green target group
The correct choice is Run a rolling update on the green Auto Scaling group to roll out the new build, then use the AWS CLI to move the ALB listener to the green target group. This sequence makes sure the green instances are running the new version and passing health checks before traffic is cut over at the load balancer.
Updating the green Auto Scaling group with a rolling update lets instances come up and register to the green target group while health checks validate readiness. After the green fleet is healthy switching the ALB listener to the green target group provides an immediate cutover at the ALB layer and avoids DNS propagation or TTL delays.
Perform an all-at-once deployment to the blue Auto Scaling group, then update the Route 53 alias to an ALB endpoint for the green target group is incorrect because it targets the wrong environment and a Route 53 alias to an ALB cannot select a specific target group. Changing DNS would also rely on TTL and delay a full cutover.
Use an AWS CLI command to switch the ALB listener to the green target group first, then run a rolling restart on the green Auto Scaling group to deploy the new build is incorrect because switching the listener before the green fleet is ready sends live traffic to instances that may not have the new software or may fail health checks.
Run a rolling restart on the green Auto Scaling group to deploy the new build, then change the Route 53 alias to point to the green environment on the ALB is incorrect because DNS cannot target a specific target group and changing the alias would introduce TTL related delay. The listener update on the ALB is the proper way to achieve an instant cutover.
Remember that when a single ALB fronts both versions you perform the cutover at the ALB listener for an instant switch and to avoid DNS TTL delays.
A fintech startup runs a critical service on Amazon EC2 instances in an Auto Scaling group. A lightweight health probe on each instance runs every 5 seconds to verify that the application responds. The DevOps engineer must use the probe results for monitoring and to raise an alarm when failures occur. Metrics must be captured at 1 minute intervals while keeping costs low. What should the engineer implement?
-
✓ C. Create a custom CloudWatch metric and publish statistic sets that roll up the 5 second results, sending one update every 60 seconds
The correct choice is Create a custom CloudWatch metric and publish statistic sets that roll up the 5 second results, sending one update every 60 seconds. This option satisfies the requirement to record metrics at one minute intervals while keeping ingestion and storage costs low.
Create a custom CloudWatch metric and publish statistic sets that roll up the 5 second results, sending one update every 60 seconds works because you aggregate the high frequency probe locally on each instance and send a single PutMetricData call per minute using the StatisticValues payload. This approach reduces the number of PutMetricData requests and retains the required one minute granularity so alarms can be evaluated without incurring high costs from high resolution ingestion.
Use a default CloudWatch metric at standard resolution and add a dimension so the script can publish once every 60 seconds is incorrect because default metrics are emitted by AWS services and you cannot push your application script output into them. Adding a dimension only annotates an existing metric and does not enable publishing custom application data.
Amazon CloudWatch Synthetics is not the best fit because synthetics runs external canaries or browser tests and usually incurs additional cost compared to batching custom metric publishes. Synthetics also does not leverage the lightweight in-instance probe that the startup already runs every five seconds.
Use a custom CloudWatch metric at high resolution and push data every 5 seconds is unnecessary for this requirement because the desired visibility is one minute. Pushing high resolution metrics every five seconds increases PutMetricData calls and cost, so reserve high resolution for true sub minute alarm needs.
When you sample frequently but need only 1 minute visibility aggregate locally and publish a statistic set once per minute to cut PutMetricData calls and cost.
You are a DevOps Engineer at a fintech startup where your application is deployed via AWS CloudFormation into an Auto Scaling group. You have created a new launch configuration that moves to a newer instance family, and the group currently runs 8 instances while at least 5 must remain in service during the update. In the template, which configuration will update the group’s instances in batches so they adopt the new launch configuration while maintaining the required in-service capacity?
-
✓ C. AutoScalingRollingUpdate
The correct choice is AutoScalingRollingUpdate. This UpdatePolicy directs CloudFormation to replace instances in the Auto Scaling group in controlled batches and it lets you set MinInstancesInService and MaxBatchSize so you can keep at least five instances serving traffic while the group adopts the new launch configuration.
The AutoScalingRollingUpdate policy performs a rolling replacement of instances so capacity can be preserved during the update. You tune the batch size and the minimum in service to ensure the Auto Scaling group never drops below the required number of healthy instances while new instances launch with the updated configuration.
AutoScalingReplacingUpdate is not appropriate because it triggers replacement of the group or all instances rather than a controlled rolling process which can jeopardize the in-service capacity requirement.
AWS CodeDeploy is a deployment service that manages application deployments and hooks into instances, and it is not a CloudFormation UpdatePolicy for controlling how an Auto Scaling group updates within a template.
AutoScalingLaunchTemplateUpdate is not a valid CloudFormation UpdatePolicy name and so it cannot be used to implement the rolling update behavior required for this scenario.
When you must preserve capacity during an Auto Scaling group update remember that an UpdatePolicy with configurable MinInstancesInService and MaxBatchSize enables rolling updates that keep instances serving traffic.
Northwind Diagnostics runs its customer portals on Amazon EC2 and wants a managed approach that continuously identifies software vulnerabilities and unexpected network exposure on those instances. The security team also needs a centralized audit trail of all user logins to the servers kept for 90 days. Which solution best satisfies these needs?
-
✓ C. Deploy Amazon Inspector for EC2 vulnerability and exposure scans, install the CloudWatch Agent to forward login logs to CloudWatch Logs, and send CloudTrail events to CloudWatch Logs for centralized auditing
Deploy Amazon Inspector for EC2 vulnerability and exposure scans, install the CloudWatch Agent to forward login logs to CloudWatch Logs, and send CloudTrail events to CloudWatch Logs for centralized auditing is correct because it provides continuous vulnerability and network exposure scanning for EC2 and a way to centralize and retain OS login and API audit logs for 90 days.
Amazon Inspector continuously evaluates EC2 instances for known CVEs and unintended network reachability so it meets the vulnerability and exposure requirement. CloudWatch Agent can forward operating system login files such as auth.log and /var/log/secure to CloudWatch Logs where you can set a 90 day retention policy. CloudTrail events delivered to CloudWatch Logs give a centralized record of API activity that you can keep alongside the instance login trails for comprehensive auditing.
Configure Amazon GuardDuty with Amazon EventBridge and AWS Lambda for automated remediation is incorrect because Amazon GuardDuty focuses on threat detection and account or network anomalies and it does not perform full host vulnerability scanning or collect OS login files.
Use AWS Systems Manager SSM Agent to detect vulnerabilities on EC2 and run an Automation runbook to patch them is incorrect because SSM Agent and Patch Manager can assess and apply patches but they are not a dedicated vulnerability management service that continuously scans for CVEs and unintended network exposure the way Inspector does.
Enable Amazon ECR image scanning with EventBridge notifications and route CloudTrail data to EventBridge for processing is incorrect because ECR image scanning only covers container image CVEs and does not scan running EC2 hosts or collect server OS login events.
Remember the difference between threat detection and vulnerability management and match the service to the requirement in the question. For OS login auditing think CloudWatch Logs or SSM Session Manager logging and set the retention policy to meet the retention requirement.
All DevOps questions come from certificationexams.pro and my Certified DevOps Engineer Udemy course.
Meridian Assurance, a global insurance firm, is rolling out centralized compliance controls across its AWS organization. Every API invocation in all member accounts must be captured for audits, and the company relies on AWS CloudTrail to record activity and flag sensitive operations. Leadership has asked the platform team to implement an automated guardrail so that if CloudTrail logging is turned off in any account, it is quickly turned back on with minimal interruption to log delivery. Which approach best meets this requirement?
-
✓ B. Deploy the cloudtrail-enabled AWS Config managed rule with a 30-minute periodic evaluation and use an EventBridge rule for AWS Config compliance changes to invoke a Lambda function that calls StartLogging on the affected trail
Deploy the cloudtrail-enabled AWS Config managed rule with a 30-minute periodic evaluation and use an EventBridge rule for AWS Config compliance changes to invoke a Lambda function that calls StartLogging on the affected trail is correct because it uses AWS Config to detect when CloudTrail logging is not enabled and it pairs that detection with an automated remediation that calls StartLogging to resume logging quickly.
Deploy the cloudtrail-enabled AWS Config managed rule with a 30-minute periodic evaluation and use an EventBridge rule for AWS Config compliance changes to invoke a Lambda function that calls StartLogging on the affected trail works well because the managed rule supports scheduled evaluations and does not depend on CloudTrail events that might be missing when logging is off. Tying AWS Config compliance change events to EventBridge lets you invoke a Lambda function automatically and the Lambda can call the CloudTrail StartLogging API to restore logging with minimal interruption to log delivery.
Create a CloudWatch Logs metric filter for StopLogging events and alarm to an SNS topic for notifications only is incorrect because it only sends notifications and requires human intervention to fix logging. Notifications alone do not meet the requirement to automatically turn CloudTrail back on quickly.
Use EventBridge with a Lambda scheduled every two minutes to query CloudTrail and if DeleteTrail is detected, recreate it with CreateTrail is incorrect because the failure mode to address is StopLogging rather than DeleteTrail and the appropriate API is StartLogging. Polling also adds unnecessary delay and operational complexity compared with event driven remediation.
Enable the cloudtrail-enabled AWS Config rule with a Configuration changes trigger and rely on its default automatic remediation is incorrect because this managed rule is supported only for periodic evaluations and it does not provide built in automatic remediation unless you implement a remediation path such as EventBridge plus Lambda.
Pair the cloudtrail-enabled AWS Config managed rule with an EventBridge rule and a Lambda that calls StartLogging so remediation is automated and downtime is minimized.
Orion Byte Labs runs several services with a MERN front end behind NGINX and uses AWS CodeDeploy to automate rollouts. The team has a QA deployment group and will add PREPROD and PRODUCTION groups later. They want the NGINX log level to be set dynamically at deploy time so each group can have different verbosity without creating separate application revisions or maintaining different scripts per environment. Which approach provides the lowest ongoing management effort and avoids multiple script variants?
-
✓ B. Use a single script that reads the DEPLOYMENT_GROUP_NAME environment variable in CodeDeploy and call it in the BeforeInstall hook to set NGINX logging per group
Use a single script that reads the DEPLOYMENT_GROUP_NAME environment variable in CodeDeploy and call it in the BeforeInstall hook to set NGINX logging per group is correct because it uses what CodeDeploy already exposes and lets you apply configuration before services start.
This method avoids extra AWS API calls and credentials and it prevents having multiple application revisions or many script variants to maintain. Reading the deployment group name inside a single script called in the BeforeInstall hook allows the log level to be set early and keeps the deployment simple and low maintenance.
Invoke a script during ApplicationStart that uses the DEPLOYMENT_GROUP_ID environment variable to detect the group and update the NGINX log level is less desirable because ApplicationStart runs later in the lifecycle and configuring after services may have started increases risk. Using the numeric group ID adds complexity when the group name is already available.
Tag each EC2 instance with its deployment group and have a ValidateService hook script call aws ec2 describe-tags to choose the log level introduces additional tag management and requires AWS CLI calls during deployment. ValidateService runs late in the lifecycle and this approach increases operational overhead compared with using built in variables.
Define a custom environment variable in CodeDeploy for each environment such as QA, PREPROD, and PROD, and have a ValidateService hook script read it to set the log level is not a practical choice because CodeDeploy hook scripts rely on predefined environment variables and do not provide arbitrary per-deployment custom variables for scripts. This also uses the late ValidateService event which is not ideal for pre-start configuration.
Use DEPLOYMENT_GROUP_NAME in a single deploy script and run it in BeforeInstall so NGINX logging is set before services start.
A DevOps specialist at Norstar Media operates a web application behind an Application Load Balancer across three Availability Zones in one AWS Region. Several EC2 instances in a single zone have begun failing health checks and returning errors. The specialist must isolate that Availability Zone and shift client requests to the healthy zones with minimal changes to the stack. What should be implemented?
-
✓ B. Turn off cross-zone load balancing on the ALB and use Amazon Route 53 Application Recovery Controller to start a zonal shift away from the impaired Availability Zone
Turn off cross-zone load balancing on the ALB and use Amazon Route 53 Application Recovery Controller to start a zonal shift away from the impaired Availability Zone is correct because it prevents the load balancer from routing new requests into the failing Availability Zone and it gives operators a controlled way to move traffic away from that zone to healthy ones.
This works in practice because disabling cross-zone distribution on the ALB stops it from balancing traffic into the degraded zone and the Amazon Route 53 Application Recovery Controller provides a zonal shift mechanism that drains and redirects traffic at the AZ level without requiring major stack changes. That combination lets you isolate the affected AZ quickly while keeping existing target groups and configuration intact.
Configure Auto Scaling health checks to replace unhealthy instances in the degraded zone is insufficient because replacing instances takes time and it does not immediately stop the load balancer from sending traffic to targets still associated with the impaired AZ.
Enable cross-zone load balancing and set Amazon Route 53 failover routing so requests are spread evenly across all zones is wrong because cross-zone distribution will continue to send traffic into the unhealthy zone and Route 53 failover routing does not provide the per-Availability Zone drain and shift control that ARC offers.
AWS Global Accelerator is not the right choice for this problem because it optimizes global client ingress and regional failover patterns and it does not provide the intra-region, per-AZ control needed to bypass a single impaired Availability Zone.
When an entire Availability Zone is degraded prefer a zonal shift tool to move traffic off that AZ and verify your load balancer settings will not counteract the shift.
DevOps exam content domains and weights
The blueprint is organized into six domains that reflect how DevOps engineers deliver value on AWS. Each domain contributes a percentage to your total score so you should allocate study time accordingly. If your background is primarily development, supplement with Scrum Master and Product Owner resources to sharpen delivery flow and release planning. Builders coming from ML should also glance at AWS ML topics and even AI Practitioner for responsible AI context that increasingly intersects with DevOps.
- Domain 1 SDLC automation – 22 percent
- Domain 2 configuration management and infrastructure as code – 17 percent
- Domain 3 resilient cloud solutions – 15 percent
- Domain 4 monitoring and logging – 15 percent
- Domain 5 incident and event response – 14 percent
- Domain 6 security and compliance – 17 percent
Domain 1 SDLC automation
This domain focuses on building robust delivery pipelines and choosing safe deployment strategies. If you like structured drills, timed sets on platforms such as Udemy practice exams help you internalize AWS exam phrasing even when the course title targets another cert.
Implement CI and CD pipelines
You configure version control integration, build stages, artifact management, and automated deployments that span single and multi account environments. The mindset transfers from associate tracks like Developer Associate and Solutions Architect Associate.
- Connect repositories and trigger builds and tests on pull requests and merges
- Generate and promote artifacts through environments
- Choose deployment patterns that reduce risk and enable rollback
Integrate automated testing
You position unit, integration, acceptance, security, and performance tests at the right stages in the pipeline. Teams that embrace Scrum practices often find test placement and definition of done easier to standardize.
- Gate promotions on test outcomes and health signals
- Exercise applications at scale and interpret exit codes and metrics
Build and manage artifacts
You create secure artifact repositories and control lifecycles and provenance. These patterns complement data movement concerns that appear on Data Engineer and GCP Data Engineer Professional.
- Produce container images, function bundles, and AMIs through repeatable builds
- Sign, scan, and promote artifacts across accounts
Deploy to instances, containers, and serverless
You select deployment approaches that fit each runtime and business need. The blue green and canary release patterns will also appear in GCP DevOps Engineer prep, which makes them worth mastering.
- Use blue green or canary for safe releases
- Automate agent configuration and troubleshoot rollout failures
Domain 2 configuration management and infrastructure as code
This domain covers how you define, standardize, and govern cloud infrastructure at scale. If your long term goal includes architecture leadership, align these skills with the guidance in Solutions Architect Professional.
Define cloud infrastructure with reusable components
You compose templates and modules that encode security controls, guardrails, and best practices.
- Model networks, compute, storage, and policies as code
- Deploy stacks consistently across accounts and Regions
Automate account provisioning and governance
You standardize account creation and baseline configuration for multi account setups. The same governance ideas show up on GCP Workspace Administrator and GCP Security Engineer.
- Apply organization structures and service control policies
- Enable configuration recording, drift detection, and change approval flows
Automate operations at scale
You build runbooks and workflows that keep fleets configured, patched, and compliant. For end to end delivery literacy consider complementing with GCP Developer Professional or GCP Data Practitioner resources.
- Automate inventory, patching, and state management
- Orchestrate complex tasks with event driven functions and workflows
Domain 3 resilient cloud solutions
This domain validates your ability to keep systems highly available, scalable, and recoverable. Many resilience patterns overlap with architect content, so browsing Solutions Architect Associate summaries can help.
Design for high availability and fault tolerance
You translate business objectives into technical resilience and remove single points of failure.
- Use multi Availability Zone and multi Region patterns where needed
- Implement health checks, graceful degradation, and failover routing
Scale elastically
You select the right auto scaling, load balancing, and caching strategies for each layer.
- Scale instances, containers, serverless functions, and data services with appropriate metrics
- Design loosely coupled and distributed architectures for growth
Automate recovery and disaster readiness
You meet recovery time and recovery point targets with tested procedures. If you also study analytics stacks, compare backup choices with those seen in GCP Database Engineer Pro.
- Choose backup and recovery patterns such as pilot light and warm standby
- Practice failovers and document restoration workflows
Domain 4 monitoring and logging
This domain ensures you can observe complex systems and turn signals into action. The mental models carry nicely to GCP Network Engineer troubleshooting and GCP Associate Engineer operations.
Collect, aggregate, and store telemetry
You capture logs and metrics securely with the right retention and encryption.
- Create custom metrics and metric filters from logs
- Stream telemetry to analytics and long term storage
Audit and analyze signals
You build dashboards, anomaly alarms, and queries that reveal health and trends.
- Correlate traces, metrics, and logs to pinpoint issues
- Use managed analytics to search and visualize events
Automate monitoring and event management
You connect events to notifications and remediation without manual effort.
- Trigger alerts, functions, and self healing actions on thresholds and patterns
- Install and manage agents safely across fleets
Domain 5 incident and event response
This domain focuses on how you detect, triage, and resolve operational issues. Many event driven patterns echo what you will use in ML pipelines or AI agents so a scan of AI Practitioner and GCP Generative AI Leader can broaden your perspective.
Process events and notify the right channels
You integrate native event sources and build fan out pipelines for processing.
- Capture platform health, audit trails, and service events
- Drive queues, streams, and workflows to coordinate actions
Apply configuration changes safely
You modify infrastructure and application settings in response to events without creating new risk.
- Roll back misconfigurations quickly
- Automate fleet wide updates through managed services
Troubleshoot and perform root cause analysis
You analyze failed deployments, scaling behaviors, and timeouts with structured methods. For deeper architecture tradeoffs, the pro level architect guide at Solutions Architect Professional is a helpful companion.
- Use traces, metrics, logs, and health data to isolate faults
- Document findings and preventive actions for future resilience
Domain 6 security and compliance
This domain confirms that you can secure identities, data, and networks at scale and prove it with evidence. Security topics frequently overlap with GCP Security Engineer and AI governance elements from AI Practitioner.
Manage identity and access at scale
You design least privilege access for humans and machines across many accounts.
- Apply roles, boundaries, and federation patterns
- Rotate credentials and enforce strong authentication practices
Automate security controls and data protection
You implement layered defenses and privacy controls through automation. If your org builds ML services, align controls with practices referenced in the AWS ML Specialty and GCP ML Engineer Pro tracks.
- Combine network controls, certificates, and encryption for defense in depth
- Discover sensitive data and protect it at rest and in transit
Monitor and audit security continuously
You collect evidence, detect threats, and alert on anomalous behavior. These habits also strengthen success in AWS Security exams.
- Enable audit trails and configuration recording
- Analyze findings and integrate remediation into workflows
Out of scope tasks
The exam does not require expert level routing design, deep database optimization, or full stack application development. You are not expected to provide deep security architecture reviews to developers or write complex application code. Focus on DevOps delivery, operations, and automation on AWS.
All DevOps questions come from certificationexams.pro and my Certified DevOps Engineer Udemy course.
How to prepare
A strong plan improves both knowledge and exam judgment. Combine guided study with hands on practice and timed drills.
For technique breakdowns and common exam patterns, Cameron McKenzie YouTube channel has lots to offer.
When you want extra practice with exam style phrasing, timed sets like this Udemy practice series can help build speed.
As you progress, branch to adjacent guides such as Cloud Practitioner, Solutions Architect Associate, and the GCP DevOps Engineer to cross check patterns.
- Start with the official DOP C02 guide and list of in scope services
- Map each task statement to labs for pipelines, IaC, observability, and incident response
- Build small projects that exercise blue green and canary deployments, multi account governance, and automated remediation
- Create dashboards, alarms, and alerts and test them with synthetic traffic
- Take full length practice exams and write out why wrong options are wrong
- Review security and reliability best practices and read service quotas and limits
If you cover every domain with real practice and test yourself under time pressure, you will be ready to pass and to operate production systems with confidence.
| Jira, Scrum & AI Certification |
|---|
| Want to get certified on the most popular software development technologies of the day? These resources will help you get Jira certified, Scrum certified and even AI Practitioner certified so your resume really stands out..
You can even get certified in the latest AI, ML and DevOps technologies. Advance your career today. |
Cameron McKenzie is an AWS Certified AI Practitioner, Machine Learning Engineer, Copilot Expert, Solutions Architect and author of many popular books in the software development and Cloud Computing space. His growing YouTube channel training devs in Java, Spring, AI and ML has well over 30,000 subscribers.
Other AWS Certification Books
If you want additional certifications and career momentum, explore this series:
- AWS Certified Cloud Practitioner Book of Exam Questions — pair with the roadmap at Cloud Practitioner topics.
- AWS Certified Developer Associate Book of Exam Questions — cross-check with Developer Certification guides.
- AWS Certified AI Practitioner Book of Exam Questions & Answers — align with AI Practitioner exam objectives and ML exam services.
- AWS Certified Machine Learning Associate Book of Exam Questions — a bridge toward ML Specialty Certification.
- AWS Certified DevOps Professional Book of Exam Questions — complements DevOps Certification Exam study.
- AWS Certified Data Engineer Associate Book of Exam Questions — use with Data Engineer content.
- AWS Certified Solutions Architect Associate Book of Exam Questions — see the companion AWS Solutions Architect Certification track.
For multi-cloud awareness, compare with GCP paths such as ML Engineer Professional, Developer Professional, Data Engineer Professional, Security Engineer, DevOps Engineer, Network Engineer, Associate Cloud Engineer, and leadership tracks like Generative AI Leader and Solutions Architect Professional.
