Azure Data Fundamentals Questions and Answers

DP-900 Azure Data Fundamentals Exam Topics

Over the past few months, I have been helping software developers, solutions architects, cloud engineers, and even Scrum Masters who want to learn core Azure data concepts gain the skills and certifications needed to stay competitive in a rapidly evolving industry.

One of the most respected introductory Azure data certifications is the DP-900 Microsoft Certified Azure Data Fundamentals.

So how do you pass the DP-900 certification? You practice by using DP-900 exam simulators, going over sample DP-900 test questions, and taking online DP-900 practice exams like this one.

Keep practicing until you can consistently answer data storage, processing, and analytics questions with confidence.

DP-900 Data Fundamentals Practice Questions

In helping students prepare for this exam, I have identified a number of commonly misunderstood DP-900 topics that tend to appear in practice questions, which is why this set of DP-900 questions and answers was created. If you can answer these correctly, you are well on your way to passing the exam.

One important note: these are not DP-900 exam dumps. There are plenty of braindump websites that focus on cheating, but there is no value in earning a certification without real knowledge. These questions are representative of the DP-900 exam style and subjects but are not duplicates of real exam content.

Now here are the DP-900 practice questions and answers. Good luck!

Git, GitHub & GitHub Copilot Certification Made Easy

Want to get certified on the most popular AI, ML & DevOps technologies of the day? These five resources will help you get GitHub certified in a hurry.

Get certified in the latest AI, ML and DevOps technologies. Advance your career today.

Data Fundamentals Sample Questions

What approach lets you restrict which external IP addresses can reach your Cosmos DB account over the public internet?

  • ❏ A. Disable public network connectivity for the Cosmos DB account

  • ❏ B. Rotate the account primary keys to invalidate existing credentials

  • ❏ C. Apply an IP address allow list using an IP firewall policy

  • ❏ D. Use a managed network firewall to inspect and block incoming traffic

A retail analytics team at NovaMart needs an Azure offering that applies machine learning to identify unusual activity and repeating patterns in time based datasets. Which Azure service should they choose?

  • ❏ A. Azure Stream Analytics

  • ❏ B. Azure Machine Learning

  • ❏ C. Google Cloud Monitoring

  • ❏ D. Azure Anomaly Detector

After provisioning an Azure SQL database which login will always be able to connect to the database?

  • ❏ A. Azure Active Directory administrator account for the server

  • ❏ B. Server admin login of the logical server

  • ❏ C. Microsoft Entra ID account that created the database

  • ❏ D. The traditional sa SQL account

How does the concept of “data privacy” differ from “data confidentiality” in practical terms?

  • ❏ A. Data privacy and data confidentiality are simply two names for the same idea

  • ❏ B. Data privacy is a compliance responsibility for the customer while data confidentiality is a technical duty for the cloud vendor

  • ❏ C. Data privacy concerns the proper and lawful handling of personal information while data confidentiality concerns preventing unauthorized access to any sensitive information

  • ❏ D. Data privacy only applies to personally identifiable fields while data confidentiality only applies to encrypted datasets

A retail reporting team at Summit Analytics is deciding whether to create a paginated report for detailed sales receipts and inventory listings. Which scenario would justify building a paginated report?

  • ❏ A. The report is expected to produce about 250000 pages

  • ❏ B. It will be viewed interactively in Looker Studio

  • ❏ C. It needs exact page layout for printing or fixed file exports

  • ❏ D. It should be a visual summary with charts and interactive elements

Why might an engineer pick Azure Cosmos DB Table API over Azure Table Storage for a globally distributed key value store that requires very low latency?

  • ❏ A. Service level agreement of 99.995% availability

  • ❏ B. Cloud Bigtable

  • ❏ C. SDKs available for many programming languages and platforms

  • ❏ D. Provides single digit millisecond read and write latency under 8 ms at the 99th percentile worldwide

Which Azure service provides real time analytics for high velocity event streams?

  • ❏ A. Azure Databricks

  • ❏ B. Google Cloud Dataflow

  • ❏ C. Azure Stream Analytics

  • ❏ D. Azure Synapse Analytics

An analytics team at Harbor Analytics is tuning a sizable Azure SQL Database table named sales_archive_2025 which contains many columns that are seldom referenced in queries. Which indexing approach will best improve overall query performance while avoiding excessive index maintenance?

  • ❏ A. Create nonclustered indexes on every seldom used column

  • ❏ B. Avoid adding indexes for the infrequently accessed columns

  • ❏ C. Create filtered indexes that target specific subsets of rows for the infrequently used columns

  • ❏ D. Create a clustered index on each of the rarely used columns

In the context of Azure reporting at Evergreen Analytics which term best fills the blank in the following sentence ” are charts of coloured rectangles with each rectangle sized to show the relative value of items and they can include nested rectangles to show hierarchy”?

  • ❏ A. Key influencers

  • ❏ B. Filled map

  • ❏ C. Matrix

  • ❏ D. Treemap visual

A retail analytics firm named Meridian Analytics wants to automate provisioning a nonrelational database on Google Cloud Platform. Which provisioning approach uses a script made of commands that you can run from any operating system shell such as Linux macOS or Windows?

  • ❏ A. Deployment Manager templates

  • ❏ B. gcloud command line or Cloud SDK

  • ❏ C. Cloud Console

  • ❏ D. Client libraries

A regional retailer that uses Contoso Cloud storage plans to classify datasets by usage patterns. What is the primary characteristic that differentiates hot data from cold data?

  • ❏ A. Hot data tends to be more expensive to retain while cold data normally costs less to store

  • ❏ B. Hot data is structured while cold data is unstructured

  • ❏ C. Cold data is accessed infrequently while hot data is accessed frequently and needs rapid availability

  • ❏ D. Hot data is stored in primary cloud storage while cold data is kept on local on premise servers

A regional retail chain named Meridian Retail depends on data modeling and visualizations for business insight and needs to combine datasets from several sources to produce an interactive report. Which tool should be used to import data from multiple data sources and author the report?

  • ❏ A. Azure Data Factory

  • ❏ B. Looker Studio

  • ❏ C. Power BI Desktop

  • ❏ D. Power BI Mobile app

In the context of Microsoft Azure which job title best completes this sentence A(n) [?] operates database systems grants user access maintains backup copies of data and recovers data after failures?

  • ❏ A. Data Engineer

  • ❏ B. Cloud SQL

  • ❏ C. Database Administrator

  • ❏ D. Data Analyst

What is the main function of a data warehouse appliance that a cloud vendor delivers to enterprises for analytics?

  • ❏ A. A serverless managed analytics service like BigQuery

  • ❏ B. A preconfigured integrated hardware and software system optimized for analytics

  • ❏ C. A repository for storing massive raw datasets in multiple formats

  • ❏ D. A tool that orchestrates ingestion and transformation workflows

A regional e commerce company named ClearWave assigns a high number of Request Units per second to an Azure Cosmos DB container for a new product catalog service, and the operations team wonders what consequence this overallocation will have on their account and the database performance?

  • ❏ A. Provisioned RU/s do not affect billing or performance

  • ❏ B. You are billed for the provisioned RU/s regardless of consumption

  • ❏ C. Cosmos DB will automatically decrease allocated RU/s when usage drops

  • ❏ D. Query throughput automatically scales in line with the provisioned RU/s and improves performance proportionally

A logistics analytics team at Harborline needs a live dashboard that displays operational metrics from their Azure data pipeline so managers can monitor performance as it happens. Which Azure service should they use to build and share interactive dashboards and reports?

  • ❏ A. Azure Databricks

  • ❏ B. Azure Data Factory

  • ❏ C. Azure Synapse Analytics

  • ❏ D. Power BI

Which configuration option for a Cosmos database account can only be chosen when the account is first created?

  • ❏ A. Multi region writes

  • ❏ B. API selection for the account

  • ❏ C. Geo replication regions

  • ❏ D. Account provisioning tier for production or development

You are administering a relational database hosted in a cloud SQL instance for Orion Systems and you must ensure that transactions are atomic, consistent, isolated and durable. Which database property enforces those ACID guarantees?

  • ❏ A. Database normalization

  • ❏ B. Transaction logging and write ahead logs

  • ❏ C. Transaction isolation level settings

  • ❏ D. High availability and failover

A small startup named LumenChat is building an instant messaging platform and requires message storage that supports extremely fast reads and writes to serve live conversations, which Azure data service best fits this requirement?

  • ❏ A. Azure Cache for Redis

  • ❏ B. Azure Blob Storage

  • ❏ C. Azure SQL Database

  • ❏ D. Azure Cosmos DB

Fill in the blank in this Power BI sentence for Contoso Cloud. When you are prepared to share a single page from a Power BI report or a group of visualizations you create a [?]. A Power BI [?] is a single page collection of visuals that you can distribute to other users?

  • ❏ A. Report

  • ❏ B. BigQuery table

  • ❏ C. Dashboard

  • ❏ D. Dataset

A data engineering group is comparing Azure Synapse Analytics and Azure Databricks for Spark based workloads and wants to know which capability Synapse provides that Databricks does not offer?

  • ❏ A. Capability to dynamically scale compute resources to meet changing workloads

  • ❏ B. Native connector integration with Google BigQuery

  • ❏ C. Support for T SQL based analytics that will be familiar to SQL Server developers

  • ❏ D. Ability to execute distributed parallel processing across clusters

A regional ecommerce startup named MeadowMart stores product records in a Cosmos DB container that uses a partition key called item_type. You need to fetch all products that belong to a specific item_type. What Cosmos DB operation will most efficiently return those items?

  • ❏ A. Use the change feed to track updates in the container

  • ❏ B. Filter documents by the partition key value

  • ❏ C. BigQuery

  • ❏ D. Scan every document in the container

When protecting information for a cloud application how would you define “data at rest”?

  • ❏ A. Data moving across a network between systems

  • ❏ B. Data stored persistently on disks, optical media, or cloud object storage

  • ❏ C. Data held temporarily in a system’s RAM or CPU cache

  • ❏ D. Data queued in a write buffer while awaiting final persistence

A regional retailer has several data flows and needs to assign the correct workload pattern to each flow. Which workload type best matches each scenario?

  • ❏ A. Streaming workload for the product index that is refreshed every 24 hours to a data warehouse, micro batch workload for customer purchases that are pushed to the warehouse as they occur, and batch workload for inventory reconciliations that are processed after every 1,200 transactions

  • ❏ B. Batch workload for the product catalog that is loaded every 24 hours to a data warehouse, streaming workload for online purchases loaded to the warehouse as they occur, and micro batch workload for inventory updates applied after every 1,200 transactions

  • ❏ C. Micro batch workload for the product catalog that is refreshed every 24 hours to a data warehouse, batch workload for online purchases that are accumulated before loading, and streaming workload for inventory updates that are emitted after every 1,200 transactions

In what situation would an engineer use an Azure Resource Manager template to deploy infrastructure and configuration in a repeatable way?

  • ❏ A. Assign complex access permissions to existing Azure resources

  • ❏ B. Automate the repeatable deployment of an interdependent set of Azure resources

  • ❏ C. Provision whole Azure subscriptions and enforce tenant level policies across several tenants

  • ❏ D. Use Google Cloud Deployment Manager instead of an Azure template

Which type of database exhibits these drawbacks? Transactional updates that span multiple entities are not guaranteed because consistency must be considered. There is no referential integrity and relationships between rows must be managed outside the table. Filtering or sorting on fields that are not keys is difficult and queries on non key fields can force full table scans?

  • ❏ A. Cloud Bigtable

  • ❏ B. Contoso Table Storage

  • ❏ C. Cloud SQL for PostgreSQL

  • ❏ D. Cloud Spanner

  • ❏ E. Cloud SQL for MySQL

The analytics team at Meridian Outfitters plans to build a modern data warehouse to store and analyze very large volumes of structured and semi structured information. Which Azure service is the best fit for this requirement?

  • ❏ A. Azure Data Lake Storage

  • ❏ B. Azure SQL Database

  • ❏ C. Google BigQuery

  • ❏ D. Azure Synapse Analytics

Review the following statements and identify which one accurately describes a SQL concept?

  • ❏ A. An index is a data structure that speeds up lookups on table columns

  • ❏ B. A materialized view is a physical snapshot of a query result stored for faster access

  • ❏ C. A stored procedure is a saved routine that performs a set of SQL operations

  • ❏ D. A view is a virtual table whose contents are produced by a SQL query

An engineer at AuroraApps cannot establish a connection from their home network to a specific Azure SQL Database while teammates can connect without issues, and the engineer can connect to other Azure SQL Databases in the same subscription with the same client tools installed. What is the most likely reason for the connection failure?

  • ❏ A. Outbound TCP port 1433 is blocked by the home router or ISP

  • ❏ B. A private endpoint or virtual network rule is preventing access from the public internet

  • ❏ C. The user does not have explicit database level permissions granted

  • ❏ D. The server firewall does not include the engineer’s home public IP

A regional logistics startup is evaluating databases that can hold semi structured documents and graphs without a fixed table layout. What is a primary benefit of selecting a nonrelational database instead of a traditional relational database?

  • ❏ A. Enforces referential integrity with foreign key constraints across tables

  • ❏ B. Cloud Bigtable

  • ❏ C. Provides strong multi row ACID transactions by default

  • ❏ D. Allows flexible schemas so data can be stored as key value pairs documents or graph models

In the context of Contoso Cloud analytics what type of visualization completes this sentence A(n) [?] can be used to show how a value differs in proportion across a geographic area or region?

  • ❏ A. Treemap

  • ❏ B. Choropleth map

  • ❏ C. Line chart

  • ❏ D. Scatter plot

In a cloud data platform how do row level security and cell level security differ in the scope of data they control?

  • ❏ A. Row level security is always easier to configure than cell level security

  • ❏ B. Row level security applies to relational tables while cell level security is only for key value stores

  • ❏ C. Row level security restricts access to whole rows while cell level security controls individual fields within a row

  • ❏ D. Row level security provides stronger protection than cell level security

A regional retail company operates a PostgreSQL server in its own data center and needs to migrate the data into an Azure Database for PostgreSQL instance. Which approach is recommended to carry out this migration?

  • ❏ A. Use Azure Data Factory to copy the data from the on premise PostgreSQL to the Azure database

  • ❏ B. Export a logical SQL dump from the local PostgreSQL and then import that dump into the Azure Database for PostgreSQL instance

  • ❏ C. Set up a site to site VPN and stream replication traffic directly into the Azure database

  • ❏ D. Use Azure Database Migration Service to perform a managed migration of the on premise PostgreSQL database

A retail analytics team at NovaMart must ingest and query very large volumes of structured telemetry and access logs interactively and with low latency. Which Azure service is most appropriate for storing and querying those massive structured log records?

  • ❏ A. Azure Data Lake Storage

  • ❏ B. Azure Blob Storage

  • ❏ C. Azure SQL Database

  • ❏ D. Azure Data Explorer

In the context of a company called Summit Retail identify the missing words in this statement. OLTP systems are the original distributed transactional sources across the enterprise and OLAP systems pull together data from those transactional stores to present a multi dimensional perspective for reporting and analytics. Together OLTP and OLAP form the two sides of what concept?

  • ❏ A. BigQuery

  • ❏ B. Transaction management

  • ❏ C. Hybrid transactional analytical processing

  • ❏ D. Data warehousing

Read the sample customer records described in the paragraph that follows. Customer 101 has an id of 101 and a name of Anna Rivera and the telephone property contains an object with home business and mobile entries using numbers such as 44 20 7123 4567 and 44 20 8123 4567 and 44 7700 900111 and the address property contains two nested entries for home at 12 Baker Lane Some City NY 10021 and office at 9 Tower Road Some City NY 10022. Customer 202 has an id of 202 and a name of Luis Moreno and the telephone property contains home and mobile entries with country prefixes such as 001 and 011 and the address property contains nested locations for UK and US with complete address lines. What kind of database model does this representation demonstrate?

  • ❏ A. Relational database

  • ❏ B. Cloud Spanner

  • ❏ C. SQL database

  • ❏ D. Non relational database

A development team is building a web service that must connect to an Azure SQL database. Which method provides the recommended secure way to store and retrieve the database connection string for the application?

  • ❏ A. Embed the connection string directly in application code

  • ❏ B. Use Google Secret Manager

  • ❏ C. Keep the connection string in a local configuration file

  • ❏ D. Azure Key Vault

Identify the missing word or words in the following sentence in the Microsoft Azure context. A(n) [?] is something about which information must be known or held. For the online retailer RetailCo you might create tables for clients, items, and purchases. A table contains rows and each row represents a single instance of a(n) [?].?

  • ❏ A. Record

  • ❏ B. Resource

  • ❏ C. Entity

  • ❏ D. Attribute

A retail insights team at BlueHarbor is building an analytics platform and they want a managed Apache Spark environment that includes interactive notebooks for data exploration and collaboration. Which Azure service should they select?

  • ❏ A. Azure Synapse Analytics

  • ❏ B. Azure Functions

  • ❏ C. Azure HDInsight

  • ❏ D. Azure Databricks

A retail technology firm named Northbridge Labs uses Azure Cosmos DB to run a client facing API and they must charge customers by tracking the total operation count and outbound data transferred. Which Cosmos DB metric should they observe to reflect billing consumption?

  • ❏ A. Index storage usage

  • ❏ B. Provisioned throughput

  • ❏ C. Total Request Units

  • ❏ D. Normalized RU consumption

When an on premises application from AlderTech connects to a cloud database using a Proxy connection policy what behavior should you expect?

  • ❏ A. After the initial connection the application sends all traffic straight to the database and no longer uses the gateway

  • ❏ B. Connections are established through the gateway so that all subsequent requests continue to flow through the gateway and individual calls can be handled by different nodes in the cluster

  • ❏ C. Cloud SQL Proxy

  • ❏ D. All connections are validated centrally to confirm they originate from trusted clients

A regional archive team needs to construct a knowledge mining workflow to derive insights from millions of unstructured records such as scanned reports and photos. Which Azure service should they use to accomplish this task?

  • ❏ A. Google Cloud Natural Language API

  • ❏ B. Azure Cognitive Search

  • ❏ C. Azure Databricks

  • ❏ D. Azure Form Recognizer

A regional retailer is evaluating CloudCo’s managed relational database service for its web applications. What are two benefits of adopting a managed PaaS relational database offering compared with running database servers yourself? (Choose 2)

  • ❏ A. Reduced operational overhead for server provisioning patching and backups

  • ❏ B. Immediate access to automatic platform updates and new database features

  • ❏ C. Total control of low level file system layout and backup schedules

  • ❏ D. Easier linking with object storage for archival and analytic workflows

A boutique travel tech firm needs to store about 40 thousand guest feedback entries and then determine whether each entry expresses positive negative or neutral feelings. Which Azure Cognitive Service should they use for analyzing the text of these reviews?

  • ❏ A. Azure Computer Vision

  • ❏ B. Azure Speech Service

  • ❏ C. Azure Language Service

  • ❏ D. Azure Text Analytics

A smart agriculture company operates roughly 9,600 wireless temperature probes that send readings every 40 seconds and the analytics team will examine changes in temperature over time. Which type of storage system is best suited to hold these time ordered sensor measurements?

  • ❏ A. Graph database

  • ❏ B. BigQuery

  • ❏ C. Time series database

  • ❏ D. Relational database

Within the context of Contoso Cloud identify the missing word or words in this sentence. [?] is a collection of services apps and connectors that lets you connect to your data wherever it happens to reside filter it if necessary and then bring it into [?] to create compelling visualizations that you can share with others?

  • ❏ A. Azure Databricks

  • ❏ B. Azure Synapse Spark

  • ❏ C. Microsoft Power BI

  • ❏ D. Azure Synapse Studio

  • ❏ E. Azure Data Factory

Identify the missing term in the context of Contoso Cloud storage where items are referred to as rows and attributes are described as columns. This storage does not provide relationships, stored procedures, secondary indexes, or foreign keys and data is normally denormalized with each row holding all the information for a single logical entity?

  • ❏ A. Cloud Bigtable

  • ❏ B. Azure Database for PostgreSQL

  • ❏ C. Azure Tables

  • ❏ D. Azure Database for MySQL

  • ❏ E. Cloud Firestore

An e commerce analytics group has an Azure SQL Database table that holds tens of millions of rows and analysts often filter queries by a single column that does not have an index. Which action will most effectively accelerate those filtered queries?

  • ❏ A. Partition the table on that column

  • ❏ B. Scale up the database service tier to increase compute and IO capacity

  • ❏ C. Create a clustered index on the column

  • ❏ D. Add a nonclustered index on the frequently filtered column

A regional analytics startup called Meridian Insights must choose a data representation that supports nested fields for configuration and document style records and that can be shared between services and stored in document databases. Which data format supports hierarchical structures and is commonly used for configuration files data interchange and document storage in NoSQL systems?

  • ❏ A. Comma separated values CSV

  • ❏ B. Cloud Firestore

  • ❏ C. JavaScript Object Notation JSON

  • ❏ D. Extensible Markup Language XML

Within the context of Contoso Cloud identify the missing term in this sentence. A(n) [?] partners with stakeholders to design and build data assets such as ingestion pipelines cleansing and transformation workflows and data stores for analytical workloads and they use a broad set of data platform technologies including relational and non relational databases file storage and streaming sources?

  • ❏ A. Contoso Cloud Solutions Architect

  • ❏ B. Contoso Data Analyst

  • ❏ C. Contoso Data Engineer

  • ❏ D. Contoso Database Administrator

At Meridian Data Labs a row oriented file format created by an Apache project stores each record with a header that defines the record structure and that header is encoded as JSON while the record payload is stored as binary. Applications read the header to interpret and extract the fields. Which format is being described?

  • ❏ A. Parquet

  • ❏ B. BigQuery

  • ❏ C. JSON

  • ❏ D. ORC

  • ❏ E. XLSX

  • ❏ F. CSV

  • ❏ G. Avro

Identify the missing word or phrase in this Microsoft Azure scenario. The blank approach is ideal for migrations and for applications that require full operating system level access. SQL Server virtual machines follow a lift and shift pattern and you can transfer an on premises installation into a cloud virtual machine with minimal adjustments so the system behaves largely as it did in its original environment?

  • ❏ A. Platform as a Service

  • ❏ B. Function as a Service

  • ❏ C. Infrastructure as a Service

  • ❏ D. Software as a Service

Which class of analytics workload is most suitable for handling a continuous flow of incoming data that has no defined start or finish?

  • ❏ A. Microbatch processing for frequent scheduled jobs

  • ❏ B. Cloud Dataflow

  • ❏ C. Continuous stream processing with low latency

  • ❏ D. Traditional ETL batch pipeline

A regional bookseller plans to migrate its on-site SQL Server instance to a Google Cloud Compute Engine virtual machine and asserts that SQL Server hosted in the VM behaves the same as the physical on-premises server and that migrating the databases is identical to copying them between two local servers. Is that claim true?

  • ❏ A. False

  • ❏ B. True

Which term best completes this Contoso Cloud sentence where the blank data store contains a collection of objects data values and named string fields formatted as JSON?

  • ❏ A. Time series

  • ❏ B. Graph

  • ❏ C. Key-Value

  • ❏ D. Document

DPf-900 Sample Questions Answered

What approach lets you restrict which external IP addresses can reach your Cosmos DB account over the public internet?

  • ✓ C. Apply an IP address allow list using an IP firewall policy

Apply an IP address allow list using an IP firewall policy is correct because Cosmos DB provides account level IP firewall rules that let you permit only specified IPv4 addresses or ranges to reach the account over the public internet.

An IP firewall policy is configured at the Cosmos DB account level and can be managed from the Azure portal, Azure CLI, PowerShell, or ARM templates. This feature enforces network level access based on source IP addresses and is the appropriate mechanism when the requirement is to restrict which external IP addresses can connect to the public endpoint.

Disable public network connectivity for the Cosmos DB account is incorrect because turning off public network connectivity blocks all public access and does not provide a way to selectively allow certain external IP addresses. That option is used when you want access only via private endpoints.

Rotate the account primary keys to invalidate existing credentials is incorrect because key rotation affects authentication credentials and does not control network level access or which IP addresses can reach the service.

Use a managed network firewall to inspect and block incoming traffic is incorrect because a managed network firewall such as Azure Firewall inspects and filters traffic for virtual networks or routed traffic and is not the mechanism for applying an allow list directly to a Cosmos DB public endpoint. To restrict direct public access by source IP you would use the Cosmos DB IP firewall or move access to private endpoints.

When a question asks about limiting which external IP addresses can reach a public endpoint look for an IP allow list or firewall rule answer rather than key rotation or disabling public access entirely.

A retail analytics team at NovaMart needs an Azure offering that applies machine learning to identify unusual activity and repeating patterns in time based datasets. Which Azure service should they choose?

  • ✓ D. Azure Anomaly Detector

The correct option is Azure Anomaly Detector.

The service applies machine learning specifically to time series data to identify unusual activity and repeating patterns. It offers a ready made REST API and SDKs for real time and batch anomaly detection which makes it a direct fit for the retail analytics team requirements.

Azure Stream Analytics is a real time stream processing and query service for event data and it is not an out of the box machine learning anomaly detection API. It can integrate with external models but it does not itself provide the turnkey time series anomaly detection capability.

Azure Machine Learning is a comprehensive platform for building training and deploying custom machine learning models and it requires you to design and manage models rather than giving a simple, dedicated time series anomaly detection endpoint.

Google Cloud Monitoring is a monitoring and observability service for Google Cloud resources and applications and it is not the Azure offering. It focuses on metrics, logs, and alerts and is therefore not the Azure service that directly provides the described anomaly detection feature.

None of the listed services are deprecated or retired so the current Azure naming and capabilities apply.

When a question asks about detecting anomalies in temporal data look for services that explicitly mention time series and anomaly detection or that advertise a dedicated anomaly detection API.

After provisioning an Azure SQL database which login will always be able to connect to the database?

  • ✓ B. Server admin login of the logical server

Server admin login of the logical server is the account that will always be able to connect to the database after provisioning.

Server admin login of the logical server is created when you provision the logical server and it acts as the server level principal with permissions to connect to and manage databases on that server. This login is the persistent administrative SQL authentication identity that you can use to access any database hosted by the logical server unless you explicitly remove or disable it.

Azure Active Directory administrator account for the server is not guaranteed to exist by default because an Azure Active Directory administrator must be explicitly configured at the server level. If it is not configured then that account will not be available to connect.

Microsoft Entra ID account that created the database does not automatically receive server admin rights. The creator of a database is not automatically granted the logical server administrator role unless that account was separately configured as the server admin or given appropriate permissions.

The traditional sa SQL account is not provided for Azure SQL Database. Microsoft does not expose a built in ‘sa’ account for customer use on the managed platform.

When a question asks which login always has access think about the server level principal. The server admin created during provisioning is the reliable answer unless the exam states an explicit change.

How does the concept of “data privacy” differ from “data confidentiality” in practical terms?

  • ✓ C. Data privacy concerns the proper and lawful handling of personal information while data confidentiality concerns preventing unauthorized access to any sensitive information

The correct answer is Data privacy concerns the proper and lawful handling of personal information while data confidentiality concerns preventing unauthorized access to any sensitive information.

Data privacy focuses on the rights of individuals and on the rules that govern how personal information is collected, used, shared, retained and deleted. It emphasizes legal requirements such as consent, purpose limitation and data subject rights and it requires governance processes, data inventories, contractual controls and policies to meet those obligations.

Data confidentiality is a security property that aims to prevent unauthorized access to information through technical and operational controls. Typical controls include encryption, access control, least privilege, network segmentation and key management, and these controls apply to any sensitive information whether it is personal data or not.

Privacy and confidentiality overlap and confidentiality is one set of controls that supports privacy. Privacy however is broader and also requires legal and procedural measures that go beyond technical access controls.

Data privacy and data confidentiality are simply two names for the same idea is incorrect because the terms refer to related but distinct concerns. Privacy is about lawful handling and individual rights while confidentiality is about preventing unauthorized access.

Data privacy is a compliance responsibility for the customer while data confidentiality is a technical duty for the cloud vendor is incorrect because responsibilities are shared in the cloud. Customers often hold primary privacy obligations but vendors also have obligations and both parties implement confidentiality controls and safeguards.

Data privacy only applies to personally identifiable fields while data confidentiality only applies to encrypted datasets is incorrect because privacy applies to all personal data and not only to specific fields. Confidentiality is enforced by multiple controls and not only by encryption.

When distinguishing these concepts on an exam think of privacy as the policy and legal side and think of confidentiality as the technical controls that prevent unauthorized access. Remember that confidentiality supports privacy but does not cover legal obligations on its own.

A retail reporting team at Summit Analytics is deciding whether to create a paginated report for detailed sales receipts and inventory listings. Which scenario would justify building a paginated report?

  • ✓ C. It needs exact page layout for printing or fixed file exports

The correct answer is It needs exact page layout for printing or fixed file exports.

A paginated report is designed to produce a precise, page oriented layout that is suitable for printing and for fixed file exports such as PDF. It provides control over page breaks, headers and footers, and pixel level placement which is important when receipts and inventory listings must match a printed or archival format.

The report is expected to produce about 250000 pages is not a sufficient justification on its own because large page counts do not necessarily require a paginated layout. Very large outputs may raise performance and processing concerns but they do not define the need for fixed page formatting.

It will be viewed interactively in Looker Studio is incorrect because interactive dashboards and reports are intended for exploration and interactivity. Paginated reports are optimized for static, print friendly output rather than interactive viewing in tools like Looker Studio.

It should be a visual summary with charts and interactive elements is incorrect because a visual, interactive summary is best delivered by dashboarding tools. Paginated reports do not provide the same level of interactivity and are focused on fixed layouts for printing and export.

When you see phrases like exact page layout or print-ready on the exam favor paginated reports and choose dashboards or interactive reports when the question emphasizes charts and exploration.

Why might an engineer pick Azure Cosmos DB Table API over Azure Table Storage for a globally distributed key value store that requires very low latency?

  • ✓ D. Provides single digit millisecond read and write latency under 8 ms at the 99th percentile worldwide

Provides single digit millisecond read and write latency under 8 ms at the 99th percentile worldwide is correct because it directly addresses the requirement for a globally distributed key value store that needs very low latency.

The Provides single digit millisecond read and write latency under 8 ms at the 99th percentile worldwide statement reflects the performance characteristics of the Cosmos DB Table API which is engineered for global distribution and offers multi region replication and low tail latency for reads and writes.

Service level agreement of 99.995% availability is incorrect because an availability SLA does not guarantee the single digit millisecond latency under global load and the question is specifically about latency rather than uptime.

Cloud Bigtable is incorrect because it is a Google Cloud product and not an Azure Table Storage or Cosmos DB option, so it does not answer why one would pick Cosmos DB Table API over Azure Table Storage.

SDKs available for many programming languages and platforms is incorrect because while broad SDK support is useful it is not the primary reason to choose a service when the requirement is very low, worldwide latency.

When a question highlights very low latency worldwide look for answer choices that state explicit millisecond latency or percentile guarantees rather than general features like SDK availability or high level SLAs.

Which Azure service provides real time analytics for high velocity event streams?

  • ✓ C. Azure Stream Analytics

The correct answer is Azure Stream Analytics.

Azure Stream Analytics is a fully managed real time analytics service that is built to ingest and process high velocity event streams with low latency. It offers a SQL like query language and built in windowing functions for event time processing and it integrates natively with Event Hubs and IoT Hub so it is the best fit for the scenario described.

Azure Databricks is a unified analytics platform based on Apache Spark and it excels at batch processing and advanced analytics. It can perform streaming with Spark Structured Streaming but it is not the serverless, purpose built service for simple low latency event stream analytics that the question asks for.

Google Cloud Dataflow is a managed stream and batch processing service on Google Cloud and it is not an Azure service, so it is not the correct choice for an Azure focused question.

Azure Synapse Analytics is an integrated analytics service that targets data warehousing and large scale analytical workloads. It can support streaming scenarios through additional components but it is primarily focused on big data analytics and not on being the dedicated real time event stream processor asked for here.

Look for keywords like real time and event streams and choose the Azure service that is explicitly designed for low latency streaming and seamless integration with Event Hubs and IoT Hub.

An analytics team at Harbor Analytics is tuning a sizable Azure SQL Database table named sales_archive_2025 which contains many columns that are seldom referenced in queries. Which indexing approach will best improve overall query performance while avoiding excessive index maintenance?

  • ✓ B. Avoid adding indexes for the infrequently accessed columns

Avoid adding indexes for the infrequently accessed columns is correct.

Adding indexes to columns that are seldom referenced usually costs more than it helps because each index increases storage and slows down inserts updates and deletes due to extra maintenance. For a large archive table the overall query performance is best improved by indexing columns that are actually used by queries and by relying on fewer well chosen indexes rather than creating many indexes that are rarely used.

Create nonclustered indexes on every seldom used column is wrong because creating an index for every seldom used column creates heavy maintenance overhead and extra storage use and it will slow write operations without delivering meaningful read performance gains for most queries.

Create filtered indexes that target specific subsets of rows for the infrequently used columns is not the best answer for this scenario because filtered indexes can be useful for targeted query patterns but they still add maintenance and are only beneficial when a clear and frequent filter pattern exists. Creating many filtered indexes across many seldom used columns can still produce excessive overhead and complexity.

Create a clustered index on each of the rarely used columns is incorrect because a table can have only one clustered index and clustered keys should be chosen for uniqueness and frequent access patterns. It is not possible to create multiple clustered indexes and it would be inappropriate to cluster on many rarely used columns.

When you decide whether to add an index think in terms of cost versus benefit and only add indexes that are justified by frequent query use. Use query plans and monitoring to validate any indexing change.

In the context of Azure reporting at Evergreen Analytics which term best fills the blank in the following sentence ” are charts of coloured rectangles with each rectangle sized to show the relative value of items and they can include nested rectangles to show hierarchy”?

  • ✓ D. Treemap visual

The correct answer is Treemap visual.

Treemap visual displays data as coloured rectangles where each rectangle size represents the relative value of an item and the visual can nest rectangles to show hierarchical relationships. The layout makes it easy to compare parts of a whole and to reveal structure within categories in a compact space.

Key influencers is a statistical visual that highlights factors that affect a metric and it does not present data as sized coloured rectangles or nested hierarchy.

Filled map colors geographic regions to show spatial value distributions and it is focused on maps rather than rectangles or hierarchical nesting.

Matrix displays data in rows and columns similar to a pivot table and it does not use area sized rectangles or nested boxes to represent values and hierarchy.

Focus on the visual description words such as rectangles, area, and nested when choosing a visual. Those keywords usually point to a Treemap visual rather than a map or tabular visual.

A retail analytics firm named Meridian Analytics wants to automate provisioning a nonrelational database on Google Cloud Platform. Which provisioning approach uses a script made of commands that you can run from any operating system shell such as Linux macOS or Windows?

  • ✓ B. gcloud command line or Cloud SDK

The correct option is gcloud command line or Cloud SDK.

gcloud command line or Cloud SDK provides a cross platform command line tool that you can install on Linux macOS or Windows and you can place the gcloud commands into a script that runs from any operating system shell.

With the Cloud SDK you can automate provisioning of nonrelational databases by running a sequence of gcloud commands in bash or PowerShell and you can integrate those scripts into CI pipelines or run them interactively as needed.

Deployment Manager templates are declarative infrastructure as code artifacts described in YAML or Jinja and they are applied as templates rather than executed as a linear script of shell commands, so they do not match the question.

Cloud Console is the web based graphical interface for Google Cloud and it is designed for interactive use in a browser rather than for running a script of commands from an operating system shell.

Client libraries are language specific SDKs used by application code to call Google Cloud APIs and they require writing program code in languages like Python Java or Go rather than composing a shell script made of commands.

When a question asks for a “script made of commands” that runs from any OS shell think gcloud and Cloud SDK and remember to test scripts first in Cloud Shell.

A regional retailer that uses Contoso Cloud storage plans to classify datasets by usage patterns. What is the primary characteristic that differentiates hot data from cold data?

  • ✓ C. Cold data is accessed infrequently while hot data is accessed frequently and needs rapid availability

Cold data is accessed infrequently while hot data is accessed frequently and needs rapid availability is correct.

Hot data describes datasets that are accessed frequently and that require low latency and quick retrieval. Systems or storage classes for hot data are optimized for immediate availability and performance so applications can read and write rapidly.

Cold data is characterized by infrequent access and by tolerance for slower retrieval times which makes it suitable for lower cost storage tiers or archives. The primary differentiator is the access pattern and the required availability rather than format or location.

Hot data tends to be more expensive to retain while cold data normally costs less to store is incorrect because cost differences are a consequence of performance and access requirements rather than the defining attribute. Cost may often align with hot versus cold but it is not the core distinction.

Hot data is structured while cold data is unstructured is incorrect because whether data is structured or unstructured does not determine its temperature. Both structured and unstructured datasets can be hot or cold depending on how often they are accessed.

Hot data is stored in primary cloud storage while cold data is kept on local on premise servers is incorrect because cold data is commonly placed into lower cost cloud storage tiers or archival services rather than necessarily being kept on premise. The classification is about access frequency and availability rather than physical location.

When answering questions about hot versus cold data focus on the frequency of access and the speed of retrieval required. Those phrases are usually the strongest clues.

A regional retail chain named Meridian Retail depends on data modeling and visualizations for business insight and needs to combine datasets from several sources to produce an interactive report. Which tool should be used to import data from multiple data sources and author the report?

  • ✓ C. Power BI Desktop

The correct option is Power BI Desktop.

Power BI Desktop is a desktop authoring tool that can connect to a wide variety of data sources and combine them into a unified data model for analysis and visualization.

Power BI Desktop provides built in data transformation and modeling features and a visual canvas to create interactive reports that you can publish to the Power BI service for sharing with stakeholders.

Azure Data Factory is a cloud service for building and orchestrating data pipelines and it focuses on data movement and transformation rather than interactive report authoring.

Looker Studio is Google Clouds reporting and dashboard product and it is a different vendor solution so it is not the Microsoft desktop authoring tool expected by this question.

Power BI Mobile app is intended for viewing and interacting with published reports on phones and tablets and it does not provide the authoring and data import capabilities of Power BI Desktop.

When a question asks about importing, modeling, and authoring interactive reports pick an authoring tool such as Power BI Desktop rather than a mobile viewer or a pipeline orchestration service.

In the context of Microsoft Azure which job title best completes this sentence A(n) [?] operates database systems grants user access maintains backup copies of data and recovers data after failures?

  • ✓ C. Database Administrator

Database Administrator is correct because that job title specifically names the person who operates database systems grants user access maintains backup copies of data and recovers data after failures.

A Database Administrator installs and configures database software and manages day to day operations. A Database Administrator is responsible for access control and permissions and for implementing backup and recovery procedures and for restoring data after hardware or software failures.

Data Engineer is incorrect because data engineers build and maintain data pipelines and transformation processes and they do not typically own the operational tasks of backups access control and disaster recovery.

Cloud SQL is incorrect because that name refers to a managed database service offered by Google Cloud and it is not a job title on Microsoft Azure.

Data Analyst is incorrect because data analysts focus on querying analyzing and visualizing data and they do not generally perform the operational duties of running and recovering database systems.

When a question describes operational responsibilities such as backup access control and recovery look for a role that matches day to day system administration tasks rather than data pipeline or analytics titles.

What is the main function of a data warehouse appliance that a cloud vendor delivers to enterprises for analytics?

  • ✓ B. A preconfigured integrated hardware and software system optimized for analytics

The correct answer is A preconfigured integrated hardware and software system optimized for analytics.

A data warehouse appliance refers to a tightly integrated bundle of hardware, storage, networking and database software that is tuned and validated to run analytical workloads efficiently. Vendors deliver these appliances so enterprises get predictable query performance, simplified deployment and a system that is optimized end to end for analytics rather than a collection of separate components.

Appliances are typically used where organizations want on prem or dedicated systems that reduce the work of integrating and tuning servers, storage and analytics software. The integrated design supports parallel processing, high throughput I O and storage layouts that accelerate complex reporting and aggregation queries.

A serverless managed analytics service like BigQuery is incorrect because a serverless cloud service is delivered as a managed software offering and does not include a prepackaged hardware appliance that the vendor ships or deploys to the customer environment.

A repository for storing massive raw datasets in multiple formats is incorrect because that describes a data lake, which focuses on flexible raw storage and schema on read rather than an optimized, integrated system designed for fast analytical query processing.

A tool that orchestrates ingestion and transformation workflows is incorrect because orchestration and ETL tools manage data flows and processing pipelines and do not provide the integrated hardware plus analytics software that characterizes a data warehouse appliance.

When you see the word appliance focus on whether hardware is part of the deliverable. If the vendor provides bundled hardware and tuned software then the appliance answer is likely correct rather than a serverless or orchestration option.

A regional e commerce company named ClearWave assigns a high number of Request Units per second to an Azure Cosmos DB container for a new product catalog service, and the operations team wonders what consequence this overallocation will have on their account and the database performance?

  • ✓ B. You are billed for the provisioned RU/s regardless of consumption

The correct option is You are billed for the provisioned RU/s regardless of consumption.

This is because Azure Cosmos DB charges based on the throughput you provision for a container or database. If you allocate a high number of RU/s you will incur costs for that provisioned capacity even when your workload does not consume all of those RUs. Overprovisioning therefore increases cost without guaranteed proportional gains in real world query latency or efficiency.

Provisioned RU/s do not affect billing or performance is incorrect because provisioned RU/s directly determine billing and they set the available capacity for operations which can affect performance when they are too low or unevenly distributed.

Cosmos DB will automatically decrease allocated RU/s when usage drops is incorrect because Cosmos DB does not automatically lower a manually provisioned RU/s. There is an autoscale mode that can adjust throughput within a configured range, but you still set the maximum throughput and billing is tied to the configured throughput model.

Query throughput automatically scales in line with the provisioned RU/s and improves performance proportionally is incorrect because increasing provisioned RU/s raises available capacity but query performance depends on indexing, partition key design, and the RU cost of the query. Higher RU/s does not guarantee a proportional improvement in query performance.

When a question mentions billing focus on the word provisioned. Remember that Cosmos DB charges are tied to the throughput you configure not to the RUs you actually consume.

A logistics analytics team at Harborline needs a live dashboard that displays operational metrics from their Azure data pipeline so managers can monitor performance as it happens. Which Azure service should they use to build and share interactive dashboards and reports?

  • ✓ D. Power BI

Power BI is the correct choice for building and sharing interactive dashboards and reports.

Power BI provides real time dashboards and supports streaming datasets so operational metrics can be displayed as they happen. It integrates with Azure data sources and offers interactive visuals along with easy sharing and access control so managers can monitor performance through the Power BI service and apps.

Azure Databricks is focused on big data processing and machine learning and it is not primarily a dashboarding or reporting tool.

Azure Data Factory is an orchestration and data movement service used to build ETL and ELT pipelines and it does not provide built in interactive dashboards for end users.

Azure Synapse Analytics is a unified analytics and data warehousing platform that can store and prepare data for reporting but it is not the service used to build and share interactive dashboards.

When a question asks about creating and sharing live dashboards choose a visualization service like Power BI rather than data processing or orchestration tools.

Which configuration option for a Cosmos database account can only be chosen when the account is first created?

  • ✓ B. API selection for the account

The correct option is API selection for the account.

You must choose the account API when you create an Azure Cosmos DB account and the API selection for the account cannot be changed later for that account. The choice determines the protocol and data model such as Core (SQL), MongoDB, Cassandra, Gremlin, or Table and so if you need a different API you must create a new account and migrate your data.

Multi region writes can be enabled or disabled and configured by adding or removing regions and updating account settings after the account exists, so this option is not restricted to the initial creation.

Geo replication regions are managed through the Global Distribution feature and you can add or remove replica regions after the account is created, so region replication is not fixed at creation time.

Account provisioning tier for production or development is not a single immutable selection that is enforced only at creation. Many provisioning and throughput settings can be adjusted after creation, although very large or structural changes may require creating a new account and migrating data.

Remember that API selection is an immutable choice for an Azure Cosmos DB account, so pick the correct API up front and practice account migration so you can handle cases where a different API is required.

You are administering a relational database hosted in a cloud SQL instance for Orion Systems and you must ensure that transactions are atomic, consistent, isolated and durable. Which database property enforces those ACID guarantees?

  • ✓ C. Transaction isolation level settings

The correct answer is Transaction isolation level settings.

Transaction isolation level settings determine how concurrent transactions see and interact with each other and they directly control the isolation guarantee of ACID. By selecting levels such as read uncommitted read committed repeatable read or serializable the database prevents undesirable phenomena like dirty reads non repeatable reads and phantom reads and thus enforces isolation among transactions.

Transaction isolation level settings operate together with atomic commit protocols and logging to provide full ACID behavior but the isolation configuration is the database property that specifically governs how strictly transactions are isolated from one another.

Database normalization is a design technique for organizing schema to reduce redundancy and update anomalies. It does not control transactional concurrency or isolation and therefore it does not enforce ACID transaction semantics.

Transaction logging and write ahead logs are essential for durability and for recovering to a consistent state after a crash and they support atomic commits. They do not by themselves control how concurrent transactions are isolated so they are not the property that enforces isolation across transactions.

High availability and failover focus on keeping the database available during failures and on minimizing downtime. These features help with availability and continuity but they do not directly enforce the transactional ACID guarantees such as isolation.

When asked about ACID map each option to a specific ACID property and remember that isolation is controlled by isolation level settings while durability and atomicity are implemented with logging and commit protocols.

A small startup named LumenChat is building an instant messaging platform and requires message storage that supports extremely fast reads and writes to serve live conversations, which Azure data service best fits this requirement?

  • ✓ D. Azure Cosmos DB

The correct answer is Azure Cosmos DB.

Azure Cosmos DB is a globally distributed, multi model database that offers guaranteed single digit millisecond read and write latency at the 99th percentile and automatic horizontal partitioning for high throughput. It also provides automatic indexing and tunable consistency levels which make it a good fit for live chat workloads that need extremely fast reads and writes together with durable storage and optional global replication.

Azure Cache for Redis provides very low latency because it is an in memory cache, but it is primarily a cache rather than a durable primary message store. Persistence options are limited and using it as the sole storage layer risks data loss or added complexity for durability and global replication.

Azure Blob Storage is object storage that is optimized for large binary objects and high throughput streaming. It is not designed for the small, low latency indexed reads and writes and the rich querying patterns that a chat message store typically requires.

Azure SQL Database is a managed relational database that supports transactions and complex queries. It can scale well for many scenarios but it does not natively provide the same predictable ultra low latency and global distribution at massive scale without additional sharding or architectural complexity.

When a question emphasizes extremely fast reads and writes and also requires durable or globally distributed storage prefer databases that advertise single digit millisecond latency and global distribution. Confirm whether the requirement is for a cache or for a primary durable store before selecting an in memory solution like Azure Cache for Redis.

Fill in the blank in this Power BI sentence for Contoso Cloud. When you are prepared to share a single page from a Power BI report or a group of visualizations you create a [?]. A Power BI [?] is a single page collection of visuals that you can distribute to other users?

  • ✓ C. Dashboard

Dashboard is correct.

A Dashboard in Power BI is a single page canvas that collects visuals and tiles pinned from reports and other sources and it is designed to be shared or distributed to other users as a consolidated view.

Report is incorrect because reports typically contain one or more pages for detailed exploration and they are not the single page, shareable summary described by the question.

BigQuery table is incorrect because that term refers to a Google BigQuery data storage object and it is not a Power BI artifact for packaging or sharing visuals.

Dataset is incorrect because a dataset is the underlying data model or source that reports and dashboards use and it is not a single page collection of visuals to distribute.

When a question highlights a single page or a collection of visuals intended for distribution, think Dashboard rather than report or dataset. Focus on the keywords in the prompt to choose the correct Power BI artifact.

A data engineering group is comparing Azure Synapse Analytics and Azure Databricks for Spark based workloads and wants to know which capability Synapse provides that Databricks does not offer?

  • ✓ C. Support for T SQL based analytics that will be familiar to SQL Server developers

Support for T SQL based analytics that will be familiar to SQL Server developers is correct because Azure Synapse offers a native T SQL query surface that aligns with SQL Server conventions and tools.

Azure Synapse provides both dedicated SQL pools and serverless SQL pools that support T SQL syntax and familiar SQL Server constructs such as views and stored procedures. This reduces friction for SQL Server developers who want to run analytics without learning a different query language or execution model.

Databricks is built around Apache Spark and uses Spark SQL and other Spark APIs. While Databricks can run SQL queries it does not provide the same native T SQL compatibility and SQL Server oriented tooling that Synapse delivers.

Capability to dynamically scale compute resources to meet changing workloads is incorrect because both Synapse and Databricks support autoscaling and flexible compute provisioning, so this capability is not unique to Synapse.

Native connector integration with Google BigQuery is incorrect because integration with external systems like BigQuery is not a unique Synapse capability and both platforms can connect to external data sources through connectors or integration tools.

Ability to execute distributed parallel processing across clusters is incorrect because distributed parallel processing is a core capability of Spark and of Synapse distributed query engines, so it does not uniquely differentiate Synapse from Databricks.

When asked which feature is unique think about native language and developer surface rather than general capabilities. Features like T SQL compatibility are often the differentiator between Synapse and Spark based platforms.

A regional ecommerce startup named MeadowMart stores product records in a Cosmos DB container that uses a partition key called item_type. You need to fetch all products that belong to a specific item_type. What Cosmos DB operation will most efficiently return those items?

  • ✓ B. Filter documents by the partition key value

Filter documents by the partition key value is correct.

Filtering by the partition key directs the query to the partition or partitions that contain the requested items so the database can avoid scanning other partitions. This approach is the most efficient way to return all products for a given item type because partitions are how Azure Cosmos DB organizes and indexes documents and targeted queries use fewer request units and complete faster.

Use the change feed to track updates in the container is incorrect because the change feed is intended to stream inserts and updates for incremental processing or eventing. It is not designed as a general purpose query mechanism to return all current documents for a particular partition key.

BigQuery is incorrect because BigQuery is a Google Cloud analytics service and it does not query documents inside an Azure Cosmos DB container directly. It is not the proper operation to fetch items from Cosmos DB.

Scan every document in the container is incorrect because scanning the full container is inefficient and will consume far more request units and time. When a partition key exists you should target that key to avoid a full container scan.

When a container uses a partition key prefer queries that filter by that partition key to reduce RU charges and improve performance. Use point reads when you have both the id and partition key for the fastest access.

When protecting information for a cloud application how would you define “data at rest”?

  • ✓ B. Data stored persistently on disks, optical media, or cloud object storage

The correct answer is Data stored persistently on disks, optical media, or cloud object storage.

This phrase refers to information that resides on physical storage or in cloud object stores when it is not being processed or transmitted. Examples include files on disks, database records, backups, and blobs in object storage. Protecting this stored data typically involves encryption at rest, access controls, and secure key management to prevent unauthorized access to persisted content.

Data moving across a network between systems is incorrect because that describes data in transit. Data in transit is protected with transport level controls such as TLS and secure networking rather than the controls used for data at rest.

Data held temporarily in a system’s RAM or CPU cache is incorrect because that describes data in use. Data in memory is ephemeral and is addressed with different protections like memory encryption and runtime isolation.

Data queued in a write buffer while awaiting final persistence is incorrect because write buffers are transient and part of ongoing input output operations. Queued data is considered in flight or in use until it is committed to persistent storage and is not classified as data at rest.

Remember that data at rest means data persisted on storage while data in transit moves across networks and data in use sits in memory. Match protections to the correct state when answering questions.

A regional retailer has several data flows and needs to assign the correct workload pattern to each flow. Which workload type best matches each scenario?

  • ✓ B. Batch workload for the product catalog that is loaded every 24 hours to a data warehouse, streaming workload for online purchases loaded to the warehouse as they occur, and micro batch workload for inventory updates applied after every 1,200 transactions

Batch workload for the product catalog that is loaded every 24 hours to a data warehouse, streaming workload for online purchases loaded to the warehouse as they occur, and micro batch workload for inventory updates applied after every 1,200 transactions is the correct option.

The product catalog is refreshed once every 24 hours so it matches a batch workload. Batch processing is appropriate for scheduled, bulk transfers to a data warehouse where latency can be higher and throughput is the priority.

Online purchases arrive as individual events and need to be loaded as they occur so they match a streaming workload. Streaming processing supports low latency ingestion and immediate availability of each event in the warehouse.

Inventory updates that are applied after every 1,200 transactions match a micro batch workload because the updates are collected into small frequent batches before being applied. Micro batching gives a balance between throughput and near real time responsiveness for grouped events.

Streaming workload for the product index that is refreshed every 24 hours to a data warehouse, micro batch workload for customer purchases that are pushed to the warehouse as they occur, and batch workload for inventory reconciliations that are processed after every 1,200 transactions is incorrect because the product index refreshed daily is a batch pattern rather than streaming, customer purchases that arrive as they occur need streaming rather than micro batching, and inventory reconciliations triggered every 1,200 transactions are a micro batch pattern rather than a large scheduled batch.

Micro batch workload for the product catalog that is refreshed every 24 hours to a data warehouse, batch workload for online purchases that are accumulated before loading, and streaming workload for inventory updates that are emitted after every 1,200 transactions is incorrect because a product catalog updated daily is a batch workload and not micro batch, online purchases that are accumulated before loading describe batch processing rather than streaming, and inventory updates emitted after a count of transactions are better described as micro batch rather than continuous streaming.

When you decide between batch, streaming, and micro batch think about how often data arrives and how quickly consumers need it. Streaming is for per event low latency, batch is for scheduled bulk loads, and micro batch is for small frequent groups of events.

In what situation would an engineer use an Azure Resource Manager template to deploy infrastructure and configuration in a repeatable way?

  • ✓ B. Automate the repeatable deployment of an interdependent set of Azure resources

The correct answer is Automate the repeatable deployment of an interdependent set of Azure resources.

Azure Resource Manager templates are declarative files that describe the resources you want to deploy and the relationships between them so you can recreate the same architecture reliably. Templates are idempotent and support parameters, outputs, and linked or nested templates which makes them suitable for automating deployments of interdependent resources in CI CD pipelines and across environments.

Assign complex access permissions to existing Azure resources is incorrect because granular access control is primarily handled by Azure role based access control and Azure Active Directory. While templates can include role assignments, assigning and managing complex permission sets is usually done with RBAC tools, policy, or administrative scripts.

Provision whole Azure subscriptions and enforce tenant level policies across several tenants is incorrect because templates deploy resources at resource group, subscription, or management group scope and they do not provision subscriptions across tenants. Creating subscriptions and applying tenant wide governance is usually done with management groups, Azure Blueprints, or tenant level APIs and tooling.

Use Google Cloud Deployment Manager instead of an Azure template is incorrect because that option refers to a different cloud provider. Google Cloud Deployment Manager is not an Azure tool and it would not be used to deploy Azure infrastructure.

When a question describes repeatable infrastructure as code think of declarative templates and choose the option that mentions automating interdependent resource deployments.

Which type of database exhibits these drawbacks? Transactional updates that span multiple entities are not guaranteed because consistency must be considered. There is no referential integrity and relationships between rows must be managed outside the table. Filtering or sorting on fields that are not keys is difficult and queries on non key fields can force full table scans?

  • ✓ B. Contoso Table Storage

Contoso Table Storage is the correct answer.

This choice describes a simple table storage model used by key value or basic NoSQL table services that do not enforce relationships between rows and that do not provide multi entity transactional guarantees. Because data is organized around partition and row keys there is no built in referential integrity and transactions that span multiple entities are not guaranteed. Filtering or sorting on attributes that are not part of the primary key is difficult and queries on non key fields can force full table scans or require external indexing to be efficient.

Cloud Bigtable is a high scale wide column NoSQL database designed for very large throughput and low latency. It supports row level atomicity and is optimized for range scans and time series patterns, so the simple table storage limitations described in the question are not the typical characterization of this managed service.

Cloud SQL for PostgreSQL is a relational database that provides ACID transactions, foreign keys, and rich indexing options. It therefore enforces referential integrity and supports complex queries without the table storage constraints in the question.

Cloud Spanner is a distributed relational database that offers strong consistency and global transactions along with SQL querying and schema constraints. It is explicitly designed to avoid the lack of transactional guarantees and integrity problems mentioned in the question.

Cloud SQL for MySQL is a managed MySQL service that supports transactions, foreign keys, and indexes, and it does not suffer from the lack of referential integrity and difficulty querying non key fields that the question describes.

When a question mentions lack of referential integrity and difficulty querying non key fields think of simple table or key value stores rather than relational databases or strongly consistent distributed SQL systems.

The analytics team at Meridian Outfitters plans to build a modern data warehouse to store and analyze very large volumes of structured and semi structured information. Which Azure service is the best fit for this requirement?

  • ✓ D. Azure Synapse Analytics

Azure Synapse Analytics is the correct choice for building a modern data warehouse to store and analyze very large volumes of structured and semi structured information.

Azure Synapse Analytics combines enterprise data warehousing, big data integration, and analytics in a single service. Synapse provides dedicated SQL pools and serverless SQL along with Spark pools so teams can run large scale queries and process both structured and semi structured data efficiently.

Synapse is designed for petabyte scale analytics and it integrates closely with Azure Data Lake Storage for storage and with other Azure services for ingestion and orchestration. Those integrated capabilities make it the best fit when a modern data warehouse is needed.

Azure Data Lake Storage is primarily an object storage service that is optimized for storing massive amounts of raw data. It is not itself a data warehouse because it does not provide the integrated query engines and warehousing features that Synapse offers, although it is commonly used together with Synapse.

Azure SQL Database is a managed relational database service that is optimized for transactional workloads and for smaller scale analytics. It does not natively provide the large scale analytics, big data integration, and mixed workload capabilities required of a modern data warehouse.

Google BigQuery is a highly scalable serverless data warehouse but it is a Google Cloud service and not an Azure offering. The question asks for the best Azure service so BigQuery is not the correct choice here.

When an exam question asks for a modern data warehouse in Azure focus on services that combine storage and large scale analytics. Pay attention to whether an option is a pure storage service, a transactional database, or a full analytics platform.

Review the following statements and identify which one accurately describes a SQL concept?

  • ✓ D. A view is a virtual table whose contents are produced by a SQL query

A view is a virtual table whose contents are produced by a SQL query is the correct option.

A view is defined by a query and presents results as if they were a table without necessarily storing the rows separately in the database. When you reference a view the underlying query is executed and the result set is returned which is why it is called a virtual table.

An index is a data structure that speeds up lookups on table columns is incorrect because that statement describes a performance structure used to speed queries rather than a virtual table produced by a query. An index stores pointers to rows to make searches faster and does not itself present query results as a table.

A materialized view is a physical snapshot of a query result stored for faster access is incorrect in this context because that option describes a stored copy of query results. A materialized view stores data and typically requires refresh operations so it is not the same concept as a virtual view produced on demand.

A stored procedure is a saved routine that performs a set of SQL operations is incorrect because a stored procedure is a programmatic object that executes logic and can perform many actions. It is not defined as a virtual table produced by a query even though it can sometimes return result sets.

Read each option and match whether it refers to stored data, a performance structure, a programmatic routine, or a virtual construct and then choose the option that exactly matches the definition in the question.

An engineer at AuroraApps cannot establish a connection from their home network to a specific Azure SQL Database while teammates can connect without issues, and the engineer can connect to other Azure SQL Databases in the same subscription with the same client tools installed. What is the most likely reason for the connection failure?

  • ✓ D. The server firewall does not include the engineer’s home public IP

The correct option is The server firewall does not include the engineer’s home public IP.

This is most likely because Azure SQL Database uses server level firewall rules to allow or block client connections by public IP address. The engineer can connect to other databases from the same home network, and teammates can connect to the problem server from their networks, which points to the problem server simply not including the engineer’s home public IP in its allowed list. Adding the home public IP to the server firewall rules or using a dynamic update process will allow the connection.

Outbound TCP port 1433 is blocked by the home router or ISP is unlikely because the engineer is already able to connect to other Azure SQL Databases from the same home network using the same client tools, so a global port block would prevent those connections as well.

A private endpoint or virtual network rule is preventing access from the public internet is not the best explanation here because a private endpoint would block public access for all users who are not on the private network. The fact that teammates can connect to this server suggests that the server is reachable from other public IPs and that a private endpoint is not the cause.

The user does not have explicit database level permissions granted is also unlikely because lack of database permissions would allow a network connection but then fail at authentication or authorization. In this case the issue prevents establishing the connection from the engineer’s home IP, which points to a firewall or network rule rather than database permissions.

When a single user cannot reach an Azure SQL server but others can, first verify the user’s public IP against the server level firewall rules before changing credentials or network hardware.

A regional logistics startup is evaluating databases that can hold semi structured documents and graphs without a fixed table layout. What is a primary benefit of selecting a nonrelational database instead of a traditional relational database?

  • ✓ D. Allows flexible schemas so data can be stored as key value pairs documents or graph models

The correct answer is: Allows flexible schemas so data can be stored as key value pairs documents or graph models.

This option is correct because nonrelational databases are designed to accept semi structured and evolving data without a fixed table layout. Flexible schemas let you store different records with different fields and nested structures which makes it easier to model documents key value pairs or graphs and to iterate on the data model as requirements change.

Choosing a nonrelational store gives you schema flexibility and horizontal scalability which helps when you need to ingest varied or rapidly changing data. Those benefits come with trade offs that you should evaluate but they align directly with the startup use case that wants to hold semi structured documents and graphs.

Enforces referential integrity with foreign key constraints across tables is incorrect because referential integrity and foreign keys are features typically provided by relational databases. Many NoSQL systems do not enforce foreign key constraints automatically and expect the application to manage those relationships.

Cloud Bigtable is incorrect because it names a specific GCP product rather than stating a general benefit. Cloud Bigtable is a wide column key value store that is optimized for large scale time series and analytical workloads and it is not primarily a document or graph database.

Provides strong multi row ACID transactions by default is incorrect because most nonrelational databases do not offer strong multi row ACID transactions by default. Strong multi row ACID guarantees are more commonly associated with traditional relational systems or specialized distributed databases that are designed to provide both scale and transactional semantics.

When a question contrasts database types focus on the capabilities described such as flexible schema or transaction guarantees and not on product names.

In the context of Contoso Cloud analytics what type of visualization completes this sentence A(n) [?] can be used to show how a value differs in proportion across a geographic area or region?

  • ✓ B. Choropleth map

The correct option is Choropleth map.

A Choropleth map shades geographic areas such as countries states or counties according to a numerical value so it directly shows how a value differs in proportion across a region. This makes it the standard choice when you need to visualize regional proportions on a map.

The Treemap displays hierarchical proportions using nested rectangles and does not map values to geographic regions so it is not appropriate for showing regional differences on a map.

The Line chart is used to show trends over time or ordered categories and it does not represent spatial areas so it cannot show proportional differences across regions.

The Scatter plot plots points to show relationships between two numeric variables in Cartesian space and it is not used to shade or color geographic regions by value.

When a question asks about showing values by geographic area look for a visualization that fills regions based on a measure and not one that plots points or uses nested shapes. Use choropleth map when areas need to be shaded by value.

In a cloud data platform how do row level security and cell level security differ in the scope of data they control?

  • ✓ C. Row level security restricts access to whole rows while cell level security controls individual fields within a row

The correct option is Row level security restricts access to whole rows while cell level security controls individual fields within a row.

Row level security controls whether an entire record is visible to a user based on policy attributes or filters. It operates at the row or record level so a user either sees the whole row or does not, and it is commonly used to enforce tenant isolation and data partitioning across users.

Cell level security controls access to specific fields inside a row so users may see some columns while other sensitive fields are masked or hidden. This finer granularity is often called column level or cell level security and it is useful when only particular attributes such as personal identifiers need stronger protection.

Row level security is always easier to configure than cell level security is incorrect because ease of configuration depends on the platform and available features. In some systems row rules are simpler but in others built in column protections make cell level controls straightforward, so you cannot assume one is always easier.

Row level security applies to relational tables while cell level security is only for key value stores is incorrect because both row and cell level controls are used across relational and non relational systems. Column or cell level protections are common in data warehouses and relational databases as well as in some key value or document stores.

Row level security provides stronger protection than cell level security is incorrect because strength depends on what you need to protect. Cell level security can provide stronger protection for specific sensitive attributes while row level security is better when entire records must be restricted. The approaches are complementary rather than one universally stronger.

When answering, focus on the scope words such as row and cell or column to decide whether the control applies to whole records or to individual fields.

A regional retail company operates a PostgreSQL server in its own data center and needs to migrate the data into an Azure Database for PostgreSQL instance. Which approach is recommended to carry out this migration?

  • ✓ D. Use Azure Database Migration Service to perform a managed migration of the on premise PostgreSQL database

Use Azure Database Migration Service to perform a managed migration of the on premise PostgreSQL database is the correct option.

Azure Database Migration Service provides a guided and managed migration path for PostgreSQL migrations to Azure Database for PostgreSQL and it supports both offline and online migration scenarios which helps reduce downtime and automate schema and data movement. It integrates with Azure services and it handles validation and cutover tasks so large or production migrations are safer and less manual than ad hoc approaches.

Use Azure Data Factory to copy the data from the on premise PostgreSQL to the Azure database is not the recommended choice because Data Factory is an ETL and bulk data movement tool and it does not provide a full database migration workflow with schema conversion, validation, or orchestrated cutover. You could use it to copy tables but you would need extra manual steps to handle schema and to minimize downtime.

Export a logical SQL dump from the local PostgreSQL and then import that dump into the Azure Database for PostgreSQL instance is technically possible for small or simple databases but it is not ideal for production migrations because it is manual, often requires extended downtime, and does not provide automated validation or incremental synchronization for cutover.

Set up a site to site VPN and stream replication traffic directly into the Azure database is incorrect because managed Azure Database for PostgreSQL does not support acting as a physical streaming replica of an external primary. Streaming physical replication into the managed service from an on premise server is not supported and you would instead use supported logical replication or a managed migration tool for continuous or near zero downtime migrations.

When a question asks about migrating a database to Azure look for the managed migration tool. Choose Azure Database Migration Service for schema aware, validated, and low downtime migrations rather than generic copy tools.

A retail analytics team at NovaMart must ingest and query very large volumes of structured telemetry and access logs interactively and with low latency. Which Azure service is most appropriate for storing and querying those massive structured log records?

  • ✓ D. Azure Data Explorer

Azure Data Explorer is correct.

This service is purpose built to ingest very large volumes of structured telemetry and access logs and to support fast, interactive, low latency queries over that data. It uses a columnar store and automatic indexing and it exposes a powerful query language that is optimized for ad hoc exploration and time series analysis which makes it suitable for massive log workloads.

Azure Data Lake Storage is not appropriate because it is designed for scalable file and object storage and batch analytics rather than for low latency interactive query and log exploration.

Azure Blob Storage is not appropriate because it provides general object storage without the built in query engine and indexing needed for fast, interactive analysis of structured log records.

Azure SQL Database is not appropriate because it is a relational database service that is not optimized for the very high ingest rates and the petabyte scale time series log queries that this scenario requires.

When a question mentions large volumes of telemetry or logs and requires interactive, low latency queries focus on services built for high ingest rates and ad hoc log exploration rather than general purpose object storage or transactional databases.

In the context of a company called Summit Retail identify the missing words in this statement. OLTP systems are the original distributed transactional sources across the enterprise and OLAP systems pull together data from those transactional stores to present a multi dimensional perspective for reporting and analytics. Together OLTP and OLAP form the two sides of what concept?

  • ✓ D. Data warehousing

The correct answer is Data warehousing.

Data warehousing is the practice of consolidating data from transactional OLTP sources and organizing it into OLAP structures so analysts and reporting systems can work with a consistent multidimensional view across the enterprise. The question describes OLTP as the transactional sources and OLAP as the analytical views and that pairing is precisely what Data warehousing represents.

Data warehousing focuses on integration transformation and long term storage of data for reporting business intelligence and analytics. A data warehouse draws from many operational systems and builds analytic schemas and aggregates so that reporting and analysis are efficient and consistent.

BigQuery is a specific cloud data warehouse product and not the general concept the question asks for. The question seeks the overarching concept and not a vendor implementation so BigQuery is not correct.

Transaction management is about ensuring atomicity consistency isolation and durability for transactions in operational systems. It does not describe the enterprise level consolidation of OLTP and OLAP for reporting and analytics and so Transaction management is not the right answer.

Hybrid transactional analytical processing refers to systems that attempt to run transactional and analytical workloads on the same database. The question describes separate OLTP sources and OLAP analytic stores that together form a consolidated reporting approach which is Data warehousing, so Hybrid transactional analytical processing does not match the described concept.

Pay attention to whether the question asks for a broad concept or a specific product. If it describes roles or architecture pick the broad concept rather than a vendor name.

Read the sample customer records described in the paragraph that follows. Customer 101 has an id of 101 and a name of Anna Rivera and the telephone property contains an object with home business and mobile entries using numbers such as 44 20 7123 4567 and 44 20 8123 4567 and 44 7700 900111 and the address property contains two nested entries for home at 12 Baker Lane Some City NY 10021 and office at 9 Tower Road Some City NY 10022. Customer 202 has an id of 202 and a name of Luis Moreno and the telephone property contains home and mobile entries with country prefixes such as 001 and 011 and the address property contains nested locations for UK and US with complete address lines. What kind of database model does this representation demonstrate?

  • ✓ D. Non relational database

The correct answer is Non relational database.

The sample records show nested objects inside a single customer record and varying fields for different customers which matches a document style, schema flexible approach rather than a fixed table layout. This kind of hierarchical and flexible data organization is typical of document oriented or other NoSQL systems and that is why Non relational database is correct.

Relational database is incorrect because relational systems model data in tables with rows and columns and they rely on a fixed schema and joins to relate entities. The sample data does not look like normalized tables with foreign keys and consistent columns.

Cloud Spanner is incorrect because Cloud Spanner is a distributed SQL database that provides relational semantics and schema based tables. It is not the best match for the nested, document style records shown in the example.

SQL database is incorrect because that term refers to databases that use structured query language and typically rely on fixed schemas and tables. The example shows flexible, nested structures which point to a NoSQL document model rather than a traditional SQL table design.

Look for nested objects and variable fields across records as strong indicators of a document or NoSQL model. When you see hierarchical JSON like data think non relational rather than a fixed table.

A development team is building a web service that must connect to an Azure SQL database. Which method provides the recommended secure way to store and retrieve the database connection string for the application?

  • ✓ D. Azure Key Vault

The correct answer is Azure Key Vault.

Azure Key Vault is designed to securely store and manage secrets such as database connection strings and it provides encryption at rest, integration with Azure Active Directory for access control, and support for managed identities so the application can retrieve secrets without embedding credentials.

Using Azure Key Vault also enables secret versioning and rotation and it provides logging and auditing so you can track secret access. These capabilities reduce the risk of accidental exposure and make operational security simpler than storing secrets in code or files.

Embed the connection string directly in application code is insecure because code repositories can be leaked and it makes rotation difficult. This practice exposes credentials to developers and to any system with access to the source code.

Use Google Secret Manager is a legitimate secrets service but it is not the recommended choice for protecting an Azure SQL connection string in an Azure environment. Using the cloud provider native secret store such as Azure Key Vault simplifies authentication and access control and avoids cross cloud complexity.

Keep the connection string in a local configuration file is risky because files can be copied or accidentally committed and they are hard to secure at scale. Configuration files do not provide the centralized access controls and auditing that a secrets manager provides.

When answering exam questions about credential storage prefer the cloud provider native secrets service and think about managed identities for retrieval without embedded secrets. For Azure resources choose Key Vault and mention access policies and rotation.

Identify the missing word or words in the following sentence in the Microsoft Azure context. A(n) [?] is something about which information must be known or held. For the online retailer RetailCo you might create tables for clients, items, and purchases. A table contains rows and each row represents a single instance of a(n) [?].?

  • ✓ C. Entity

The correct option is Entity.

Entity is the right term in Azure table and data modeling contexts because it refers to an object or row about which information must be known or held. For an online retailer you would create tables for clients items and purchases and each row represents a single instance of an Entity with properties that store values such as name price and quantity.

Record is not the preferred Azure term even though it can mean a row in other database systems. Azure documentation and table services use the term Entity rather than Record to describe a row.

Resource refers to an Azure managed object such as a virtual machine storage account or database instance. It does not mean a table row or instance in the data model so it is not correct.

Attribute describes a property or column that holds a single piece of data about an entity. It is not the whole instance or row so it does not fit the sentence.

When a question asks what a row or instance is called in Azure tables look for the term entity and remember that columns are called properties or attributes while resources refer to Azure services.

A retail insights team at BlueHarbor is building an analytics platform and they want a managed Apache Spark environment that includes interactive notebooks for data exploration and collaboration. Which Azure service should they select?

  • ✓ D. Azure Databricks

The correct answer is Azure Databricks.

Azure Databricks provides a managed Apache Spark environment with integrated interactive notebooks that support collaborative data exploration and development. It offers an optimized Spark runtime and workspace features that simplify cluster management and enable teams to work together in notebooks.

Azure Databricks also integrates with Azure storage services and supports Delta Lake for reliable analytics and data engineering. For a retail insights team that wants a managed Spark platform with interactive notebooks and collaboration, this service is purpose built for that scenario.

Azure Synapse Analytics can run Spark pools and includes notebooks, but it is a broader unified analytics platform focused on integrated SQL, data warehousing, and orchestration and it does not provide the same specialized Databricks collaborative workspace experience.

Azure Functions is a serverless event driven compute service for running small pieces of code on demand and it does not provide a managed Apache Spark environment or interactive notebooks for collaborative data exploration.

Azure HDInsight is a managed Hadoop and Spark cluster service, but it requires more cluster administration and it does not deliver the same level of integrated collaborative notebooks and optimized managed Spark runtime that Azure Databricks provides.

When a question asks for a managed Apache Spark environment with collaborative notebooks look for services that explicitly mention managed Spark or Databricks. Pay attention to words like managed Spark and interactive notebooks to choose the best fit.

A retail technology firm named Northbridge Labs uses Azure Cosmos DB to run a client facing API and they must charge customers by tracking the total operation count and outbound data transferred. Which Cosmos DB metric should they observe to reflect billing consumption?

  • ✓ C. Total Request Units

Total Request Units is the correct option.

Total Request Units is the Cosmos DB metric that reports the consumed request units across operations and it directly reflects throughput consumption that you are billed for. This metric aggregates the cost of reads writes and queries and the resources used to process them so it is the best single indicator of operation volume and the resource consumption associated with outbound data processing.

Index storage usage is incorrect because it reports how much disk space indexes use and it does not reflect operation volume or throughput consumption.

Provisioned throughput is incorrect because it shows the configured RU per second capacity that you reserve rather than the actual RUs consumed over time.

Normalized RU consumption is incorrect because it is not the standard billing metric for total consumed RUs and it does not replace the Total Request Units metric for billing purposes.

When you need to bill customers by activity monitor Total Request Units for consumed throughput and also check data egress and storage usage since those can be billed separately.

When an on premises application from AlderTech connects to a cloud database using a Proxy connection policy what behavior should you expect?

  • ✓ B. Connections are established through the gateway so that all subsequent requests continue to flow through the gateway and individual calls can be handled by different nodes in the cluster

The correct answer is Connections are established through the gateway so that all subsequent requests continue to flow through the gateway and individual calls can be handled by different nodes in the cluster.

This means the proxy keeps the gateway in the request path so that individual requests can be routed to different backend nodes and load balanced as needed. The gateway handles or forwards each request rather than allowing the application to talk directly to a single database node after the first handshake.

Keeping the gateway in place provides connection management, routing flexibility, and resiliency when backend nodes change. That is why subsequent calls can be handled by different nodes in the cluster while still flowing through the gateway.

After the initial connection the application sends all traffic straight to the database and no longer uses the gateway is incorrect. That option describes a handoff model where the gateway is removed after setup and traffic goes direct to the database, which is not how a proxy connection policy operates in this context.

Cloud SQL Proxy is incorrect because it is just the name of a proxy product and does not by itself describe the specific connection policy behavior asked for. The question asks about how connections flow under a proxy connection policy rather than naming a product.

All connections are validated centrally to confirm they originate from trusted clients is incorrect because central validation may be part of security, but it does not describe the key behavior that subsequent requests continue to flow through the gateway and are routed to cluster nodes.

Read the options for phrases like continue to flow through the gateway versus sent straight to the database to quickly decide whether the proxy stays in path or only handles an initial handshake.

A regional archive team needs to construct a knowledge mining workflow to derive insights from millions of unstructured records such as scanned reports and photos. Which Azure service should they use to accomplish this task?

  • ✓ B. Azure Cognitive Search

Azure Cognitive Search is the correct choice for building a knowledge mining workflow across millions of unstructured records such as scanned reports and photos.

Azure Cognitive Search provides a managed search index with built in AI enrichment that can perform OCR on images, extract text and entities, and apply custom skillsets for entity recognition and language processing. This service scales to handle large ingestion volumes and integrates with other Azure services to create a complete knowledge mining pipeline that indexes and surfaces insights from diverse unstructured sources.

Google Cloud Natural Language API is a text analysis API that can extract entities and sentiment from text but it is not a managed, end to end knowledge mining index with built in document enrichment and search capabilities and it is not an Azure native solution for an archive team.

Azure Databricks is a strong platform for large scale data processing and machine learning and it can be used as part of a custom solution, but it does not provide the out of the box document enrichment pipeline and searchable index that Azure Cognitive Search offers for knowledge mining of documents and images.

Azure Form Recognizer specializes in extracting structured fields and tables from forms, receipts, and invoices and it works well for form like documents, but it is not designed to provide a full scale knowledge mining index across millions of varied unstructured records and photos without additional search and indexing infrastructure.

When a question mentions knowledge mining, index, skillset, or OCR look for a managed search service with AI enrichment which usually points to Azure Cognitive Search.

A regional retailer is evaluating CloudCo’s managed relational database service for its web applications. What are two benefits of adopting a managed PaaS relational database offering compared with running database servers yourself? (Choose 2)

  • ✓ A. Reduced operational overhead for server provisioning patching and backups

  • ✓ B. Immediate access to automatic platform updates and new database features

The correct options are Reduced operational overhead for server provisioning patching and backups and Immediate access to automatic platform updates and new database features.

Reduced operational overhead for server provisioning patching and backups is correct because a managed relational database service takes on routine operational tasks such as provisioning instances, applying operating system and database patches, running automated backups, and handling many maintenance activities so the retailer can focus on its applications and business logic.

Immediate access to automatic platform updates and new database features is correct because the provider applies security fixes and platform updates and can roll out new database capabilities to customers, which reduces the effort required to manage upgrades and lets teams adopt improvements faster than with many self managed deployments.

Total control of low level file system layout and backup schedules is incorrect because managed PaaS offerings intentionally abstract low level system details and do not provide full control over file system layout even though they usually provide configurable backup settings.

Easier linking with object storage for archival and analytic workflows is incorrect because integration with object storage depends on available connectors and the chosen architecture rather than on whether the database is managed, and both managed and self managed databases require work to connect and orchestrate archival and analytical pipelines.

When comparing managed services to self managed servers focus on who performs operational tasks and which responsibilities are offloaded. Look for answers that mention reduced operational burden or automatic updates as indicators of managed service benefits.

A boutique travel tech firm needs to store about 40 thousand guest feedback entries and then determine whether each entry expresses positive negative or neutral feelings. Which Azure Cognitive Service should they use for analyzing the text of these reviews?

  • ✓ D. Azure Text Analytics

Azure Text Analytics is the correct choice.

Azure Text Analytics provides a prebuilt sentiment analysis API that classifies text as positive negative or neutral and it can process large batches of documents which makes it suitable for analyzing about 40 thousand guest feedback entries.

Azure Computer Vision is designed to analyze images and video rather than written text so it would not be used for sentiment analysis of guest reviews.

Azure Speech Service is focused on converting speech to text producing and recognizing audio and related tasks so it only becomes relevant if the feedback were spoken and you first needed to transcribe it.

Azure Language Service is a broader collection of language capabilities and Microsoft has been unifying language features under that umbrella but the dedicated and commonly referenced API for straightforward sentiment classification in documentation and exam scenarios is Azure Text Analytics which is why that option is the expected answer.

When a question asks about sentiment or opinion detection look for the service that explicitly references text analytics or sentiment analysis in its name or documentation.

A smart agriculture company operates roughly 9,600 wireless temperature probes that send readings every 40 seconds and the analytics team will examine changes in temperature over time. Which type of storage system is best suited to hold these time ordered sensor measurements?

  • ✓ C. Time series database

The correct option is Time series database.

Time series database systems are designed for high write throughput and efficient queries over time ordered data which matches the probe workload. With 9,600 probes sending a reading every 40 seconds the system receives about 240 writes per second and about 20.7 million measurements per day, and Time series database engines handle this volume with features like compression, partitioning by time, and optimized time-range queries.

Time series database solutions also provide built in retention policies, downsampling and aggregation mechanisms which make long term storage and historical analysis practical without excessive cost. Those features let the analytics team quickly compute trends and windowed statistics over the sensor data.

Graph database is not a good fit because it is optimized for relationship traversal and complex connected queries rather than high volume sequential time stamped measurements.

BigQuery is a scalable analytical warehouse and it works well for large scale ad hoc analysis, but it is not ideal for very frequent raw inserts and time series primitives like automatic downsampling and retention management.

Relational database can store time stamped rows, but traditional relational systems struggle with sustained high ingest rates and they lack specialized compression and time based query optimizations that purpose built time series databases provide.

When a question describes many sensors and frequent time ordered measurements favor a time series database or a managed time series service and then think about retention and downsampling strategies.

Within the context of Contoso Cloud identify the missing word or words in this sentence. [?] is a collection of services apps and connectors that lets you connect to your data wherever it happens to reside filter it if necessary and then bring it into [?] to create compelling visualizations that you can share with others?

  • ✓ C. Microsoft Power BI

The correct option is Microsoft Power BI.

Microsoft Power BI is a suite of services apps and connectors that lets you connect to your data wherever it happens to reside filter it if necessary and then bring it into Microsoft Power BI to create compelling visualizations that you can share with others.

Azure Databricks is an analytics platform built on Apache Spark for big data processing and machine learning and it is not primarily a visualization service for end user reports and dashboards.

Azure Synapse Spark refers to the Spark runtime within Azure Synapse and it is focused on large scale data processing rather than on providing the reporting and sharing experience that Power BI delivers.

Azure Synapse Studio is the integrated development environment for data integration and analytics inside Synapse and it focuses on pipelines notebooks and SQL development rather than on the interactive visualization and sharing features of Power BI.

Azure Data Factory is an orchestration and ETL service for moving and transforming data and it does not provide the visualization and interactive reporting capabilities that identify Power BI as the correct answer.

Focus on keywords such as visualizations and connectors to distinguish reporting and sharing tools like Power BI from data processing or orchestration services when you answer exam questions.

Identify the missing term in the context of Contoso Cloud storage where items are referred to as rows and attributes are described as columns. This storage does not provide relationships, stored procedures, secondary indexes, or foreign keys and data is normally denormalized with each row holding all the information for a single logical entity?

  • ✓ C. Azure Tables

The correct answer is Azure Tables.

Azure Tables is the Azure Storage Table service that provides a simple NoSQL key and attribute store where entities are treated as rows and their properties are treated as columns. The service does not include relational features such as foreign keys or stored procedures and it lacks built in secondary indexes beyond the PartitionKey and RowKey, so data is typically denormalized with each row holding all information for a single logical entity.

Cloud Bigtable is a Google Cloud wide column NoSQL database and not the Azure Table service described in the question. It targets large scale and low latency workloads and belongs to a different cloud ecosystem, so it is not the correct match here.

Azure Database for PostgreSQL is a managed relational database and it supports schemas, foreign keys, stored procedures and secondary indexes. That makes it inconsistent with the simple, schema free and denormalized Table storage model.

Azure Database for MySQL is also a managed relational database and it provides the relational features that the question explicitly says are not present, so it cannot be the right answer.

Cloud Firestore is a document oriented database that provides richer querying and automatic indexing and it uses collections and documents rather than the simple row and attribute structure described, so it does not match the Table storage description.

Look for keywords such as rows, attributes as columns, no foreign keys and denormalized to identify a simple NoSQL table or key attribute store. Also check the vendor name to make sure the service matches the cloud platform referenced in the question.

An e commerce analytics group has an Azure SQL Database table that holds tens of millions of rows and analysts often filter queries by a single column that does not have an index. Which action will most effectively accelerate those filtered queries?

  • ✓ D. Add a nonclustered index on the frequently filtered column

The correct answer is Add a nonclustered index on the frequently filtered column.

A nonclustered index creates an ordered structure on the column so the query optimizer can perform index seeks instead of scanning tens of millions of rows. This reduces IO and latency for selective filters. You can also add included columns to a nonclustered index to make it covering and avoid additional lookups.

Partition the table on that column is not the best first action because partitioning is primarily for manageability and for pruning whole partitions when queries align with the partition boundaries. Partitioning adds complexity and often does not eliminate the need for an index to efficiently locate rows.

Scale up the database service tier to increase compute and IO capacity can improve raw throughput but it does not change the execution plan and it is a costly way to mask missing indexes. Without an index queries will still perform large scans and scaling up is usually less effective than adding the right index.

Create a clustered index on the column might improve performance in some cases because it defines the physical row order, but only one clustered index can exist and changing the clustering key can be intrusive. For accelerating a frequently filtered column a nonclustered index is typically the least disruptive and most targeted solution.

When a column is used often in WHERE clauses try adding an index on that column first before considering scaling or partitioning because indexes usually give the biggest performance gain for the least cost and complexity.

A regional analytics startup called Meridian Insights must choose a data representation that supports nested fields for configuration and document style records and that can be shared between services and stored in document databases. Which data format supports hierarchical structures and is commonly used for configuration files data interchange and document storage in NoSQL systems?

  • ✓ C. JavaScript Object Notation JSON

The correct choice is JavaScript Object Notation JSON.

JavaScript Object Notation JSON natively supports nested objects and arrays which makes it suitable for hierarchical configuration and document style records. It is language independent and widely used for data interchange between services and for storing documents in NoSQL systems because it is human readable and easy to parse.

Comma separated values CSV is intended for flat, tabular data and it does not natively support nested fields so it is not appropriate for hierarchical configuration or document style records.

Cloud Firestore is a managed NoSQL document database and not a standalone data representation format. The question asks for a data format that can be shared and stored, so the format itself rather than a specific database service is the correct answer.

Extensible Markup Language XML does support hierarchical structures and it can be used for configuration and document storage, but it is less commonly used today for lightweight APIs and many modern NoSQL systems and configuration files prefer JavaScript Object Notation JSON for its simplicity and ubiquity.

Choose answer choices that name a data format rather than a platform or service and remember that JSON is the common lightweight option for nested configuration and NoSQL documents.

Within the context of Contoso Cloud identify the missing term in this sentence. A(n) [?] partners with stakeholders to design and build data assets such as ingestion pipelines cleansing and transformation workflows and data stores for analytical workloads and they use a broad set of data platform technologies including relational and non relational databases file storage and streaming sources?

  • ✓ C. Contoso Data Engineer

Contoso Data Engineer is correct. This role description matches a data engineer who partners with stakeholders to design and build data assets such as ingestion pipelines, cleansing and transformation workflows, and data stores for analytical workloads.

The data engineer builds and owns the pipelines and workflows that bring raw data into analytical stores and that prepare data for analysis. They work across relational and non relational databases, file storage, and streaming sources and they implement cleansing, transformation, and integration logic so analysts and data scientists can use reliable data.

Contoso Cloud Solutions Architect is incorrect because cloud architects focus on overall solution design, governance, and platform choices rather than hands on construction of ingestion pipelines and transformation workflows.

Contoso Data Analyst is incorrect because analysts focus on exploring and interpreting prepared data and producing reports and dashboards rather than building the underlying data ingestion and transformation infrastructure.

Contoso Database Administrator is incorrect because database administrators concentrate on operating, tuning, and securing database systems rather than designing broad data pipelines and streaming workflows for analytical workloads.

Focus on the verbs in the question such as design, build, and ingestion pipelines to pick the role that performs engineering and construction of data assets rather than roles that only analyze or operate databases.

At Meridian Data Labs a row oriented file format created by an Apache project stores each record with a header that defines the record structure and that header is encoded as JSON while the record payload is stored as binary. Applications read the header to interpret and extract the fields. Which format is being described?

  • ✓ G. Avro

Avro is correct because Apache Avro stores a JSON-encoded schema with records and writes the record payload in a compact binary form so applications read the schema to interpret and extract fields.

Avro was designed as a row oriented serialization system and it embeds the schema with the data so readers can discover field definitions at read time and then decode the binary payload accordingly. This combination of a JSON header for the schema and binary-encoded record data is a defining characteristic of Avro.

Parquet is incorrect because Parquet is a columnar storage format optimized for analytical reads and it does not embed per-record JSON headers for binary payloads.

BigQuery is incorrect because BigQuery is a managed data warehouse service and not a file format that embeds a JSON schema with binary record payloads.

JSON is incorrect because JSON is a text based format that represents both schema and data in human readable text rather than storing a separate JSON header with a binary payload.

ORC is incorrect because ORC is a columnar file format for Hadoop style workloads and it uses columnar encodings and metadata rather than per record JSON headers and binary record payloads.

XLSX is incorrect because XLSX is a zipped XML based spreadsheet format and it does not follow the Avro pattern of a JSON schema header plus binary encoded records.

CSV is incorrect because CSV is a plain text row format without an embedded JSON schema header and it stores values as text rather than as a binary payload with an accompanying JSON description.

When a question mentions a schema embedded with each record as JSON and a compact binary payload think Avro and rule out columnar formats like Parquet or ORC and managed services like BigQuery.

Identify the missing word or phrase in this Microsoft Azure scenario. The blank approach is ideal for migrations and for applications that require full operating system level access. SQL Server virtual machines follow a lift and shift pattern and you can transfer an on premises installation into a cloud virtual machine with minimal adjustments so the system behaves largely as it did in its original environment?

  • ✓ C. Infrastructure as a Service

The correct option is Infrastructure as a Service.

Infrastructure as a Service provides full operating system level access by letting you run virtual machines in the cloud, which makes it ideal for lift and shift migrations. Deploying SQL Server on an Azure virtual machine lets you transfer an on premises installation into a cloud VM with minimal adjustments so the system behaves much like it did in its original environment.

Platform as a Service is focused on managed application platforms and does not give you full OS level control, so it is not the right fit for a straightforward lift and shift of an entire server.

Function as a Service is a serverless model for running small, event driven functions and it does not provide an operating system or long running VMs, so it cannot host a full SQL Server instance.

Software as a Service delivers fully managed applications where the provider controls the OS and infrastructure, so you would not be able to perform an on premises style lift and shift of the underlying server.

Watch for keywords like full operating system level access and lift and shift in the question because they usually point to an IaaS solution rather than PaaS, FaaS, or SaaS.

Which class of analytics workload is most suitable for handling a continuous flow of incoming data that has no defined start or finish?

  • ✓ C. Continuous stream processing with low latency

Continuous stream processing with low latency is the correct option.

Continuous stream processing with low latency is designed for unbounded data where there is no defined start or finish and it processes events as they arrive which keeps end to end latency low.

The approach supports event by event processing, windowing, and stateful operations which are required when you need near real time insights from a continuous flow of data.

Microbatch processing for frequent scheduled jobs is not suitable because it relies on fixed schedules and it introduces latency that is higher than what low latency streaming requires.

Cloud Dataflow is a specific managed service that can implement streaming workloads but it is not the class of workload itself and the question asked for the type of analytics workload rather than a product.

Traditional ETL batch pipeline is built for bounded datasets with a clear start and finish and it cannot provide the low latency needed for continuous unbounded streams.

Focus on whether the data is bounded or unbounded and on latency needs. Unbounded continuous flows with real time requirements point to stream processing rather than batch approaches.

A regional bookseller plans to migrate its on-site SQL Server instance to a Google Cloud Compute Engine virtual machine and asserts that SQL Server hosted in the VM behaves the same as the physical on-premises server and that migrating the databases is identical to copying them between two local servers. Is that claim true?

  • ✓ B. True

The correct answer is True.

A SQL Server instance running inside a Google Cloud Compute Engine virtual machine uses the same SQL Server binaries and operating system configuration so the database engine behaves the same from an application and administration perspective. You can use the same SQL Server management tools and features and the VM runs the same database engine as an on premises server.

That said there are important cloud specific differences to consider when you move to a VM. Persistent disk performance characteristics, network latency and bandwidth, and licensing or high availability choices can differ from a local physical environment and those differences affect performance tuning and operational procedures. You should size VM types and attach appropriate disk types and configure backups and HA according to cloud best practices.

Database migration workflows are similar to on premises methods in that you can use backup and restore, detach and attach, or replication. However copying data between an on premises server and a cloud VM is not always identical to a local server to server copy because you must account for network transfer, consistent snapshots, potential downtime windows, and any cloud specific tooling or transfer appliances for very large datasets.

False is incorrect because saying the claim is simply false ignores that the SQL Server engine itself does behave the same in a VM. The statement is best understood as true with practical caveats about storage, networking, licensing and migration mechanics.

When exam statements claim two environments behave the same, check for infrastructure differences such as storage performance, network transfer and licensing that can change operational or migration procedures.

Which term best completes this Contoso Cloud sentence where the blank data store contains a collection of objects data values and named string fields formatted as JSON?

  • ✓ D. Document

The correct option is Document.

Document stores hold semi structured records as JSON or similar formats and they contain named fields and nested objects and arrays. A document database organizes data into collections of such documents so you can query by field names and store complex object graphs inside each record.

Time series is not correct because time series stores are optimized for sequences of timestamped measurements and not for arbitrary JSON documents with named string fields.

Graph is not correct because graph databases represent data as nodes and edges for relationship queries rather than as collections of JSON documents.

Key-Value is not correct because key value stores map a key to an opaque value or blob and they do not expose structured named fields within the stored value in the same way a document store does.

When the question mentions JSON documents with named fields and collections think document database and look for words like collection nested objects and queryable fields.

Jira, Scrum & AI Certification

Want to get certified on the most popular software development technologies of the day? These resources will help you get Jira certified, Scrum certified and even AI Practitioner certified so your resume really stands out..

You can even get certified in the latest AI, ML and DevOps technologies. Advance your career today.

Cameron McKenzie Cameron McKenzie is an AWS Certified AI Practitioner, Machine Learning Engineer, Copilot Expert, Solutions Architect and author of many popular books in the software development and Cloud Computing space. His growing YouTube channel training devs in Java, Spring, AI and ML has well over 30,000 subscribers.