At GreenField Analytics, a group of seven researchers plans to build a bespoke generative AI model and they want to iterate quickly with popular libraries while sharing work in the same environment. They also need frictionless access to BigQuery datasets without lengthy configuration and they prefer to spend their time coding models rather than maintaining servers and packages. Which Google Cloud service best meets these needs by offering a fully managed collaborative Jupyter environment with common ML frameworks already available?
-
❏ A. Vertex AI Pipelines
-
❏ B. Cloud Shell Editor
-
❏ C. Vertex AI Workbench
-
❏ D. Vertex AI Model Garden
A learning and development lead at a regional credit union needs to produce role-specific welcome videos for seven departments, compress lengthy compliance guides into executive briefs, and analyze open-ended survey comments to spot skills gaps. For the task of delivering tailored onboarding videos for each department, which generative AI capability provides the greatest value?
-
❏ A. Summarization
-
❏ B. Discovery
-
❏ C. Creation
-
❏ D. Vertex AI Search
A market intelligence firm named Elkstone Insights plans to launch a generative AI assistant that lets analysts ask about breaking developments and get answers that include citations from trustworthy and current public news sites. They want the assistant grounded on broad open web “world data” rather than only their private repositories, so which Google Cloud grounding capability should they use?
-
❏ A. Vertex AI Extensions with a third-party news API connector
-
❏ B. Vertex AI Search configured for internal repositories
-
❏ C. Grounding with Google Search
-
❏ D. RAG APIs with a custom vector store built from historical news archives
EduBright at example.com uses a generative AI mentor to assemble personalized study plans for its learners. Over the past 24 months the model was trained on interaction logs in which roughly 80 percent came from a single cultural and regional cohort, and students from other groups now receive plans that are less engaging and less effective. What area should the team focus on to ensure the system provides equitable experiences and outcomes for all student populations?
-
❏ A. Model scalability
-
❏ B. Explainable AI
-
❏ C. Fairness in AI systems
-
❏ D. Computational efficiency
A compliance team at a multinational bank plans to launch an internal assistant that performs grounded reasoning over regulatory PDFs. They need to automate the creation and lifecycle of “spaces” and to ingest multiple source files from example.com and to issue question and answer prompts against those sources using code instead of the browser UI. Which Google Cloud service provides an API to enable this programmatic control?
-
❏ A. Document AI API
-
❏ B. Gemini API
-
❏ C. Cloud NotebookLM API
-
❏ D. Vertex AI Search API
Which MLOps tooling should a team that retrains models every 30 days use to ensure reproducible training by versioning datasets and shared features in alignment with code commits?
-
❏ A. Vertex AI Metadata
-
❏ B. Feature store or DVC style data versioning
-
❏ C. Data catalogs such as Data Catalog
-
❏ D. Experiment tracking systems
An aerospace parts distributor maintains a large internal repository of service bulletins, repair procedures, and wiring diagrams that are scattered across wikis, file shares, and document libraries. The company wants its maintenance engineers to ask natural language questions and quickly retrieve exact passages from these complex materials. Which Google Cloud product is purpose built to create an enterprise search experience over an organization’s private content?
-
❏ A. Document AI
-
❏ B. Vertex AI Search
-
❏ C. Cloud Storage
-
❏ D. Google Public Search
A travel assistant built on GCP receives a request that says “Is it raining in Sydney right now and what is the capital of Australia”. The agent first plans that it must fetch live weather for Sydney and also retrieve the country capital, then it invokes a weather API tool and a knowledge base lookup, and finally it composes a single response. This loop of thinking through subtasks and then calling tools to gather facts best represents which prompting pattern?
-
❏ A. Chain-of-Thought prompting
-
❏ B. ReAct approach
-
❏ C. Retrieval-augmented generation
-
❏ D. Few-shot prompting
A consumer insights team at BeaconField Analytics is choosing a feedback collection strategy for a new product launch. They are evaluating typed questionnaires or 45-minute video conversations with 120 customers and they decide to move forward with video interviews. What is an important business consequence of selecting video data rather than text for the AI analysis?
-
❏ A. Model training completes faster because video contains denser information
-
❏ B. Store the raw interviews in BigQuery to reduce costs and simplify analytics
-
❏ C. You can capture multimodal cues such as tone and emotion for deeper insights although storage and compute costs increase
-
❏ D. Overall storage and processing expenses will be substantially lower with video
A content intelligence team at Ravenbrook Publishing wants to use AI to search across PDFs, product mockup images, and recorded webinars so they can gather insights from their internal asset library. They need one approach that can interpret and retrieve information from different modalities in a single workflow. Which Google Cloud capability should they choose?
-
❏ A. Document AI
-
❏ B. Multimodal Search
-
❏ C. Prompt Tuning
-
❏ D. Vertex AI Feature Store
All questions come from Cameron McKenzie’s Generative AI Practice Exams Udemy course and certificationexams.pro
An Operations Director at MetroParcel Logistics is evaluating a proposal for an AI enabled dispatch platform. The proposal describes four components. It includes a Google Kubernetes Engine cluster with GPUs to run the workloads. It names the foundation model to be used for route planning. It lists a browser portal at example.com that drivers will use to view their manifests. It also specifies an AI component that monitors real time traffic feeds and independently updates routes to hit delivery goals. Within the generative AI stack, which component aligns with the Agent layer?
-
❏ A. The foundation model hosted on Vertex AI for route planning
-
❏ B. The web portal for drivers on example.com
-
❏ C. Google Kubernetes Engine with attached GPUs
-
❏ D. The AI that autonomously adjusts routes based on live traffic
Which Google Cloud feature offers auditable model documentation and bias evaluation across demographic groups?
-
❏ A. Vertex AI Explainable AI
-
❏ B. Vertex AI Model Monitoring
-
❏ C. Vertex AI Model Cards and Fairness Indicators
BlueRiver Freight is building a new streaming pipeline on Google Cloud. The team captures raw telemetry from delivery drones and writes every event into a Cloud Storage landing bucket for at least 90 days before any parsing, filtering, or transformation occurs. Which phase of the machine learning lifecycle best describes this activity?
-
❏ A. Model Deployment
-
❏ B. Data Preparation
-
❏ C. Data Ingestion
-
❏ D. Model Training
A compliance team at a mid sized bank wants to use generative AI to analyze and summarize entire loan agreements that are typically 80 to 160 pages in length. The bank’s technology advisor explains that some models can only handle a small portion of the contract in one request while others can process the full agreement but each analysis costs more. Which model attribute is most important for this workload?
-
❏ A. Tailored fine tuning for financial and legal vocabulary
-
❏ B. Large context window to accommodate the entire document in one pass
-
❏ C. Knowledge cutoff date of the model
-
❏ D. Temperature settings to adjust creativity versus determinism
Apex Manufacturing wants to launch an internal chatbot within 14 days that can answer employee questions using about 60,000 HR and compliance documents stored in Google Drive and Cloud Storage, and leadership requires that responses be grounded in the source files to reduce hallucinations while preferring a solution that leverages Google’s search expertise rather than a complex custom build, so which Google Cloud service should they use to quickly deliver this RAG powered assistant?
-
❏ A. Dialogflow CX
-
❏ B. Vertex AI Endpoints
-
❏ C. Vertex AI Search
-
❏ D. Training a foundation model from scratch in Vertex AI
A mid sized apparel brand called RiverStone Outfitters launches a generative AI assistant that answers customer questions using its internal support articles. An attacker submits a message with concealed directions that convince the assistant to ignore its guardrails and reveal sensitive configuration details. What is this attack technique called when the adversary manipulates model behavior by placing malicious instructions inside the user prompt?
-
❏ A. Model inversion
-
❏ B. Prompt injection
-
❏ C. Data poisoning
-
❏ D. Denial-of-service
A small media analytics group at Blue Harbor Labs has built a custom data processing routine into a container image and wants to publish it as an HTTPS API that a generative AI agent on example.com can invoke as a tool. The platform must be fully managed and elastic and it must scale down to zero when idle and it should handle bursts of about 4,000 requests per minute during promotions. Which Google Cloud service should they choose to run this containerized endpoint?
-
❏ A. Cloud Functions
-
❏ B. GKE Autopilot
-
❏ C. Cloud Run
-
❏ D. Vertex AI Model Garden
In Google Cloud, what is the most effective way to adapt a general model so it consistently produces text in a legal style and uses appropriate legal terminology?
-
❏ A. Retrieval augmented generation with a private legal knowledge base
-
❏ B. Domain fine tuning on a curated legal corpus
-
❏ C. Prompt engineering with legal style instructions
-
❏ D. Lower model temperature
Cobalt Pixel Labs is developing new foundation models for multimodal search and needs on demand access to large fleets of TPUs and GPUs along with high throughput networking and durable storage that are fully operated by their cloud provider. Within the generative AI stack, which layer primarily delivers these foundational compute capabilities?
-
❏ A. Models
-
❏ B. Infrastructure
-
❏ C. Platforms
-
❏ D. Applications
Novamart is piloting a voice-enabled virtual concierge for its customer support line that must capture callers’ spoken questions, infer meaning and detect sentiment, then speak a personalized reply in real time. Which combination of Google Cloud AI APIs should be used to deliver these core capabilities?
-
❏ A. Dialogflow CX and Cloud Run
-
❏ B. Speech-to-Text API, Cloud Natural Language API, and Text-to-Speech API
-
❏ C. Cloud Vision API and Cloud Video Intelligence API
-
❏ D. Translation API and Document AI API
All questions come from Cameron McKenzie’s Generative AI Practice Exams Udemy course and certificationexams.pro
Skylark Journeys, an online travel agency, introduced a generative AI virtual assistant to triage customer chats, and the operations team is comparing the average handle time for cases that escalate to human agents during the 45 days after launch with the average handle time in the 90 days before launch to assess impact. This measurement reflects which kind of outcome?
-
❏ A. Customer satisfaction indicators
-
❏ B. Direct revenue generation
-
❏ C. Operational efficiency and cost savings
-
❏ D. Brand perception lift
A content operations manager at scrumtuous.com is weighing two ways to improve generative writing workflows. One path focuses on teaching staff to craft clearer and more targeted prompts and the other involves adopting a tool that automatically adapts the underlying model to match their editorial voice and tasks. Which statement best describes these two approaches?
-
❏ A. The first is few shot prompting and the second is full model retraining on Vertex AI
-
❏ B. The first is prompt engineering and the second is prompt tuning and prompt tuning typically reduces ongoing enablement effort while requiring greater upfront investment
-
❏ C. Both are prompt engineering and they require the same level of investment
-
❏ D. The first is prompt tuning and the second is prompt engineering and they demand the same technical depth
A regional healthcare network wants its internal help desk to quickly surface answers from more than 25,000 clinical guidelines, internal FAQs, and support transcripts. They require an AI driven experience that understands natural language intent, retrieves precise results across private repositories, and improves relevance over time using click and rating feedback. Which Google Cloud service best meets these needs?
-
❏ A. Vertex AI Vector Search
-
❏ B. Google Search
-
❏ C. Vertex AI Agent Builder
-
❏ D. Vertex AI Search
In a generative workflow, which technique uses the output of one prompt as context for the next prompt?
-
❏ A. Zero-shot prompting
-
❏ B. Prompt chaining
-
❏ C. Retrieval-augmented generation
A national e commerce retailer wants to automate customer support for complicated order and delivery requests that must pull data from three internal systems, apply company policies, and reply in a consistent brand voice. The team tried a generic chatbot and found it could not orchestrate workflows or connect to tools as required. What approach on Google Cloud should they adopt to meet these needs?
-
❏ A. Vertex AI Search
-
❏ B. Gemini Advanced
-
❏ C. Vertex AI Agent Builder
-
❏ D. Fine tune a foundation model with Model Garden
An insurance carrier plans to use generative AI to draft customer policy summaries. State regulators require that each document follows a fixed layout with three sections, includes the approved legal disclaimer verbatim, and uses a glossary managed by the compliance team without deviation. When selecting a model, which factor is most critical to meet these strict regulatory obligations?
-
❏ A. The model’s ability to produce highly imaginative and varied text
-
❏ B. Strong customization features that let the model learn and apply mandated templates, disclaimers and vocabulary
-
❏ C. The lowest cost per generated document
-
❏ D. Native support for structured JSON style output
Oriole Outfitters is a multinational retailer that plans to roll out generative AI across nine business units. The leadership wants a unified governance approach for AI security and risk management, and they must ensure their AI programs follow industry standards and comply with regulations in every region. Which Google Cloud offering or framework is purpose built to guide organizations on AI security governance and risk control across the lifecycle?
-
❏ A. Assured Workloads
-
❏ B. Vertex AI Model Garden
-
❏ C. Google Cloud AI optimized infrastructure TPUs and GPUs
-
❏ D. Google’s Secure AI Framework SAIF
BrightHire Labs launches a generative assistant that ranks resumes for hiring teams, and after two months an internal fairness check finds that it often scores candidates from certain communities as “lower potential” even when their education and experience are equivalent. The model was trained largely on 12 years of historical hiring decisions from a field with persistent underrepresentation of those communities. What is the most plausible root cause of this behavior?
-
❏ A. Production data drift that changed resume patterns since deployment
-
❏ B. Skewed training data that encodes past hiring bias and teaches unfair patterns
-
❏ C. Lack of model interpretability which makes it hard to see why predictions were made
-
❏ D. Vertex AI Model Monitoring
A regional telecom provider named CoastalConnect plans to roll out a generative AI assistant for its help desk analysts. Over three weeks the leadership schedules hands-on enablement sessions, explains the expected benefits and workflow impacts, and opens a dedicated channel for ongoing feedback and questions from staff. Which organizational area do these actions most directly target during gen AI integration?
-
❏ A. Responsible AI governance and model monitoring
-
❏ B. Data security and privacy compliance
-
❏ C. Managing organizational change and user adoption
-
❏ D. Algorithm selection and model fine-tuning
What is the primary platform-level risk of using a no-code AI builder that offers only one general-purpose model and fixed UI widgets?
-
❏ A. Unpredictable infrastructure costs
-
❏ B. Limited access to Vertex AI features
-
❏ C. Limited capabilities reduce evolution and differentiation
-
❏ D. Need to hire more infrastructure engineers
A marketing coordinator at the online retailer mcnz.com spends several hours each week turning transcripts from Google Meet that are about 90 minutes long into succinct summaries and composing follow-up messages in Gmail. They want an out of the box generative AI capability that works natively inside their current Google Workspace apps so they can automate these tasks without switching tools. Which Google Cloud offering should they use?
-
❏ A. NotebookLM
-
❏ B. Gemini for Google Workspace
-
❏ C. Vertex AI Search
-
❏ D. The standalone Gemini app
A retail analytics firm named BlueMarket Labs is building an assistant on Vertex AI that uses one model to condense long vendor documents, another to categorize customer comments, and another to brainstorm campaign ideas. The lead architect wants the solution to be responsive and economical. Which approach would not be a valid way to optimize both cost and performance on Vertex AI?
-
❏ A. Cache answers for frequent or identical prompts to avoid repeat calls
-
❏ B. Route all traffic only to the region with the lowest per unit compute price even if it increases user latency
-
❏ C. Use a lightweight model such as Gemini 1.5 Flash for simple classification and a stronger model such as Gemini 2.0 Pro for complex reasoning
-
❏ D. Constrain the “max output tokens” in requests so verbose replies do not inflate cost
An applied research lab at example.com is building generative AI prototypes and wants the freedom to combine open-source frameworks and community models with managed cloud capabilities. They want to reduce vendor lock-in and benefit from rapid innovation across the wider AI ecosystem. Which element of Google Cloud’s generative AI strategy would best align with these goals?
-
❏ A. Enterprise security and compliance controls
-
❏ B. Prebuilt industry accelerators for AI
-
❏ C. Commitment to an open ecosystem with support for open-source models, tools, and interoperability
-
❏ D. Cloud TPU
A creative agency plans to use a foundation model in Vertex AI to produce 20 second promotional videos from short text prompts for social networks. When evaluating candidate models, what is the single most important first criterion to verify for this task?
-
❏ A. Inference latency for generating video clips
-
❏ B. Availability of customization or fine-tuning features
-
❏ C. Support for the required modalities that is text to video
-
❏ D. Maximum input context window size for prompts
A regional convenience store cooperative plans to deploy generative AI to produce localized marketing copy for more than 650 independently owned locations. Each message must reflect nearby events, regional phrasing, and the precise product availability at each store. The leadership team needs an approach that can expand efficiently and be quickly adapted for each owner’s needs. Given these requirements, which consideration should drive the selection of the generative AI solution?
-
❏ A. Maximizing the size of the base model above all other factors
-
❏ B. Prioritizing advanced mathematical and logical problem solving
-
❏ C. Ensuring the approach can scale broadly and be easily tailored for each locale
-
❏ D. Minimizing per-token cost as the primary decision criterion
Generative AI Leader Practice Exam Answers
All questions come from Cameron McKenzie’s Generative AI Practice Exams Udemy course and certificationexams.pro
At GreenField Analytics, a group of seven researchers plans to build a bespoke generative AI model and they want to iterate quickly with popular libraries while sharing work in the same environment. They also need frictionless access to BigQuery datasets without lengthy configuration and they prefer to spend their time coding models rather than maintaining servers and packages. Which Google Cloud service best meets these needs by offering a fully managed collaborative Jupyter environment with common ML frameworks already available?
-
✓ C. Vertex AI Workbench
The correct option is Vertex AI Workbench.
This service provides a fully managed Jupyter environment that teams can share so researchers can collaborate in the same workspace. It comes with common machine learning frameworks already installed which lets the group start coding immediately. It also offers streamlined access to BigQuery through built-in connectors and project authentication so there is no lengthy configuration. Because the platform manages infrastructure and packages, the team can focus on iterating on their generative model rather than maintaining servers.
Vertex AI Pipelines is built for orchestrating and automating ML workflows, not for interactive, collaborative notebook development. It does not provide a shared Jupyter environment with preinstalled frameworks.
Cloud Shell Editor is a lightweight browser-based IDE meant for general development and administration. It is not optimized for data science collaboration, lacks the managed Jupyter experience, and does not provide the ready-to-use ML frameworks or scalable notebook runtimes that the scenario requires.
Vertex AI Model Garden is a catalog to discover and use foundation models and solutions. It is not a managed notebook environment for building custom models or for collaborative Jupyter-based development.
When a scenario stresses a managed and collaborative Jupyter experience with easy BigQuery access, prefer the notebook platform over orchestration tools or general IDEs.
A learning and development lead at a regional credit union needs to produce role-specific welcome videos for seven departments, compress lengthy compliance guides into executive briefs, and analyze open-ended survey comments to spot skills gaps. For the task of delivering tailored onboarding videos for each department, which generative AI capability provides the greatest value?
-
✓ C. Creation
The correct option is Creation because the task is to generate new, department-specific onboarding videos rather than simply condense or retrieve existing information.
This capability focuses on producing original multimedia outputs that can be tailored by department. It can use role context and brand guidelines to draft scripts, propose storyboards, craft narration and assemble visuals so each team receives a video that matches its responsibilities and terminology.
Summarization is designed to condense lengthy content into shorter formats and it fits the executive brief need but it does not generate new video assets for different departments.
Discovery helps uncover relevant information and patterns which can guide what to include in training yet it still does not create the actual onboarding videos.
Vertex AI Search provides retrieval and question answering over indexed enterprise content and it improves findability rather than producing department-specific video content.
Match the verb in the scenario to the capability. If the need is to create new content choose creation. If the need is to condense material choose summarization. If the need is to find or surface information choose discovery or a search tool.
A market intelligence firm named Elkstone Insights plans to launch a generative AI assistant that lets analysts ask about breaking developments and get answers that include citations from trustworthy and current public news sites. They want the assistant grounded on broad open web “world data” rather than only their private repositories, so which Google Cloud grounding capability should they use?
-
✓ C. Grounding with Google Search
The correct answer is Grounding with Google Search.
Grounding with Google Search lets the model consult live open web content and return answers with citations and attributions to trustworthy public sources. It is designed for broad world knowledge and freshness so it is well suited for questions about breaking developments where current information and source links are essential.
Vertex AI Extensions with a third-party news API connector is not a grounding capability and would limit the assistant to whatever that single API exposes. This approach does not provide comprehensive open web coverage and it does not inherently deliver broad, up to date citations across many public sites.
Vertex AI Search configured for internal repositories focuses on enterprise search over your own indexed data. It is optimized for private and internal sources rather than the public web, so it does not meet the need for broad world data from current news sites.
RAG APIs with a custom vector store built from historical news archives retrieves only from the content you ingest, which would be limited to your archive. This cannot guarantee freshness for breaking news and it will not automatically provide citations from across the live public web.
When a question emphasizes current information from the public web and needs citations, prefer grounding with Google Search. When it emphasizes private or enterprise data, prefer Vertex AI Search or RAG over your indexed content.
EduBright at example.com uses a generative AI mentor to assemble personalized study plans for its learners. Over the past 24 months the model was trained on interaction logs in which roughly 80 percent came from a single cultural and regional cohort, and students from other groups now receive plans that are less engaging and less effective. What area should the team focus on to ensure the system provides equitable experiences and outcomes for all student populations?
-
✓ C. Fairness in AI systems
The correct option is Fairness in AI systems. The training data overwhelmingly reflected one cohort and this led to unequal engagement and effectiveness for other groups, so the team needs to focus on equitable data coverage and outcomes across demographics.
Concentrating on this area means auditing subgroup representation in the interaction logs and defining clear equity goals with measurable metrics across groups. The team should rebalance or reweight the data where appropriate, expand data collection for underrepresented students, and incorporate bias mitigation during training and evaluation. They should also monitor outcomes by demographic group after deployment so the model continues to deliver equitable study plans as usage evolves.
Model scalability concerns the ability to handle more users or larger workloads and it does not address disparities caused by skewed training data, so it would not resolve the unequal outcomes across student populations.
Explainable AI improves transparency and helps diagnose issues, yet explanation alone does not correct performance gaps between demographic groups, so it is not the primary focus for restoring equity.
Computational efficiency targets speed and resource usage, which does not tackle representational imbalance or unequal outcomes, so it is not the right emphasis for this scenario.
When outcomes differ by demographic groups, prioritize fairness and representative data before tuning explainability, scalability, or efficiency, and verify improvements with subgroup metrics.
A compliance team at a multinational bank plans to launch an internal assistant that performs grounded reasoning over regulatory PDFs. They need to automate the creation and lifecycle of “spaces” and to ingest multiple source files from example.com and to issue question and answer prompts against those sources using code instead of the browser UI. Which Google Cloud service provides an API to enable this programmatic control?
-
✓ C. Cloud NotebookLM API
The correct option is Cloud NotebookLM API.
Cloud NotebookLM API is designed to programmatically manage the full lifecycle of the assistant experience built around curated sources. It exposes endpoints to create and administer spaces, to add and update sources such as PDFs and web URLs, and to send prompts that return grounded answers that cite those sources. This matches the requirement to automate creation of spaces, to ingest multiple files from example.com, and to issue question and answer prompts without relying on the browser UI.
Document AI API focuses on structured extraction and classification from documents using processors. It does not provide the concept of spaces or an assistant workflow for grounded Q and A over a curated set of sources, so it does not meet the need for programmatic control of spaces and prompts.
Gemini API provides access to large language models for content generation and reasoning. It does not natively manage spaces or ingestion of enterprise sources for grounded retrieval, so you would have to build and host your own ingestion and grounding layer rather than use a ready API for spaces.
Vertex AI Search API powers search and question answering over indexed data through data stores and serving configurations. While it can support retrieval and Q and A, it does not offer the spaces model or the NotebookLM workflow the question requires, so it is not the API that enables programmatic creation and lifecycle of spaces.
Watch for unique product nouns such as spaces or sources. These often map directly to specific services and can quickly eliminate close alternatives that do not expose those exact primitives by API.
Which MLOps tooling should a team that retrains models every 30 days use to ensure reproducible training by versioning datasets and shared features in alignment with code commits?
-
✓ B. Feature store or DVC style data versioning
The correct option is Feature store or DVC style data versioning. This is the only choice that provides dataset and shared feature versioning that can be aligned with code commits so the team can reliably reproduce training every 30 days.
This approach gives you immutable snapshots of data and features that can be tied to commit hashes which ensures the exact same inputs can be retrieved later. A managed feature service supports feature versioning and point in time retrieval so you avoid data leakage and can recreate the precise training view. A Git integrated data versioning tool records data pointers alongside code so restoring the state for any past commit becomes straightforward.
Vertex AI Metadata records lineage and metadata for artifacts and pipelines which is valuable for traceability but it does not provide dataset or shared feature versioned storage aligned with code commits. It complements data versioning rather than replacing it.
Data catalogs such as Data Catalog help discover and govern datasets and they store metadata and tags but they do not create immutable dataset or feature versions nor do they align those versions with code commits.
Experiment tracking systems capture runs, parameters, metrics and artifacts for comparison but they do not version the underlying datasets or shared features which is required for reproducible retraining.
When a question emphasizes reproducibility and alignment with code commits choose tools that version the training data and features rather than tools that only track experiments or metadata.
An aerospace parts distributor maintains a large internal repository of service bulletins, repair procedures, and wiring diagrams that are scattered across wikis, file shares, and document libraries. The company wants its maintenance engineers to ask natural language questions and quickly retrieve exact passages from these complex materials. Which Google Cloud product is purpose built to create an enterprise search experience over an organization’s private content?
-
✓ B. Vertex AI Search
The correct option is Vertex AI Search.
This product is purpose built to deliver enterprise search across an organization�s private content. It can index and unify content from wikis, file shares, document libraries and cloud object stores, then let users ask natural language questions and get exact passages and citations. It supports semantic and keyword retrieval, connectors to common repositories, and respects access controls so engineers only see what they are allowed to see. It can also power conversational experiences over your data which aligns with the need for quick passage level answers from complex materials.
Document AI focuses on parsing and extracting structured data from documents using OCR and specialized processors. While its outputs can feed other systems, it does not provide an end to end enterprise search experience with natural language querying and passage ranking across many sources.
Cloud Storage is durable object storage for files. It does not index content for search or provide natural language question answering, and you would still need a separate search layer to retrieve and rank passages.
Google Public Search is a consumer web search engine for publicly available content on the internet. It cannot index or securely search your private enterprise repositories, so it does not satisfy the requirement.
When you see a need for natural language retrieval and exact passages over private content, select the purpose built enterprise search service rather than storage or document processing products.
A travel assistant built on GCP receives a request that says “Is it raining in Sydney right now and what is the capital of Australia”. The agent first plans that it must fetch live weather for Sydney and also retrieve the country capital, then it invokes a weather API tool and a knowledge base lookup, and finally it composes a single response. This loop of thinking through subtasks and then calling tools to gather facts best represents which prompting pattern?
-
✓ B. ReAct approach
The correct option is ReAct approach.
This pattern fits because the agent explicitly reasons about the subtasks it must perform, decides to call external tools to fetch live weather and look up a fact, executes those tool calls, and then synthesizes the final answer. That think and act loop is the defining characteristic of this approach.
Chain-of-Thought prompting is not correct because it focuses on guiding the model to show its reasoning steps, yet it does not require the model to invoke tools or perform actions to gather fresh facts.
Retrieval-augmented generation is not correct because RAG emphasizes retrieving documents or knowledge snippets to ground the answer, which is usually a single retrieval step that informs generation. The scenario highlights iterative reasoning and multiple tool calls rather than only retrieval to ground the response.
Few-shot prompting is not correct because few-shot relies on exemplars in the prompt to shape behavior. It does not inherently involve planning subtasks or calling external tools.
When a scenario explicitly interleaves thinking with tool use to gather up-to-date facts, identify the ReAct pattern. If it only retrieves documents to ground the model, think RAG. If it shows reasoning without actions, think Chain-of-Thought. If it shows examples in the prompt, think Few-shot.
A consumer insights team at BeaconField Analytics is choosing a feedback collection strategy for a new product launch. They are evaluating typed questionnaires or 45-minute video conversations with 120 customers and they decide to move forward with video interviews. What is an important business consequence of selecting video data rather than text for the AI analysis?
-
✓ C. You can capture multimodal cues such as tone and emotion for deeper insights although storage and compute costs increase
The correct option is You can capture multimodal cues such as tone and emotion for deeper insights although storage and compute costs increase.
Video interviews carry audio and visual signals that text cannot provide which can improve the richness of insights. Tone of voice, pacing, facial expressions and gestures can all help models detect sentiment and nuance more effectively than typed responses. This benefit comes with larger files and more complex processing which means higher storage usage and greater compute time for feature extraction and model inference. On Google Cloud you would typically store the raw footage in Cloud Storage and extract transcripts or features for downstream analytics in BigQuery or Vertex AI.
Model training completes faster because video contains denser information is incorrect because video is high dimensional and includes many frames and an audio track which expands the data volume and typically increases preprocessing and training time rather than reducing it.
Store the raw interviews in BigQuery to reduce costs and simplify analytics is incorrect because BigQuery is a data warehouse optimized for structured and semi structured data such as tables and it is not intended for storing large binary video files. Raw media should be stored in Cloud Storage and only derived metadata or transcripts should be loaded into BigQuery for analytics.
Overall storage and processing expenses will be substantially lower with video is incorrect because video requires significantly more storage than text and it also increases compute for decoding, transcription and visual analysis which generally raises costs.
When a scenario compares text with video or audio first ask what additional signal the richer modality provides and then confirm the tradeoff in storage and compute. If raw files are involved remember that Cloud Storage holds unstructured media and BigQuery holds structured analytics data.
A content intelligence team at Ravenbrook Publishing wants to use AI to search across PDFs, product mockup images, and recorded webinars so they can gather insights from their internal asset library. They need one approach that can interpret and retrieve information from different modalities in a single workflow. Which Google Cloud capability should they choose?
-
✓ B. Multimodal Search
The correct option is Multimodal Search because it lets you interpret and retrieve information from text in PDFs, visual content in images, and audio or video transcripts in a single unified search workflow.
With Multimodal Search, you can index heterogeneous assets and use shared embeddings to perform semantic retrieval across modalities. This capability returns relevant results whether the query or the answer lives in text, images, or recorded webinars, which directly matches the team’s goal of gathering insights across their internal asset library in one approach.
Document AI is designed for document understanding and extraction such as parsing PDFs and forms, yet it is not a unified cross modality search solution and does not provide end to end retrieval across images and recorded webinars.
Prompt Tuning is a parameter efficient way to adapt language models for specific tasks, but it does not provide indexing or retrieval and it is not a search capability across different media types.
Vertex AI Feature Store manages and serves machine learning features for training and online inference, and it does not offer content indexing or semantic search across documents, images, or videos.
When a question emphasizes one approach to search across text, images, and video, look for capabilities that explicitly support multimodal indexing and retrieval. If an option focuses on document extraction, model tuning, or feature management, it is likely not the right fit.
All questions come from Cameron McKenzie’s Generative AI Practice Exams Udemy course and certificationexams.pro
An Operations Director at MetroParcel Logistics is evaluating a proposal for an AI enabled dispatch platform. The proposal describes four components. It includes a Google Kubernetes Engine cluster with GPUs to run the workloads. It names the foundation model to be used for route planning. It lists a browser portal at example.com that drivers will use to view their manifests. It also specifies an AI component that monitors real time traffic feeds and independently updates routes to hit delivery goals. Within the generative AI stack, which component aligns with the Agent layer?
-
✓ D. The AI that autonomously adjusts routes based on live traffic
The correct option is The AI that autonomously adjusts routes based on live traffic because the Agent layer represents autonomous goal driven components that perceive context, decide, and act.
This component monitors real time signals, invokes models and tools as needed, and updates routes without requiring a human instruction for each step. It optimizes toward delivery objectives, which is the defining behavior of an agent in the generative AI stack.
The foundation model hosted on Vertex AI for route planning belongs to the model layer, which provides generation and reasoning capabilities, yet it does not independently plan and execute actions to meet goals.
The web portal for drivers on example.com is a user interface in the application or presentation layer and it does not perform autonomous decision making or tool use.
Google Kubernetes Engine with attached GPUs is infrastructure that hosts workloads and accelerates computation, yet it is not an agent because it does not make decisions or act toward goals.
Identify the component that acts autonomously toward a goal to find the Agent layer, then place models in the model layer, hosting in the infrastructure layer, and user interfaces in the presentation layer.
Which Google Cloud feature offers auditable model documentation and bias evaluation across demographic groups?
-
✓ C. Vertex AI Model Cards and Fairness Indicators
The correct option is Vertex AI Model Cards and Fairness Indicators.
Model Cards in Vertex AI give you structured and auditable documentation that records a model’s intended use, training data characteristics, evaluation metrics, and known limitations. Fairness Indicators complements this by computing and comparing metrics across demographic or user-defined slices so you can identify and quantify potential bias. Together they directly satisfy the need for auditable model documentation and bias evaluation across demographic groups.
Vertex AI Explainable AI focuses on interpreting individual predictions through feature attributions and related explanations. It helps you understand why a model made a prediction, yet it does not provide end to end model documentation or systematic fairness evaluations across groups.
Vertex AI Model Monitoring tracks data drift, prediction drift, and anomalies in production to maintain model health. It is not designed to create auditable documentation or to perform fairness analysis across demographic groups.
Map the requirement to the capability. If the question asks for auditable documentation and bias evaluation across groups then think of Model Cards for documentation and Fairness Indicators for sliced metrics. If it asks for explanations think Explainable AI, and if it mentions drift think Model Monitoring.
BlueRiver Freight is building a new streaming pipeline on Google Cloud. The team captures raw telemetry from delivery drones and writes every event into a Cloud Storage landing bucket for at least 90 days before any parsing, filtering, or transformation occurs. Which phase of the machine learning lifecycle best describes this activity?
-
✓ C. Data Ingestion
The correct option is Data Ingestion.
Capturing raw telemetry events from drones and writing each event to a Cloud Storage landing bucket for retention before any parsing, filtering, or transformation is the activity of bringing data into the platform and persisting it in its original form so that later stages can use it. This is the intake stage of the machine learning lifecycle where data is collected and stored as is to enable subsequent validation, preparation, and modeling steps.
Model Deployment is about pushing a trained model into a serving environment for online or batch predictions and it does not involve collecting or landing raw events.
Data Preparation covers cleaning, parsing, filtering, transforming, and feature engineering. The scenario states that none of these steps occur yet, which means this phase has not started.
Model Training involves fitting algorithms to prepared data to learn parameters. The described activity does not train any model.
When data is being captured and stored in its original form with no changes, think ingestion. If the question mentions parsing, cleaning, filtering, or feature engineering, that points to preparation. If a model is being learned from data, that is training, and if the model is being served, that is deployment.
A compliance team at a mid sized bank wants to use generative AI to analyze and summarize entire loan agreements that are typically 80 to 160 pages in length. The bank’s technology advisor explains that some models can only handle a small portion of the contract in one request while others can process the full agreement but each analysis costs more. Which model attribute is most important for this workload?
-
✓ B. Large context window to accommodate the entire document in one pass
The correct option is Large context window to accommodate the entire document in one pass.
A large context window determines how much input the model can process in a single request, which is crucial when loan agreements are 80 to 160 pages long. With a sufficiently large context window the model can read the entire contract without aggressive chunking, which helps preserve cross references and context so the summaries are more accurate and coherent. This directly addresses the advisor’s point that some models cannot handle the full agreement in one go while others can.
Tailored fine tuning for financial and legal vocabulary can improve domain specificity and tone, yet it does not expand how much text the model can accept at once, so it does not solve the core requirement to process the entire agreement in one pass.
Knowledge cutoff date of the model is less relevant because the task is to summarize content that is provided in the prompt, and summarization relies on the ability to attend to the supplied text rather than recalling external facts.
Temperature settings to adjust creativity versus determinism influence randomness and consistency of outputs, but they do not affect the amount of text the model can ingest, so they are not the limiting factor for this workload.
When the input is very long, prioritize the model’s context window and token limits. Ask whether the entire source can fit in a single prompt, and if not consider approaches like retrieval augmentation or careful chunking.
Apex Manufacturing wants to launch an internal chatbot within 14 days that can answer employee questions using about 60,000 HR and compliance documents stored in Google Drive and Cloud Storage, and leadership requires that responses be grounded in the source files to reduce hallucinations while preferring a solution that leverages Google’s search expertise rather than a complex custom build, so which Google Cloud service should they use to quickly deliver this RAG powered assistant?
-
✓ C. Vertex AI Search
The correct option is Vertex AI Search because it provides a managed enterprise retrieval augmented generation service that can index tens of thousands of documents from Google Drive and Cloud Storage, return grounded answers with citations to the source files, and it leverages Google search expertise so you can deliver an internal chatbot within the 14 day timeline without a complex custom build.
This service offers native connectors to Google Drive and Cloud Storage so you can ingest about 60,000 HR and compliance documents quickly and keep them synchronized. It generates answers that are grounded in your indexed sources and includes citations, which directly addresses the requirement to reduce hallucinations and to tie responses back to specific files. It is designed for rapid setup through the console and APIs and it provides ranking and retrieval capabilities that reflect Google search quality so teams can move from ingestion to a working assistant rapidly.
Dialogflow CX focuses on conversational flow design and intent management, yet it does not provide enterprise document indexing, retrieval, and grounding with citations on its own. You would still need to build or integrate a separate retrieval layer to meet the requirements for grounded answers over Drive and Cloud Storage, which would slow delivery.
Vertex AI Endpoints is used to deploy custom models as scalable endpoints, but it does not include retrieval over enterprise content or connectors to Google Drive and Cloud Storage. Choosing it would require you to design and implement the entire RAG pipeline, including indexing, retrieval, and grounding, which is not suitable for a two week timeline.
Training a foundation model from scratch in Vertex AI is unnecessary and impractical for this use case since it is costly and time consuming and it would not by itself provide grounded answers or enterprise search over your documents. The requirement is best met by a managed retrieval and grounding service rather than model pretraining.
When a scenario emphasizes fast delivery, enterprise sources like Google Drive or Cloud Storage, and the need for grounded answers with citations, prefer the managed search and RAG service rather than building a custom pipeline.
A mid sized apparel brand called RiverStone Outfitters launches a generative AI assistant that answers customer questions using its internal support articles. An attacker submits a message with concealed directions that convince the assistant to ignore its guardrails and reveal sensitive configuration details. What is this attack technique called when the adversary manipulates model behavior by placing malicious instructions inside the user prompt?
-
✓ B. Prompt injection
The correct option is Prompt injection.
This attack fits the scenario because the adversary hides instructions inside the user message that persuade the assistant to ignore its safeguards and reveal sensitive information. It happens at inference time and exploits the model�s instruction following behavior rather than altering the training process or overwhelming system resources.
Model inversion is incorrect because it involves inferring or reconstructing details about the training data from model outputs, which is different from inserting malicious directions into a prompt.
Data poisoning is incorrect because it targets the training data or pipeline to influence the learned model during training rather than manipulating the model at inference through a crafted prompt.
Denial-of-service is incorrect because it focuses on disrupting availability by exhausting resources or flooding the service, not on coercing the model to disclose sensitive configuration details.
Map the attack to the phase it targets. Hidden instructions in the user message point to prompt injection. Tampered training sets point to data poisoning. Extracting secrets about training data points to model inversion. Overwhelming resources points to denial of service.
A small media analytics group at Blue Harbor Labs has built a custom data processing routine into a container image and wants to publish it as an HTTPS API that a generative AI agent on example.com can invoke as a tool. The platform must be fully managed and elastic and it must scale down to zero when idle and it should handle bursts of about 4,000 requests per minute during promotions. Which Google Cloud service should they choose to run this containerized endpoint?
-
✓ C. Cloud Run
The correct option is Cloud Run because it is a fully managed and elastic service that runs container images behind an HTTPS endpoint, scales down to zero when idle, and can rapidly scale to handle short bursts such as about 4,000 requests per minute.
It executes stateless containers and provides automatic request based autoscaling with configurable concurrency so it can absorb promotional traffic spikes and then scale back to zero when requests stop. It provisions HTTPS by default and is well suited to expose a simple tool endpoint that an external generative AI agent can call securely.
Cloud Functions is serverless but it is designed for code written as functions rather than running an arbitrary container image, so it does not fit the requirement to deploy an existing container as is.
GKE Autopilot can run containers but it is a Kubernetes platform that introduces cluster concepts and does not natively scale workloads to zero by default, which adds operational overhead and misses the strict scale to zero requirement.
Vertex AI Model Garden is for discovering and consuming foundation models and it is not intended to host a custom containerized HTTPS API, so it does not match the use case.
Map keywords to services. The combination of container image, HTTPS endpoint, and scale to zero strongly points to Cloud Run. If the prompt emphasizes a single function without a container think Cloud Functions. If it stresses Kubernetes control think GKE.
In Google Cloud, what is the most effective way to adapt a general model so it consistently produces text in a legal style and uses appropriate legal terminology?
-
✓ B. Domain fine tuning on a curated legal corpus
The correct option is Domain fine tuning on a curated legal corpus.
This approach uses Vertex AI to adapt a base model to the language patterns, structure, and vocabulary of legal writing. Training on a high quality domain dataset teaches the model to produce legally appropriate tone and terminology on its own, which leads to more consistent results across different prompts and use cases.
It is the most reliable way to achieve persistent stylistic control because the style becomes part of the model’s learned behavior. This reduces reliance on fragile prompt wording and gives you predictable outputs even as tasks and inputs vary.
Retrieval augmented generation with a private legal knowledge base focuses on bringing in the right facts from your documents, which improves grounding and accuracy, but it does not make the model consistently write in legal style. It is a complementary technique rather than a substitute for training the model to produce a particular tone.
Prompt engineering with legal style instructions can nudge the model, yet it remains situational and often brittle. It does not change the model’s parameters, so consistency degrades as prompts or contexts shift and it will not match the stability of a model that has been trained on domain examples.
Lower model temperature only reduces randomness. It does not teach legal phrasing or structure, so it can make outputs more deterministic but still off style or incorrect in terminology.
When the question asks for consistent style or tone across many prompts, prefer fine tuning on a representative domain dataset. Use RAG to improve factual grounding, and adjust temperature for variability, but do not expect either to enforce a writing style.
Cobalt Pixel Labs is developing new foundation models for multimodal search and needs on demand access to large fleets of TPUs and GPUs along with high throughput networking and durable storage that are fully operated by their cloud provider. Within the generative AI stack, which layer primarily delivers these foundational compute capabilities?
-
✓ B. Infrastructure
Infrastructure is correct because it is the layer that provides on demand access to TPUs and GPUs along with high throughput networking and durable storage that are fully operated by the cloud provider.
This layer delivers the underlying compute accelerators such as Cloud TPUs and GPUs, the interconnects needed for high bandwidth training and inference, and managed storage for durable datasets and checkpoints. It handles provisioning, scaling, and reliability so teams can focus on building and evaluating models rather than operating hardware and networking.
Models is not correct because this layer contains the pretrained or custom models themselves which consume the underlying compute rather than provide it.
Platforms is not correct because this layer offers managed tooling to build, train, tune, and serve models such as Vertex AI, and it runs on top of the underlying compute instead of being the source of the raw accelerators, networking, and storage.
Applications is not correct because this layer includes end user solutions that use the models and platform capabilities and it does not deliver foundational compute resources.
Map the described capability to the stack layer. When you see TPUs or GPUs with high throughput networking and durable storage that are operated by the provider, choose the infrastructure layer.
Novamart is piloting a voice-enabled virtual concierge for its customer support line that must capture callers’ spoken questions, infer meaning and detect sentiment, then speak a personalized reply in real time. Which combination of Google Cloud AI APIs should be used to deliver these core capabilities?
-
✓ B. Speech-to-Text API, Cloud Natural Language API, and Text-to-Speech API
The correct option is Speech-to-Text API, Cloud Natural Language API, and Text-to-Speech API.
This combination directly maps to the required pipeline for a voice concierge. Speech recognition captures callers’ questions as audio and turns them into text in real time, which enables quick handoff to language understanding. Natural language processing then extracts meaning with entities and categories and it can detect sentiment to tailor the response tone. Finally speech synthesis speaks the personalized reply back to the caller with natural voices for a smooth and responsive experience.
Dialogflow CX and Cloud Run is not sufficient here because it does not explicitly provide dedicated speech recognition and speech synthesis along with sentiment analysis. Cloud Run is a compute platform rather than an AI API, and relying only on Dialogflow for all capabilities leaves gaps for the stated speech and sentiment requirements.
Cloud Vision API and Cloud Video Intelligence API are designed for images and videos and do not address speech input, language understanding, or spoken output.
Translation API and Document AI API focus on language translation and document parsing and do not provide speech recognition, sentiment analysis, or speech synthesis.
Map each requirement to the matching API. Speech input points to Speech-to-Text and sentiment or entity extraction points to Cloud Natural Language and voice output points to Text-to-Speech. Watch for the phrase real time and prefer APIs that support streaming.
All questions come from Cameron McKenzie’s Generative AI Practice Exams Udemy course and certificationexams.pro
Skylark Journeys, an online travel agency, introduced a generative AI virtual assistant to triage customer chats, and the operations team is comparing the average handle time for cases that escalate to human agents during the 45 days after launch with the average handle time in the 90 days before launch to assess impact. This measurement reflects which kind of outcome?
-
✓ C. Operational efficiency and cost savings
The correct option is Operational efficiency and cost savings.
Comparing average handle time before and after launching a triage assistant evaluates how efficiently human agents process escalated cases and whether the intervention reduces time and cost to serve. Average handle time is a classic operations metric that reflects process efficiency and is commonly used to assess contact center performance and cost outcomes when deploying virtual agents or agent assist capabilities.
Customer satisfaction indicators is not correct because the measurement is not directly capturing customer sentiment or satisfaction. Metrics like CSAT, NPS, or sentiment analysis would better indicate satisfaction, while handle time is an operational process measure.
Direct revenue generation is not correct because the comparison does not track sales, conversions, average order value, or revenue attribution. It only measures time spent handling escalated cases.
Brand perception lift is not correct because the metric does not assess brand awareness or perception changes. Those outcomes would rely on surveys, brand tracking studies, or sentiment at scale rather than handle time.
Map the metric to the outcome category. If you see time to resolve, average handle time, deflection, or agent productivity, think operational efficiency. If the metric is conversions or sales, think revenue. If it is CSAT, NPS, or sentiment, think customer satisfaction. If it is awareness or favorability, think brand.
A content operations manager at scrumtuous.com is weighing two ways to improve generative writing workflows. One path focuses on teaching staff to craft clearer and more targeted prompts and the other involves adopting a tool that automatically adapts the underlying model to match their editorial voice and tasks. Which statement best describes these two approaches?
-
✓ B. The first is prompt engineering and the second is prompt tuning and prompt tuning typically reduces ongoing enablement effort while requiring greater upfront investment
The correct option is The first is prompt engineering and the second is prompt tuning and prompt tuning typically reduces ongoing enablement effort while requiring greater upfront investment.
Teaching staff to write clearer and more targeted prompts is prompt engineering. This improves outputs by guiding the model with better instructions and examples while leaving model weights unchanged. Adopting a tool that adapts the underlying model to the organization’s voice and tasks aligns with prompt tuning on Vertex AI. This creates a tuned variant that teams can reuse across workflows which lowers the ongoing enablement burden. However it requires upfront data collection, configuration, evaluation, and a tuning run which is why the initial investment is higher.
The first is few shot prompting and the second is full model retraining on Vertex AI is incorrect because few shot prompting is only one pattern within prompt engineering rather than the full scope of the first approach. It is also incorrect to describe the second approach as full model retraining because Vertex AI uses lightweight tuning methods such as prompt tuning or adapters for foundation models rather than retraining the entire base model.
Both are prompt engineering and they require the same level of investment is incorrect because the second approach involves tuning a model variant which demands upfront effort for data preparation and evaluation while prompt engineering alone typically has lower initial cost.
The first is prompt tuning and the second is prompt engineering and they demand the same technical depth is incorrect because the approaches are reversed and the effort is not the same. Prompt engineering emphasizes user guidance and iteration while tuning introduces additional setup and evaluation that increase the initial technical depth.
Translate scenario clues into method names. If you see people learning to write better inputs that indicates prompt engineering. If you see the model being adapted to a specific voice or task with a reusable variant that indicates prompt tuning. If the text mentions retraining the whole model that points to full fine tuning which is rarely needed for foundation models.
A regional healthcare network wants its internal help desk to quickly surface answers from more than 25,000 clinical guidelines, internal FAQs, and support transcripts. They require an AI driven experience that understands natural language intent, retrieves precise results across private repositories, and improves relevance over time using click and rating feedback. Which Google Cloud service best meets these needs?
-
✓ D. Vertex AI Search
Vertex AI Search is the correct choice because it delivers enterprise-grade natural language search across private repositories, supports connectors to many data sources, and learns from user interactions such as clicks and ratings to continually improve result relevance.
This service is designed for enterprise search use cases where content is spread across documents, FAQs, and transcripts. It understands user intent with semantic retrieval, enforces access control over private data, and offers out-of-the-box ranking that can be tuned using implicit and explicit feedback so results get better over time. It also integrates with retrieval augmented generation patterns to power answer synthesis when needed.
Vertex AI Vector Search is a managed vector database for similarity search and indexing which is powerful infrastructure but it does not by itself provide an end-to-end enterprise search experience with connectors, relevance tuning from click and rating signals, or turnkey ranking. You would need to build significant application logic on top to match the requirements.
Google Search targets public web content and is not intended for secure retrieval over private enterprise repositories, nor does it provide enterprise-specific connectors or relevance tuning based on your organization’s feedback data.
Vertex AI Agent Builder is an umbrella product for building AI agents and experiences. The specific capability that satisfies enterprise search across private data with feedback-driven relevance is the dedicated search product, therefore the more precise choice for this scenario is the search service itself.
Map the requirement to retrieve across private data with improving relevance to an enterprise search product. Look for clues like connectors, access control, and relevance feedback which point to a managed search solution rather than a vector index or public web search.
In a generative workflow, which technique uses the output of one prompt as context for the next prompt?
-
✓ B. Prompt chaining
The correct option is Prompt chaining.
This technique takes the generated output from one step and feeds it into the next step as additional context. It enables multi step workflows such as first extracting key facts then using those facts to draft a summary or first generating an outline then expanding it into full content. It helps structure complex tasks into smaller prompts that build on one another.
Zero-shot prompting does not pass outputs between steps and instead asks the model to perform the task with a single instruction and no examples or prior intermediate results.
Retrieval-augmented generation focuses on grounding a prompt with retrieved documents from an external knowledge source. It enriches a single prompt with retrieved context rather than chaining model outputs from one prompt to the next.
Spot phrases like use the output of one prompt as input to the next to identify chaining. If the question highlights no examples it likely points to zero shot. If it highlights retrieved documents or grounding then it is RAG.
A national e commerce retailer wants to automate customer support for complicated order and delivery requests that must pull data from three internal systems, apply company policies, and reply in a consistent brand voice. The team tried a generic chatbot and found it could not orchestrate workflows or connect to tools as required. What approach on Google Cloud should they adopt to meet these needs?
-
✓ C. Vertex AI Agent Builder
The correct option is Vertex AI Agent Builder.
This service is designed to build production agents that can call tools and APIs, orchestrate multi step workflows, and ground responses in enterprise data. It can connect to multiple internal systems through extensions and connectors, apply company policies with controlled system instructions, and maintain a consistent brand voice through prompt design and safety controls. It also supports retrieval and grounding so the agent can fetch order and delivery details before taking actions.
Vertex AI Search focuses on enterprise retrieval and grounding, which helps answer questions from internal data, but it does not by itself provide end to end workflow orchestration or robust tool execution needed to call multiple systems and apply complex policies.
Gemini Advanced is a consumer offering for individual use rather than a managed platform for building enterprise agents. It does not provide the enterprise grade orchestration, connectors, and deployment capabilities required here.
Fine tune a foundation model with Model Garden would adjust model behavior or style but it does not add the ability to connect to internal systems, execute tools, or manage workflows. It would also risk stale knowledge for order and delivery data that changes frequently.
When a question calls for tool integration, multi step workflow orchestration, and a controlled brand voice, look for the option that builds end to end agents. Favor agent platforms over pure retrieval or model fine tuning alone.
An insurance carrier plans to use generative AI to draft customer policy summaries. State regulators require that each document follows a fixed layout with three sections, includes the approved legal disclaimer verbatim, and uses a glossary managed by the compliance team without deviation. When selecting a model, which factor is most critical to meet these strict regulatory obligations?
-
✓ B. Strong customization features that let the model learn and apply mandated templates, disclaimers and vocabulary
The correct choice is Strong customization features that let the model learn and apply mandated templates, disclaimers and vocabulary. This matters most because the solution must reliably enforce a fixed three section layout, include the approved legal disclaimer word for word, and apply a controlled glossary without any deviation.
Customization lets you embed templates and constraints into the model so it reproduces the exact structure and wording every time. You can tune behavior, use strict instructions, and validate outputs so the approved disclaimer remains verbatim and the glossary terms are consistently applied. This directly aligns the model with regulatory obligations rather than hoping a general purpose setting will keep content compliant.
The model’s ability to produce highly imaginative and varied text is not suitable because creativity increases variation and the chance of drifting from the mandated layout, required disclaimer, or controlled vocabulary.
The lowest cost per generated document does not ensure compliance and prioritizing price over control can introduce unacceptable regulatory risk.
Native support for structured JSON style output can help with machine readable fields, yet it does not guarantee the natural language will match an exact legal template or include the disclaimer verbatim, which are the core requirements here.
When requirements demand a fixed layout and verbatim language, prioritize capabilities that enforce constraints through tuning, instructions, and validation. Do not confuse structured formats like JSON with guaranteed adherence to legal templates and wording.
Oriole Outfitters is a multinational retailer that plans to roll out generative AI across nine business units. The leadership wants a unified governance approach for AI security and risk management, and they must ensure their AI programs follow industry standards and comply with regulations in every region. Which Google Cloud offering or framework is purpose built to guide organizations on AI security governance and risk control across the lifecycle?
-
✓ D. Google’s Secure AI Framework SAIF
The correct answer is Google’s Secure AI Framework SAIF. It is the only option that is purpose built to guide organizations on AI security governance and risk control across the lifecycle and it supports alignment with industry standards and regulatory requirements across regions.
This framework provides actionable security controls for data, models and applications through the entire AI lifecycle from design and development to deployment and ongoing operations. It gives organizations a unified governance approach so multiple business units can follow consistent policies while meeting regional compliance needs.
Assured Workloads focuses on enforcing data residency and compliance constraints for regulated workloads rather than providing an AI specific governance and risk framework across the model lifecycle.
Vertex AI Model Garden is a catalog to discover and use models and tools and it does not deliver organization wide governance guidance or risk management controls.
Google Cloud AI optimized infrastructure TPUs and GPUs refers to compute hardware for training and inference and it is not a governance or risk management framework.
When a question stresses governance, the AI lifecycle, and regulatory alignment, prefer a framework level solution rather than a product that runs workloads or hosts models.
BrightHire Labs launches a generative assistant that ranks resumes for hiring teams, and after two months an internal fairness check finds that it often scores candidates from certain communities as “lower potential” even when their education and experience are equivalent. The model was trained largely on 12 years of historical hiring decisions from a field with persistent underrepresentation of those communities. What is the most plausible root cause of this behavior?
-
✓ B. Skewed training data that encodes past hiring bias and teaches unfair patterns
The correct option is Skewed training data that encodes past hiring bias and teaches unfair patterns. The model was trained on many years of historical hiring decisions from a field with persistent underrepresentation, which makes it very likely that it learned to reproduce those biased patterns even when candidates have equivalent qualifications.
Historical decisions often encode human and systemic biases in the labels. Training on those labels teaches the model to associate proxies for protected characteristics with lower potential. The fairness finding that candidates from certain communities are scored lower despite equivalent education and experience strongly indicates label and representation bias in the training set rather than an issue that emerged after deployment. This type of bias would be present from day one and will persist unless data, labels, or training objectives are corrected.
Production data drift that changed resume patterns since deployment is unlikely because the disparity appears despite equivalent qualifications and aligns with long standing historical underrepresentation. That points to biased training signals rather than a sudden change in incoming resumes after launch.
Lack of model interpretability which makes it hard to see why predictions were made is not a root cause. Limited explainability can hinder diagnosis and remediation, yet the unfair scoring stems from what the model learned from biased data, not from whether we can explain the predictions.
Vertex AI Model Monitoring is a monitoring capability and not a cause of biased outcomes. It can help detect drift or skew and surface quality issues, but it does not create the unfair pattern described.
When a model replicates disparities seen in historical decisions, suspect biased training data and labels as the root cause. Options that name monitoring or tooling usually describe detection or visibility features rather than causal factors.
A regional telecom provider named CoastalConnect plans to roll out a generative AI assistant for its help desk analysts. Over three weeks the leadership schedules hands-on enablement sessions, explains the expected benefits and workflow impacts, and opens a dedicated channel for ongoing feedback and questions from staff. Which organizational area do these actions most directly target during gen AI integration?
-
✓ C. Managing organizational change and user adoption
The correct option is Managing organizational change and user adoption.
Enablement sessions, communication about expected benefits and workflow impacts, and an open feedback channel are classic change management activities. They prepare people for new ways of working, build confidence and buy in, and create a feedback loop that accelerates Managing organizational change and user adoption.
By investing in training and two way communication, leadership is targeting the people and process side of adoption. This drives sustained usage and helps analysts integrate the assistant into daily work in a way that improves outcomes and reduces friction.
The option Responsible AI governance and model monitoring is not the focus here because that area centers on policies, fairness and safety reviews, human oversight, and operational monitoring of models in production, which are not described in the scenario.
The option Data security and privacy compliance is not correct because the actions do not address data handling requirements, access controls, encryption, consent, or regulatory obligations.
The option Algorithm selection and model fine-tuning is incorrect because the scenario does not involve choosing model architectures, tuning hyperparameters, or adapting models to domain data. It describes preparing users to adopt the solution.
When a scenario highlights training, communication of benefits, new workflow expectations, and ongoing feedback, link it to change management and user adoption rather than technical tasks.
What is the primary platform-level risk of using a no-code AI builder that offers only one general-purpose model and fixed UI widgets?
-
✓ C. Limited capabilities reduce evolution and differentiation
The correct option is Limited capabilities reduce evolution and differentiation.
A no-code AI builder that forces a single general model and fixed UI widgets imposes tight constraints on customization and integration. Over time these constraints hinder your ability to adopt new models, experiment with different architectures, or add unique product capabilities. This reduces differentiation and slows evolution, which is exactly the kind of risk that emerges from the platform layer rather than from infrastructure or staffing.
Unpredictable infrastructure costs is not the primary platform-layer risk in this scenario because no-code platforms generally abstract infrastructure and often make cost more predictable. The main concern is loss of flexibility and innovation velocity, not cost volatility.
Limited access to Vertex AI features is not the core issue described. The risk comes from the builder’s own restrictions on models and UI, which limit your ability to evolve and differentiate, regardless of whether specific Vertex AI features are exposed.
Need to hire more infrastructure engineers runs counter to the premise of using a no-code platform. These tools usually reduce infrastructure management needs, so the main risk is not increased staffing but diminished flexibility and differentiation.
When you see cues like single general model or fixed UI widgets, align the risk with the constrained layer. If the platform restricts what you can build, the likely answer centers on lost flexibility and reduced differentiation rather than infrastructure or staffing concerns.
All questions come from Cameron McKenzie’s Generative AI Practice Exams Udemy course and certificationexams.pro
A marketing coordinator at the online retailer mcnz.com spends several hours each week turning transcripts from Google Meet that are about 90 minutes long into succinct summaries and composing follow-up messages in Gmail. They want an out of the box generative AI capability that works natively inside their current Google Workspace apps so they can automate these tasks without switching tools. Which Google Cloud offering should they use?
-
✓ B. Gemini for Google Workspace
The correct option is Gemini for Google Workspace.
This service embeds generative AI directly into Gmail and Google Meet so the coordinator can summarize long meeting content and draft follow-up emails without leaving their existing apps. It is designed to work natively inside Workspace which satisfies the request for an out of the box capability with no context switching.
In Gmail it provides help me write to compose or refine follow-up messages based on prompts or context. In Meet and across Workspace it can generate concise summaries and notes from long discussions and transcripts which fits the 90 minute meeting scenario.
NotebookLM is a separate note taking and research tool and it is not a native Workspace capability inside Gmail and Meet. Using it would require working outside the coordinator’s existing app workflow.
Vertex AI Search is for enterprise search and retrieval augmented experiences across data sources and it is not meant to add generative features directly inside Gmail or Meet for writing emails and summarizing meetings.
The standalone Gemini app is a separate conversational interface and using it would require switching tools rather than getting in-product assistance inside Gmail and Google Meet.
When you see a requirement for features working inside Gmail, Docs or Meet, map it to Gemini for Google Workspace. If the scenario mentions building custom apps or enterprise search then consider Vertex AI offerings instead.
A retail analytics firm named BlueMarket Labs is building an assistant on Vertex AI that uses one model to condense long vendor documents, another to categorize customer comments, and another to brainstorm campaign ideas. The lead architect wants the solution to be responsive and economical. Which approach would not be a valid way to optimize both cost and performance on Vertex AI?
-
✓ B. Route all traffic only to the region with the lowest per unit compute price even if it increases user latency
The option that would not be a valid way to optimize both cost and performance is Route all traffic only to the region with the lowest per unit compute price even if it increases user latency.
This approach prioritizes unit price over user experience and can lead to noticeably slower responses as latency grows with distance and network paths. It can also add cross region egress charges and complicate data residency which can offset any savings and reduce reliability. Vertex AI guidance is to select regions that align with your users and data in order to keep latency low while balancing price, so forcing a single distant region does not achieve the goal of being both responsive and economical.
Cache answers for frequent or identical prompts to avoid repeat calls is a sound optimization because reusing responses for repeated inputs reduces billed tokens and can return results faster. This keeps latency low for common queries and saves money without changing model quality.
Use a lightweight model such as Gemini 1.5 Flash for simple classification and a stronger model such as Gemini 2.0 Pro for complex reasoning is a best practice since routing easy tasks to a low cost and low latency model and reserving a more capable model for hard problems balances cost and performance effectively.
Constrain the max output tokens in requests so verbose replies do not inflate cost is valid because shorter outputs are cheaper and usually return faster. Setting an appropriate cap prevents unnecessary verbosity while preserving the needed information.
When a question asks what would not optimize both goals, look for the choice that sacrifices latency or reliability to chase lower unit cost. Favor balanced strategies that combine smart region selection, model right sizing, and prudent token limits.
An applied research lab at example.com is building generative AI prototypes and wants the freedom to combine open-source frameworks and community models with managed cloud capabilities. They want to reduce vendor lock-in and benefit from rapid innovation across the wider AI ecosystem. Which element of Google Cloud’s generative AI strategy would best align with these goals?
-
✓ C. Commitment to an open ecosystem with support for open-source models, tools, and interoperability
The correct option is Commitment to an open ecosystem with support for open-source models, tools, and interoperability. This choice fits a research lab that wants to mix community models and open frameworks with managed services while avoiding vendor lock in and keeping up with rapid innovation.
With this open approach, teams can bring their preferred frameworks and libraries and run or fine tune community and third party models while still using managed capabilities for training, tuning, orchestration, and deployment. This lets them adopt new models and tools as they emerge and it preserves portability across environments.
Interoperability also means containerized workloads, standard interfaces, and broad integration points so researchers can iterate quickly and switch components without rearchitecting. That combination of flexibility and managed reliability is exactly what the scenario describes.
Enterprise security and compliance controls are essential for protecting data and meeting regulatory needs, yet they do not address the requirement to combine community models with open frameworks or to minimize lock in. They focus on governance rather than enabling broad interoperability.
Prebuilt industry accelerators for AI provide packaged solutions for specific vertical use cases, which speeds delivery for common patterns but limits the freedom to explore diverse open-source frameworks and rapidly evolving community models.
Cloud TPU supplies powerful hardware for training and inference, however it is an infrastructure option rather than a strategy for openness. It does not by itself ensure portability across tools and models or reduce dependence on proprietary stacks.
Look for cues like vendor lock in, open source, and interoperability. When the scenario emphasizes combining community models with managed services, choose the option that highlights an open ecosystem rather than security features, packaged solutions, or specific hardware.
A creative agency plans to use a foundation model in Vertex AI to produce 20 second promotional videos from short text prompts for social networks. When evaluating candidate models, what is the single most important first criterion to verify for this task?
-
✓ C. Support for the required modalities that is text to video
The single most important first criterion to verify is Support for the required modalities that is text to video.
<pYou must first ensure the candidate model can actually generate video from text prompts. Vertex AI offers different foundation models that target different modalities such as text, image, audio, and video. If a model does not support text to video generation then it cannot meet the requirement regardless of any other performance or customization features. Once this capability is confirmed you can evaluate quality, control features, and operational characteristics.
Inference latency for generating video clips is important for user experience and scaling, yet it only matters after you confirm the model can perform the text to video task at all.
Availability of customization or fine-tuning features can help align outputs with brand style or domain needs, but it is secondary to verifying that the base model supports text to video generation in the first place.
Maximum input context window size for prompts is mainly a constraint for large language models doing text processing. The scenario uses short prompts to create video, so context window size is not the primary gating factor compared to verifying modality support.
First check the modality and task fit. Only after confirming the model can perform the required input and output should you compare quality, latency, and customization options.
A regional convenience store cooperative plans to deploy generative AI to produce localized marketing copy for more than 650 independently owned locations. Each message must reflect nearby events, regional phrasing, and the precise product availability at each store. The leadership team needs an approach that can expand efficiently and be quickly adapted for each owner’s needs. Given these requirements, which consideration should drive the selection of the generative AI solution?
-
✓ C. Ensuring the approach can scale broadly and be easily tailored for each locale
The correct option is Ensuring the approach can scale broadly and be easily tailored for each locale.
The cooperative must generate copy for hundreds of independent stores while reflecting local events, regional phrasing, and live product availability. The success factor is an approach that supports large scale deployment and fine grained customization for each store. In Google Cloud this typically means using reusable prompt templates, structured inputs, and retrieval augmented generation that grounds responses with store specific data so each owner can adapt quickly while the platform scales efficiently.
This focus also enables consistent governance and quality controls across the network while allowing each owner to adjust content to their market. It reduces operational overhead by centralizing the core system and pushing only lightweight configuration per location.
Maximizing the size of the base model above all other factors is not the driver for this use case because larger models do not automatically improve local accuracy and they can increase cost and latency. The ability to ground on store data and to template prompts for each locale matters more than raw model size.
Prioritizing advanced mathematical and logical problem solving does not align with marketing copy generation. The requirement is targeted localization and branding rather than complex reasoning on math or logic tasks.
Minimizing per-token cost as the primary decision criterion is short sighted for this scenario. Cost control is important but should follow the ability to scale and tailor outputs since choosing only by price can undermine localization quality and controllability.
Identify the core business driver first. When a scenario stresses many locations and unique local needs, favor options that emphasize scalability and tailoring rather than model size or raw cost.

