GCP Professional Cloud Architect Certification Guide 2025

GET Full PDF

Question.41
You are planning to migrate your current on-premises Apache Hadoop deployment to the cloud. You need to ensure that the deployment is as fault-tolerant and cost-effective as possible for long-running batch jobs. You want to use a managed service.

What should you do?

(A) Deploy a Cloud Dataproc cluster. Use a standard persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://
(B) Deploy a Cloud Dataproc cluster. Use an SSD persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://
(C) Install Hadoop and Spark on a 10-node Compute Engine instance group with standard instances. Install the Cloud Storage connector, and store the data in Cloud Storage. Change references in scripts from hdfs:// to gs://
(D) Install Hadoop and Spark on a 10-node Compute Engine instance group with preemptible instances. Store data in HDFS. Change references in scripts from hdfs:// to gs://

41. Click here to View Answer

Answer is (A) Deploy a Cloud Dataproc cluster. Use a standard persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://

Cloud Dataproc for Managed Cloud native application and HDD for cost-effective solution.

Question.42
You need to choose a database for a new project that has the following requirements:
– Fully managed
– Able to automatically scale up
– Transactionally consistent
– Able to scale up to 6 TB
– Able to be queried using SQL

Which database do you choose?

(A) Cloud SQL
(B) Cloud Bigtable
(C) Cloud Spanner
(D) Cloud Datastore

42. Click here to View Answer

Answer is (A) Cloud SQL

It asks for scaling up which can be done in cloud sql, horizontal scaling is not possible in cloud sql

Automatic storage increase
If you enable this setting, Cloud SQL checks your available storage every 30 seconds. If the available storage falls below a threshold size, Cloud SQL automatically adds additional storage capacity. If the available storage repeatedly falls below the threshold size, Cloud SQL continues to add storage until it reaches the maximum of 30 TB.

Question.43
What are two of the benefits of using denormalized data structures in BigQuery?

(A) Reduces the amount of data processed, reduces the amount of storage required
(B) Increases query speed, makes queries simpler
(C) Reduces the amount of storage required, increases query speed
(D) Reduces the amount of data processed, increases query speed

43. Click here to View Answer

Answer is (B) Increases query speed, makes queries simpler

Cannot be A or C because:
“Denormalized schemas aren’t storage-optimal, but BigQuery’s low cost of storage addresses concerns about storage inefficiency.”

Cannot be D because the amount of data processed is the same.

As for why is it “simpler”, I don’t see it directly stated but it is hinted at: “Expressing records by using nested and repeated fields simplifies data load using JSON or Avro files.” and “Expressing records using nested and repeated structures can provide a more natural representation of the underlying data.”

Reference:
https://cloud.google.com/solutions/bigquery-data-warehouse

Question.44
Which of the following are examples of hyperparameters? (Select 2 answers.)

(A) Number of hidden layers
(B) Number of nodes in each hidden layer
(C) Biases
(D) Weights

44. Click here to View Answer

Answers are;
(A) Number of hidden layers
(B) Number of nodes in each hidden layer

Hyperparamters are configuration variables and cannot change

Reference:
https://cloud.google.com/ai-platform/training/docs/hyperparameter-tuning-overview

Question.45
Which of the following are feature engineering techniques? (Select 2 answers)

(A) Hidden feature layers
(B) Feature prioritization
(C) Crossed feature columns
(D) Bucketization of a continuous feature

45. Click here to View Answer

Answer are;
(C) Crossed feature columns
(D) Bucketization of a continuous feature

Selecting and crafting the right set of feature columns is key to learning an effective model. Bucketization is a process of dividing the entire range of a continuous feature into a set of consecutive bins/buckets, and then converting the original numerical feature into a bucket ID (as a categorical feature) depending on which bucket that value falls into.

Using each base feature column separately may not be enough to explain the data. To learn the differences between different feature combinations, we can add crossed feature columns to the model.

Reference:
https://cloud.google.com/solutions/machine-learning/ml-on-structured-data-model-2

Pages: 1 2 3 4 5 6 7 8 9 10 11