Question.21 You are designing storage for two relational tables that are part of a 10-TB database on Google Cloud. You want to support transactions that scale horizontally. You also want to optimize data for range queries on non-key columns. What should you do? (A) Use Cloud SQL for storage. Add secondary indexes to support query patterns. (B) Use Cloud SQL for storage. Use Cloud Dataflow to transform data to support query patterns. (C) Use Cloud Spanner for storage. Add secondary indexes to support query patterns. (D) Use Cloud Spanner for storage. Use Cloud Dataflow to transform data to support query patterns. |
21. Click here to View Answer
Answer is (C) Use Cloud Spanner for storage. Add secondary indexes to support query patterns.
Spanner allows transaction tables to scale horizontally and secondary indexes for range queries
Reference:
https://cloud.google.com/spanner/docs/secondary-indexes
Question.22 Your financial services company is moving to cloud technology and wants to store 50 TB of financial time-series data in the cloud. This data is updated frequently and new data will be streaming in all the time. Your company also wants to move their existing Apache Hadoop jobs to the cloud to get insights into this data. Which product should they use to store the data? (A) Cloud Bigtable (B) Google BigQuery (C) Google Cloud Storage (D) Google Cloud Datastore |
22. Click here to View Answer
Answer is (A) Cloud Bigtable
Bigtable is GCP’s managed wide-column database. It is also a good option for migrat- ing on-premises Hadoop HBase databases to a managed database because Bigtable has an HBase interface.
Cloud Bigtable is a wide-column NoSQL database used for high-volume databases that require low millisecond (ms) latency. Cloud Bigtable is used for IoT, time-series, finance, and similar applications.
Question.23 You are responsible for writing your company’s ETL pipelines to run on an Apache Hadoop cluster. The pipeline will require some checkpointing and splitting pipelines. Which method should you use to write the pipelines? (A) PigLatin using Pig (B) HiveQL using Hive (C) Java using MapReduce (D) Python using MapReduce |
23. Click here to View Answer
Answer is (A) PigLatin using Pig
Pig is scripting language which can be used for checkpointing and splitting pipelines
Question.24 You need to migrate a 2TB relational database to Google Cloud Platform. You do not have the resources to significantly refactor the application that uses this database and cost to operate is of primary concern. Which service do you select for storing and serving your data? Cloud Spanner Cloud Bigtable Cloud Firestore Cloud SQL |
24. Click here to View Answer
Answer is (D) Cloud SQL
Cloud SQL supports MySQL 5.6 or 5.7, and provides up to 624 GB of RAM and 30 TB of data storage, with the option to automatically increase the storage size as needed.
Question.25 You are designing an Apache Beam pipeline to enrich data from Cloud Pub/Sub with static reference data from BigQuery. The reference data is small enough to fit in memory on a single worker. The pipeline should write enriched results to BigQuery for analysis. Which job type and transforms should this pipeline use? (A) Batch job, PubSubIO, side-inputs (B) Streaming job, PubSubIO, JdbcIO, side-outputs (C) Streaming job, PubSubIO, BigQueryIO, side-inputs (D) Streaming job, PubSubIO, BigQueryIO, side-outputs |
25. Click here to View Answer
Answer is (C) Streaming job, PubSubIO, BigQueryIO, side-inputs
You need pubsubIO and BigQueryIO for streaming data and writing enriched data back to BigQuery. side-inputs are a way to enrich the data
Reference:
https://cloud.google.com/architecture/e-commerce/patterns/slow-updating-side-inputs