Question.26 You want to analyze hundreds of thousands of social media posts daily at the lowest cost and with the fewest steps. You have the following requirements: – You will batch-load the posts once per day and run them through the Cloud Natural Language API. – You will extract topics and sentiment from the posts. – You must store the raw posts for archiving and reprocessing. – You will create dashboards to be shared with people both inside and outside your organization. You need to store both the data extracted from the API to perform analysis as well as the raw social media posts for historical archiving. What should you do? (A) Store the social media posts and the data extracted from the API in BigQuery. (B) Store the social media posts and the data extracted from the API in Cloud SQL. (C) Store the raw social media posts in Cloud Storage, and write the data extracted from the API into BigQuery. (D) Feed to social media posts into the API directly from the source, and write the extracted data from the API into BigQuery. |
26. Click here to View Answer
Answer is (C) Store the raw social media posts in Cloud Storage, and write the data extracted from the API into BigQuery.
Social media posts can images/videos which cannot be stored in bigquery
Question.27 You want to automate execution of a multi-step data pipeline running on Google Cloud. The pipeline includes Cloud Dataproc and Cloud Dataflow jobs that have multiple dependencies on each other. You want to use managed services where possible, and the pipeline will run every day. Which tool should you use? (A) cron (B) Cloud Composer (C) Cloud Scheduler (D) Workflow Templates on Cloud Dataproc |
27. Click here to View Answer
Answer is (B) Cloud Composer
Cloud Composer is an Apache Airflow managed service, it serves well when orchestrating interdependent pipelines, and Cloud Scheduler is just a managed Cron service.
Reference:
https://stackoverflow.com/questions/59841146/cloud-composer-vs-cloud-scheduler
Question.28 You work for a shipping company that uses handheld scanners to read shipping labels. Your company has strict data privacy standards that require scanners to only transmit recipients’ personally identifiable information (PII) to analytics systems, which violates user privacy rules. You want to quickly build a scalable solution using cloud-native managed services to prevent exposure of PII to the analytics systems. What should you do? (A) Create an authorized view in BigQuery to restrict access to tables with sensitive data. (B) Install a third-party data validation tool on Compute Engine virtual machines to check the incoming data for sensitive information. (C) Use Stackdriver logging to analyze the data passed through the total pipeline to identify transactions that may contain sensitive information. (D) Build a Cloud Function that reads the topics and makes a call to the Cloud Data Loss Prevention API. Use the tagging and confidence levels to either pass or quarantine the data in a bucket for review. |
28. Click here to View Answer
Answer is (D) Build a Cloud Function that reads the topics and makes a call to the Cloud Data Loss Prevention API. Use the tagging and confidence levels to either pass or quarantine the data in a bucket for review.
Protection of sensitive data, like personally identifiable information (PII), is critical to your business. Deploy de-identification in migrations, data workloads, and real-time data collection and processing.
Reference:
https://cloud.google.com/dlp
Question.29 You are a retailer that wants to integrate your online sales capabilities with different in-home assistants, such as Google Home. You need to interpret customer voice commands and issue an order to the backend systems. Which solutions should you choose? (A) Cloud Speech-to-Text API (B) Cloud Natural Language API (C) Dialogflow Enterprise Edition (D) Cloud AutoML Natural Language |
29. Click here to View Answer
Answer is (C) Dialogflow Enterprise Edition
since we need to recognize both voice and intent
Question.30 You are designing a data processing pipeline. The pipeline must be able to scale automatically as load increases. Messages must be processed at least once and must be ordered within windows of 1 hour. How should you design the solution? Use Apache Kafka for message ingestion and use Cloud Dataproc for streaming analysis. Use Apache Kafka for message ingestion and use Cloud Dataflow for streaming analysis. Use Cloud Pub/Sub for message ingestion and Cloud Dataproc for streaming analysis. Use Cloud Pub/Sub for message ingestion and Cloud Dataflow for streaming analysis. |
30. Click here to View Answer
Answer is (D) Use Cloud Pub/Sub for message ingestion and Cloud Dataflow for streaming analysis.
Dataflow has autoscaling feature and pubsub is best solution