/  GCP Data Engineering Dataproc – Quiz

This Quiz contains totally 25 Questions each carry 1 point for you.

1.What is Google Cloud Dataproc?
A data analysis tool
A data storage service
A managed Spark and Hadoop service
A database management system

Correct!

Wrong!

2.Which of the following is a component of Dataproc's architecture?
Compute Engine
App Engine
Firebase
BigQuery

Correct!

Wrong!

3.What is the role of the master node in Dataproc clusters?
Storing data
Running the user's job
Coordinating the distribution of tasks
None of the above

Correct!

Wrong!

4.Which command is used to create a Dataproc cluster using the command-line interface?
gcloud dataproc clusters create
gcloud dataproc clusters start
gcloud dataproc clusters init
gcloud dataproc clusters build

Correct!

Wrong!

5.Which of the following applications is supported by Google Cloud Dataproc?
Apache Spark
MongoDB
Oracle DB
MySQL

Correct!

Wrong!

6.What is the command to submit a Spark job in Dataproc?
gcloud dataproc jobs submit spark
gcloud dataproc jobs run spark
gcloud dataproc jobs start spark
gcloud dataproc jobs execute spark

Correct!

Wrong!

7.How can you monitor Dataproc jobs?
Using Google Cloud Logging
By checking the job output
Using Google Cloud Monitoring
All of the above

Correct!

Wrong!

8.Which of the following can be used to optimize the costs of running Dataproc?
Using preemptible VMs
Deleting clusters when not in use
Using autoscaling
All of the above

Correct!

Wrong!

9.Which of the following is a best practice when working with Dataproc?
Using default settings for all clusters
Keeping clusters running indefinitely
Storing sensitive data on clusters
Separating compute and storage resources

Correct!

Wrong!

10.What is the default filesystem used by Dataproc?
HDFS
Cloud Storage
Local disk
NFS

Correct!

Wrong!

11.Which of the following is not a component of Dataproc?
Master node
Worker node
Preemptible worker node
Serverless node

Correct!

Wrong!

12.How can you view the status of a running job in Dataproc?
Using the Dataproc console
By checking the job logs
Via the gcloud command-line tool
All of the above

Correct!

Wrong!

13.Which of the following is a benefit of using Dataproc?
Faster cluster startup times
Simplified cluster management
Cost efficiency
All of the above

Correct!

Wrong!

14.What is the purpose of Dataproc initialization actions?
To enable autoscaling
To configure clusters and install additional software
To delete clusters
To monitor cluster performance

Correct!

Wrong!

15.What is the primary use case of preemptible VMs in Dataproc clusters?
To store data
To run critical tasks
To save costs on non-critical or fault-tolerant tasks
To improve performance

Correct!

Wrong!

16.Which of the following can be used to authenticate to Dataproc?
OAuth 2.0
API key
Service account
All of the above

Correct!

Wrong!

17.What is the primary difference between worker nodes and preemptible worker nodes in Dataproc clusters?
Worker nodes store data, while preemptible worker nodes do not
b) Worker nodes are more expensive, while preemptible worker nodes are cheaper
Worker nodes can be used for critical tasks, while preemptible worker nodes should not be used for such tasks
All of the above

Correct!

Wrong!

18.Which of the following is not a supported job type in Dataproc?
Spark
Hadoop
Hive
PostgreSQL

Correct!

Wrong!

19.What is a Dataproc region?
A specific geographic location where your clusters can be deployed
A specific range of IP addresses used by your clusters
A security boundary for your clusters
A type of storage used by your clusters

Correct!

Wrong!

20.Can you access data stored in Google Cloud Storage directly from a Dataproc cluster?
Yes
No

Correct!

Wrong!

21.How can you scale a Dataproc cluster?
By adding more worker nodes
By adding more preemptible worker nodes
By using autoscaling policies
All of the above

Correct!

Wrong!

22.What is the purpose of the Cloud Dataproc Job API?
To create and manage Dataproc clusters
To run and manage Dataproc jobs
To monitor Dataproc clusters
To store data used by Dataproc clusters

Correct!

Wrong!

23.Which of the following is not a best practice when working with Dataproc?
Using Cloud Storage instead of HDFS for data storage
Keeping clusters running even when they're not in use
Separating compute and storage resources
Using preemptible VMs for non-critical tasks

Correct!

Wrong!

24.What is the main advantage of Dataproc over self-managed Hadoop and Spark clusters?
Better performance
Lower costs
Simpler management and faster startup times
All of the above

Correct!

Wrong!

25.Which of the following can be used to control access to Dataproc resources?
IAM roles
Network firewall rules
OAuth tokens
All of the above

Correct!

Wrong!

Share the quiz to show your results !

Subscribe to see your results

Ignore & go to results

GCP Data Engineering Dataproc – Quiz

You got %%score%% of %%total%% right

%%description%%

%%description%%

Loading...