Site icon i2tutorials

PySpark SQL and Data Warehousing – Quiz

This Quiz contains totally 25 Questions each carry 1 point for you.

1.What is PySpark SQL?
A programming language
A library in PySpark for processing structured and semi-structured data
A database management system
A type of data

Correct!

Wrong!

2.Which of the following allows you to execute SQL queries using PySpark DataFrames?
SparkSQL
Hive
Python
Java

Correct!

Wrong!

3.PySpark SQL can interact with:
Only internal data
Only external data
Both internal and external data
None of the above

Correct!

Wrong!

4.What is the role of Hive in Spark SQL?
Data storage
Data processing
Both A and B
None of the above

Correct!

Wrong!

5.Which of the following is NOT a data warehousing concept?
Star Schema
Snowflake Schema
Data Mining
Data Lake

Correct!

Wrong!

6.In the context of data warehousing, what does a star schema represent?
A single fact table connected to dimension tables
A single dimension table connected to fact tables
Multiple fact tables connected to a single dimension table
None of the above

Correct!

Wrong!

7.A snowflake schema in data warehousing is characterized by:
No normalization
Full normalization
Partial normalization
None of the above

Correct!

Wrong!

8.What does the 'fact table' in a star or snowflake schema typically contain?
Transactional data
Descriptive data
Metadata
All of the above

Correct!

Wrong!

9.In a star schema, dimension tables are usually:
Normalized
Denormalized
Partially normalized
None of the above

Correct!

Wrong!

10.Which of the following is a best practice for data warehousing?
Data should be stored in a single, large table
Data should be fully normalized
Data should be properly indexed and partitioned
All of the above

Correct!

Wrong!

11.PySpark SQL supports which type of data sources?
CSV
JSON
Parquet
All of the above

Correct!

Wrong!

12.The use of __________ improves the performance of Spark SQL
Hive
Catalyst Optimizer
PySpark
Java

Correct!

Wrong!

13.The DataFrame API in Spark SQL was inspired by data frames in:
R and Python
Java
C++
JavaScript

Correct!

Wrong!

14.Which of the following SQL operations can be performed with PySpark DataFrames?
SELECT
JOIN
AGGREGATE
All of the above

Correct!

Wrong!

15.Snowflake schema is an extension of:
Fact constellation schema
Star schema
Both A and B
None of the above

Correct!

Wrong!

16.In data warehousing, a fact table typically does NOT contain:
Measurements
Numeric data
Foreign keys
Hierarchical data

Correct!

Wrong!

17.The star schema is:
Completely normalized
Partially normalized
Completely denormalized
Partially denormalized

Correct!

Wrong!

18.In the context of PySpark SQL, a UDF is:
Unidentified Data Function
User-Defined Function
Unified Data Format
User-Driven Format

Correct!

Wrong!

19.PySpark SQL can read data directly from:
Hive
Hadoop HDFS
Local filesystem
All of the above

Correct!

Wrong!

20.The central table in a star schema is called the:
Dimension table
Fact table
Star table
Central table

Correct!

Wrong!

21.Which of the following is a primary advantage of the star schema?
It is highly normalized
It simplifies queries and improves query performance
It reduces the amount of data stored
It avoids data redundancy

Correct!

Wrong!

22.What is a DataFrame in PySpark?
A distributed collection of data
A type of data structure
A type of data format
A type of database

Correct!

Wrong!

23.Which of the following is NOT a benefit of using Spark SQL?
Ability to seamlessly mix SQL queries with Spark programs
Support for complex data processing
Support for a variety of data sources
Ability to perform operations in real-time

Correct!

Wrong!

24.Which of the following is NOT a component of a data warehouse?
Fact tables
Dimension tables
Metadata
Real-time processing engine

Correct!

Wrong!

25.Which of the following is NOT true about a snowflake schema?
It has more joins than a star schema
It uses less space than a star schema
It is less complex than a star schema
It has normalized dimension tables

Correct!

Wrong!

Share the quiz to show your results !

Subscribe to see your results

Ignore & go to results

PySpark SQL and Data Warehousing – Quiz

You got %%score%% of %%total%% right

%%description%%

%%description%%

Loading...

Exit mobile version