This blog lists commonly asked and important interview questions
& answers of Apache Spark which you should prepare. Each question is
associated with detailed answer, which will make you confident to face
the interviews of Apache Spark. This guide lists frequently asked
questions with tips to cracks the interview, to learn more about Apache Spark follow this introductory guide.
Q. What are the features and characteristics of Apache Spark which
make it superior than other Big Data solutions like Hadoop-MapReduce?
Q. What is Resilient Distributed Dataset (RDD) in Apache Spark ? How
it provides abstraction in Spark and make spark operator rich ?
Q. What is RDD lineage graph or linage operation in Apache Spark
? Explain lineage graph operator in Apache Spark, how it enables
fault-tolerance in Spark ?
Q. Explain Apache spark eco-system components: Spark SQL, Spark Streaming, Spark MLlib and GraphX.
In which scenarios we can use these components ? what type of problems can be solved using them ?
Q. What is the difference between rdd and dataframes ?
Q. What is the exact differences between reduce and fold operation in spark?
Q. How to process data using Transformation operation in Spark ? what
is the need of transformations in Spark ? provide the list of all the
transformation available in Spark.
Q. Brief explanation of RDD in Apache Spark. Why RDD is used to
process the data ? What are the major features/characteristics of RDD
(Resilient Distributed Datasets) ?
Q. Explain briefly what is Action in Apache Spark, how action is used
to generate final results ? Provide some examples of actions ?
Q. What is the use of Spark driver, where it gets executed on the cluster ?
Q. What is Parquet file format ? Where Parquet format should be used ? how to convert data to Parquet format ?
Q. Benefits of Spark over MapReduce or Spark vs MapReduce?
Q. How to split single HDFS block into partitions RDD ?
Q. What are the roles and responsibilities of worker nodes in the
apache spark cluster? Is Worker Node in Spark is same as Slave Node?