big data engineer interview questions shared by candidates
Given a continuous stream of numbers (positive integers perhaps) through a UDP port, how would you find the distinct numbers every 5 minutes? Is the question complete and has sufficient information?
I told him - the head of data intelligence, that it could be easily done using some auxiliary data structure like a dictionary, cache or set but he wasn't ready to listen or understand. It was clear that he was inexperienced and unqualified for the job. He told me that it's a hard problem to solve and I should better read some books to understand this. He was such an outdated and out of place guy, that he was clearly a spoiled for the org. If Airtel, wants to improve or move up, they clearly need more educated, skilled and modern people in such roles. Such a waste.
Project level discussion which mentioned over resume. Hadoop and Hive question and 1 scenario based Map Reduce question Spark questions like RDD, Dataframe, Data and Spark Streaming with Kafka. SQL and NoSQL questions and queries
1. Hive Heap Size Memory Issue 2. HIve Optimization 3. Hive Sub Partition Deletion 4. HIve Stages 5. Add column in between existing columns in Hive 6. Map Join in Hive 7. Change or Check Block SIze 8. Spark Optimization 9. Spark Shared Variables 10. Spark Stages 11. Compress & Decompress Time Comparison of Parquet with others 12. Bloom Filter in HIve 13. 3rd Highest Salary by each department using Spark DataFrame(Not using Hive or SparkSQL) 14. Spark Cache vs Broadcast 15. Create DataFrame in Spark 16. Spark Partition 17. ACID properties in Hive & Spark 18. Count in Spark without using count func() 19. Shuffle Read & Shuffle Write in Spark 20. Spark cache is action or transformation? 21. Spark 1.x vs 2.x 22. What is hint?
More on java spark and hive if you able to make the concept clear to them there is a chance and the two person took my interview both with good knowledge of same. Overall simple process and they are very co-operate.
See Interview Questions for Similar Jobs
- Software Engineer
- Senior Software Engineer
- Data Scientist
- Applications Engineer
- Technology Analyst
- Java Developer
- Technical Lead
- Software Developer
- QA Engineer
- Data Analyst
- Senior Consultant
- Oracle Database Administrator