How do you debug spark jobs?
Data Engineer Interview Questions
20,279 data engineer interview questions shared by candidates
Given a timeseries dataset with the following columns: device_type, timestamp, number_of_requests. Find a the minute in which the next 60 minutes sum up to the greatest total for each type of device.
Can you send us your resume?
Are you willing to relocate?
Tell me about the main differences between transient clusters and permanent (& shared) clusters?
SQL (follow up) Now imagine that you have a user that is "always" accessing table A, but you need to move the changes (in 10000 rows) from the temp table into table A without the user being affected. How would you do that ?
SQL Imagine you want to make some changes to table A, without actually changing table A. How do you do it ?
Python: Have you ever used a context manager ?
Tech Questions related to Data Engineering. Python, SQL, PySpark and in general CS questions
what is the most complex thing you faced in your recent project?how did you resolve that?what was your approach?what did you learn from that
Viewing 2001 - 2010 interview questions