# 2K

Senior Data Engineer interview questions shared by candidates

## Top Interview Questions

Sort: Relevance|Popular|Date
Senior Data Scientist was asked...21 October 2014

### How would you test if survey responses were filled at random by certain individuals, as opposed to truthful selections?

This is a very basic psychometrics question. Calculate Cronbach's alpha for the survey items. If it is low (below .5), it is very likely that the questions were answered at random. Less

I would design the test in a way that certain information is asked two different ways. if two answers disagree with each other I would seriously doubt the validity of the answers. Less

We need to find the histograms of the questions in the survey to see the distribution of each answer in each question. All question histograms will likely follow the normal distribution if they are truthful selection. If one response with more than of half of total answers being located outside of 95% confidential interval in each histogram, the response will be categorized as random fall out of mean plus tw Less

### How would you build and test a metric to compare two user's ranked lists of movie/tv show preferences?

1) Develop a list of shows/movies that are representative of different taste categries (more on this later) 2) Obtain ranking of the items in the list from 2 users 3) Use Spearman's rho (or other test that works with rankings) to assess dependence/conguence between the 2 people's rankings. * To find shows/movies to include in the measurement instrument, maybe do cluster analysis on large number of viewer's viewing habits. Less

Look at the mean average precision of the movies that the users watch out of the rankings. So if out of 10 recommended movies one user prefers the third and the other user prefers the sixth, the recommendation engine of the user who preferred the third would be better. InterviewQuery.com has it more in depth of an answer. Less

It's essential to demonstrate that you can really go deep... there are plenty of followup questions and (sometimes tangential) angles to explore. There's a lot of Senior Data Scientist experts who've worked at Netflix, who provide this sort of practice through mock interviews. There's a whole list of them curated on Prepfully. prepfully.com/practice-interviews Less

### The percentage of female customer base

Wrote the SQL query to answer this question

Do you have any details on Python questions?

You need demographics data for this. Query would be fairly simple

### Given a list, create a new list that does not include the duplicates of the original list.

a = old list b = new list code : a = set(a) b = list(a)

Maybe they were asking to do it in-place. In that case, switch the duplicate elements to the end. Less

python 4 lines of code.

### Technical case interview which is a mix of modelling skills + classical case interview structure

wow, sorry to hear that. of all of gamma’s shortcomings, lack of common courtesy/EQ would not be on my radar’s radar. Less

Hi there, Thank you for sharing your experience. Just a quick question - do you remember how long you waited till you heard back after the business case interview stage? Thanks! Less

Hi there, Sorry you had a bad experience with this interviewer - do you mind giving us the first name of this interviewer? Or at least first and last initials? I'll be sure to contact this employee and point them to training resources at BCG. Thanks. Less

### Imagine you have N pieces of rope in a bucket. You reach in and grab one end-piece, then reach in and grab another end-piece, and tie those two together. What is the expected value of the number of loops in the bucket?

Is the question and answer makes sense? I thought the answer is 1/(2n-1). I don't understand why the solution adds all probability from 1 to N case together? For the 2 ropes case, the p(1 loop) = 1/3. So expected number of loop is also 1/3, but why the answer is 1+1/3= 4/3?Am I missing something? Less

You are right, the long answer failed simple boundary condition: if you tie once after pick two end, the max number of loops is one! So the p(n) is [0,1], lol Less

I got the correct answer, but the mathematician yelled at me for arriving to slowly at such an "easy" answer. Less

### If you take 3 subsequent number (n, n+1, n+2) and know, that n and n+2 are prime numbers, can you proove, that n+1 is always dividable by 6?

3, 4, 5. 3 and 5 are prime. 4 is not divisible by 6.

n+1 will be divisible by 2 since n and n+2 are prime now n,n+1 or n+2 any one of them should be divisible by 3 n and n+2 are prime so n+1 should be divisible by 3 Hence proved Less

1. if n and n+2 are prime, the n+1 is dividable by 2; 2.one of the three subsequent number (n, n+1, n+2) must be dividable by 3. because n and n+2 are prime, then n+1 is dividable by 3. So n+1 is dividable by 6. Less

### Do you like beer or wine more?

I was a bit taken aback, then I answered beer.

My "deal" bottle is an 18 year Glenfiddich. I drink whiskey neat when relaxing at home or with co-workers. I drink gin/vodka soda with lime when at business functions so I can order a soda water with lime after the first round to keep up appearances while keeping my wits about me : ) Less

### You have a bucket full of gems of 3 colors (i.e. Red, Green, Blue). Each color is > 1 gem. What is the minimum number of gems you would have to take to always ensure that you have 2 of the same color? What if I had n colors?

stupid!

You would need to pick 4 gems. You would need n+1 gem