# Data Scientist Interview Questions in San Jose, CA, US

Data scientist interview questions shared by candidates

## Top Interview Questions

Find the second largest element in a Binary Search Tree find the right most element. If this is a right node with no children, return its parent. if this is not, return the largest element of its left child. One addition is the situation where the tree has no right branch (root is largest). In this special case, it does not have a parent. So it's better to keep track of parent and current pointers, if different, the original method by the candidate works well, if the same (which means the root situation), find the largest of its left branch. if (root == null || (!root.hasRightChild() ) { return null;} else return findSecondGreatest(root, root.getValue()); value findSecondGreatest(Node curr, value oldValue) { if(curr.hasRightChild()) { return (findSecondGreatest( curr.getRightChild(), curr.value)); } else return oldValue; } Show more responses |

generating a sorted vector from two sorted vectors. |

1. What's the relationship between PCA and k-means clustering? 2. What are the requirements for a matrix to represent a kernel? What happens if we run SVM using a 'kernel' that does not satisfy these requirements? 3. Problems using Python lists and dictionaries 4. SQL joins, aggregates (count, sum, avg), and cases 5. If you were given a dataset with [X] features (may be numerical, categorial, etc.) and you want to build a model (to determine fraudulent transactions, say), how would you determine which features are best to use in the model? |

How do you know if one algorithm is better than other? |

Business sense. A question on how to assess impact of a hypothetical features and possible problems. |

What makes you special -- that makes you stand out over everyone else. Heh heh -- |

classification vs regression metrics for evaluation how to handle missing, corrupt data segmentation Objective/loss function definitions how do you imagine an ML system, broadcasting in numpy ? |

how to design a model for times series data using LSTM? |

There are 25 horses. You can race any 5 of them at once, and all you get is the order they finished. How many races would you need to find the 3 fastest horses? |

Given an array of integers, find the maximum cumulative sum of a sub-set of the array |

**1**–

**10**of

**233**Interview Questions