Machine learning and algorithmic interviewing too hard?
Source:
Https://mp.weixin.qq.com/s/GrkCvU2Ia_mEaQmiffLotQ
Shi Xiaowen
August participated in a number of pre-approved interviews, including Ali, Baidu, headlines, shells, a little information and so on. Organized a number of interview questions, share to everyone.
First, machine learning basic problem
1, the formula of LSTM
2, Rnn why the gradient disappears and the derivation of BPTT
3, the basic principle of DQN?
4. What is the difference between a GBDT and a random forest?
5, the principle of GBDT, how to do classification and regression
6, random forest random embodiment in which aspect
7, the principle of Wide &deep
8. How is GBDT+LR done?
9, DQN Model Why do experience playback
10, if the data is not independent of the distribution of what will happen
11, the principle of AUC introduction
12. The difference between Xgboost and GBDT.
13. The difference between intensive learning and supervised learning
14. What are the loss functions within the neural network?
15. What are the common activation functions in machine learning? Why do I usually need a 0 mean value?
16, DEEPFM Introduction
17. FM Deduction
18, the difference between boosting and bagging?
19. Why can bagging reduce variance?
20, cross-entropy loss function, 0-1 classification of the cross-entropy loss function form. What is a convex function? 0-1 Classification If the square loss is used, why is the cross-entropy rather than the square loss?
21, L1 and L2 What is the difference, from the mathematical point of view to explain why L2 can enhance the generalization ability of the model.
22. What are the differences between L2 and dropout in deep learning?
23. What are the benefits of L1 regularization?
24, if there are 10,000 geographical coordinates, converted to 1-10000 of the number, you can use the decision tree?
25. What is the difference between cart classification tree and ID3 and C4.5?
26, tree integration model There are several ways to achieve: bagging and boosting, in the course of the answer asked a lot of details. What are the random aspects of the forest, adaboost How to change the sample weights, what is the GBDT classification tree fit?
27. What is the difference between dueling dqn and DQN?
28. What is the effect of early stop on parameters?
Second, the data structure algorithm problem
1, K ordered array, find a minimum length of the interval, in this interval contains at least one number of each array
2, n [0,n] number, for each number of occurrences (can not open up additional space)
3, array of the full arrangement (Space complexity O (1))
4, a pile of banknotes, as evenly as possible (using the idea of knapsack problem)
5. The maximum value of the shortest path (Floyd algorithm) in a non-circular graph
6, the Level traversal binary tree
7. The longest common subsequence of a string (dynamic planning)
8, the tree's pre-sequence traversal and zigzag traversal (non-recursive)
9, an array, all arrays appear two times, only one number appears once, return this number (bitwise operation)
10, an array, a number appears more than half times, returns this number
11. The result of division is returned with a string, if it can be done, the result of dividing is returned, if not done, the infinite loop part is marked with [].
12, array sorting, assuming that the order of the array after the precedence and ranking before the absolute difference between the value of less than K, what is faster than the algorithm?
13, the first public ancestor of the two nodes in the tree.
14, judge whether it is a palindrome linked list
15. Determine if there are the same nodes in the two linked list
Third, practical problems
1, if you want to add a feature to the model, how to determine whether this feature is valid?
2. What is the difference between LR and FM? Does FM need to choose a crossover feature? If the LR has chosen a part of the feature to do the crossover, achieved a better effect than FM, which is why? If FM becomes DEEPFM, the effect is more than LR, and why?
3, if all the samples of the logistic regression are positive samples, then what is the super plane that it learns?
4, which scenario classification problem does not apply to the cross-entropy loss function?
5. What do you think is the most important part of the recommendation system?
6, multi-arm Tiger machine, there are many methods, such as E-greedy,timponson sampling, UCB, what are the applicable scenarios of these methods?
7, how to predict the sales of a shop sub-category
8, the information flow sampling, there are n data, but the length of n is not known, design a sampling algorithm, so that each of the selected probability is the same.
9, the model on-line evaluation and online use, often appear the actual effect of the line than the offline effect of the situation, please analyze the possible reasons.
10, in the CTR estimate problem, assume the training data of positive and negative sample number is 1:4, the test data in the positive and negative sample number is also 1:4, then the model to the test set, the average click-through rate of 1/(1+4), assuming that the under-sampling strategy, the positive and negative sample number of 1:1, the same test set to predict , what should the average Ctr be? (Large sample size, initial total sample number is 1 billion)
Machine learning and algorithmic interviewing too hard?