Machine learning and algorithmic interviewing too hard?

Last Update:2018-09-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Source:
Https://mp.weixin.qq.com/s/GrkCvU2Ia_mEaQmiffLotQ
Shi Xiaowen

August participated in a number of pre-approved interviews, including Ali, Baidu, headlines, shells, a little information and so on. Organized a number of interview questions, share to everyone.

First, machine learning basic problem

1, the formula of LSTM

2, Rnn why the gradient disappears and the derivation of BPTT

3, the basic principle of DQN?

4. What is the difference between a GBDT and a random forest?

5, the principle of GBDT, how to do classification and regression

6, random forest random embodiment in which aspect

7, the principle of Wide &deep

8. How is GBDT+LR done?

9, DQN Model Why do experience playback

10, if the data is not independent of the distribution of what will happen

11, the principle of AUC introduction

12. The difference between Xgboost and GBDT.

13. The difference between intensive learning and supervised learning

14. What are the loss functions within the neural network?

15. What are the common activation functions in machine learning? Why do I usually need a 0 mean value?

16, DEEPFM Introduction

17. FM Deduction

18, the difference between boosting and bagging?

19. Why can bagging reduce variance?

20, cross-entropy loss function, 0-1 classification of the cross-entropy loss function form. What is a convex function? 0-1 Classification If the square loss is used, why is the cross-entropy rather than the square loss?

21, L1 and L2 What is the difference, from the mathematical point of view to explain why L2 can enhance the generalization ability of the model.

22. What are the differences between L2 and dropout in deep learning?

23. What are the benefits of L1 regularization?

24, if there are 10,000 geographical coordinates, converted to 1-10000 of the number, you can use the decision tree?

25. What is the difference between cart classification tree and ID3 and C4.5?

26, tree integration model There are several ways to achieve: bagging and boosting, in the course of the answer asked a lot of details. What are the random aspects of the forest, adaboost How to change the sample weights, what is the GBDT classification tree fit?

27. What is the difference between dueling dqn and DQN?

28. What is the effect of early stop on parameters?

Second, the data structure algorithm problem

1, K ordered array, find a minimum length of the interval, in this interval contains at least one number of each array

2, n [0,n] number, for each number of occurrences (can not open up additional space)

3, array of the full arrangement (Space complexity O (1))

4, a pile of banknotes, as evenly as possible (using the idea of knapsack problem)

5. The maximum value of the shortest path (Floyd algorithm) in a non-circular graph

6, the Level traversal binary tree

7. The longest common subsequence of a string (dynamic planning)

8, the tree's pre-sequence traversal and zigzag traversal (non-recursive)

9, an array, all arrays appear two times, only one number appears once, return this number (bitwise operation)

10, an array, a number appears more than half times, returns this number

11. The result of division is returned with a string, if it can be done, the result of dividing is returned, if not done, the infinite loop part is marked with [].

12, array sorting, assuming that the order of the array after the precedence and ranking before the absolute difference between the value of less than K, what is faster than the algorithm?

13, the first public ancestor of the two nodes in the tree.

14, judge whether it is a palindrome linked list

15. Determine if there are the same nodes in the two linked list

Third, practical problems

1, if you want to add a feature to the model, how to determine whether this feature is valid?

2. What is the difference between LR and FM? Does FM need to choose a crossover feature? If the LR has chosen a part of the feature to do the crossover, achieved a better effect than FM, which is why? If FM becomes DEEPFM, the effect is more than LR, and why?

3, if all the samples of the logistic regression are positive samples, then what is the super plane that it learns?

4, which scenario classification problem does not apply to the cross-entropy loss function?

5. What do you think is the most important part of the recommendation system?

6, multi-arm Tiger machine, there are many methods, such as E-greedy,timponson sampling, UCB, what are the applicable scenarios of these methods?

7, how to predict the sales of a shop sub-category

8, the information flow sampling, there are n data, but the length of n is not known, design a sampling algorithm, so that each of the selected probability is the same.

9, the model on-line evaluation and online use, often appear the actual effect of the line than the offline effect of the situation, please analyze the possible reasons.

10, in the CTR estimate problem, assume the training data of positive and negative sample number is 1:4, the test data in the positive and negative sample number is also 1:4, then the model to the test set, the average click-through rate of 1/(1+4), assuming that the under-sampling strategy, the positive and negative sample number of 1:1, the same test set to predict , what should the average Ctr be? (Large sample size, initial total sample number is 1 billion)

Machine learning and algorithmic interviewing too hard?

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More