Hello everyone, I am from Silicon Valley Dong Fei, at the invitation of domestic friends, very happy to communicate with you about the U.S. Big Data Engineers interview strategy.
Personal Introduction
First do a self-introduction, after the undergraduate Nankai, joined a start-up company Kuxun, do real-time information retrieval, and then enter the Baidu Infrastructure group, built the Baidu APP engine earlier version, and then went to Duke University to study, during the master's degree, Starfish The research project related to Hadoop's large data, and then internships in the Amazon EC2 department to learn about their internal architecture, to join LinkedIn after graduation, to advertise the architecture of the group, involving Hadoop tuning, Data Pipeline, offline/ Online, real-time system. The latest is working as a data engineer in Coursera.
In the years of work, in addition to the relentless pursuit of technology, but also accumulated a large number of interview experience, from the domestic first-line internet companies Baidu, Alibaba, Qihoo, everyone, to the United States first-line company Facebook, Google, Linkedin, Twitter, Amazon, To popular startup, Uber, Pinterest, Airbnb, Box, Dropbox, Snapchat, Houzz, get 10+ offer, and during LinkedIn interviewed 100+ candidates, participate in the development of the test questions, Willing to share and help a lot of people succeed in job search, achieve goals.
Introduction of Silicon Valley high-tech company
Let's take a look at this Silicon Valley map, which is located in California, a long strip from San Jose to San Francisco, in the middle of San Francisco Bay, or Bay Area. Its origin is this side has the computer core processor to be inseparable from the silicon, 30 years, Silicon Valley has developed into countless technical start-up company's Cradle. More than more than 20 years ago, there were many hardware companies listed successfully, such as Intel, Oracle, Apple, Cisco. 10 years ago, the rise of the internet created the magic of Yahoo, Google, ebay. Now Tesla, Facebook, Twitter and LinkedIn are on the top of the U.S. stock of high-tech stocks. The market capitalisation of these companies is from dozens of billion to hundreds of billion,pe from negative to thousand, and the world is changing behind crazy valuations.
If there is a reason for the success of Silicon Valley, I think there are two points:
On the one hand, the geographical location is a unique attraction to attract a large number of talent, here are Stanford and California State University to provide intelligence library support, in Silicon Valley can see the smartest people from around the world, China, Indians, Jews are the main force that constitutes these engineer. Although the domestic technology to make a mockery of the code farmers, but in Silicon Valley to become a good engineer or harvest quite well.
On the other hand, entrepreneurship is an eternal topic, in Stanford there is a saying-the air is fluttering in the taste of entrepreneurship, some early employees through the listing and accumulated experience has become an angel investment, Y Combinator, a variety of technical forum, Meetup, entrepreneurial instructors are very active. The power of capital can not be, in the early years of VC through investment, acquisition, listing amplification to form a snowball effect. People always like to ask what is next big thing, which is next Facebook, the next musk, according to statistics 10 years can achieve a company of more than billions, the process is now shortened.
I'll take LinkedIn as an example of what a High-tech company (FLG) looks like. It was founded in 2003, the professional social networking site, in the development of 10, is not a sudden outbreak, there are currently 300 million of global users, although with Facebook, Google 1 billion + users can not compare, but it has a good moat, the user positioning high-end precision, unit value high. In this photo, the founder Reid Hoffman, a member of the PayPal Gang, is a big boss in Silicon Valley and is now a director and investor. In the middle this is CEO jeff,2013 year by Glassdoor as the best CEO, as a professional manager, successfully help LinkedIn growth, he likes to mention transformation, hope that each of our employees can challenge themselves, in their respective positions evolution.
LinkedIn offers good benefits for its employees, with the so-called Bay Area's best free canteen, once a month in Day,hack Day, helping employees to start their own incumbator plans. It is characterized by data-driven development products, such as people you/may know, and the Job you are interested. I've done sponroed Ads all require strong data backgrounds and the support of the scientist. Its business model is also unique, with 3 line, for the company's recruitment services, advertising-oriented market services, personalized subscription services, as well as the latest sales Solution, because so many possibilities, become the darling of Wall Street.
The latest entrepreneurial trends in Silicon Valley
Say that Silicon Valley, in addition to those already successful big companies, have to say the latest entrepreneurial trends, these represent the next FLG. I have summed up some areas and represented companies: cloud computing (Box, Dropbox), large data (Cloudera), consumer Internet (Pinterest), Health (Fitbit), communication (Snapchat), payment (Square), Life (Uber).
This is the latest Wall Street update on the scale of financing, such as Uber to reach the 18Billion valuation, I did not go to the offer, still feel very crazy, if you look at this table, you can see that the Silicon Valley (blue), especially in San Francisco, they are much larger than other regions, or geographical determinism And in the domestic two millet, Jing Dong are in Beijing, and recently we see some bubble theory, said what Alibaba listed whether the U.S. stocks to the top, Jingwei VC founder also remind us of the risk of bubbles, I can not judge. It would be fun to get involved in the next wave. I recommend that you go to see "The top of the wave", "singularity near", I still look forward to the next 20 years of technological revolution.
Large data-related technology
I personally love the big data, in the Silicon Valley This is also a great relish, there is a joke, the large data is like teenage sex:everyone talks about it, nobody really knows you do it. In fact, we are still interested in driving a good, not so utilitarian, large data technology involved too much, the normal work is slowly accumulating, there are countless pits and technical details to overcome. It's not that the technology is going to be the hottest. If you use bad, your pressure is very big, for example, you use an open source database, found that it occasionally has data loss how to do, if this is online service, you constantly receive the alarm, at that time you choose its advantages scalable, fault tolerance is meaningless.
Then the big data, which hadoop as an industry standard, I have to face in addition to Google, Microsoft does not need, almost all companies are used to recommend that you take advantage of this opportunity. There are three big, Cloudera is the old Hadoop consulting company, the founder of Hadoop Cto,hortonworks is also a lot of Hadoop COMMITTEE,MAPR is proposed HDFS erasure coding way Efficient and famous, They are all heavily invested, and the pattern is similar, starting with the community free version, but there is a business version to provide better management. And this year, a dark horse spark, simply the memory level of the calculation, than the Hadoop framework can save IO, the use of caching, can adapt to batch processing, iterative, flow-type calculation.
Here to look at its ecosystem, how to learn Hadoop is a step-by-step process, first to understand the core system to learn it, HDFS, MapReduce, Common, in the periphery there are countless system tools to facilitate development, I personally used the Avro as a data format, Zookeeper as a high reliability component, SOLR as a search interface, pig build workflow, Hive Data Warehouse query, Oozie management workflow, HBase as KV distributed storage, mahout Data Mining Library, Cassandra NoSQL database. I suggest beginners consider chinahadoop courses.
And Hadoop itself is an evolutionary process, a few years ago 0.19, to 0.20, 0.23 streaming into the yarn architecture, and finally evolved into Hadoop2.0, Hadoop1.0 and 2.0 their interfaces and components are completely different, but overall Hadoop 2.0 is the trend, Because it has a separate resource management platform such as yarn, the application can be developed in a plug-in way, freeing up productivity, and the new processors like Spark,storm support Hadoop 2.0.
Here is the community version of their hortonworks, which can be said to be standard-makers, standard companies, and other companies that use only the stable version they offer, with little say. But the large number of applications that do big data and do not necessarily go to these standard-setting companies is also a great test of the flexibility of the architecture and the ability to see the actual product.
When it comes to the 2014 fire, it still depends on spark. The 2-Spark Congress has been opened, with thousands of people on the scale, and countless people excited by the 100 times-fold performance boost from Hadoop. It says here that its background is the amplab of Berkeley, which has a very famous Bdas (Berkeley Data Analytics Stack), and Spark has become the top project of Apache. Last year, the lab professor and the students went out to set up Databricks company, pulled to tens of millions of VCs, and asked Spark is the Terminator of Hadoop? I'm looking at the 2014-year Spark conference where all of the Hadoop companies are very supportive, Like Cloudrea even give up impala support and turn into spark.
If this develops, a single spark can be a prairie fire. It uses Scala as a functional language. There are a lot of components inside, there are shark support SQL similar hive, there are spark streaming, Mllib, Graphx, Sparkr, Blinkdb. Its core data structure is RDD, which can be run on a variety of distributed systems. Overall is an inclusive + aggressive system. I am personally optimistic about their development.
I've done some big-data ad systems on LinkedIn, and I'm just going to mention some of the things I've learned.
LinkedIn has a unique open source data system, including Voldermort (distributed kv Storage), Kafka (distributed real-time Message Queuing), Espresso (mass storage based on MySQL), databus (data change capture), and can view http:// Data.linkedin.com
Lambda architecture, offline use Hadoop to do pipeline,near online for efficient aggregation, providing a hybrid architecture to achieve real-time and consistent compromise.
Kafka in the foundation of LinkedIn, on the one hand all real-time tracking is through it, on the other hand, data bridge, such as the graph through the Kafka can achieve seamless convergence, it is difficult to imagine each data source heterogeneous systems, their communications will be n^2 Level of complexity.
Distributed is not cool, if you take into account high reliability, strong consistency and data volume is not imagined, not necessarily use; Try to use mature, reliable, such as MySQL, Memcached
Job Search Experience
Based on my job search experience, I give a few suggestions:
If the interview, I first see their experience is not match, on the fresh students see there is no internship experience, if you are in Google, LinkedIn internship, absolute bonus points, schools of course also need, such as we recruit UC Berkeley is to have a bigger chance.
I've seen a lot of resumes and I don't recommend Doc because it's different in the system. Resume should not be too long, not Daniel is not more than 2 pages. Try to highlight how your skills match the position of the company, such as the company is using C + +, you do not have C + +, is not appropriate. I also do not like to see proficient, if you write this, it is easy to cause trouble, it is best to write how many years of experience, the mastery of specific technology.
Of course, the interview needs to be prepared, but in the end it is not enough. I suggest that you select 1 or 2 of your experiences, including how to work in teams, technical details, difficulties and how to overcome them. Don't try, just prepare what you're using and nobody cares what you did 5 years ago.
Social is very important, the simplest is to go to the job fair can be mixed face familiar with people, if you use the network more than some network recruitment sites, such as Dice,indeed, you have to skillfully use LinkedIn, add a senior account, you can see some alumni resources, the station letter, Refer is a lot more efficient than your online blind cast.
How to find a face test?
There are a lot of resources online, such as Glassdoor is an anonymous site, often have interview questions, some technical forum StackOverflow, Careercup also have a lot of reference questions.
How do you know a company that is reliable?
Can see you know the cattle people have chosen to go to which companies, if not well-known, can go to the flow rankings to see what position he is, if not listed, you can see its financing scale, but also from LinkedIn to see whether its staff is excellent.
When do you know you're ready for an interview?
Whether or not the algorithm can write recursive and dynamic rules;
Coding whether the clearance, whether the IDE can write bugs free;
Design whether pass, whether can give tradeoff;
Project experience, able to speak the structure, difficulties, their own contribution;
Add sub-item: Github, Blog, participate in open source.
How to answer a behavioral interview question?
For example, have you ever had a failed experience, and if your boss gives you a task you don't like, what kind of person are you imagining? Here on the one hand can combine their own energy, on the other hand, pay more attention to the company's introduction page, including the company founder background, corporate culture, recruitment requirements. These are all possible to do homework in advance, as far as possible to reflect your passion, responsibility, diligence and other excellent quality.
How do I get a U.S. work visa?
To come to work in the United States, is generally required to H1B identity, is an employer to the Labour Department to apply for sponsorship, according to the current form, the annual quota is a snatch and empty, then it will need to draw lots to decide. Applications are made before April 1 and balloting is drawn after April 1, and if a master's degree or above is obtained in the United States, it can be prioritized and the probability of extraction is higher, and according to the 2014, the average smoking probability is 50%.
If there is no smoking, if the United States has a master's degree, you can use opt to work, and can save the Social Security tax. If it is overseas, can only wait for the coming year in smoke. Global companies such as Google and FB also offer opportunities for other countries ' offices, and then continue to work in the United States through L1 or H1B after a year of work. In addition, if you have a quota, then job-hopping can be transfer and do not need to rely on places, every 3 years can be renewed, up to 6 years. If you apply for a green card during H1B, you can extend it.
Interview process
If you get an interview, the following process is the first electrical plane, for engineer, algorithm coding basic skills is necessary. or ready to your little whiteboard, according to the interviewer's question to give ideas and code, easy to say, but that is more than 10 lines of code, more than 80% of people are dead.
Then is onsite, American companies in order to reflect the talent first, will come to a onsite, if it is remote, reimbursement of air tickets, fares, hotels, meals, it sounds like a free travel opportunities. But onsite is also not easy, basically 4-6 rounds, 45min to 1 hours per round, will also make you challenge the limit, is often a headache splitting, my most ruthless once 10 days 7 onsite, continuous flying, continuous noodles, really torture.
What they are investigating, simply to say that you are not smart, whether through engineering training, whether can cooperate. The total is divided into 3 pieces, one is the technical issues, such as algorithms, system experience, a kind of communication skills, your personal experience, interest, a kind of behavioral interviews that HR likes to ask, like whether you've had a failed experience, what you think you will be if your boss gives you a task you don't like.
Interview Preparation
The technical interview involves a wide range of facets. Seemingly simple coding may not be able to pass, do not believe, you write a string lookup, I do not need You know KMP, is a violent solution, but 90% of people are hanging on this problem. Algorithm, common Hashtable, Heap, Trie. System design is also a lot of students are afraid, many people say I did not design those systems.
If we all pass the algorithm, system design can continue to screen, reflecting your level. There are also some very random questions, the mathematical combination of probability, Linux common commands are likely to touch.
I'm here to list some of the basic problems of Hadoop, are relatively simple, we Google, know that there are also my detailed topics.
For the algorithm, is the most important study, I summed up a number of high-frequency topics, the same see my knowledge.
These two questions are my real asked, although not conventional, we can think about. The area is asked by Apple, the students can be in the 15min calculation out?
Select work
Assuming that you have experienced the test of the interview and get an offer, the next question is how to choose. Before considering an offer, do a research on the company, such as what is the size of the company? What is the product? Glassdoor How do employees evaluate? Do you like your position? This is like choosing a school, if the election is wrong, but also need to take a lot of detours.
My personal reference is that the first time this company is not up, whether the product has love, the team is stronger, learn something. Classification of the company, Hortonworks this is purely technical, enterprise-class may not many people know, and Uber is the mass consumption, many friends have used. Now the hot spot is the mobile internet, we can also consider this piece more.
Everyone is very concerned about the benefits of Silicon Valley company, this side I also make an introduction, the treatment is divided into basic wage, according to Glassdoor ranking, probably in the annual 10w Knife -20w knife, Silicon Valley is now a rising tide, some big bonuses companies have (Google, FB 15%-20%). If it is a listed company, will give restricted shares, 3-4 years of exercise, start-up companies generally give options, the difference is that the restrictive stock is sent in vain, do not need their own pockets, options need to buy their own, different periods of different prices. But the taxes on the stock exchange are very high and options are some long-term tax avoidance.
Finally, consider your interest and risk tolerance, if you go to a big company to make a screw, the realization of Communist life is understandable. Go to small companies under pressure, grow fast. But be prepared to fail and look at the lessons of Zynga.
Workplace feelings
Silicon Valley lives a group of people who don't care what others think, but have crazy ideas. Everyone in this talk about innovation, talk about technology, entrepreneurship, capital inflow, talent competition, resulting in everyone is very high expectations, can not wait to change. This fickleness may be the driving force of social progress.
I'm on top of the hottest topics, and every field is worth billions or billions of dollars. Alibaba's recent successful listing to create the largest IPO, so that you can see the brutal growth of China's Internet, China's development speed and broad market let everyone imagine unlimited. And Baidu in Silicon Valley set up artificial Intelligence institute, Alibaba is also ready to recruit thousands of people in Silicon Valley research and development team, more and more talent to fight. Some of China's Internet products also go abroad, micro-letter, Millet, 360 are in the broad layout of the future investment. Sometimes I think when everyone is talking about technology changing the world, being a little app is worth billions, even billions, but really the world because you changed? We also need to think more independently.