Hadoop interview 45 questions and answers

Last Update:2014-12-22 Source: Internet

Author: User

Keywords nbsp; name run xml dfs

Tags check cloudera configuration data default development directory distributed

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hadoop cluster can run three modes?

Stand-alone (local) mode

Pseudo-distributed mode

Fully distributed mode

2. Stand-alone (local) mode of attention?

There is no daemon in standalone mode, everything runs on a single JVM. There is also no DFS here, using a local file system. Stand-alone mode is suitable for running MapReduce programs during development, which is also the least used mode.

3 pseudo-distribution model of the attention point?

Pseudo is suitable for development and test environments, where all daemons run on the same machine.

4. Can a VM be called Pseudo?

No, two things, and Pseudo only targets Hadoop.

5. What is the point of full distribution model?

Fully distributed model is usually used for production environment, where we use N hosts to form a Hadoop cluster, Hadoop daemon running on each host. There will be a host running Namenode, a host running Datanode, and a host running task tracker. In a distributed environment, the primary and secondary nodes are separated.

Does Hadoop follow UNIX mode?

Yes, under UNIX use cases, Hadoop also has a "conf" directory.

7. Hadoop installed in what directory?

Cloudera and Apache use the same directory structure, and Hadoop is installed at cd / usr / lib / hadoop-0.20 /.

8. Namenode, Job tracker and task tracker port number is?

Namenode, 70; Job tracker, 30; Task tracker, 60.

9. Hadoop core configuration is what?

Hadoop core configuration through two xml files to complete: 1, hadoop-default.xml; 2, hadoop-site.xml. These files all use the xml format, so there are some attributes in each xml, including names and values, but none of these files now exist.

10. How should the current configuration?

Hadoop now has three profiles: 1, core-site.xml; 2, hdfs-site.xml; 3, mapred-site.xml. These files are stored in the conf / subdirectory.

11. RAM overflow factor is?

Spill factor is the size of the file stored in the temporary file, which is the Hadoop-temp directory.

12. fs.mapr.working.dir only a single directory?

fs.mapr.working.dir is just a directory.

13. hdfs-site.xml 3 main properties?

dfs.name.dir determines the path to the metadata store and how DFS is stored (disk or remote)

dfs.data.dir determines the path to the data store

fs.checkpoint.dir for the second Namenode

14. How to exit the input mode?

To exit the input: 1. Press ESC; 2. Type in: q (if you have not typed any current one) or type: wq (if you have typed in the current one) and press Enter.

15. What happens to the system when you type hadoopfsck / creates a "connection refused java exception"?

This means that Namenode is not running on your VM.

16. We use Ubuntu and Cloudera, where do we go to download Hadoop, or is it installed with Ubuntu by default?

This belongs to the default configuration of Hadoop, you have to download from Cloudera or Edureka's dropbox, and then run on your system. Of course, you can configure it yourself, but you need a Linux box, Ubuntu or Red Hat. There are installation steps in the Cloudera website or in Dropbox at Edureka.

17. "jps" command's usefulness?

This command can check Namenode, Datanode, Task Tracker, Job Tracker is working properly.

18. How to restart Namenode?

Click stop-all.sh, then click start-all.sh.

Type sudo hdfs (Enter), su-hdfs (Enter), / etc / init.d / ha (Enter), and /etc/init.d/hadoop-0.20-namenode start (Enter).

19. Fsck's full name?

Full name is: File System Check.

20. How to check if Namenode is working properly?

To check if the Namenode is working, use the command /etc/init.d/hadoop-0.20-namenode status or simply jps.

21. mapred.job.tracker command role?

Allows you to know which node is a Job Tracker.

22. / etc / init.d command is the role?

/ etc / init.d Description of the daemon (service) location or status, in fact, LINUX features, and Hadoop little relationship.

23. How to find Namenode in the browser?

If you really need to find Namenode in your browser, you no longer need localhost: 8021, and Namenode's port number is 50070.

24. How to transfer from SU to Cloudera?

From SU to Cloudera just type exit.

25. Start and close orders which documents will be used?

Slaves and Masters.

26. Slaves made up of what?

Slaves consist of a list of hosts, one for each row, to illustrate the data nodes.

27. Masters made up of what?

Masters is also a list of hosts, each one line, used to illustrate the second Namenode server.

28. hadoop-env.sh is used to do what?

hadoop-env.sh provides the Hadoop JAVA_HOME operating environment.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More