Recently, I want to briefly learn streaming, mainly using python. Python + hadoop also has an exception in the previous blog post. It's interesting. If C ++ has the opportunity to try it.
Record some webpages that you see as memos
Http://hadoop.apache.org/docs/r0.19.2/cn/streaming.html#Hadoop+Streaming Chinese, although the version is relatively old
Latest http://hadoop.apache.org/docs/stable/streaming.html version
Http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
Http://dongxicheng.org/mapreduce/hadoop-streaming-programming/
Http://dongxicheng.org/mapreduce/hadoop-streaming-advanced-programming/
I have not read these two http://www.cnblogs.com/luchen927/archive/2012/01/16/2323448.html
Http://blog.csdn.net/xiaotom5/article/details/8092035
Http://apc999.blogspot.com/2010/03/hadoop-related.html has a lot of configuration or good
I haven't tried XML processing yet.
Http://blog.sina.com.cn/s/blog_3e48b19f0100zu8r.html
Http://dongyajun.iteye.com/blog/1315185