Hadoop streaming running Python program, custom module import

Source: Internet
Author: User
Tags python script

Today in the code refactoring, all Python files were put into a folder, uploaded to Hadoop run, no problem, but as the complexity of the task increased, it feels so unreasonable, so did a refactoring, built several packages to store different functions of Python files, the course is as follows:

1. At first, in the IDE, click Run, right, very praise;

2. Then move on to the server, and there's this problem:

Importerror:no module named XXX

Ah, it seems that the package refers to the path is not correct, so find the article to solve:

In Python, each py file is called a module, and each directory with a __init__.py file is called a package. As long as the mold
Block or the directory where the package resides in Sys.path, you can use the Import module or the import package to use the
If you want to use the module (PY file) and the current module in the same directory, just import the corresponding filename is good, than
For example, use b.py in a.py:
Import b

But what if you want to import a file of a different directory (for example, b.py)?
You first need to use the Sys.path.append method to add the directory where b.py resides in the search directory. The import is then
Can, for example

Import Sys
Import OS
Curpath = Os.path.abspath (Os.path.dirname (__file__))
RootPath = Os.path.split (Curpath) [0]
Sys.path.append (RootPath)

First problem solved, happy!

3. Then try to run the program on the hadoop-streaming, the amount, has been the error:

Importerror:no module named XXX

Thought is also because of this path problem, tried a lot of methods:

Later in the stackoverflow discovery someone asked the same question, and I solved it with one of the solutions:

When Hadoop-streaming startsThe Python scripts, your PythonScript ' s pathis where the script file really is. However, Hadoop starts them at './', and your lib.py ( it ' s a symlink) is at './', too. So, try to add ' sys.path.append ( "./") ' before you import lib.py like This:import syssys.path.append ('./') Import Lib 
When Hadoop-streaming starts the Python script, the path to your Python script is the actual location of the script file. However, Hadoop begins with './', and lib.py (which is a symbolic link) is also in the './'. Therefore, before you import lib.py, try adding "sys.path.append ("./")" ".   Import sys sys.path.append ('./') Import Lib

and when importing modules and packages, it is not possible to use the From XXX import yyy, you must use Import XXX, use yyy time, want to use xxx.yyy to call; repeated attempts finally found this. It's not a waste of time.


Hadoop streaming running Python program, custom module import

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.