Data analysis and machine learning environment configuration (Docker minimalist Getting Started guide)

Source: Internet
Author: User
Tags port number virtual environment jupyter jupyter notebook docker run docker toolbox xgboost

Do data science generally need to use similar xgboost, tensorflow, such as libraries, these libraries in win is not so good installation, but many people need them, how to do it, the simplest is to use Docker, not only a Linux virtual environment, You can also use Windows at the same time.

It is actually a fairly easy to use software, this article does not teach too many commands, because I will not, will only speak a few basic commands. This article will talk about how to install using Docker one under Win10 : what is Docker.

What Docker is, the official says it's called a container, but it's really hard to understand, getting started with it as a lightweight virtual machine is good second: why use Docker.

Some users who use Windows systems often encounter installation or compilation problems when installing Python libraries, tensorflow, xgboost, etc.

No need to study how to install Linux, get the Linux environment directly under win, use the powerful Linux shell

Solving Python's environmental problems

Easy to keep various packages and libraries up to date, from manual updates to Docker mirror market updates

Easy to reproduce, as long as you specify the same image version, then each machine operating environment is the same, there will be no program sent to others, but others can not run the problem. Three: Docker's easy Getting Started tutorial 1. Download and install Docker

First of all, of course, to download the official website, into the Docker, click on the icon on the graph, we can see if you have to install the Professional version or the Ultimate version of Win10, the family version of the WIN10 can only be tragic to win7 like the installation of Docker Toolbox, here does not start speaking. Default you are WIN10 Pro version, if you are not, then you become a ... 2. Download and install Docker

This is generally the point to keep the next step, over ...

If prompted, you may need to turn on Hyper-V or turn on virtualization in the BIOS, then follow the prompts to go 3. Start Docker

Double click on this icon to run, the lower right corner as shown in Figure 4. Pull Image

Press Win+r and enter CMD, enter the following command after you open cmd to pull Kaggle official production of a mirror, which encapsulates the xgboost, Anaconda, TensorFlow and other commonly used libraries and software, and Kaggle will continue to update, Province's own to update. The Docker market also has a variety of images, such as MySQL, Ubuntu and so on, as you choose.

Docker Pull Kaggle/python

To download a few g, peace of mind and so on, if not then go to Daocloud register an account to get an acceleration.
5. Create a folder to Exchange files

Here we set up a Kaggle folder in the D disk to interact with the virtual machine file, continue to enter the following command in CMD into the D disk, and then create a new folder called Kaggle

cd/d d:
mkdir Kaggle

Then we need to interact with the folder of the Qin positioning "D:/kaggle", later in Linux can directly access the Kaggle folder under the win 6. Modify Docker Settings

Right-click on the Docker icon and select Settings. In advanced, you can allocate more resources to Docker, select D in the shared drives, click Apply, enter the WIN10 account password, and wait for the Docker reboot to complete. 7. Run the image

Then create a container from the image to run, continue typing

Docker run--name Kaggle-  v d:/kaggle:/tmp/working/kaggle  -w=/tmp/working-p 8888:8888-d  -it kaggle/ Python  jupyter notebook--no-browser--ip= "0.0.0.0"--notebook-dir=/tmp/working

After running the results as shown, if there is no error on behalf of success. A brief explanation of-name Kaggle on behalf of us to name it kaggle; Specify a swap directory to map the d:/kaggle of win to the/tmp/working/directory under Linux; The port number is set to 8888;-d to run in the background Jupyter Notebook-no-browser represents running notebook without a browser, because we use the browser under WIN10. 8. Enter the container to find tokens

Now that notebook has a security verification that needs token to use, we continue to enter

Docker exec-it Kaggle Bash

This goes into the Linux bash, you can enter some shell commands, such as APT,LS,PIP and so on,

This time we enter

Jupyter Notebook List

Copy a string of characters "512bc.....4ed0" after token= to get token input

Exit

Exit Bash 9. Run Notebook

You can use Jupyter notebook at this time, enter the address in the browser

localhost:8888

It was the notebook in Docker, the token we just copied, the new notebook, and then the test of whether the import library was successful.

Perfect ~ ~ ~ 10. Stop Container

If we don't, we can stop the container.

Docker Stop Kaggle
11. Re-enable the container

If we want to run the previous container just enter

Docker Start Kaggle

That is, as long as the first time to complete, then just 11-12 steps to enable the closure of the container, is not very simple. 12. Update docker (optional)

If Kaggle updates the mirror, it only needs to

Docker Pull Kaggle/python

will be able to use the latest package they provide, of course, it will need to re-execute 8-10 steps, and use Docker rmi xxx to remove the outdated image. PS

1. Test through WIN10 Professional Edition V1607+docker V1.13.1

2. Part of the small C-drive, you can change the location of the image under the Advanced tab of step 7th

3. This article uses markdown here rendering complete, a little ugly

4. Recommended to use KITMATEIC to manage containers, very intuitive and beautiful

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.