Why is the machine learning framework biased towards python?

Source: Internet
Author: User
Tags svm theano numba
What are the features of Python that make scientific computing developers so fond of them?

Reply content:

Summary: Good writing, support comprehensive, good tune, speed is not slow.

1. Python is the language of interpretation, which makes it easier to write a program. For example, in a compiler language such as C, write a matrix multiplication, you need to allocate the operand (matrix) of memory, allocate the results of memory, manually call the Blas interface Gemm, and finally if the use of smart pointer also have to manually reclaim memory space. Python is almost the import numpy; Numpy.dot two sentences.
Update (2015-5-7):Of course now a lot of support for managed memory management has been supported for C + + libraries, which makes the development process much easier, but the explanation language still has a natural advantage-no compile time required. This requires a lot of prototyping and iterative research direction for machine learning, which is very useful and efficient.

2. the development of Python is mature and there are many useful libraries to use. In addition to the above mentioned NumPy, there are scipy, NLTK, OS (comes with) and so on. Python's flexible syntax also makes it easy to implement very useful features, including text manipulation, list/dict comprehension, and so much more efficiently (writing and running efficiently), with lambda and more. This is one of the main reasons behind the benign ecology of Python. In contrast, Lua is also the interpretation of language, and even the luajit of this artifact, but it is difficult to do python itself, one because of Python, the predecessor occupied the market share, and another because of its own anti-common sense of design (such as global variables). But with Lua-python Bridge and Torch's east wind, Lua seems to be on the rise as well.

3. writing programs is very important for people who do machine learning. Because it is often necessary to make a variety of changes to the model, which is likely to be a reaching in the compiler language, Python can usually be implemented in very little time.

4. Python's efficiency is not bad. The development of explanation language has greatly surpassed many people's imagination. Many of the syntax sugars, such as list comprehension, are implemented in close proximity to the kernel. In addition to jit[1], there are cython that can significantly increase operational efficiency. Finally, thanks to Python's interface to C, many highly efficient, python-friendly libraries like gnumpy, Theano can speed up the operation of the program, and with the support of a strong team, the efficiency of these libraries may be greater than that of an unskilled programmer using C for one month tuning.

[1] Native Python is not self-JIT, want to use the JIT students can see PyPy and Numba. (Thanks to the comments in the Liu Shenxiu that the performance of Python does not have to be washed white, really poor, in the web framework of Python performance compared to C + + Java will probably be more than 10 to dozens of times times worse.)

However it can still be popular is it to solve the problem very friendly.

This includes the easy-to-use advantages of Python itself and a powerful repository of tools.

Like what:

To do the security to send a probe packet, requests can fully encapsulate the operation of the protocol stack, the user only care about the real need to send to whom the data sent, which is actually very efficient.

And the text operation also saves a lot of things that do not need to be repeated.

The temptation of this is very large, although the technical multi-body, but do research to do exploration, time and energy is really valuable, Python saves time to think about the real problem is what we need to solve.

In the final analysis, the programming language is just a tool, so-called proficiency in a certain programming language, is the use of tools proficiency, know how to deal with what kind of problem is another matter.

The study of machine learning is another matter, for this code thing, what does not matter, what can save unnecessary energy spending on what, because there are more important things to do, tangled in the programming language underlying how to operate is not here to care, no need or should not. MATLAB is the best language for research on machine-related research, and there is hardly one. However, the price is not cheap. As a result, scholars turned their eyes to python.

There is a scholar who does machine study, but not computer professional background. The immediate need for him is to transform his idea from a formula to a computer language and run it. This time, the extra energy that needs to be done in the middle As little as possible .

So first of all, in the process of algorithm implementation, it is necessary to follow the rules of program language and avoid the pitfalls of programming language: The Python language itself is designed to help users avoid a number of traps, without having to consider declaring variables, freeing up memory what non-computational professionals consider "trivial".

Secondly, the conversion from mathematical symbols to computer language itself, it is also necessary to expend energy. If you can give the researcher an illusion: "Writing a program is to write the formula again in another language," is perfect. This illusion that Python can provide is relatively high compared to other languages. Mainly relies on the characteristics of Python itself and some open-source algorithm library. First, the vector provides, and the scholar understands that the generics hardly need to blink. Then, a universal quantifier or something, a for-in can also be explained, although a bit strange, but also can accept it. Finally, a large wave of banks, SciPy, NumPy and so on. Make the writing program as sour as the writing formula.

Here, the scholar is a bit too sedentary. Think this is OK, first look at the price, you guess how, Free!!!! The scholar immediately took a pat on the thigh, on her. Then, suddenly a flash of light flashed, the Python goddess appeared in front of him: "Thank you for choosing Python, we also have super package both hands oh"

"First of all, you have to publish a paper to have a picture of it, come on, here is a matplotlib to take." The pictures are especially cute. ”

"You usually do research also need to make a note what, come, here is a Ipython notebook take, make notes also Meng Meng da." Oh, I forgot to tell you. A little conversion can be used as a slideshow. In this way, when you go to the meeting, do presentation also Meng Da "

"And oh, we also provide web spiders, lambda functional programming. As long as you need, also will provide Oh, free Oh!! ”

"I hope you enjoy it!" ”

At this time, the scholar thick glasses under the film, Full of Tears.

----
Above, according to their own understanding to answer a bit. There may be a lot of non-rigorous places, looking haihan. Originally wanted to serious answer, the result answer is more and more less serious in the back. Python Dafa Good, this is a lot of machine learning program Ape consensus, upstairs have said in terms of language characteristics of a bit. Add python and MATLAB support for deep learning here.
Theano: This is a python-based open source codebase, the main advantage (compared to the various toolboxes in MATLAB) is that GPU acceleration can be leveraged. In comparison, the GPU support of MATLAB is not enough, which limits its application foreground in deep learning.

Caffe: Currently the most fire deep learning framework, written in C + +, but provides the interface of Python and MATLAB, although provided these interfaces, but the use of people should know that Matlab interface is really general, compared to the Python interface is done very well, provides a variety of operations, This also allows most people to use the Python interface---especially after the concept of hypercolumn. In fact, the reason is that most of the machine learning, will not write code in fact, Python with a relatively good is so few, nothing is studious, high productivity, suitable for scientific research, to do prototypes.

You see the doctor most willing to use MATLAB is not the same. Language features, just two points.
1, Python is an explanatory language, the introduction is relatively simple, the syntax is more beautiful.
2, with the development of Python faster. To know the development of machine learning, the key is to have an idea immediately after the implementation of code to verify the operability and superiority of the algorithm. and C + + and other efficiency language development is slow, and code maintenance difficult.
--------------------------------------------------------------------------------------------------------------- --------------
I personally think that the most important thing about choosing Python is not its language features. But because the Python community is more diverse. And now the scientific calculation of this piece is the general direction of Python.
Python has a lot of mature machine learning packages such as Scikit-learn, and there are NumPy, Scipy, Pandas, Matplotlib, Numba, pypy, etc. that can greatly refresh the scientific computing speed of the package. Easy for developers to quickly develop machine learning algorithms. And in the face of problems, you can also seek the Python community inside Daniel's master.
--------------------------------------------------------------------------------------------------------------- --------------
The last sentence, Python Dafa is good.
--------------------------------------------------------------------------------------------------------------- ---------------
Really is the last sentence, life is too short, I use Python. To cite only one example, the SVM is called in Scikit-learn to predict:
From Sklearn import SVM
# to train
CLF = SVM. SVC ()
Clf.fit (train_x, train_y)
# Make predictions for new data
Clf.predict (new_x)

Is there a more stupid machine learning package than this? Python is suitable for doing some simple data preprocessing work, or some only run one or two times the simple job, but do the core data calculation and C + + efficiency is too poor (Cython can slightly speed up a bit, but the middle of a lot of pits, want to write Cython program faster than directly write C + + easy).

For ML, the main program to run at least dozens of times, or even hundreds of times, write a program to save a little bit of convenience, is running program parameters such as the results of the tears left behind ....

With the reference to the new features of C++11/14, writing is no more trouble than Python. Multithreading is much more convenient than Python. Before class in MATLAB and Python write machine learning homework, the first homework write a SVM, cool to fly. I used to write a Jacobian in C, and I was sick and dead.

Besides drawing, Matlab and matplotlib how simple ah. In C + +? Put on a QT first, then use Qglviewer to call OpenGL inside to draw you.

I will not say Blas and lapack those things have more trouble, do not say csparse that thing, drink ...
  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.