The ten fallacies of Python language in enterprise application

Source: Internet
Author: User
Tags disqus

English Original: https://www.paypal-engineering.com/2014/12/10/10-myths-of-enterprise-python/

Translated Original: Http://www.oschina.net/translate/10-myths-of-enterprise-python?p=3#comments

Language diversification is an important part of the PayPal programming culture. While C + + and Java have been popular for a long time, more teams have chosen Jvascript and Scala. At the same time, Braintree's acquisition has introduced a sophisticated ruby community.

Python, as a special language, has a long history in ebay and PayPal. And its popularity is still increasing.

ebay's developers support Python, a language that has been used for many years in the grassroots field. Even before ebay management officially supported Python, technicians started using Python. I joined PayPal a few years ago and chose Python to write internal applications, but I found Python code almost 15 years ago in PayPal.

Currently, Python supports more than 50 projects , including:

    • features and product types , such as EBay now and RedLaser
    • operational and infrastructure types , from open OpenStack to proprietary facilities
    • Middle-tier services and applications , such as the one that PayPal uses to set prices and detect user-available features (services/apps)
    • monitoring agents and Interfaces Use cases that involve deployment and security
    • Batch processing Tasks such as data import, price adjustment, and other items
    • and countless developer tools.

In the following article I will detail the technologies and initiatives that have enabled EBay and PayPal's Python ecosystem to be used by more than 25 engineers in 2011 and over 260 engineers in 2014.   For this article, I will focus on the 10 myths that have to be uncovered about the business environment of EBay and PayPal. Fallacy #1: Python is a new language

With the fact that all startups are using it and children are learning it recently, it's understandable why this fallacy still exists. In fact Python is already over 23 years old, it was originally released in 1991, earlier than HTTP 1.0 protocol 5 and earlier than Java 4. At present, there is a famous example of using Python early in 1996: Google's first successful web crawler.

If you're curious about the long history of Python, Python's author Guido van Rossum has prepared the whole story for you. Fallacy #2: Python is not compiled

Unlike C + +, which requires a separate compiler toolchain, Python is actually compiled into bytecode, much like Java or many other compiled languages. The further compilation process, if any, depends on the runtime environment, either CPYTHON,PYPY,JYTHON/JVM,IRONPYTHON/CLR or other process virtual machines. Refer to the fallacy #6 to learn more.

A common principle in PayPal and elsewhere is that security cannot depend on the compiled state of the code. It is more important to enhance the security of the runtime environment, because essentially each language has a decoder, or can be intercepted and exported in a protected state.   Refer to the next fallacy to learn more about Python security issues. Fallacy #3: Python is unsafe

The affinity of lightweight Python may make him look less scary, but the intuition is largely misguided. One of the core tenets of security is to make the presentation as small as possible. Large systems violate security principles because they tend to centralize behavior and make it difficult for developers to understand. Python is marginalizing these disgusting problems by advocating brevity. What's more, CPython solves these problems by making itself a simple, stable, and easy-to-review virtual machine. In fact, a recent analysis of coverity software shows that CPython got their highest quality rating.

Python also has a range of scalable, open source, industry-standardized security library sequences. At PayPal, we see security and credit as a priority, and we find Hashlib, Pycrypto, and OpenSSL, through the combination of PYOPENSSL and our own custom build, covering the diverse security and performance needs of PayPal.

These many reasons make Python the fastest choice for the PayPal (and ebay) application security team in some businesses. Here are a few security-based applications that use Python in the secure first environment of PayPal:

    • Create a security agent to facilitate key rotation and consolidate encryption implementations
    • Industry-leading HSM technology integration
    • Building a TLS-protected encapsulation agent for a technology stack that lacks compatibility
    • Generate keys and certificates for our in-House mutual authentication Program
    • Developing an active vulnerability scanner

In addition, there are countless security vulnerabilities in Python-built, operational-oriented systems such as firewalls and connection management.   In the future, we must go back in-depth integration of PayPal Python security matters. Fallacy #4: Python is a scripting language

Python can indeed be used to write scripts, and is one of the pioneers in this field because of its simple syntax, cross-platform and ubiquitous Linux, Macs, and other Unix machines.

In fact, Python may be the most flexible technology in a general-purpose programming language. Here are some examples:

    1. Telecommunications infrastructure (Twilio)
    2. Payment System (PayPal, Balanced Payments)
    3. Neuroscience and psychology (many, many, examples)
    4. Numerical analysis and Engineering (NumPy, Numba, and much more)
    5. Animation (LucasArts, Disney, DreamWorks)
    6. Game Backstage (Eve Online, Second life, Battlefield, and many others)
    7. Email Infrastructure (mailman, Mailgun)
    8. Media Storage and processing (YouTube, Instagram, Dropbox)
    9. Operations and System Management (Rackspace, OpenStack)
    10. Natural language Processing (NLTK)
    11. Machine learning and computer versions (Scikit-learn, Orange, SIMPLECV)
    12. Safety and permeability testing (many many and Ebay/paypal
    13. Big Data (Disco, Hadoop support)
    14. Daniel (Calendar Server, which drives Apple ICal)
    15. Search system (ITA, UltraSEEK, and Google)
    16. Internet Infrastructure (DNS) (BIND 10)

Not to mention Web sites and Web services, those are a few. In fact, PayPal engineers seem to be interested in working with Python-based web features like YouTube and Yelp.   If you're interested in a larger list of Python success stories, take a look at the official list. Fallacy #5: Python is a weak type

The Python type system is characterised by a powerful, flexible type of operation. Wikipedia's explanation of this.

There is an indisputable and interesting fact that Python is more type-reinforcing than Java. Java distinguishes the type system from primitive types and objects, and it allows NULL to exist in a gray area. On the other hand, modern Python has a unified, strongly-typed system in which none of the types is explicitly specified. Further, the JVM itself is a dynamic type because it can trace its roots back to an implementation of the Smalltalk VM acquired by Sun.

Python's type system is great, but for enterprise use, there are still many more important things to focus on. Fallacy #6: Python speed is slow

First, there is an important difference: Python is a programming language, not a runtime environment. Python has several implementations:

    1. CPython is a reference implementation, and is also widely published and used by implementations.
    2. Jython is a mature implementation of Python for the JVM.
    3. IronPython is a python implemented by Microsoft for its own common language runtime-aka. NET.
    4. PyPy is an increasingly sophisticated Python implementation, with JIT compiling, incremental garbage collection, and many advanced features.

Each runtime has its own performance characteristics, and they are not slow. What's more important here is that you can't mistakenly assign a performance indicator to a programming language IQ. This assessment should always be used when an application is running, preferably for a specific usage scenario.

Now that you know the things, the following are some of the small items that Python offers, reflecting their important performance benefits:

    1. Using NumPy as Intel's MKL SIMD interface
    2. JIT compilation of PyPy can achieve faster performance than C
    3. Disqus can accommodate 250 million to 500 million users on the same 100 boxes

Admittedly, these are not the latest, but my personal favorites. This will easily involve high-performance python and the wide range of independently available runtimes. Instead of focusing solely on solving a particular case, we should focus on the general impact of the developer's productivity on the final product performance, especially in an enterprise-wide environment.

C + + vs Python,. The comparison of the two languages under the same output.

Given enough time, a disciplined developer will only write accurate and efficient software in the form of the following argument:

    1. Design A software that accomplishes the task correctly, including developing individual tests
    2. test performance, identify bottlenecks
    3. optimization , according to the rules of testing and Amdahl, and using the source of Python and C

Although this sounds simple, it is still a very time-consuming process for even a seasoned engineer. This development process was considered at the beginning of the Python design. Based on our experience, it is common for C + + and Java projects to complete an iterative process for three iterations of the Python project.   Today, there are no shortage of Python projects in PayPal and ebay that use less code to defeat similar C + + and Java projects, thanks to rapid development that makes it possible to cut and optimize carefully. Fallacy #7: Python cannot do large-scale

There are many definitions on a large scale, but YouTube is a large-scale website anyway. More than 1 billion per month of UV, upload video for more than 100 hours, occupy the internet bandwidth of 20%, all of which are based on Python as the core technology. Dropbox,disqus, Eventbrite, Reddit, Twilio,instagram, Yelp, EVE Online, Second life, and, yes, And both ebay and PayPal have Python-scale examples that prove large-scale not just possible: it's a pattern.

The key to success is simplicity and consistency. Cpython,python's main virtual machines, which maximize these features, evolve an accurate and measurable runtime. It is hard to find Python programmers concerned about garbage collection pauses or application startup time. With a strong platform and network support, Python's own natural intelligence level can be expanded, BitTorrent is its full embodiment.

In addition, scale mainly covers measurement and iteration. Python is built on the essentials of Analysis and optimization.   Look at Myth #6了解更多Python如何垂直拓展的细节. Fallacy #8: Python lacks good concurrency support

In addition to the occasional clamor for performance and scale issues, some people want to mention the technical, "Python lacks concurrency," or, "What about Gil?" "If the dozens of counter examples are still not enough to support Python's ability to scale horizontally and vertically, then it won't help to explain the details of the CPython implementation more deeply, so I'll be brief."

Python has powerful concurrency primitives, including generators, Greenlets, deferreds, and futures. Python has excellent concurrency frameworks, including Eventlet, Gevent, and twisted. Python has put a staggering amount of effort into custom-run fashion, including Stackless and PyPy. All this annoying and more suggests that there is no flaw in the programmer's concurrent programming in Python. At the same time, all these are being formally supported or used in an enterprise production environment. For example, refer to the myth #7.

The global interpreter lock, or Gil, is the performance optimization of Python in most applications, and is also the basis for the development of almost all CPython implementation code. The Gil makes it easy for Python to use operating system threads or light threads (usually referred to as greenlets) without affecting the use of multiple processes. For more information, see the Q&a List of topics, as well as the descriptions in the Python documentation.

In PayPal, the deployment of a typical service requires multiple machines, multiple processes, multiple threads, and a very large number of greenlets, equivalent to a very powerful and extensible parallel environment (see). In most enterprise environments, teams are more inclined to overdo, cautious, and focus on disaster recovery at a higher level. However, in some cases, the Python service continues to process millions of requests per machine per day and is handled with ease.

A collaborative asynchronous architecture sketch based on a single worker. The outermost box is the process, and the next level is the thread, where these threads are light threads.   The operating system handles preemption between threads, while I/O works together asynchronously. Fallacy #9: Python programmers are scarce

In fact, Web developers using Python do not have a lot of web developers using PHP or Java. This may be due primarily to the interplay between business needs and education, but the trend in the education sector (the programming language used in teaching) makes the situation likely to change.

In other words, developers using Python are not scarce. There are now millions of developers using Python in the world. There are dozens of Python technology conferences, thousands of Python content questions and answers on StackOverflow, large companies that employ large python developers such as YouTube, Bank of American, and lucasarts/ DreamWorks and so on, these are clearly confirmed by this. In EBay and PayPal we have been maintaining the hundreds of full-time developers who use Python, how is this done?

So, when a project is created, why does it get the lead? For kids, college students and professors, Python is very easy to learn as the first programming language. On ebay, just one weeks from now, a new Python programmer will be able to show a real achievement, and they'll start emitting light often as long as 2-3 months, through the Internet's Treasures (interactive tutorials, books, documents, and open source repositories) everything is possible.

Another important consideration is that the project will be simpler to use with Python, and it will not require as many developers as other projects.   As mentioned in fallacy 6 and fallacy 9, learning an efficient team like Instagram is a common metaphor in the Python project, and this is really our experience on ebay and PayPal. Fallacy #10: Python is not fit for large projects

The fallacy #7 discussed a large-scale project to run Python, but what is the development of a Python-scale project? As in the fallacy #9中提到的, most python is not favored by others. However, Instagram has reached tens of millions of clicks on its $ billion acquisition day, with only more than 10 people in the company. Dropbox has only 70 engineers in 2011 years and fewer other teams. So is python suitable for large teams?

Bank of America actually has more than 5000 python developers, a single project with more than 10 million lines of Python code. JP Morgan has also undergone a similar shift. YouTube also has thousands of developers and millions of lines of code. Python is used by large-scale products and teams every day because of its modularity and encapsulation characteristics, and many large-scale development recommendations are consistent in specific areas. Tools, strong practices, and code reviews have led to the reality of project scale management.

Fortunately, Python developed the good Foundation laid out above. We are examining the execution of static analysis of Python code using Pyflakes and other tools, just as the basic style guide for sticking to Pep8--python language.

Finally, it should be noted that, in addition to the dispatch acceleration Myth #6以及 # #, projects that use Python often require fewer developers. In our common success stories, projects that use Java or C + + typically have 3-5 developers taking 2-6 months to complete the project by a single developer in 2-6 weeks (or hours, for these reasons).

A bit like a miracle, but it is a fact of modern development, but it often comes from a highly competitive industry. A clean state.

These fallacies may be mere pastimes. The discussion of these fallacies is still very active and enlightening, both internally and externally, because implicit in every fallacy is the understanding of the advantages of a python. Moreover, remembering these seemingly tedious manifestations and troublesome problems is the manifestation of steady growth, and steadily increasing interest in promoting education and continuing to work. Here, I hope to extinguish a war full of flames and make it possible to really talk about work and the realization of Python.

Keep an eye on future posts and I'll delve into the details at this overview. Then you have to know the details before that, have been revised or commented, my email is [email protected]. At that time, happy coding it!

The ten fallacies of Python language in enterprise application

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.