For a long time, no matter what program I write, I use C + +, because C + + is very powerful, from the bottom up to the upper level can be engaged. The underlying aspect C + + C compatibility, support inline assembly, can develop embedded programs, drivers, operating systems, the upper level of C + + support object-oriented, there is a very rich library, enough to develop any program. C + + uses tools such as QT, Android NDK to develop mobile apps, and tools such as CGICC to develop Web applications. But using C + + to develop mobile apps and web programs is not a good choice, and not many people do it. C + + can do anything, but it doesn't mean anything can be done conveniently. Choose the right language tools to develop the appropriate programs, play out each language in various areas of advantage, and let the various languages division of labor is the king.
when I first stepped into the programming line, I chose the C + + language as the introductory language because I was interested in the underlying stuff and I was going to do the development of embedded, driver, virtual machine, operating system or server in the future. In the process of developing a program using C + +, it is very refreshing to deal with the system APIs, or even to break through the limitations of the system API to realize that the operating system does not publicly provide some functionality. Then there was a time when I used C # to rewrite some of my C + + programs, because. NET doesn't provide the features I want (after all, C # is the language Microsoft uses to compete with Java. NET encapsulation of the most is those for the upper application development of the tool library, need a lot of calls to the system API, originally in C + + down with the system API sentence can be done, in C # need to declare a lot of things, also need to deal with type conversion and pointer problems, the result is the code is very disgusting, Since then there has been little affection for C #. But C # gave me a good impression that the upper tool library is very rich and simple to call. Before writing some XX management system of the work, with WinForm, Hua, all kinds of control to drag, change a property, related to the database, the basic code is not written, the data in the database is displayed in the form of a report on the interface. But I am the XX management system, such as enterprise application development and Web program development and other applications development of the upper layer of interest is not very interesting, so did not continue to study C # in depth.
in other words, although the ability to develop in the low-level C + + is very strong, but it is not a development of high efficiency of the language, after all, running efficiency and development efficiency can not be combined, there is nothing to take into account all aspects of perfection, everything has to lose. In this way, when writing some of the functions of the upper layer, it feels very tired, very difficult, so always want to find a language and C/ So I opened the list of recent programming languages and looked from top to bottom to see what language to choose as my second programming language is better. First is Java and C #, because of the mobile application development, enterprise application development, Web development interest is not very much, so rejected, and there seems to be no number of areas need Java and C/s cooperation or C/C + + cooperation. Then Objective-c, who was not ready for mobile app development, including mobile app development for iOS, was rejected. Next is Python, Javascript, PHP, vb.net, vb,vb do not say, VB. NET is not as good as C #, PHP and JavaScript are mainly used for web development, so they are rejected, the top 10 of the language is only Python. 10 of the later languages were rejected because the people who used them were not many (not fire), or domain-specific languages, or because I ignorant not know them.
A cursory look at Python, a scripting language, is much more efficient than C #, Java, and a good combination of C + +, which is certainly the best choice. Before listening to a lot of people recommend Python, always thought is not a script language, scripting language so much, Python can go where the cow, and read the Python code, found that the grammar and C syntax There are many differences, feel strange, there is no way to understand too much. Later found a few books to learn, and to write the code, suddenly surprised, the original program can also be written so simple and convenient.
do it now. Write a small program to crawl data from the web, and then compare it with the same function that was written in C + +, immediately after sucking a mouthful of coolness, the crotch suddenly wet--scare urine. This Python code is posted here, only more than 10 lines to achieve the Internet from the Love password crawl the latest released daily shared Thunderbolt member account password.
Import Urllib.requestimport redef get_member (): response = Urllib.request.urlopen ("http://521xunlei.com/ portal.php ") html = str (response.read (), ' GBK ') result = Re.search (R ' <div id=" portal_block_62_content "[\s \s]+?<a href= "(thread.+?)" [\s\s]+?</li> ', HTML) if result is none: return none url = R ' http://521xunlei.com/' + result.group (1) response = urllib.request.urlopen (URL) html = str (response.read (), ' GBK ') pattern = Re.compile (?: Account Sharing Thunder member account Sharing | Thunderbolt number | Thunderbolt Account | Thunder Share number | Thunderbolt member account | shared account) ([a-za-z0-9]+?:[12]). *? (?: share password | password share | initial password) (.+?) (?:<br|</font>) ' result = Pattern.findall (HTML) if len (result) = = 0: return None else: return Result
I've also written a web crawler in C + +, It was a great effort. The first is the need for an HTTP protocol to access the site's library, the C + + standard library and the boost library are not. Find it online, there is an open source cross-platform Libcurl available, but as with most third-party libraries in C + +, these libraries will always have problems compiling on the VC compiler. Toss the most of the day finally compiled successfully, look at the document, all in English, and not very detailed, and Libcurl API does not seem to support Unicode, in short, with a variety of uncomfortable. Later instead of using Windows to provide WinInet and WinHTTP, although it feels good to use, but after all, is the C language API, and does not provide easy-to-use interface, send a GET request will have to write more than 10 lines of code. It took a few days to wrap the WinHTTP into a class that provides an easy-to-use interface to send Get/post requests, which makes them feel convenient. However, the HTML text of the page gets to how to extract the data from it. The beginning of the thinking is the HTML parser, trying to use the third-party HTML parsing library and Microsoft provided by the MSHTML, the various disgusting things encountered, no longer mentioned, in another article has been mentioned. In order to use MSHTML, it can be said that the basic COM technology and ATL have learned how long it took to say no. And then it took me a few more days. The basic functionality of MSHTML is encapsulated into a class that can read the contents of an HTML element in a short statement like JavaScript. Summing up, in order to write a web crawler in C + +, a total of two classes were written to encapsulate WinHTTP and mshtml, there are some easy-to-use conversion character sets of functions and gadgets, a total of more than 3,000 lines of code. Doing these preparations, consulted a lot of information, learned a lot of technology, took a long long time. Then can be more pleasant to write web crawler, but this time is very tired ...
Later saw others with Python in conjunction with the regular expression of the Web crawler, he also followed the test, it is very simple and enjoyable. The reason for writing is very cool, there are mainly three points: 1. The Python syntax is concise and the code is written 2 short. Python standard library is very powerful, all kinds of modules have, especially Web Access, text processing is its strengths 3. For a directed web crawler, it is not necessary to use the HTML parser, using regular expressions can also achieve the purpose, and the required code is small, In addition, the Group capture function in regular expressions is quite useful, and it captures the desired data in the entire HTML text directly once.
In fact, Python let me most pleasantly surprised is that its standard library provides a lot of things, the function is more comprehensive, I want to have everything, and these libraries are simple to call, just import a bit, the third-party library can be simply easy_install to install up. Third-party libraries in C/s + + are more rich, but compilation, integration is troublesome, especially in the VC compiler on the Windows platform to compile, in all likelihood to jump out a lot of compilation errors, which for our novice such a headache, is a disaster. And all kinds of libraries like to do a set of data types, such as STL/MFC/QT and other libraries have their own string type, as well as character set problems, some libraries do not support Unicode, and some only support UTF8 and so on, want to integrate a variety of libraries into their own program is very troublesome. Python does not need to tube these disgusting things, the integration of various libraries of the disgusting work has been arranged by others, you just need to hand over to use.
Before saying, no language is omnipotent, Python is not doing anything good, such as the program GUI. The above 10 lines of code, although the implementation of the Thunderbolt account crawl, but if you need to add a graphical interface to the program. Python is not so easy to use, most of Python's GUI libraries are calls to other languages of the GUI library, such as Python can also call the Win32 API and MFC to build the GUI, but certainly not native C/s is convenient to call. And Python does not seem to have easy-to-use visual GUI design tools. Plus I was just in touch with Python, and I'm not familiar with Python's GUI development, so I decided to build a GUI with my familiar MFC, and then let C + + call the above 10 lines of Python code to complete a "Thunderbolt account capture." Learn how C + + and Python interact, by the way.
The program code is not posted, this article is mainly about the use of Python feelings. After I finished calling Python's program in C + +, I found a very exciting thing to do. That is, MFC write a good graphical interface, the EXE after the completely do not need to change. The MFC program just calls a Python function, the function returns an array of account passwords, and the MFC program displays these account passwords in the list box. As long as the interface provided by Python to the MFC program is not changed, the MFC program will never need to be modified and will not need to be recompiled. Since the obtained Thunderbolt account must be an account and an array of passwords, it is guaranteed that the interface will not change. If after the "Love password" site revision, the code to collect data is not valid (this kind of thing is very easy to happen), or is no longer in the "Love password" to collect data on the Internet, but instead to other sites to collect data, only need to modify the Python code, will not affect the MFC program, no need to change the MFC program. This is the advantage of modular programming. But module this thing, many people think is generally a DLL. such as this program, if you use C + + to implement data acquisition function and encapsulated in a DLL to the MFC program calls, later if you need to change the code to collect data, just recompile the DLL, will not affect the MFC write EXE. In fact, the scripting language can also be used as modules, called script modules. The advantages of the script module are more obvious, modify the time with a text editor to open the change to save a bit OK. When users use your software, it is much easier to update the textual script module than to update the binary DLL module if your software needs to be updated.
When it comes to the interaction between C + + and Python, the most basic approach is to invoke the C language API provided by the Python virtual machine. Because the Python virtual machine is written in C, the official also provides a rich API that enables C/s + + to access almost everything in Python. But this set of APIs is not easy to call. For example, to invoke the Get_member () method in the Python code above to get the Thunderbolt account, the return value of Get_member () is a list, each element of this list is a tuple, each tuple holds two strings, one is the account one is the password. The code for the call is as follows:
Pyobject *module_name = Py_buildvalue ("s", "Thunder"); Pyobject *module = Pyimport_import (module_name); Pyobject *function_dict = pymodule_getdict (module); Pyobject *function = pydict_getitemstring (function_dict, "Get_member"); Pyobject *result_list = pyobject_callobject (function, NULL), int result_num = Pylist_size (result_list), for (int i = 0; I &l T Result_num; i++) {Pyobject *member_tuple = Pylist_getitem (result_list, i); Wcout << Pyunicode_asunicode (Pytuple_getitem ( Member_tuple, 0)) << l "\ T" << Pyunicode_asunicode (Pytuple_getitem (member_tuple, 1)) << L "\ n";} Py_decref (module);
It is cumbersome to call, but also to pay attention to the issue of resource release. A python call to Python would be very concise (as if it were nonsense):
Import Thunderresult = Thunder.get_member () for ITER in result: print (iter[0] + ' \ t ' + iter[1] + ' \ n ')
But the exciting thing is that Boost.python changed the situation, and Boost.python makes it much easier for C + + to call Python or Python to call C + +. Use Boost.python to invoke the following:
Object result = Import ("Thunder"). attr ("Get_member") (); for (int i = 0; i < len (result); i++) {cout << string (extra Ct<string> (result[i][0]) << "\ T" << string (extract<string> (result[i][1])) << "\ n";}
It can be seen that boost.python to the Python function get_member (), a line of code is completed, and the use of smart pointer technology, do not need to control the release of resources. The code is a lot simpler to write, in fact, many libraries in boost make C + + code more concise. This is thanks to C + + powerful template technology (known as the compile-time polymorphism), and boost the developers of the great skill, the template to play a lively, superb, if you read the source code of the boost, you will feel very admire. C + + is inherently a strongly typed static language, but boost uses templates and operator overloading techniques to make it feel like C + + seems to be a weak type of dynamic language when you use some libraries such as Boost.python. For example, object in Boost.python wraps everything in Python, an object can be a module in Python, a Python object, a function, a string, an integer variable, a list, An instance of any type, such as a tuple or a dict. All types are weakened, with only one type: object. And boost overloads the object's operator () and operator[] so that an object can be used as an array, as a list, as a tuple, as a dict, or as a function call, and not at compile time to care if it is an array, is not a list, is not a function, everything is put into the runtime before deciding. And it's also syntactically close to scripting language, very easy to use. This is really a great artifact ah, with the help of boost.python,c++ and Python cooperation development is very convenient.
A preliminary study of Python