Proficient in Python crawlers from scrapy to mobile apps (end-of-text benefits)

Source: Internet
Author: User
Tags to domain

I can hear people screaming: "What is Appery.io, a dedicated platform for mobile apps, and what does it have to do with scrapy?" "Then, seeing is believing." You may also be impressed by the scenes that were presented to a person (friend, manager, or Customer) on an Excel spreadsheet a few years ago. But now, unless your audience is very sophisticated, their expectations are likely to be different. In the next few pages, you'll see a simple mobile app, a minimal visualization that can be created with just a few clicks, to communicate the power of the extracted data to stakeholders, and return to the ecosystem to show the value it can bring in the form of web traffic from the source website.

I will try to keep a short heuristic example, where they will show how to make the most of your data.

1.1 Choosing a mobile app framework

Providing data to mobile apps with the right tools is a very easy thing to do. There are many excellent cross-platform mobile application development frameworks, such as PhoneGap, appcelerator with Appcelerator cloud services, JQuery Mobile and Sencha Touch.

This article will use Appery.io because it allows us to quickly create iOS, Android, Windows phone, and HTML5 mobile apps using PhoneGap and jquery mobile. Scrapy and I have no interest in Appery.io. I would encourage you to investigate independently and see if it fits your needs in addition to the features presented in this article. Please note that this is a paid service and you can have a 14-day trial period, but it seems to me that it allows people to quickly develop prototypes without having to move their brains, especially for those who are not experts in the network to pay for it. The main reason I chose this service is that it provides both mobile and backend services, meaning we don't need to configure the database, write rest APIs, or use some other language for the server and mobile apps. You will see that we do not have to write a line of code! We will use their online tools, and at any time you can download the app and use all the features of PhoneGap as a PhoneGap project.

In this article, you need to connect to an Internet connection to use Appery.io. It is also important to note that the layout of the site may change in the future. Please take our screenshot as a reference, and don't be surprised to find that the site doesn't look the same.

1.2 Creating databases and collections

The first step is to sign up for the free Appery.io scenario by clicking the Sign-up button on the Appery.io website and choosing the free option. You'll need to provide your username, email address and password, and you'll create a new account. Wait a few seconds before the account completes activation. You can then log in to the Appery.io dashboard. Now, start preparing to create a new database as well as the collection, shown in 1.1.

Figure 1.1 Creating a new database and collection using Appery.io

In order to complete this operation, you need to follow the steps below.

1. Click the Databases tab (1).

2. Then click the green Create New Database (2) button. Name the new database Scrapy (3).

3. Now click the Create button (4). The dashboard for the Scrapy database is automatically opened, where you can create a new collection.

In the Appery.io terminology, a database is made up of a set of collections. Roughly speaking, an app uses a separate database (at least initially), and each database contains multiple collections, such as users, properties, messages, and so on. Appery.io has provided a users collection by default, which includes user names and passwords (they have many built-in features). Figure 1.2 shows the process of creating a collection.

Figure 1.2 Creating a new database and collection using Appery.io

Now, we add a user, username root, password is pass. Of course, you can also choose a more secure user name and password. To do this, click the Users Collection (1) in the sidebar, and then click +row to add the user/row (2). Enter the user name and password (3) and (4) in the two fields that appear.

We also need to create a new collection that stores the property data crawled by the scrapy and names the collection as properties. You can create a new collection by clicking the green Create New Collection button (5), naming it as properties (6), and then clicking the Add button (7). Now, we also have to do some customization of the collection. Click +col to add the database column (8). Each database column has its type, which is used to validate the value. Most fields are simple string types except that the price is a numeric type. We will add several columns (8) by clicking +col, and populate the column name (9), if it is not a string type, you also need to select the type (10) and click the Create Column button (11). Repeat the procedure 5 times to create the columns shown in table 1.1.

Table 1.1

At the end of the collection creation, you should have created all the required columns, as shown in table 1.1. Now you are ready to import some data from the scrapy.

1.3 Populating the database with Scrapy

First, we need an API key. We can find it in the Settings tab (1). Copy the value (2), and then click the Collections tab (3) to return to the property collection, as shown in procedure 1.3.

Figure 1.3 Creating a new database and collection using Appery.io

Very good! Now you need to import the data into Appery.io. We will first copy the project and the crawler named Easy (easy.py) and rename the crawler to Tomobile (tomobile.py). Also, edit the file and set its name to Tomobile.

One of the problems you may have noticed here is that the Web server (http://web:9312) used in the previous section is not in use, but instead uses a publicly available copy of the site, which I have stored on the http://scrapybook.s3.amazonaws.com. This approach is used in this article because it makes it easy to share apps by making both the picture and the URL publicly available.

We will use the Appery.io pipeline to insert the data. The Scrapy pipeline is typically a small Python class that has the ability to post-process, clean, and store scrapy item. For now, you can install it using Easy_install or PIP, but if you're using our vagrant dev machine, there's no need to do anything because we've already installed it.

Or

At this point, you need to make some minor changes to the Scrapy master settings file, adding the previously copied API key. Now all we have to do is add the following line to the properties/settings.py file.

Don't forget to replace apperyio_db_id with your API key. Also, you need to make sure that the user name and password in the settings are the same as the one you used when you created the database user in Appery.io. To populate the Appery.io database with data, start scrapy crawl as you normally would.


This time the output will be somewhat different. As you can see in the first few lines, one line is used to enable the Apperyiopipeline item pipeline, but most obviously, you will find that despite fetching 100 item, there are 200 requests/responses. This is because the Appery.io pipeline performs an additional request to the Appery.io server for each item to write to each item. These requests with the Api.appery.io URL will also appear in the log.

When you return to Appery.io, you can see that the data (2) is already populated in the Properties collection (1), as shown in 1.4.

Figure 1.4 Populating the Properties collection with data

1.4 Creating a mobile app

Creating a new mobile app is easy. We just click on the Apps tab (1) and click the Green Create New App button (2). Fill in the Application name as properties (3) and click the Create button to make it, as shown in procedure 1.5.

Figure 1.5 Creating a new phone app and database collection

1.4.1 Creating a Database access Service

There may be a number of options when creating a new app. Using the Appery.io app editor, you can write complex applications, but we'll keep things simple as much as possible. The first thing we needed was to create a service that would allow us to access the Scrapy database from the app. To achieve this, you need to click the rectangle's Green button create NEW (5) and select Database Services (6). A new dialog box pops up, allowing us to select the database you want to connect to. Select the Scrapy database (7). Most of the options in this menu will not be used, now just click to expand the Properties area (8) and select list (9). In the background, it will write code for us so that the data we crawl with scrapy can be used on the network. Finally, click the Import Selected Services button to finish (10).

1.4.2 Creating the user interface

The following will start creating all of the visual elements of the app, which will be implemented using the Design tab in the editor, as shown in 1.6.

Figure 1.6 Creating the user interface

From the tree on the left side of the page, expand the Pages folder (1), and then click Startscreen (2). The UI editor will open the page, where we can add some controls. Use the editor below to edit the title so you can become more familiar with it. Click the header heading (3), and you'll notice that the property area on the right side of the screen changes to the display Caption property, which contains a text property that modifies the property value to the Scrapy App, and the title in the middle of the screen is updated accordingly.

Then, you need to add a grid component and drag the grid control from the left panel (5). The control has two lines, and according to our requirements, only one line is required. Select the grid you just added. When the thumbnail area (6) at the top of the phone view is dimmed, you know that the grid has been selected. If it is not selected, click the grid to select it. The property bar on the right is then updated to the properties of the grid. Just set the Rows property to 1, then click Apply (7) and (8). The grid will now be updated to have only one row.

Finally, drag some other controls into the grid. First you add a picture control (9) to the left of the grid, then add a link (10) to the right of the grid, and finally add a label (11) Underneath the link.

As far as layout is concerned, this is sufficient. The next step is to enter data from the database into the user interface.

1.4.3 mapping data to the user interface

So far, we've spent a lot of time in the Design tab to create visualizations of the app. In order to link the available data to these controls, you need to switch to the Data tab (1), as shown in 1.7.

Figure 1.7 Mapping data to the user interface

Select Service (2) as the data source type. Because the service you created earlier is the only service available, it is automatically selected. You can then continue clicking the Add button (3), at which point the service properties will be listed below. As soon as you press the Add button, you'll see events like before send and success. We can customize what the service will do after a successful call by clicking the Mapping button behind success.

This opens the Mapping Action Editor, where we can complete the connection. The editor has both sides. On the left is the field available in the service response, and on the right you can see the properties of the UI control that you added in the previous step. There is a expand all link on both sides, which you can click to see all available data and controls. Next, you need to drag from the left to the right by following the 5 mappings given in table 1.2.

Table 1.2

1.4.4 mapping between database fields and user interface controls

The number of items in table 1.2 may be slightly different from your situation, but because each control has only one, the likelihood of an error is very small. By setting these mappings, we notify Appery.io to write all the code in the background to load the control with values from the database when the database query succeeds. Below, you can click the Save and Return button (6) to continue.

The Data tab is now back, as shown in 1.7. Because you also need to return to the UI editor, you need to click the Design tab (7). At the bottom of the screen, you will find an events area (8) that has just been launched, although the area has been in existence. In the events area, we let Appery.io do something as a response to UI events. This is the last step we need to take. It causes the app to invoke the service to retrieve the data immediately after the UI is loaded. To achieve this, we need to select Startscreen as the component and keep the event as the default load option. Then select Invoke Service as the action, and keep DataSource as the default Restservice1 option (9). Finally, click Save (10), and that's all we've done to create this mobile app.

1.4.5 test, share and export your mobile app

Now, you can test the application. All we need to do is click the Test button (1) at the top of the UI builder, as shown in 1.8.

Figure 1.8 Mobile app running in your browser

The mobile app will run in the browser. These links are valid (2) and can be browsed. You can preview different phone screen schemes and device orientations, or you can click the View on Phone button, and a QR code will be displayed, and you can scan the QR code using your mobile device and preview the app. You just need to share their generated links, and others can try the app in their browser.

With just a few clicks, we can organize the data captured by Scrapy and display it in a mobile app. If you need to further customize the app, you can refer to the tutorial provided by Appery.io with the URL http://devcenter.appery.io/tutorials/. When everything is ready, the app can be exported via the Export button, and Appery.io offers a very rich export option, as shown in 1.9.

Figure 1.9 You can export your app to most major mobile platforms

You can export project files and develop them in your favorite IDE, or you can get binaries and publish them to each platform's mobile phone market.

1.5 Summary of this article

Using the Scrapy and Appery.io tools, we have a system that can crawl Web sites and be able to insert data into the database. In addition, we got the restful API, and a simple mobile app that can be used for Android and iOS. For advanced features and further development, you can go deeper into these platforms, outsource some of these development efforts to domain experts, or research alternatives. Now you only need the least number of encodings to have a minimal product that can demonstrate the application concept.

You'll notice that our app looks good in such a short development time. This is because it uses real data, not placeholders, and all links are available and meaningful. We have successfully created a minimum available product that respects its ecology (the source site) and feeds the value back to the source site in the form of traffic.

Now, we can start to learn how to use the Scrapy Crawler to extract data in more complex scenarios.

This digest is from the "Master Python crawler Framework Scrapy"

Mastery of Python crawler framework Scrapy
"Beauty" Dimitrios Caucis-Laucas (Dimitrios Kouzis-loukas) chopsticks

Click on cover to buy paper book please add link description

Python3 Scrapy Tutorial, a comprehensive analysis of the web crawler technology implementation principles, through the crawl sample to demonstrate the application of Scrapy, from the desktop to crawl to the mobile crawl, real-time crawl all the content.

This book explains the basics of Scrapy, discusses how to extract data from any source, how to clean up data, and how to use Python and third-party APIs for processing to meet your needs. This book also explains how to efficiently feed crawled data into databases, search engines, and stream data processing systems (such as Apache Spark). When you're done with this book, you'll get a feel for the data and apply it to your application.



In the "Async community" backstage reply "concern", you can get free 2000 online video courses; recommend friends follow the tips to get books link, free to get an asynchronous book. Come and join me!
Sweep the QR code above, reply to the "Attention" participation activities!
Read the original and buy the Python framework Scrapy

Proficient in Python crawlers from scrapy to mobile apps (end-of-text benefits)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.