Matlab crawler Web scraping with MATLAB 01--understanding basic Functions Webread

Source: Internet
Author: User
Tags in degrees scalar


Note: The following sections are from MathWorksWebread





Read content from a RESTful Web service





1.RESTful


REST represents a common architectural style of a Web service, a representational state transition (representational, transfer). The RESTful interface provides a standard HTTP method, such as GET, PUT, POST, or DELETE.



As REST becomes the default choice for most Web and Mobile applications, it's essential to understand its rationale.



Today, more than 10 years after it was presented, REST has become one of the most important WEB application technologies. As all technologies evolve towards the API, the importance of it is likely to continue to grow rapidly. Each major programming language now contains a framework for building RESTful WEB services. Similarly, it is important for WEB developers and architects to have a clear understanding of REST and restful services. This tutorial explains the REST architecture and then studies the details of using it to build a common API-based task.


1.1 What is REST


REST represents a representational state transition (representational, transfer), which is an architectural style for networked hypermedia applications. It is primarily used to build lightweight, maintainable, and scalable WEB services. REST-based services are called restful services. REST is not dependent on any protocol, but almost every RESTful service uses HTTP as the underlying protocol.



RESTful use of HTTP POST (create, update) data, read data, delete data. Use HTTP to implement CRUD (create, read, update, delete) operations.


1.2 RESTful service features:


Resources are used by each system. These resources can be images, video files, Web pages, business information, or anything that can be represented in a computer-based system. The purpose of the service is to provide a window to the client so that clients can access these resources. Service Architects and developers want these services to become easy to implement, maintain, scale, and scale. The RESTful architecture allows these, and even more. In general, RESTful services should have the following attributes and characteristics, which is what I want to describe in detail:


    • Model Representation (representations)
    • Message (Messages)
    • URIs
    • Consistent interface (Uniform interface)
    • (No status) stateless
    • Links between resources (links between resources)
    • Cache (Caching)
2. Syntaxdata = webread(url)data = webread(url,QueryName1,QueryValue1,...,QueryNameN,QueryValueN)data = webread(___,options)[data,colormap,alpha] = webread(___)[data,Fs] = webread(___)3. Description


Example



data = webread(url)urlreads the content from the specified Web service anddatareturns the content in.



WEB services provide a restful, which can return data that has an Internet media type format, such as JSON, XML, images, or text.



Example



data = webread(url,QueryName1,QueryValue1,...,QueryNameN,QueryValueN)Appends a query parameter to the method specified by one or more name-value pairs of group parametersurl. To put the query into the message body, use thewebwrite. The WEB service defines the query parameters.



Example



data = webread(___,options)AddsweboptionsoptionsAdditional HTTP request options specified by the object. You can use this syntax in conjunction with any of the input parameters in the preceding syntax.



To return the data as a specific output type, specifyoptionstheContentTypeproperties.



To use a function to read the contentoptions,ContentReaderSpecify the property as a handle to the function.webreaddownload the data from the WEB service and read the data using the specified function:


    • If you specify a function handle that returns more than one output parameter,webreadall output parameters are returned.

    • If you specify a function handle that does not return any output parameters, such as the image processing Toolbox for a video file? function@implay), nowebreadoutput parameters will be returned.





[data,colormap,alpha] = webread(___)urlreads an image from the specified Web service anddatareturns the image in. You can use the syntax above to return images only. Use this syntax to return the color map and alpha channel associated with the image.



If the HTTP response has a header field for the specified image media typeContent-Typeandimreadsupports the image format,webreadan image is returned. For supported image formats, see Supported import and Export file formats.



[data,Fs] = webread(___)urlreads the audio data from the specified Web service and returns thedatadata in. You can use the syntax above to return only audio data. Use this syntax to return the sample rate (in hertz) of the audio data.



If the HTTP response has a header field for the specified audio media typeContent-Typeandaudioreadthe audio format is supported, thewebreadaudio data is returned. For supported audio formats, see supported import and Export file formats.


4. Example4.1 reading images from a Web site


Read the Jupiter image from the Hubble Heritage website and display the image.


url = ‘http://heritage.stsci.edu/2007/14/images/p0714aa.jpg‘;
rgb = webread(url);
whos rgb
  Name         Size                 Bytes  Class    Attributes

  rgb       1000x800x3            2400000  uint8   


Adjusts and displays the image.


rgb = imresize(rgb,0.6);
imshow(rgb)





Jupiter Image Source: NASA, ESA, and Hubble Heritage Team (Stsci/aura). (for terms of use, see Hubble Heritage information Center.) )


4.2 reading data from the WEB services API


Read US temperature data from the World Bank climate data API. Draw a temperature map for 1901–2012 years.



Read data from the World Bank. This API returns data in the form of a JSON object.


api = ‘http://climatedataapi.worldbank.org/climateweb/rest/v1/‘;
url = [api ‘country/cru/tas/year/USA‘];
S = webread(url)
S = 

112x1 struct array with fields:

    year
    data


webreadConverts a JSON object to an array of struct-bodies. Each structure contains the year and the average temperature of the year in the United States (in degrees Celsius).



Displays the temperature for the first year.


S (1)
ans = 

    year: 1901
    data: 6.6187


Draws the average temperature map. SetsS.yearandS.dataconcatenates the arrays and draws them.


year = [S.year];
data = [S.data];
plot(year,data)
xlabel(‘Year‘);
ylabel(‘Temperature (Celsius)‘);
title(‘USA Average Temperatures‘)
axis tight





World Bank-provided APIs and data: Climate Data API. (See World Bank: Climate Data API for more information on APIs; see World Bank: Terms of Use.) )


4.3 specifying Web service Query Parameters


In file Exchange, search for files uploaded in the last seven days that contain the word Simulink .



Specifies the query parameters.webreadappends the WEB service query parameter name and value to the URL. File Exchange Web Service definitiontermanddurationquery parameters, notwebreadfunctions.


url = ‘http://www.mathworks.com/matlabcentral/fileexchange/‘;
data = webread(url,‘term‘,‘simulink‘,‘duration‘,7);


webreadReturns the HTML of the search results page as a character array.


4.4 Specifying request options


Specify an additional request option to read data to a character array through the World Bank climate data API.



Create anweboptionsobject and set itContentTypeto‘text‘.webreadfunction to convert a JSON object to a character array instead of a struct array. Displays the beginning of the character array.


api = ‘http://climatedataapi.worldbank.org/climateweb/rest/v1/‘;
url = [api ‘country/cru/tas/year/USA‘];
options = weboptions(‘ContentType‘,‘text‘);
data = webread(url,options);
data(1:62)
ans =

[{"year":1901,"data":6.6187487},{"year":1902,"data":6.4643273}


World Bank-provided APIs and data: Climate Data API. (See World Bank: Climate Data API for more information on APIs; see World Bank: Terms of Use.) )


4.5 reading data using a POST request


Send an HTTP POST request to search for files in file Exchange that contain the word Simulink that was uploaded in the last seven days .


url = ‘http://www.mathworks.com/matlabcentral/fileexchange/‘;
options = weboptions(‘RequestMethod‘,‘post‘);
data = webread(url,‘term‘,‘simulink‘,‘duration‘,7,options);


Many WEB services provide a POST method to request data in addition to the GET method.


4.6 Specifying a date and time as a query parameter


Read a December 2004 "Blue Marble:next Generation" image from the NASA Earth observation (NEO) Web Mapping Service.



Use andatetimeobject to specify the date of the requested image. SpecifiesDaFormatproperty that matches the format of the WEB service request.


url = ‘http://neowms.sci.gsfc.nasa.gov/wms/wms‘;
D = datetime(2004,12,01,‘Format‘,‘yyyy-MM-dd‘);
rgb = webread(url,‘Time‘,D, ...
     ‘Service‘,‘WMS‘,‘Layers‘,‘BlueMarbleNG-TB‘,‘CRS‘,‘CRS:84‘, ...
     ‘Format‘,‘image/jpeg‘,‘Height‘,256,‘Width‘,512, ...
     ‘BBOX‘,‘-180.0,-90.0,180.0,90.0‘,‘Version‘,‘1.3.0‘,‘Request‘,‘GetMap‘);
imshow(rgb)





webreadConvert thedatetimeobjects so that they can be used as values for WEB service query parameters. All name-value pairs in the example provide a query parameter specified by the NEO Web Mapping Service.



Blue Marble:next Generation + topography and bathymetry images are provided by NASA Earth Observatory. The NEO Web Mapping Service (WMS) provides access to images and services. (See NASA Earth observations for a list of acknowledgements and terms of use.) For WMS query parameters, search the NASA Earth observations website https://neo.sci.gsfc.nasa.gov/about/wms.php. )


5. Input parameters5.1   url-URL of the Web service
character Array


The URL of the Web service, specified as a character array. The Web service implements a RESTful interface. For more information, see RESTful.



Example:webread(‘http://www.mathworks.com/matlabcentral‘)reading a Web page and returning its HTML as a character array.


5.2   QueryName1,QueryValue1,...,QueryNameN,QueryValueN-Web Service Query Parameters
name-value pairs group


A WEB service query parameter that is specified as one or more name-value pairs for the group parameter.QueryNameparameter must specify the name of the query parameter. TheQueryValueargument must be a character array or a numeric, logical, or value that specifies the value of the query parameterdatetime. Values, logical values, anddatetimevalues can be placed in an array. The WEB service defines a group of name-value pairs that are accepted as part of the request.



When youQueryValuespecify as andatetimeobject, you must specify itsFormatproperties so that it is consistent with the format required by the WEB service. If theFormatproperty contains a time zone or offset, and thedatetimeobject does not set the time zone, it will bewebread‘Local‘specified as the time zone.



WhenQueryValueyou include multiple values in an array, you may want to specify the properties of theweboptionsobjectArrayFormatto encode the array in the manner specified by the Web service.



Example:webread(‘http://www.mathworks.com/matlabcentral/fileexchange/‘,‘term‘,‘webread‘)Retrieve a list of files uploaded to file Exchange that contain wordswebread.


5.3  options-Additional HTTP Request options
weboptionsObject


Other HTTP request options, specified asweboptionsobjects.



You can specify theweboptionsproperties of an objectContentTypeand pass the object as an input parameter towebread. It will thenwebreaddatareturn the output as that type. This table lists the valid content types that you can specify in theweboptionsobject.


ContentTypeThe set character

Output type

‘auto‘(default value)

Automatically determines the output type based on the content type.

‘text‘


Character vectors for content types:


text/plain
text/html
text/xml
application/xml
application/javascript
application/x-javascript
application/x-www-form-urlencoded


If the Web service returns a.mMATLAB file with an extension, the function returns the contents of the file as a character vector.


‘image‘

image/formatThe value or logical matrix of the content. If the first output parameter is an indexed image, the second output parameter is a color graph, and the third output parameter is an alpha channel.

For supported image formats, see Supported import and Export file formats.

‘audio‘

audio/formatThe value matrix of the content (the numeric scalar sample rate as the second output parameter).

For supported audio formats, see supported import and Export file formats.

‘binary‘

Binary contentuint8column vectors, where binary content refers to content that cannot be processed as achartype.

‘table‘

text/csva scalar Table object for the spreadsheet and CSV () content.

‘json‘

application/jsonContent,charnumeric, logical value, struct, or cell array.

‘xmldom‘

text/xmlorapplication/xmlthe content of the Java? Document Object Model (DOM) node. If not specified, the function returns the XML content as a character vector.

‘raw‘

‘text‘,‘xmldom‘and‘json‘the column vector for the contentchar. The functionuint8returns all other content types as a column vector.


Forweboptionsall request options as attributes, seeweboptions.


6. Output parameters




6.1  data-Content in the WEB service
Scalar | array | struct | table


The content read from the Web service is returned as a scalar, array, struct, or table.


6.2  colormap-Color map associated with an indexed image
array of Values


The color map associated with the indexed image, returned as a numeric array.


6.3  alpha-Alpha channel associated with an indexed image
array of Values


An Alpha channel associated with an indexed image, returned as an array of values.


6.4  Fs-Sample rate of audio data (in hertz)
positive Numeric scalar


The sampling rate (in hertz) of the audio data, returned as a scalar of positive values.





7. Tips
    • For features not supported by RESTful WEB service functions, see HTTP interfaces.

    • webreadSupports HTTP GET and POST methods. Many WEB services provide both GET and POST methods for requesting data. To send an HTTP POST request,optionsSpecify theRequestMethodproperty as‘post‘. However, thewebreadquery option is placed insteadurlof being placed in the body of the request message. To put the query into the message body, use thewebwrite.

    • For HTTP POST requests, thewebreadfunction supports onlyapplication/x-www-form-urlencodedmedia types. To send a POST request with the content of any other Internet media type, use thewebwrite.


Matlab crawler Web scraping with MATLAB 01--understanding basic Functions Webread


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.