Using regular expressions to turn HTML Web page data into Web Service

Source: Internet
Author: User
Tags bool expression html page regular expression tostring web services client
web| Data | web | The title of this is very simple, the Bank of China has a page to check the exchange rate of the day (http://www.bank-of-china.com/info/qpindex.shtml), but the traditional HTML format, It does not provide XML format or WebService queries. Now if you want other information systems to be able to read the data at any time, it is more convenient than the bank to provide a WebService interface for everyone to call, which is also a typical security WebService application. Unfortunately, the bank did not do so, can we do it ourselves? Of course, as long as the application analysis of its HTML page, then it is easy to read the data. Text analysis, of course, to see our "Regular Expression" (hehe, in fact, this is the real purpose of writing this program-the application of regular expressions. )

The page of the bank is similar to the following:

Date: 2004/09/30 valid until 2004/10/07


Currency name cash purchase rate rate base price
GBP 1488.1700
1453.1500
1492.6400


HK $105.9700
105.3300
106.2900
106.1100

USD 826.4200
821.4500
828.9000
827.6600

Swiss Franc 655.9300
641.1400
659.2200


Singapore Dollar 488.7600
477.2600
490.2300


Swedish krona 112.4900
109.8400
112.8300


Danish krone 136.5900
133.3700
137.0000


Norwegian krone 121.9500
119.0800
122.3100


Japanese Yen 7.4344
7.3785
7.4717
7.4519

Canadian Dollar 650.8000
635.4800
652.7600


Australian dollar 591.9900
578.6400
594.9600


EUR 1019.6400
1010.9600
1022.7000
1019.7000

Macau Dollar 103.2200
102.6000
103.5300


Philippine Peso 14.6700
14.3300
14.7200


Thai Baht 19.9000
19.4300
19.9600


New Zealand dollar 553.7000

555.3600




After the analysis of its code, given a regular expression, of course, this expression is not perfect, but for the current relatively fixed rate page of the BOC, there is no problem for the time being.

@ "<tr bgcolor= ' #\w+ ' ><td height= ' > (? <currency>.*) </td>\s*" +
@ "<td height= ' ><p align= ' right ' > (? <bankbuytt>\d*.? \d*) () +.? </td>\s* "+
@ "<td height= ' ><p align= ' right ' > (? <buynotes>\d*.? \d*) () +.? </td>\s* "+
@ "<td height= ' ><p align= ' right ' > (? <sell>\d*.? \d*) () +.? </td>\s* "+
@ "<td height= ' ><p align= ' right ' > (? <base>\d*.? \d*) () +.? </td>\s* "


Then the filter is very simple. I always thought the code was the best description, especially for the elegant language, because I didn't say much, and the code waited.

This is the code for the WebService page Foreignexchange.asmx:

Using System;
Using System.Collections;
Using System.ComponentModel;
Using System.Data;
Using System.Diagnostics;
Using System.Web;
Using System.Net;
Using System.Web.Services;
Using System.Xml;
Using System.Text;
Using System.Text.RegularExpressions;
Using System.IO;

Namespace Chinabank
{
<summary>
Summary description for Foreignexchange.
</summary>
[WebService (namespace= "http://dancefires.com/ChinaBank/")]
public class ForeignExchange:System.Web.Services.WebService
{
Public Foreignexchange ()
{
Codegen:this call are required by the ASP.net Web Services Designer
InitializeComponent ();
}

#region Component Designer generated code

Required by the Web Services Designer
Private IContainer components = null;

<summary>
Required to Designer support-do not modify
The contents is with the Code Editor.
</summary>
private void InitializeComponent ()
{
}

<summary>
Clean up any being used.
</summary>
protected override void Dispose (bool disposing)
{
if (disposing && components!= null)
{
Components. Dispose ();
}
Base. Dispose (disposing);
}

#endregion

[WebMethod]
Public XmlDataDocument getforeignexchangerates ()
{
return Getxmldoc ();
}
[WebMethod]
Public DataSet Getforeignexchangeratesdataset ()
{
Return Getxmldoc (). DataSet;
}
[WebMethod]
public string Getbankpage ()
{
Return getwebcontent ("http://www.bank-of-china.com/info/whjrpj.html");
}
Private methods
private string getwebcontent (string url)
{
using (WebClient client = new WebClient ())
{
byte[] buffer = client. Downloaddata (URL);
String str = encoding.getencoding ("GB2312"). GetString (buffer, 0, buffer. Length);
return str;
}
}
Private XmlDataDocument Getxmldoc ()
{
String webcontent = Getwebcontent ("http://www.bank-of-china.com/info/whjrpj.html");

Prepair for DataSet
DataSet ds = new DataSet ("Exchange");
DataTable dt = new DataTable ("Foreignexchange");
Ds. Tables.add (DT);
Dt. Columns.Add ("Currency", typeof (String));
Dt. Columns.Add ("Bankbuytt", typeof (Double));
Dt. Columns.Add ("Bankbuynotes", typeof (Double));
Dt. Columns.Add ("Banksell", typeof (Double));
Dt. Columns.Add ("Baseline", typeof (Double));
XmlDataDocument xmldoc = new XmlDataDocument (DS);

Regex expr = new Regex (
@ "<tr bgcolor= ' #\w+ ' ><td height= ' > (? <currency>.*) </td>\s*" +
@ "<td height= ' ><p align= ' right ' > (? <bankbuytt>\d*.? \d*) () +.? </td>\s* "+
@ "<td height= ' ><p align= ' right ' > (? <buynotes>\d*.? \d*) () +.? </td>\s* "+
@ "<td height= ' ><p align= ' right ' > (? <sell>\d*.? \d*) () +.? </td>\s* "+
@ "<td height= ' ><p align= ' right ' > (? <base>\d*.? \d*) () +.? </td>\s* "
, regexoptions.compiled);
for (Match m = expr. Match (webcontent); m.success; M=m.nextmatch ())
{
String key;
DataRow row = dt. NewRow ();
row["Currency"] = m.groups["Currency"];
Key = m.groups["Bankbuytt"]. ToString ();
row["Bankbuytt"] = key. Length > 0? Convert.todouble (key)/100:0;
Key = m.groups["Buynotes"]. ToString ();
row["bankbuynotes"] = key. Length > 0? Convert.todouble (key)/100:0;
Key = m.groups["sell"]. ToString ();
row["Banksell"] = key. Length > 0? Convert.todouble (key)/100:0;
Key = m.groups["Base"]. ToString ();
row["Baseline"] = key. Length > 0? Convert.todouble (key)/100:0;
Dt. Rows.Add (row);
}
return xmldoc;
}
}
}

It is also easy for the client to call directly when the corresponding WebService proxy is generated with WSDL, because I have the server side return the dataset, so the client directly displays the DataSet with a DataGrid The client has no technical key points on this issue.

Using System;
Using System.Threading;
Using System.Drawing;
Using System.Collections;
Using System.ComponentModel;
Using System.Windows.Forms;

Namespace Bankdataclient
{
<summary>
Summary description for Frmmainbankrates.
</summary>
public class FrmMainBankRates:System.Windows.Forms.Form
{
Private System.Windows.Forms.DataGrid dataGrid1;
Private System.Windows.Forms.Button Btnconnect;
Private System.Data.DataSet DS;
Private bankdataclient.com.dancefires.http://www.alixixi.com/program/a/www. Foreignexchange proxy = new bankdataclient.com.dancefires.http://www.alixixi.com/program/a/www. Foreignexchange ();
Private System.Windows.Forms.TextBox Txturl;
<summary>
Required designer variable.
</summary>
Private System.ComponentModel.Container components = null;

Public Frmmainbankrates ()
{
//
Required for Windows Form Designer support
//
InitializeComponent ();
Try
{
Txturl.text = system.configuration.configurationsettings.appsettings["url"];
Proxy. URL = Txturl.text;
}
catch (Exception)
{
Proxy. URL = "Http://www.dancefires.com/ChinaBank/ForeignExchange.asmx";
Txturl.text = proxy. URL;
}
}

<summary>
Clean up any being used.
</summary>
protected override void Dispose (bool disposing)
{
if (disposing)
{
if (Components!= null)
{
Components. Dispose ();
}
}
Base. Dispose (disposing);
}

#region Windows Form Designer generated code
<summary>
Required to Designer support-do not modify
The contents is with the Code Editor.
</summary>
private void InitializeComponent ()
{
THIS.DATAGRID1 = new System.Windows.Forms.DataGrid ();
This.ds = new System.Data.DataSet ();
This.btnconnect = new System.Windows.Forms.Button ();
This.txturl = new System.Windows.Forms.TextBox ();
((System.ComponentModel.ISupportInitialize) (THIS.DATAGRID1)). BeginInit ();
((System.ComponentModel.ISupportInitialize) (This.ds)). BeginInit ();
This. SuspendLayout ();
//
DataGrid1
//
This.dataGrid1.DataMember = "";
This.dataGrid1.DataSource = This.ds;
This.dataGrid1.HeaderForeColor = System.Drawing.SystemColors.ControlText;
This.dataGrid1.Location = new System.Drawing.Point (32, 48);
This.dataGrid1.Name = "DATAGRID1";
This.dataGrid1.Size = new System.Drawing.Size (480, 256);
This.dataGrid1.TabIndex = 0;
//
Ds
//
This.ds.DataSetName = "Exchange";
This.ds.Locale = new System.Globalization.CultureInfo ("ZH-CN");
//
Btnconnect
//
This.btnConnect.Location = new System.Drawing.Point (432, 16);
This.btnConnect.Name = "Btnconnect";
This.btnConnect.TabIndex = 1;
This.btnConnect.Text = "Connection";
This.btnConnect.Click + = new System.EventHandler (This.btnconnect_click);
//
Txturl
//
This.txtUrl.Location = new System.Drawing.Point (32, 16);
This.txtUrl.Name = "Txturl";
This.txtUrl.Size = new System.Drawing.Size (384, 20);
This.txtUrl.TabIndex = 2;
This.txtUrl.Text = "";
//
Frmmainbankrates
//
This. AutoScaleBaseSize = new System.Drawing.Size (5, 13);
This. ClientSize = new System.Drawing.Size (544, 318);
This. Controls.Add (This.txturl);
This. Controls.Add (This.btnconnect);
This. Controls.Add (THIS.DATAGRID1);
This. Name = "Frmmainbankrates";
This. Text = "Foreign Exchange Rates of Bank of";
((System.ComponentModel.ISupportInitialize) (THIS.DATAGRID1)). EndInit ();
((System.ComponentModel.ISupportInitialize) (This.ds)). EndInit ();
This. ResumeLayout (FALSE);

}
#endregion

private void Btnconnect_click (object sender, System.EventArgs e)
{
Updatedatagrid ();
}
private void Updatedatagrid ()
{
Try
{
btnconnect.enabled = false;
Txturl.readonly = true;
Proxy. URL = Txturl.text;
ds = Proxy. Getforeignexchangeratesdataset ();
DataGrid1.SetDataBinding (ds, "Foreignexchange");
Datagrid1.update ();
}
catch (Exception err)
{
MessageBox.Show (Err. message);
}
Finally
{
Txturl.readonly = false;
Btnconnect.enabled = true;
}
}
[STAThread]
static void Main (string[] args)
{
Application.Run (New Frmmainbankrates ());
}
}
}

With this example, you should be able to learn the basics of XML, WebService, Regular Expression, Datasets, and DataGrid.

All code for the software and related screenshots can be obtained from the following connections:

http://www.dancefires.com/ChinaBank/





Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.