The source code is as follows:
Assume that the webpage is test.html, and the content of Part Information in the last table is not fixed. it may be one or multiple rows.
What should I do if I want to capture the blue font? Find a solution.
Reply to discussion (solution)
Loop table tr, directly capture the td value
When the page itself returns data, is there Blue on it? If yes, then
|
Aaaaaa Aaaaaa |
Aaaaaa xxxx (aaaaaa) |
Aaaaaa xxxx |
Adress aaaaaa adress |
|
|
Delivery Schedule |
Planned arrival time |
PUS No. 770266110 version 00 |
Customer |
* DYNP-770266110-00 * |
|
|
Delivery Information |
Factory Plant |
Xxxxxx |
Pickup time Pick Up Time |
|
Supplier Feedback required Need Duns Response |
N |
Delivery date Delivery Date |
2013-09-16 |
Window time Window Time |
16: 30 |
Unloading port Dock |
CC-70D |
Unloading port owner Dock Incharger |
Kkk |
Unloading port number Dock Tel |
011-1111 |
Unloading port address Dock Address |
Adress |
Delivery location Delivery Place |
|
Scheduler tracker Follow Up |
Kkkk |
Planned tracker Phone/Fax FollowUp Tel/Fax |
011-1111 |
Delivery instructions Delivery Note |
|
Part Information Part list |
Serial Number |
Part number |
Part description |
Demand quantity |
Promised quantity |
Number of received instances |
Number of packages |
Number of bins |
Bin No. |
Real-Time bin No. |
Number of real-time bins |
Actual receiving bin No. |
Number of Real receiving bins |
Remarks |
1 |
12647212 |
|
60 |
60 |
|
15 |
4 |
P000000D |
|
|
|
|
|
2 |
12654172 |
|
615 |
615 |
|
15 |
41 |
P000000D |
|
|
|
|
|
'; $ Result = array (); preg_match_all (' # (. *) # iUus ', $ string, $ result); print_r ($ result [1]);
If there is no blue (id, class, and so on), then only the regular expression of all cells can be matched and retrieved according to the page structure.
If there is no blue (id, class, and so on), then only the regular expression of all cells can be matched and retrieved according to the page structure.
There is no color distinction, but it is identified by me.
$ S page content for you
preg_match_all('#
#isU', $s, $r);$r = array_map('trim', array_map('strip_tags', $r[0]));print_r($r);
Array ([0] => [1] => aaaaaa [2] => aaaaaa xxxx (aaaaaa) [3] => aaaaaa xxxx [4] => adress aaaaaa adress [5] => [6] => delivery schedule [7] => planned arrival time [8] => PUS No. 770266110 00 [9] => Customer [10] => * DYNP-770266110-00 * [11] => [12] => Delivery Information [13] => factory Plant [14] => xxxxxx [15] => pickup Time Pick Up Time [16] => 2013-09-09 [17] => supplier feedback Need Duns Response [18] => N [19] => Delivery Date [20] => 2013-09-16 [21] => Window Time [22] => [23] => unloading port dock [24] => CC-70D [25] => unloading port owner Dock Incharger [26] => kkk [27] => unloading port telephone Dock Tel [28] => 011- 1111 [29] => unloading port Address Dock Address [30] => adress [31] => Delivery location Delivery Place [32] => [33] => scheduler tracker Follow up [34] => KKKKK [35] => scheduler tracker Phone/Fax FollowUp Tel/Fax [36] => 011-1111 [37] => Delivery instructions Delivery Note [38] => [39] => Part Information Part list [40] => No. [41] => Part No. [42] => Part description [43] => required quantity [44] => promised quantity [45] => actual quantity [46] => Number of packages [47] => Number of bins [48] => Number of bins [49] => real-Time bin No. [50] => Number of real-time bins [51] => Number of real-time bins [52] => Number of real-time bins [53] => remarks [54] => 1 [55] => 12647212 [56] => [57] => 60 [58] => 60 [59] => [60] => 15 [61] => 4 [62] => P000000D [63] => [64] => [65] => [66] => [67] => [68] => 2 [69] => 12654172 [70] => [71] => 615 [72] => 615 [73] => [74] => 15 [75] => 41 [76] => P000000D [77] => [78] => [79] => [80] => [81] =>)
Isn't it difficult to read a certain item?
// The second table starts with the subscript 40, with 14 columns $ t = array_chunk (array_slice ($ r, 40), 14); for ($ I = 1; $ I
Array ([0] => Array ([serial number] => 1 [part number] => 12647212 [part description] => [quantity required] => 60 [quantity promised] => 60 [quantity received] => [number of packages] => 15 [number of bins] => 4 [number of bins] => P000000D [number of bins] => [number of bins actually delivered] => [actual receiving bin number] => [number of actual receiving bins] => [remarks] =>) [1] => Array ([serial number] => 2 [part number] => 12654172 [part description] => [quantity required] => 615 [quantity promised] => 615 [quantity received] => [number of packages] => 15 [number of bins] => 41 [number of bins] => P000000D [number of pallets] => [number of pallets] => [actual receiving bin number] => [number of actual receiving bins] => [remarks] => ))
Preg_match_all ('#
# IsU ', $ s, $ r );
What is this regular expression used? Thank you!
Preg_match_all ('#
# IsU ', $ s, $ r );
If some pages have different values, how can we find those items?
For example: [10] => * DYNP-770266110-00 *, sometimes [12] => * DYNP-770266110-00 *.
However, the values of the previous item are the same, except that the key values are different. For example, [9] => Customer.
That's your problem.
Generally, the text and data are always paired, and the description text is in front and the data is in the back
That's your problem.
Generally, the text and data are always paired, and the description text is in front and the data is in the back
If you are on the first floorThere is a table before, so array_combine will have a warning prompt.
Supplier Signature Carrier Signature Supplier Signature _____________ CarrierSignature _____________ |
Supplier Confirm Time Supplier confirmation time 13-09-10 |
Receiver Signature Receiver signature _______________ |
Date ______________ |
* ** End of page *** |
How can I filter out the table information?