PHP: create Baidu Dictionary word query collector _ PHP Tutorial

Source: Internet
Author: User
Tags array example
PHP makes Baidu Dictionary word query collector. This article describes how to use PHP to create a Baidu Dictionary word query collector, if you need it, you can refer to the collection of samples written by Baidu dict to create a Baidu Dictionary word query collector using PHP.

This article describes how to create a keyword query collector for Baidu Dictionary using PHP. For more information, see

Samples collected by Baidu dict

The collection of all the result data after Baidu dict dictionary translation, of course, comes with the 13.5w word library and simple collection cases. here I will write the main class dict. class. php is released. the Project address is http://github.com/widuu/baidu_dict. if you have a specific direct fork, you can get it ~ Some of them are used by very few people, so some useful brothers have taken them away ~

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

/**

* Dict. class. php collects Baidu Dictionary translation content

*

* @ Copyright (C) 2014 widuu

* @ License http://www.widuu.com

* @ Lastmodify 2-2-15

*/

Header ("content-type: text/html; charset = utf8 ");

Class Dict {

Private $ word;

// Number of displayed items

Private static $ num = 10;

Public function _ construct (){}

/**

* Public return methods for Baidu data collection

* @ Param string: an English word.

* Retun array (

* Symbol "=> phonetic symbols

* "Pro" => pronunciation

* "Example" => example

* "Explain" => concise interpretation

* "Synonym" => same antonym

* "Phrase" => phrase array

*)

*

*/

Public function content ($ word ){

$ This-> word = $ word;

$ Symbol = $ this-> Pronounced ();

$ Pro = $ this-> getSay ();

$ Example = $ this-> getExample ();

$ Explain = $ this-> getExplain ();

$ Synonym = $ this-> getSynonym ();

$ Phrase = $ this-> getPhrase ();

$ Result = array (

"Symbol" => $ symbol, // phonetic symbol

"Pro" => $ pro, // pronunciation

"Example" => $ example, // example

"Explain" => $ explain, // concise interpretation

"Synonym" => $ synonym, // same antonym

"Phrase" => $ phrase // phrase array

);

Return $ result;

}

/**

* Remote retrieval of Baidu translation content

* Get function curl

* Retun string

*

*/

Private function getContent (){

$ Useragent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv: 23.0) Gecko/20100101 Firefox/23.0 ";

$ Ch = curl_init ();

$ Url = "http://dict.baidu.com/s? Wd = ". $ this-> word;

Curl_setopt ($ ch, CURLOPT_URL, $ url );

Curl_setopt ($ ch, CURLOPT_USERAGENT, $ useragent );

Curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, TRUE );

Curl_setopt ($ ch, CURLOPT_FOLLOWLOCATION, 1 );

Curl_setopt ($ ch, CURLOPT_HTTPGET, 1 );

Curl_setopt ($ ch, CURLOPT_AUTOREFERER, 1 );

Curl_setopt ($ ch, CURLOPT_HEADER, 0 );

Curl_setopt ($ ch, CURLOPT_TIMEOUT, 30 );

$ Result = curl_exec ($ ch );

If (curl_errno ($ curl )){

Echo 'errno'. curl_error ($ curl );

}

Curl_close ($ ch );

Return $ result;

}

/**

* Get Baidu translation pronunciation

* Retun array (English, US)

*

*/

Private function Pronounced (){

$ Data = $ this-> getContent ();

Preg_match_all ("/\" EN \-US \ "\> (. *) \ <\/B \>/Ui", $ data, $ pronounced );

Return array (

'En' => $ pronounced [1] [0],

'Us' => $ pronounced [1] [1]

);

}

/**

* Get Baidu translation pronunciation

* Return array (English, US)

*

*/

Private function getSay (){

$ Data = $ this-> getContent ();

Preg_match_all ("/url = \" (. *) \ "/Ui", $ data, $ pronounced );

Return array (

'En' => $ pronounced [1] [0],

'Us' => $ pronounced [1] [1]

);

}

/**

* Get Baidu translation example

* Return array () multi-dimensional array example

*

*/

Private function getExample (){

$ Str = "";

$ Data = $ this-> getContent ();

Preg_match_all ("/var example_data = (. *) \] \;/Us", $ data, $ example );

$ Data1 = "[[[". ltrim ($ example [1] [0], "[");

$ Data2 = explode ("[[", $ data1 );

$ Num = count (array_filter ($ data2 ));

Foreach ($ data2 as $ key => $ value ){

$ Data3 = explode ("[[", "[". $ value );

Foreach ($ data3 as $ k => $ v ){

Preg_match_all ("/\ [\" (. *) \ ",/Us", "[". $ v, $ match );

If (! Empty ($ match [1]) {

$ Str. = implode ($ match [1], ""). "@";

}

}

}

$ Data4 = trim ($ str ,"@");

$ Data5 = explode ("@", $ data4 );

$ Result = array_chunk ($ data5, 2 );

Return $ result;

}

/**

* Get concise interpretations

* Return array (x => "part of speech", B => "affiliated ")

*

**/

Private function getExplain (){

$ Data = $ this-> getContent ();

Preg_match_all ("/id \ = \" en \-simple \-means \ "\> (.*)\ /Us ", $ data, $ explain );

$ R_data = $ explain [1] [0];

Preg_match_all ("/\ \ (? P. *) \ <\/strong \> \(? P . *) \ <\/Span \ >\< \/p \>/Us ", $ r_data, $ a_data );

Preg_match_all ("/\(? P [^ \>] + )\:\(? P . *) \ <\/A \ >\< \/span \>/Us ", $ r_data, $ B _data );

$ Result = array ();

Foreach ($ a_data ["adj"] as $ key => $ value ){

$ Result [$ value] = $ a_data ["name"] [$ key];

}

$ Word_ B = array ();

Foreach ($ B _data ["tag"] as $ key => $ value ){

$ Word_ B [$ value] = strip_tags ($ B _data ["word"] [$ key]);

}

$ Result_data = array ("x" => $ result, "B" => $ word_ B );

Return $ result_data;

}

/**

* Getting synonyms

* Return array (0 => "synonym", 1 => "antonym") is generally a multi-dimensional array.

*

*/

Private function getSynonym (){

$ Data = $ this-> getContent ();

Preg_match_all ("/id = \" en \-syn \-ant \ "\> (.*) /Us ", $ data, $ synonym );

$ Content = $ synonym [1] [0];

$ Data1 = explode ("", $ Content );

$ Result = array ();

$ Data2 = array ();

Foreach ($ data1 as $ key => $ value ){

Preg_match_all ("/\ (? P. *) \; \ <\/strong \> \ <\/p \> \ \ (? . *) \ <\/Ul \>/Us ", $ value, $ r_data );

$ Data2 [$ key] ["adj"] = $ r_data ["adj"];

$ Data2 [$ key] ["content"] = $ r_data ["content"];

}

Foreach ($ data2 as $ key => $ value ){

Foreach ($ value ["content"] as $ k => $ v ){

If (! Empty ($ v )){

Preg_match_all ("/\ \ (? P . *) \ <\/P \> (? P . *) \ <\/Li>/Us ", $ v, $ v_data );

Foreach ($ v_data ['title'] as $ m => $ d ){

$ Data = strip_tags (preg_replace ("<>", "", $ v_data ["value"] [$ m]);

$ Result [$ key] [$ value ["adj"] [$ k] [$ d] = $ data;

}

}

}

}

Return $ result;

}

/**

* Get phrase

* Return array (key => value) one-dimensional or multi-dimensional array

*

*/

Private function getPhrase (){

$ Num = self: $ num;

$ Data = $ this-> getContent ();

Preg_match_all ("/id = \" en \-phrase \ "\> (.*)\

/Us ", $ data, $ phrase );

$ Data = explode ("", $ Phrase [1] [0]);

$ Data1 = array_slice ($ data, 0, $ num );

$ Result = array ();

Foreach ($ data1 as $ key => $ value ){

$ Data2 = explode ("

", $ Value );

$ N = count ($ data2 );

If ($ n <= 3 ){

$ Result [str_replace ("", "", strip_tags ($ data2 [0])] = strip_tags ($ data2 [1]);

} Else {

$ Data3 = array_slice ($ data2, 0, $ n-1 );

$ Data4 = array_slice ($ data2, 0, 2 );

$ Res = array_diff ($ data3, $ data4 );

$ Data5 = array_chunk ($ res, 2 );

$ Key_value = trim (str_replace ("", "", strip_tags ($ data4 [0]);

$ Result [$ key_value] = strip_tags ($ data4 [1]);

Foreach ($ data5 as $ key => $ value ){

Foreach ($ value as $ k => $ v ){

$ Value [$ k] = strip_tags ($ v );

}

$ Array = array ($ result [$ key_value], $ value );

If (array_key_exists ($ key_value, $ result )){

$ Result [$ key_value] = $ array;

}

}

}

}

Return $ result;

}

/**

* Convert an array to a string

*

* @ Param array $ data array

* @ Param bool $ isformdata if it is 0, new_stripslashes is not used for processing. optional. the default value is 1.

* @ Return string returns a string. if data is null, null is returned.

*/

Private function array2string ($ data, $ isformdata = 1 ){

If ($ data = '') return '';

If ($ isformdata) $ data = $ this-> new_stripslashes ($ data );

Return addslashes (var_export ($ data, TRUE ));

}

/**

* Returns the string or array processed by stripslashes.

* @ Param $ string the string or array to be processed

* @ Return mixed

*/

Private function new_stripslashes ($ string ){

If (! Is_array ($ string) return stripslashes ($ string );

Foreach ($ string as $ key => $ val) $ string [$ key] = $ this-> new_stripslashes ($ val );

Return $ string;

}

}

// $ Word = new dict ("express ");

// $ Word-> content ();

The above is all the content of this article. it is very practical and I hope my friends will like it.

This article describes how to use PHP to create a keyword query collector for Baidu Dictionary. For more information, see the collection of Baidu dict samples...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.