Tutorial on using the mechanism in Ruby: rubymechanic
In Ruby, web page capturing is usually implemented using the mechanism, which is very simple to use.
Install
Copy codeThe Code is as follows:
Sudo gem install mechanic
Capture webpages
Copy codeThe Code is as follows:
Require 'rubygems'
Require 'delete'
Agent = mechanic. new
Page = agent. get ('HTTP: // google.com /')
Simulate click events
Copy codeThe Code is as follows:
Page = agent. page. link_with (: text => 'News'). click
Simulate form submission
Copy codeThe Code is as follows:
Google_form = page. form ('F ')
Google_form ["q"] = 'Ruby mechanize'
Page = agent. submit (google_form, google_form.buttons.first)
Pp page
Analysis page. The mechanism uses nokogiri to parse the webpage. Therefore, you can refer to the nokogiri document.
Copy codeThe Code is as follows:
Table = page. search ('A ')
Text = table. inner_text
Puts text
Note: If you need to log on to the webpage first, you can log on to the website first, log on to the JSESSIONID, and assign it to the agent.
Copy codeThe Code is as follows:
Cookie = Mechanic: Cookie. new ("JSESSIONID", "BA58528B76124698AD033EE6DF12B986:-1 ")
Cookie. domain = "datamirror.csdb.cn"
Cookie. path = "/"
Agent. cookie_jar.add! (Cookie)
If you need to save the webpage, use. save_as (or save, I have not tried it). For example:
Copy codeThe Code is as follows:
Agent. get ("http://google.com"). save_as
Tips
Puts Mechanic: AGENT_ALIASES can print out all available user_agent
Puts mechanic. instance_methods (false) Outputs all methods of the mechanic Module
Puts mechanic. instance_methods () Outputs all methods of the mechanic module and the functions of the inherited classes.