Before using Rvest to help people write a regular crawl Amazon price inventory, and compared with the previous price of the small program, is recently written the first complete program. It involves a number of error handling.
Here are the main references to the following questions and answers on StackOverflow:
- How to skip an error in a loop
- Skip to next value of loop upon error in R
Trycatch section, follow-up information, found the following blog post: 1. The R language uses Trycatch for simple error handling
The following is a code example:
1) Use the Trycatch function to skip the error message. (examples are styled with download.file)
Look at the following code. There is a need to download a bunch of Amazon product information in bulk. If the Product ID number is wrong, or the IP is limited, the webpage will not open, and Download.file will error. I use trycatch here to get the error message when the webpage cannot be opened. and requires the next loop "" to be executed.
for (n-in 1:length (Productlink)) { TryCatch ({download.file (Productlink[n],paste0 ( ), "GETWD", Productid[n,], ". html"), Cacheok = TRUE) },error=function (e) {Cat ("error:", Conditionmessage (e), "\ n")}) Sys.sleep (0.5) #增加了Sys. Sleep (seconds) function to allow each step of the loop to pause for a while. While this may slow down the program, it is a good idea for sites with restricted access. }
The above example consists of two important functions, namely Trycatch and cat
Lookup function, Trycatch belongs to base package, condition system. Use Trycatch for simple error handling in the R language this post has a simple demonstration of Trycatch as follows:
result = TryCatch ( {expr}, warning = function (w) {Warning-handler-code}, error = function (e) { Error-handler-code}, finally = {Cleanup-code} )
That is, if warning, what to do with warning, if error is what to do with error. If no conditions are matched, the content in expr is eventually output. If there is a final item, then both the finally item and the expr item are output
TryCatch ({a<-"C" b<-"C" b==a}, error=function (e) {cat ("hahaha", Conditionmessage (e), "\ n")}, finally={print ("CCC")})
[1] "CCC"
[1] TRUE
TryCatch ({a<-"C"
Cc==a}, #cc不存在 error=function (e) {cat ("hahaha", Conditionmessage (e), "\ n")}, Finally={print ("CCC")})
hahaha object ' cc ' not found
For the code example, the download succeeds to return the download content, and the unsuccessful return Error=function (e) {Cat ("ERROR:", Conditionmessage (e), "\ n")}
Then there is the cat function. This cat is an input/output value. This is equivalent to asking the system to output the contents of "ERROR:" +conditionmessage (e). Then use the "" Branch.
In addition, we see a more interesting application in this question and answer by Mmann1123, which is answered by StackOverflow.
It shrinks and expands and can be read.
Trycatch Demonstration
2) Use the IF statement and the Stop statement.
That is, if a condition is not true, stop the program and output the contents of the stop. I'm mainly here to check if the original Product ID is entered correctly.
if (!sum (check) ==length (Productlink)) { productlink<-null productid<-null Stop ("invalid ProductID double check if any space or else in, and resave the file or the script would not run ") }
3) When processing bulk read data using Data.frame, the element does not exist because of data.frame error.
For example, if a does not exist, the data.frame error is caused.
A<-nullb<-c ("CC", "DD") data.frame (a,d)> Error in Data.frame (A, D): parameter value means different number of rows: 0, 2
Therefore, in the loop, you need to synthesize data.frame separately, and then use Rbind to synthesize each data.frame together, you can consider increasing the value of the outlier assignment. As in the following two paragraphs, if the product name does not exist in the page I pulled, then length (ProductName) ==1 is false, the output "product not download or not existing" is directly Then this field is not a null value or 2-3 rows, but 1 lines, and then merged into Data.frame will not be an error.
Data<-function (n) { # # # #隐掉获得productname/price/category Code if (!length (ProductName) ==1) {productname= " Product not download or not existing '} if (!length (price) ==1) { price=na category<-' product not download Or not existing " } data.frame (productname,price,category) # Here, the data.frame is synthesized, if the three rows are not equal (many null values are null, or a field has 2-3 rows.) #使用上面的IF判断赋值的好处是, the last productname,price,category guaranteed is 1 lines, which can be combined with data.frame. And there is output for outliers as well.
I didn't understand the Trycatch function because I was dealing with class 2nd 3 errors. Now look down, seemingly trycatch function can do more things?
Write down for reference when writing code later.
In addition, Trycatch have similar effects in java,c. It seems that r in the final analysis, still can not escape the underlying language AH.
R language-three examples of handling outliers or errors