Recent projects have been applied to the R read file data, so the relevant useful, validated methods summed up, effectively avoid the next pit.
1. R Read txt file
Use R to read TXT files directly using the Read.table () method to read, do not need to load additional packages.
Read.table ("/home/slave/test.txt", header=t,na.strings = C ("NA"))
Note that the Na.strings = C ("NA") here means that the missing data in the file is represented by NA, and the default split symbol is a space when reading a text file. The specific parameter settings can be referenced as follows:
Read.table (file, Header = FALSE, Sep = "", quote = "\" ",
Dec =". ", numerals = C (" Allow.loss "," Warn.loss "," No.loss " ),
row.names, col.names, as.is =!stringsasfactors,
na.strings = "NA", colclasses = na, nrows =-1,
skip = 0, Check.names = true, fill =!blank.lines.skip,
strip.white = FALSE, Blank.lines.skip = True,
Comment.char = "#",
allowescapes = false, flush = False,
stringsasfactors = Default.stringsasfactors (),
fileencoding = "", encoding = "Unknown", text, Skipnul = FALSE)
2. R Read CSV file
Using R to read a CSV file is similar to reading a TXT file, using the Read.csv () method, where the use of the parameters is mostly the same.
Read.csv ("/home/slave/test.csv", Header=t, Na.strings=c ("NA"))
When reading the CSV file, the separator is "," (this sentence is basically nonsense, you know); The specific parameter settings can be referenced as follows:
Read.csv (file, Header = true, Sep = ",", quote = "\",
Dec = ".", fill = true, Comment.char = "", ...)
3. R Read xls and xlsx files
There are many ways to read XLS and xlsx, but many of these methods are not particularly useful, such as the Read XLS method in the RODBC package is less useful, and sometimes a variety of problems arise. After some digging into the pit, I found two relatively useful packages to read the XLS file, which I'll explain separately. Gdata
Install.packages ("Gdata")
library (gdata)
Read.xls ("/home/slave/test.xls", Sheet=1,na.strings=c ("Na", "# div/0! "))
Where the sheet=1 parameter means reading the contents of the first sheet; Na.strings=c ("Na", "#DIV/0!") Both "NA" and "#DIV/0!" are represented as missing data, and the specific parameter settings of the Read.xls () method can be referenced as follows:
Read.xls (xls, sheet=1, verbose=false, pattern, na.strings=c ("NA", "#DIV/0!"),
..., method=c ("CSV", "TSV", "tab"), Perl= "Perl")
The Read.xls () method is just one method in the Gdata package, and there are some useful methods in the Gdata package, such as XLS to Csv,xls to TXT, and here are some examples:
Xls2csv (xls, Sheet=1, Verbose=false, Blank.lines.skip=true, ..., perl= "Perl")
Xls2tab (xls, sheet=1, Verbose=false , Blank.lines.skip=true, ..., perl= "Perl")
XLS2TSV (xls, Sheet=1, Verbose=false, Blank.lines.skip=true, ..., perl= " Perl ")
Xls2sep (xls, Sheet=1, Verbose=false, Blank.lines.skip=true, ...,
method=c (" CSV "," TSV "," tab "), Perl=" Perl ")
The Gdata package has a lot of features, but it has a lot of dependencies on other packages, and there may be a variety of unpredictable problems, and a less dependent package is described below. READXL
Install.packages ("READXL")
library (READXL)
read_excel ("/home/slave/test.xls", sheet=1,na= "NA")
One thing to note about this is that na= "NA" is slightly different from other read formats, and the specific parameter settings are listed below:
Read_excel (path, sheet = 1, col_names = TRUE, col_types = NULL, na = "", skip = 0)
Note : The above two methods can be read for XLS and xlsx.
At this point, we have used in the R common to read the file data method and the introduction is complete, let us experiment together. ^_^
Reference: http://www.cnblogs.com/xianghang123/archive/2012/06/06/2538274.html https:// cran.r-project.org/web/packages/gdata/index.html HTTPS://GITHUB.COM/HADLEY/READXL