XML is a scalable markup language designed for data transmission and storage. XML is the most common tool for data transmission between various applications. Unlike Access, Oracle, SQL Server, and other databases, the database provides more powerful data storage and analysis capabilities, such as data indexing, sorting, searching, and related consistency, it only stores data. In fact, the biggest difference between it and other data forms is that it is extremely simple, which seems a little small, but that makes it different. Data parsing
XML is a scalable markup language designed for data transmission and storage. XML is the most common tool for data transmission between various applications. Unlike Access, Oracle, SQL Server, and other databases, the database provides more powerful data storage and analysis capabilities, such as data indexing, sorting, searching, and related consistency.Store data. In fact, the biggest difference between it and other data forms is that it is extremely simple, which seems a little small, but that makes it different.
For XML format data, the R language XML package can import and process the data. for details, see the following case description.
Case 1
Enter a markup language text and use the XML package for parsing.
library(XML)tt = '
text
a phrase
'doc = xmlParse(tt)xmlToList(doc)# use an R-level node representationdoc = xmlTreeParse(tt)xmlToList(doc)Case 2
Import and process existing xml format data. in this case, the mobile phone address book xml data is used. follow these steps:
# Read xml format data and parse xmlfile = xmlParse (file. choose (), encoding = "UTF-8") class (xmlfile) # form the root directory list data xmltop = xmlRoot (xmlfile) class (xmltop) # view the class xmlName (xmltop) # view the root directory name xmlSize (xmltop) # view the total number of root directories xmlName (xmltop [[1]) # viewing sub-directory names # viewing the first sub-directory xmltop [[1] # viewing the second sub-directory xmltop [[2] # sub-directory node xmlSize (xmltop [[1]) # Number of sub-directory nodes xmlSApply (xmltop [[1], xmlName) # Name of the sub-directory node xmlSApply (xmltop [[1], xmlAttrs) # xmlSApply (xmltop [[1], xmlSize) # subdirectory node size # view the first node of the first subdirectory xmltop [[1] [[[1] # view the second node of the first subdirectory xmltop [[1] [[2] # The second subdirectory xmltop [[2] [[1] xmltop [[2] [[2] xmltop [[1] [3]] [[1] [[1] # View Contact phone xmltop [['Contact '] [['pythonelist'] [[1] [1] # Method 2 getNodeSet (xmltop, "// Contact/PhoneList ") [[1] [[1] [[1] # Third method xmltop [[1] [[3] [1] [[1] = 13717232323 # Change the contact number xmltop [[[1] [[1] [[1] = "zhangsan" # Change the contact name # save saveXML (xmltop, file = "out. xml ", encoding =" UTF-8 ")
Convert the xml format to dataframe
Follow these steps:
XmlToDataFrame (xmlfile) # Method 1: Use xmlToDataFrame () function library ("plyr") # Method 2: use the data format to process the dedicated package plyrMyContact = ldply (xmlToList (file. choose (), data. frame) # Convert to list first and then to dataframeView (MyContact) # view the contact information MyContact [, c ("Name", "PhoneList. phone. text ")] # save write.csv (MyContact," MyContact.csv ", row. names = FALSE)Feedback and suggestions
The above is the content of ShangFR for importing and processing data in XML format of R language. For more information, see PHP Chinese website (www.php1.cn )!