This article mainly introduces the Python read-write Docx file method, has a certain reference value, now share to everyone, the need for friends can refer to
Python reads and writes Word documents with ready-made libraries to handle. I use python-docx here. You can install Python-docx with pip install.
In this case, PPT and Excel also have similar libraries Oh, and are directly reading the XML data inside the file. So doc format has to find another library to handle, DOC format is not XML-based.
1. Create new or open files. This is relatively simple with the Docx document class, if the specified path is open documents, if the path is not specified is a new document
#coding: utf-8import docx #新建文档doc_new = docx. Document () #读取文档doc = docx. Document (Ur ' C:\1.docx ')
2, save the file. There is open, there is a save. Use the document class's Save method, where the parameter is the saved file path, or the file stream to save. Typically specify a path.
Doc.save (Path_or_stream)
3. Collection of objects. Python-docx contains a collection of related objects for a Word document.
Doc.paragraphs #段落集合doc. Tables #表格集合doc. Sections #节 Collection Doc.styles #样式集合doc. Inline_shapes #内置图形 etc...
4, insert the paragraph. A paragraph is one of the most basic objects of word.
Doc.add_paragraph (U ' first paragraph ', style=none) #插入一个段落, the text is "the first paragraph" #默认是不应用样式, here can also not write the style parameter, or specify a paragraph style doc.add_paragraph (U ' second paragraph ', style= ' Heading 2 ') #这些样式都是word默认带有的样式, which can be directly listed with which paragraph styles print [S.name for S in Doc.styles if S.type==1]
5, new style. This help document is not very careful, and it is in English. The project I have on hand uses this, and I figure out how to use it, as follows.
#coding: Utf-8from docx import documentfrom docx.shared import RGBColor #这个是docx的颜色类 #新建文档doc = Document () #新增样式 (the first argument is a style name , the second argument is the style type: 1 for the paragraph, 2 for the character, 3 for the table) style = Doc.styles.add_style (' style name 1 ', 2) #设置具体样式 (modify the style font to blue, and of course, you can modify the other, everyone try) Style.font.color.rgb = RGBColor (0x0, 0x0, 0xff)
6, apply the character style. Characters are naturally inside a paragraph, and you can use the following method to append text to the paragraph and set the character style.
#插入一个空白段落p = Doc.add_paragraph (') p.add_run (' 123 ', style= "Heading 1 Char") p.add_run (' 456 ') p.add_run (' 789 ', style= " Heading 2 Char ") #这样一个段落就应用了两个字符样式, the middle" 456 "does not apply the style print P.text #输出结果是u ' 123456789 ' is also continuous
7, set the font. Of course, you can set some words without setting a style, or directly.
p = doc.add_paragraph (") R = P.add_run (' 123 ') R.font.bold = True #加粗r. Font.Italic = True #倾斜 etc...
8, table operation. Tables are also an object type that is often used.
#新建一个2x3的表格, style can not write table=doc.add_table (rows=2,cols=3,style=none) #可以用table rows and columns get the number of lines and columns in this table print Len ( table.rows) Print len (table.columns) #遍历表格for row in table.rows:row.cells[0].text = ' 1 ' #print row.cells[0].text #新增行或列ta Ble.add_row () Table.add_column ()
That's pretty much what word does with common operations. You can view the Help document, or you can use Dir and Help to view the object's method properties and assistance.