Since processing word and PPT on Linux is troublesome and there is a file format patent problem, the following operations are all performed under windows.
First you need to install Microsoft save as PDF add-on, official: http://www.microsoft.com/zh-cn/download/details.aspx? Id = 7
After the installation is successful, you can manually Save the document as a PDF file.
The "Win32: Ole" module must be referenced.
use Win32::OLE;use Win32::OLE::Const 'Microsoft Word';use Win32::OLE::Const 'Microsoft PowerPoint';
Word to PDF:
sub word2pdf{ my $word_file = $_[0]; my $word = CreateObject Win32::OLE 'Word.Application' or die $!; $word->{'Visible'} = 0; my $document = $word->Documents->Open($word_file) || die("Unable to open document ") ; my $pdffile = $word_file.".pdf"; $document->saveas({FileName=>$pdffile,FileFormat=>wdExportFormatPDF}); $document -> close ({SaveChanges=>wdDoNotSaveChanges}); $word->quit();}
PPT to PDF
sub ppt2pdf{ my $word_file = $_[0]; my $word = CreateObject Win32::OLE 'PowerPoint.Application' or die $!; $word->{'Visible'} = 1; my $document = $word->Presentations->Open($word_file) || die("Unable to open document ") ; my $pdffile = $word_file.".pdf"; $document->saveas($pdffile,32); $document -> close ({SaveChanges=>wdDoNotSaveChanges}); $word->quit();}
Note:
1. If you do not set PowerPoint to be displayed in PPT conversion, that is, $ word-> {'visible '} = 0, the conversion will fail.
2. If the complete path is used, the path name cannot contain spaces, "%", or other special symbols. Otherwise, the document cannot be opened.
Convert files in the current folder:
use Cwd;my $dirname = getcwd();@files = glob "*.doc";foreach (@files){ print $dirname.'/'.$_, "\n"; word2pdf($dirname.'/'.$_);}
If you want to convert the sub-folder files at the same time, you can traverse the files before conversion:
use File::Find;find(sub { word2pdf($File::Find::name) if /\.(doc|docx)/; ppt2pdf($File::Find::name) if /\.(ppt|pptx)/;}, "D:/test");
To avoid repeated word opening, you can first obtain all the documents to be converted and convert them in a centralized manner:
find(sub { push(@file_word, $File::Find::name) if /\.(doc|docx)/;}, "D:/test");word2pdf(@file_word);sub deleteSpace{ my $filename = $_[0]; my @temp = split(/\//, $filename); my $filename_without_path = pop(@temp); $filename_without_path =~ s/\s+//g; join('/', @temp).'/'.$filename_without_path;}sub word2pdf{ my @files = @_; my $word = CreateObject Win32::OLE 'Word.Application' or die $!; $word->{'Visible'} = 0; foreach (@files){ my $new_name = deleteSpace($_); rename($_, $new_name); print $new_name, "\n"; my $document = $word->Documents->Open($new_name) || die "can not open document"; my $pdffile = $new_name.".pdf"; $document->saveas({FileName=>$pdffile,FileFormat=>wdExportFormatPDF}); $document -> close ({SaveChanges=>wdDoNotSaveChanges}); } $word->quit();}
Alternatively, you can call chdir to the subdirectory and convert it in the subdirectory to avoid conversion failures caused by invalid characters in the directory, however, the failure caused by invalid characters in the file name is inevitable. Therefore, all the conversions above require spaces and special characters. deletespace only replaces spaces and needs to be improved.
For more information, see http://www.cnblogs.com/xesam /]
Address: http://www.cnblogs.com/xesam/archive/2012/11/06/2756222.html