Recently deployed to the Linux platform at the adjusting supervised ASP, the online search in the document import operation enabled the Exceldatareader build to support cross-platform. Test the build with NuGet installation on your local Windows, which relies on sharpziplib. NET under dingding well-known decompression build, should be used to extract Excel documents. Local test not read normal no problem, but publish to Linux run always can't read, return null, this makes me very confused! Next you're ready to find out where the problem is
1) downloaded the Exceldatareader source code local compilation test problem persists, the data is not read.
2) Linux Platform Non-development environment, on the source code to locate the error printing method found that the output is exceptionmessage, and changed to. ToString () and use the console printout to observe the code error line. The following results are obtained:
The error message is actually SharpZipLib newspaper ... OK, continue to download SharpZipLib source code to find the problem.
3) Download complete SharpZipLib source code, navigate to SharpZipLib.Zip.ZipConstants.ConvertToString (system.byte[] data, Int32 count) method:
The code here is very simple, the error is not to find the object, if the judgment is to check whether data is null, then it is certainly not a judgment inside the error. Then only may be the following byte array to the error of the string, carefully look at not understand the type of int defaultcodepage is what ghosts, and I usually write encoding.getencoding ("UTF8"). GetString (data, 0, count); Then went to find the definition of the next defaultcodepage:
Find the definition of the next codepage, this thing is called the Inner Code table is defined as a character encoding type mapping of an integer number
Baidu Encyclopedia said:
1 |
CodePage:可读/可写。整型。定义用于在浏览器中显示页内容的代码页。代码页是字符集的数字值,不同的语言使用不同的代码页。例 如,ANSI代码页为1252,日文代码页为932,简体中文代码页为936。一般情况下,当你上传到国外网页空间,或者提取数据库记录等出现乱码时,就 采用这种方法解决。 |
Well, Simplified Chinese is 936, is my document the reason for Chinese? But I also changed into English words or not ah, try to print under the running of this defaultcodepage is how much:
What the hell is oemcodepage?
Find this introduction: Http://dcx.sap.com/1200/zh/dbadmin/win-collation-natlang.html means that this is the encoding used in the old DOS environment, and I thought the code of the server device manufacturer's location ....
(⊙?⊙) is right, the code is right ...
I don't understand, let's get rid of this thing.
Return encoding.getencoding (Encoding.Default.CodePage). GetString (data, 0, count);
I also use codepage, get the codepage from the. NET environment, and run the test, no problem:
Document content:
Parsing code:
FileStream stream = System.IO.File.Open ("articletemplate.xlsx", FileMode.Open, FileAccess.Read); Iexceldatareader Excelreader = Excelreaderfactory.createopenxmlreader (stream); Console.WriteLine (""); excelreader.isfirstrowascolumnnames = True;dataset ds = Excelreader.asdataset (); if (ds = = null) { Console.WriteLine ("DS is Null");} else{ Console.WriteLine ("Read OK");} foreach (DataTable dt in DS. Tables) { foreach (DataRow dr in Dt. Rows) {for (int i = 0; i < dt. Columns.count; i++) { Console.WriteLine (dr[i]); } }
(。 _。 Well, that's the problem, so what's the codepage that the. NET environment gets on mono?
The query gets 65001 for the UTF-8.
So what does my win environment look like?
All 936 is Simplified Chinese is also GBK encoding.
Now that the Linux environment gets GBK encoded but not used, is my Linux environment not GBK encoded? Locale view the encoding used by the current system:
Exclusively's UTF8 is not GBK. The locale-a command can view the supported encodings, with the display support GBK.
So can only guess mono on the run. Net program to do encoding conversion can only be converted to the currently used encoding?
The current change to Encoding.default can solve the problem of coding errors, but the actual reason is not my guess so, I do not know ...
Single mono parsing Excel document encoding error troubleshooting record