As we know, computers can only handle low-level binary values and cannot handle characters directly. When a text file is stored, each character in the file is mapped to a binary value, which is actually stored on the hard disk by the binary value. Then when the program opens a text file, all binary values are read in and mapped back to the original readable characters. This "Save and open" process works well when all the programs that need access to this file can "understand" its encoding, that is, the binary value to the character mapping, which also ensures a roundtrip process for understandable data.
If different programs use different encodings to process the same file, special characters in the source file will not display correctly. Special characters here refer to non-English characters such as accented characters (e.g., Á,ü).
Then the question came: 1 How do we determine what character encoding is used for a certain text file? 2 How do we convert the file to the selected character encoding?
Step One
To determine the character encoding of the file, we use a command-line tool called "File". Because the file command is a standard UNIX program, we can find it in all modern Linux distributions.
Run the following command:
The code is as follows:
$ file--mime-encoding filename
Step Two
The next step is to see what kind of file encoding your Linux system supports. To do this, we use the tool named Iconv and the "-l" option (lowercase of L) to list all currently supported encodings.
The code is as follows:
$ iconv-l
The Iconv tool is part of the GNU libc library, so it is out-of-the-box in all Linux distributions.
Step Three
After we have selected the target encoding in the encoding supported by our Linux system, run the following command to complete the encoding conversion:
The code is as follows:
$ iconv-f old_encoding-t new_encoding filename
For example, convert iso-8859-1 encoding to UTF-8 encoding:
The code is as follows:
$ iconv-f iso-8859-1-T Utf-8 input.txt
Once you've learned how to use these tools, you can fix a damaged subtitle file like the following: