The big/little problem in problem Java
1. Solve the endian problem: a summary
Everything in the Java binaries is in the form of Big-endian, which is sometimes referred to as the network order. This is a good message, meaning if you only use Java. All files are processed in the same way on all platforms (Mac,pc,solaris, etc.). You can freely exchange binary data, electronically on the Internet, or on a floppy disk, regardless of the endian problem. The problem is that there are problems when you exchange data files with programs that are not written in Java. Because these programs use the Little-endian order, usually the C language that is used on the PC. Some platforms use Big-endian byte order (mac,ibm390), and some use Little-endian byte order (Intel). Java hides the endian problem with the user.
In binary files, there are no delimiters between fields, and files are binary, unreadable ASCII. If the data you want to read is not in a standard format, it is usually prepared by a non-Java program. You can choose from four different options:
1. Rewrite the output program that provides the input file. It can directly output Big-endian byte stream dataoutputstream or character dataoutputsream format.
2. Write a separate translation program that reads and arranges bytes. Can be written in any language.
3. Read the data in bytes and rearrange them (on the fly).
4. The easiest way to do this is to use the Ledatainputstream, Ledataoutputstream and lerandomaccessfile simulations that I have written datainputstream, DataOutputStream and Randomaccessfile, they use a Little-endian byte stream. You can read about Ledatastream. Can download the code and source free. You can get help from the File I/O amanuensis to show you the classes. Just tell it to you have Little-endian binary data.
2. You may not even have any problems.
Many Java novices from C may think they need to consider whether the big or little problem is being used within the platform on which they depend. This is not a problem in Java. Further, you cannot know how they are stored without the help of local classes. Java has no struct I/O and no unions or any of the other endian-sensitive language constructs.
Endian issues need to be considered only when communicating with a legacy C + + application. The following code will produce the same result on the big or little endian machine:
Take 16-bit short apart into two 8-bit bytes.
Short x = 0XABCD;
byte high = (byte) (x >>> 8);
byte low = (byte) x;/* cast implies & 0xFF * *
System.out.println ("x=" + x + "high=" + High + "low=" + low);
3. Read Little-endian Binary Files
The most common problem is dealing with files stored in Little-endian format.
I had to implement routines parallel to those in java.io.DataInputStream which reads raw binary, in my ledatainputstream a nd ledataoutputstream classes. Don ' t confuse this with the IO. Datainput human-readable character-based file-interchange format.
If you are wanted to does it yourself, without the overhead of the full ledatainputstream and Ledataoutputstream classes, here I s The basic technique:
Presuming your integers are in 2 ' s complement Little-endian format, shorts are easy to pretty:
4. History
In Gulliver's travels the Lilliputians liked to break their eggs on the "small End" and "the" Blefuscudians on the "big" end. They fought wars over this. There is a computer analogy. Should numbers be stored most or least significant byte-a? This is sometimes referred to as byte sex.
Those in the Big-endian camp (most significant byte stored i) include the Java VM virtual computer, the Java binary fi Le format, the IBM 360 and follow-on mainframes such as the 390, and the Motorola 68K and most mainframes. The Power PC is endian-agnostic.
Blefuscudians (Big-endians) Assert this is the way God intended integers to be stored, most important part I. At a assembler level fields of mixed positive integers and text can is sorted as if it were one big text field key. Real programmers read Hex dumps, and Big-endian are a lot easier to comprehend.
In the Little-endian camp (least significant byte A) are the Intel 8080, 8086, 80286, Pentium and follow ons and the A MD 6502 popularised by the Apple [.
Lilliputians (Little-endians) assert that putting "low" order Part I natural because when I do arithmetic Manually, you start in the least significant part and work toward the most significant part. This is ordering makes writing multi-precision arithmetic easier since you. It made implementing 8-bit microprocessors easier. At the assembler level (don't in Java) It also lets you cheat and pass addresses of a 32-bit positive ints to a routine Cting only a 16-bit parameter and still have it work. Real programmers read Hex dumps, and Little-endian are more of a stimulating challenge.
If a machine is word addressable, with no finer addressing supported, the concept of endianness means nothing since are fetched from RAM in parallel, both ends.
5. What Sex is Your CPU?
Byte Sex Endianness of CPUs
Cpu
Endianness Notes
AMD 6502, Duron, Athlon, Thunderird
Little
6502 is used in the Apple [, the Duron, Athlon and Thunderbird in Windows 95/08/me/nt/2000/xp
Apple [6502
Little
Apple Mac 68000
Big
Uses Motorola 68000
Apple Power PC
Big
CPU is bisexual but stays the Mac OS.
Burroughs 1700, 1800, 1900
?
Bit addressable. Used different interpreter firmware instruction for each sets.
Burroughs 7800
?
Algol Machine
CDC LGP-30
Word-addressable only, hence no endianness
31½bit words. The low order bit must is 0 on the drum, but can is 1 in the accumulator.
CDC 3300, 6600
Word-addressable
?
DEC PDP, Vax
Little
IBM 360, 370, 380, 390
Big
IBM 7044, 7090
Word addressable
Bits
IBM AS-400
Big
?
Power PC
Either
The endian-agnostic power-pc ' s have a foot in both camps. They are bisexual, but the OS usually imposes one convention or the other. e.g. Mac powerpcs are Big-endian.
Intel 8080, 8080, 8086, 80286, 80386, 80486, Pentium I, II, III, IV
Little
Chips used in PCs
Intel 8051
Big
MIPS R4000, R5000, R10000
Big
Used in Silcon Graphics IRIX.
Motorola 6800, 6809, 680x0, 68HC11
Big
Early Macs used the 68000. The Amiga.
NCR 8500
Big
NCR Century
Big
Sun Sparc and UltraSparc
Big
Sun ' s Solaris. Normally used as Big-endian, but also has support for operating for Little-endian mode, including being to switch end Ianness under program control for particular loads and stores.
Univac 1100
Word-addressable
36-bit words.
Univac 90/30
Big
IBM 370 Clone
Zilog Z80
Little
Used in CPM machines.
If you are know the endianness of the other cpus/oses/platforms please email me at roedy@mindprod.com.
In theory data can have two different byte sexes but CPUs can have four. Let us give, into this world of mixed left and right hand drive, which there are not real CPUs with all four sexes to Contend with.
The Four Possible Byte sexes for CPUS
Which Byte
Is Stored in the
Lower-numbered
Address?
Which Byte
Is addressed?
Used in
Lsb
Lsb
Intel, AMD, Power PC, DEC.
Lsb
MSB
None that I know of.
MSB
Lsb
Perhaps one of the old word mark architecture machines.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.