Skipping and Seeking; strange Java API discrepancy of the day
# 2003-03-18 16:57:40 -0500 | Java | 2 CommentsThere is a common operating system problem when trying to read very large
files; if your OS only supports 32 bit file addressing, you can’t “seek” into a file
past 4 GB (4 GB = 2^32/1024/1024/1024).
But this is not a problem for most operating systems now-days, because you can “skip” past 4GB. For example, you do a “seek” just up to 4GB, then you can “skip” forward in 4GB leaps. (If you think of a file as a linked list of inodes, then you can see how this skip works for very large files.)
And Java has support for skipping and seeking too; java.io.InputStream supports skip, and
java.io.RandomAccessFile supports seek. Both these methods allow you to skip/seek a long number of bytes; and make no mistake, that means 8,589,934,591 GB ( =(2^63 - 1)/1024/1024/1024). (On 32 bit OSes, I assume these methods must be implemented with multiple skips.)
Interestingly RandomAccessFile implements
java.io.DataInput and
java.io.DataOutput, but not
java.io.InputStream nor
java.io.OutputStream; I’ve always found that curious.
But I find the discrepancies in the following methods more than curious, it’s down right strange:
-
The
java.io.InputStreamclass has a skip method like this:public long skip(long n) throws IOException. -
The
java.io.DataInputclass has a skip method like this:public int skipBytes(int n) throws IOException.
For starters, they are named differently. But more importantly why does one take an int
and the other a long? I mean, are you less likely to want to skip 8,589,934,591 GB
ahead in a file if you are using DataInput as opposed to InputStream?
*smirk*
Java doesn’t support multiple inheritance so RandomAccessFile can’t extend both InputStream and OuputStream. DataInput and DataOutput are interfaces, so this is what they have had to use.
Some Windows File API calls use a LARGEINTEGER (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winprog/winprog/largeinteger_str.asp) which is essentially a LONGLONG (Signed 64-bit integer).
I’m not promising that all calls use it but…