Skipping and Seeking; strange Java API discrepancy of the day

There is a common operating system problem when trying to read very large files; if your OS only supports 32 bit file addressing, you can’t “seek” into a file past 4 GB (4 GB = 2^32/1024/1024/1024).

But this is not a problem for most operating systems now-days, because you can “skip” past 4GB. For example, you do a “seek” just up to 4GB, then you can “skip” forward in 4GB leaps. (If you think of a file as a linked list of inodes, then you can see how this skip works for very large files.)

And Java has support for skipping and seeking too; java.io.InputStream supports skip, and java.io.RandomAccessFile supports seek. Both these methods allow you to skip/seek a long number of bytes; and make no mistake, that means 8,589,934,591 GB ( =(2^63 - 1)/1024/1024/1024). (On 32 bit OSes, I assume these methods must be implemented with multiple skips.)

Interestingly RandomAccessFile implements java.io.DataInput and java.io.DataOutput, but not java.io.InputStream nor java.io.OutputStream; I’ve always found that curious.

But I find the discrepancies in the following methods more than curious, it’s down right strange:

  • The java.io.InputStream class has a skip method like this:
    public long skip(long n) throws IOException.
  • The java.io.DataInput class has a skip method like this:
    public int skipBytes(int n) throws IOException.

For starters, they are named differently. But more importantly why does one take an int and the other a long? I mean, are you less likely to want to skip 8,589,934,591 GB ahead in a file if you are using DataInput as opposed to InputStream?

*smirk*

2 Comments

  1. Rich Dougherty
    Posted March 18, 2003 at 10:17 pm | Permalink

    Java doesn’t support multiple inheritance so RandomAccessFile can’t extend both InputStream and OuputStream. DataInput and DataOutput are interfaces, so this is what they have had to use.

  2. Ivan
    Posted March 19, 2003 at 6:15 am | Permalink

    Some Windows File API calls use a LARGEINTEGER (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winprog/winprog/largeinteger_str.asp) which is essentially a LONGLONG (Signed 64-bit integer).

    I’m not promising that all calls use it but…

Post a Comment

Your email is never shared. Required fields are marked *

*
*