I have been working on a new archive format that supports files over 2 GB without any nonportable code. As a demo, I created an archiver that stores both uncompressed files and files compressed with lpaq1 7 (needs 387 MB memory)
http://cs.fit.edu/~mmahoney/compression/#lpq1
To make the code portable without any Windows or Linux specific code, all files have to be accessed sequentially without using fseek(). Files are divided into blocks (about 64KB to 1MB) with the format:
"lPq" 1 [filename [0 mode oldsize newsize data]...]...
The first 4 bytes identify the archive and version number. The compressed files are concatenated together with a filename and a sequence of blocks. Each block has a mode ('c' for compressed or 's' for stored), the uncompressed size (4 bytes), compressed size (4 bytes) and the compressed data.
To list the contents, the program has to read the whole file, skipping over the data, but there is no easy other way to do this portably with large files. It is fast enough, though. The compressor doesn't know the file size until it is done, and it has to buffer the compressed data so it can write the block headers.
The archiver uses commands similar to 7zip and rar (just a, x, and l). To keep it simple, it only supports solid archives that can't be updated. (The compressor is not initialized between files, so compression is better). Also, it will never clobber files, just skip over existing files and keeps going. It lets you rename files when you extract them, which is an annoying problem with some archivers. It doesn't create directories (which would not be portable).
I will probably use this new format in my upcoming paq9.![]()

I'm looking forward to updates of this, and the upcoming paq9.

