Discussion:
Streamable compressed archive format?
(too old to reply)
g***@spamgourmet.com
2006-01-19 10:06:58 UTC
Permalink
Hello,

I know of the TAR format, which in itself does not lend itself easily
to long NTFS paths and compression & I have found some discussion on
the "Duplicity" website about an alternative.

Are you aware of any other streamable archive formats? So you can
compress your huge directory tree to stdout on one machine, forward the
stream on the fly to another machine, the latter decompressing the
stream on the fly?

ZIP and others contain forward pointers that must be patched after
compression, to be streamable you cannot change a buffer after is has
been sent. Gzip is streamable, but not really an archive.

TIA

Rasmus Møller
Mark Adler
2006-01-19 15:01:58 UTC
Permalink
Post by g***@spamgourmet.com
ZIP and others contain forward pointers that must be patched after
compression, to be streamable you cannot change a buffer after is has
been sent.
No, zip is streamable. There is an option for local headers to not
contain the length and crc information, and instead for that
information to follow the compressed data. We convinced PK to put that
in many many years ago when we noticed that the deflate compressed data
format was itself streamable (self-terminating).

mark
Aslan
2006-01-19 15:37:06 UTC
Permalink
[Are you aware of any other streamable archive formats? So you can
compress your huge directory tree to stdout on one machine, forward the
stream on the fly to another machine, the latter decompressing the
stream on the fly?]

Seems like "cpio" in copy-out mode on one side and in copy-in mode on the
side may work for you. Both modes can copy stdin to stdout.
So if you could transfer stdout of copy-out to the stdin of copy-in on
another machine, it may work.
Jasen Betts
2006-01-20 06:42:25 UTC
Permalink
Post by g***@spamgourmet.com
Hello,
I know of the TAR format, which in itself does not lend itself easily
to long NTFS paths and compression & I have found some discussion on
the "Duplicity" website about an alternative.
how long is long? use a larger block size?
Post by g***@spamgourmet.com
Are you aware of any other streamable archive formats? So you can
compress your huge directory tree to stdout on one machine, forward the
stream on the fly to another machine, the latter decompressing the
stream on the fly?
cpio (also from unix) it'll probably need some modification if you're
wanting to backup NTFS file attributes, but GPL (or possiblr BSD) source
is available. like TAR no compression is built in.

I used it (with ssh and gzip) to transfer a large number of files across
town, archiving, compressing and sending, receiving, and decmpressing,
and extracting, on the fly.

ZOO (from Raul D'hesi iirc) was IIRC streamable
(also open source, multi-platform)
JAR (from Robert Jung, author of ARJ) might be too. (but closed source)

IIRC
ZIP is streamable, but somewhat lumpy (files must be compressed before the
header for the file can be written) and most unarchivers want to see the
directory at the end of the file (but the info in the direcctory is present
before each file in the archive so a streaming unarchiver could be done.)


Bye.
Jasen
Aslan
2006-01-20 09:37:26 UTC
Permalink
Post by Jasen Betts
Post by g***@spamgourmet.com
Hello,
I know of the TAR format, which in itself does not lend itself easily
to long NTFS paths and compression & I have found some discussion on
the "Duplicity" website about an alternative.
how long is long? use a larger block size?
Post by g***@spamgourmet.com
Are you aware of any other streamable archive formats? So you can
compress your huge directory tree to stdout on one machine, forward the
stream on the fly to another machine, the latter decompressing the
stream on the fly?
cpio (also from unix) it'll probably need some modification if you're
wanting to backup NTFS file attributes, but GPL (or possiblr BSD) source
is available. like TAR no compression is built in.
I used it (with ssh and gzip) to transfer a large number of files across
town, archiving, compressing and sending, receiving, and decmpressing,
and extracting, on the fly.
ZOO (from Raul D'hesi iirc) was IIRC streamable
(also open source, multi-platform)
JAR (from Robert Jung, author of ARJ) might be too. (but closed source)
IIRC
ZIP is streamable, but somewhat lumpy (files must be compressed before the
header for the file can be written) and most unarchivers want to see the
Not necessarily. Method 0 (STORE) can be used. But if you are sending the
compressed data over network, DEFLATE should be fast enough in most of the
cases to catch up with the connection speed.
Post by Jasen Betts
directory at the end of the file (but the info in the direcctory is present
before each file in the archive so a streaming unarchiver could be done.)
Bye.
Jasen
Rasmus Møller
2006-01-20 10:38:49 UTC
Permalink
Thanks for responses so far;

if I - for the moment - limit myself to already compiled and available
utilities, my next question is: has anybody tried this
Windows-to-Windows?

I have found UNXUTILS compilations for Windows, I wil try out ASAP the
following setup:

TAR -> GZIP -> NetCat -> (network) -> NetCat ->(un)GZIP -> (un)TAR

I'll be back with my results later.

Rasmus
Rasmus Møller
2006-02-07 18:22:03 UTC
Permalink
Now I am back!

Cygwin TAR works nicely under Windows, but does NOT include extended
attributes. I thought I did not need them, but I was wrong. NTTAR works
with extended attribs; though I had to recompile to make it work
flawlessly (as well as faster because of bigger buffersize). Reply to
thread if you are interested in the changes.

Gzip or lzop both work fine for the purpose.

Netcat for Windows (nc.exe) worked fine but hopelessly slow; a quick
hack with endless loops transferring 64KB blocks between stdin/stdout
and a TCP connection improved the speed a factor 10-20. Reply if you
are interested.

Sorry if parts of this were slightly off-topic, but there has been a
few similar questions in the past. I am very happy that it now works
fine, every night streaming 120GB in just above 1 hour (through an
intermediate PC running rinetd). Hop to shave a little off in the
future.
Claudio Grondi
2006-02-08 01:02:57 UTC
Permalink
Post by Rasmus Møller
Now I am back!
Cygwin TAR works nicely under Windows, but does NOT include extended
attributes. I thought I did not need them, but I was wrong. NTTAR works
with extended attribs; though I had to recompile to make it work
flawlessly (as well as faster because of bigger buffersize).
Reply to
thread if you are interested in the changes.
I am.

Claudio
Post by Rasmus Møller
Gzip or lzop both work fine for the purpose.
Netcat for Windows (nc.exe) worked fine but hopelessly slow; a quick
hack with endless loops transferring 64KB blocks between stdin/stdout
and a TCP connection improved the speed a factor 10-20. Reply if you
are interested.
Sorry if parts of this were slightly off-topic, but there has been a
few similar questions in the past. I am very happy that it now works
fine, every night streaming 120GB in just above 1 hour (through an
intermediate PC running rinetd). Hop to shave a little off in the
future.
Rasmus Møller
2006-02-08 11:37:17 UTC
Permalink
Ok,

I use a spamgourmet address.
Please mail me at:

***@spamgourmet.com

and I'll send you the sourcecode (a devcpp project + main.c) and an
exe, if you want. Devcpp is free.

NTTAR is GPL, I fear a little for the red tape involved by
redistribution.
My intention is, of course, to comply to the extent possible (and not
too much work).

Rasmus Møller

Loading...