This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Telnet / SSH connection timeout on LAN


On Jul 13, 2015, at 10:34 AM, Andrey Repin <anrdaemon@yandex.ru> wrote:
> 
> In my environment, a small touch to the original file cause changes throughout
> the entirety of its stored image. ('cause storage format is actually an
> archive, and a small change here and there in the source file cause massive
> shifts in the resulting image.)

Unless those files are written using either whole-archive compression or whole-archive encryption, rsync should still be able to find substantial savings in the transfer with its rolling checksums.  rsync wonât be confused by simple changes like a new byte added to the middle of a file, shifting all subsequent bytes down by one.

Some âarchiveâ formats do use compression, but in a piecewise fashion, so that changing one byte of one piece of the archive may cause that entire chunk to change, but it might not affect any of the others.  An example of this is the Fossil database format.

You can figure out if your archive files work this way by adding -v to your rsync command.  It reports a ratio of the on-disk data size to the transfer size as âspeedup is Nâ, where N > 1.0 means it is not re-sending the entire file.  The output of --stats gives similar info, more verbosely. 

The point I made in the original post, however, is that all this work to save network bandwidth comes at a disk I/O and CPU cost in the case of rsync, because it doesnât have a daemon that can sit around watching for filesystem change events.  The larger the files are with respect to the change sizes, the greater the waste.

Always-running software like Dropbox avoids much of this cost because it can watch for those events, and thus only do work when the OS tells it that a particular file has changed.

I have also left out another disadvantage of rsync: itâs basically a one-way operation.  If you ever need two-way (or N-way) syncing, youâre better off moving to one of the many alternatives that know how to do this correctly.  Multilateral syncing is surprisingly hard to get right.

I donât mean to advertise for Dropbox, just to give it as an example that everyone can relate to.

An alternative thatâs open source, more secure, and definitely does pay attention to the OSâs filesystem event API is SpiderOak.  You can see from their Github contents that theyâve got OS-specific file change notifiers:

  https://github.com/SpiderOak

Now contrast Syncthing, which has many of the same virtues, but currently doesnât have file change notification built in, causing some third party to write a helper for Syncthing to fill the gap:

  https://syncthing.net/
  https://github.com/syncthing/syncthing-inotify/

These tables may be helpful:

  https://en.wikipedia.org/wiki/Comparison_of_file_synchronization_software
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]