Discussion:
[Duplicity-talk] Duplicity - hang/choke near the end - on a large directory
Dirk-Willem van Gulik
2016-05-03 13:20:55 UTC
Permalink
Folks,

I have a setup which has been running well for some 9 years; most recent OS and duplicity change from about 6 months ago. Daily incrementals with some striping; with every 2 months overlapping full backups. Backend is an sFTP site with PKCS#11/#15.

This months quarterly full backup ‘hung’ on what is very likely its last partial upload. No sign of any IO issues, diskspace issues or exhaustion of some resource.

The only thing special about that last block may be the fact that it has a much larger number of files than any of the earlier blocks and that this last block contains a single large file. A restart (after clearing the lock) and a restart (after clearing lock and cache) did not make things work again.

The sanitised log file is below. I.e. we see a lot of

A /some/file

getting added; then a

AsyncScheduler: task execution done (success: True)

followed by the

A /som/file
<hang>

and then a hang (> 12 hours for a process that should complete in about 3 normally)

The "AsyncScheduler: task execution done (success: True)” prior to the hang is the 953rth — i.e. corresponding to a "Writing duplicity-full.20160503T111720Z.vol953.difftar.gpg”.

There is no sign of a 954 starting for the files which are (still) added during the hang - so not sure what the task execution refers to.

Attaching (d)truss/ktrace/strace to the hanging process does not give a peep (nor does it rack up any cpu cycles); lsof does show an open pipe to a gpg process; no open files (for writing, nor any temp files).

The attached gpg process spins in a syscall (very likely a read); has a temp files open, twice, with what seems to be a gpg encrupted payload - as well as a single file from the ‘A’ emitted a few 100 files earlier*. This file happens to be large (14Mb); while all other files in this ‘part’ are in the the kBytes or smaller.

A kill (or kill -9) does not yield any response either. The machine has done at least 1, possibly 2 full backups cycles with the exact same kernel/python/etc.

What is a good way to dive deeper into this ? Or does this ring a bell with anyone ? Or what debugging options do I have left ?

Dw.


*: I am a bit confused about it - would have assumed that gpg was just processing a stream blindly — why does it see the actual file names ?


Using archive dir: /var/db/duplicity/e39ca70….894946c1
Using backup name: e39ca7...94946c0
...
Import of duplicity.backends.rsyncbackend Succeeded
Import of duplicity.backends.ssh_paramiko_backend Succeeded
Import of duplicity.backends.ssh_pexpect_backend Succeeded
Main action: full
================================================================================
duplicity 0.7.05 (September 15, 2015)
Args: /usr/local/bin/duplicity full --asynchronous-upload --ssh-options -oIdentityFile=/some/cnfdir --archive-dir /var/db/duplicity --hidden-encrypt-key XX--hidden-encrypt-key XXX --hidden-encrypt-key XXX --sign-key XXXX --include /etc —include/stuff --exclude ** -v 7 / pexpect+sftp://***@somehost.com
FreeBSD xxxx.webweaving.org 10.2-RELEASE-p9 FreeBSD 10.2-RELEASE-p9 #0: Thu Jan 14 01:32:46 UTC 2016 ***@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 amd64
/usr/local/bin/python2.7 2.7.11 (default, Jan 16 2016, 01:23:17)
[GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final 208032)]
================================================================================
Using temporary directory /tmp/duplicity-HC6yqb-tempdir
Temp has 12101220048384 available, backup will use approx 60293120.
Running 'sftp -oIdentityFile=/some/cnfdir -oServerAliveInterval=15 -oServerAliveCountMax=2 ***@somehost.com'
sftp command: 'mkdir "."'
sftp command: 'cd "."'
sftp command: 'ls -1'
Synchronizing remote metadata to local cache...
Copying duplicity-full-signatures.20160101T011501Z.sigtar.gpg to local cache.
Running 'sftp -oIdentityFile=/some/cnfdir -oServerAliveInterval=15 -oServerAliveCountMax=2 ***@somehost.com'
sftp command: 'get "./duplicity-full-signatures.20160101T011501Z.sigtar.gpg" "/tmp/duplicity-HC6yqb-tempdir/mktemp-wGp1rN-2"'
Deleting /tmp/duplicity-HC6yqb-tempdir/mktemp-wGp1rN-2
Deleting /tmp/duplicity-HC6yqb-tempdir/mktemp-9EPrVs-3
Copying duplicity-full.20160101T011501Z.manifest.gpg to local cache.

…<repeated for all other files…>

Running 'sftp -oIdentityFile=/some/cnfdir -oServerAliveInterval=15 -oServerAliveCountMax=2 ***@somehost.com'
sftp command: 'get "./duplicity-new-signatures.20160429T002400Z.to.20160430T002400Z.sigtar.gpg" "/tmp/duplicity-HC6yqb-tempdir/mktemp-U_FaFN-472"'
Deleting /tmp/duplicity-HC6yqb-tempdir/mktemp-U_FaFN-472
Deleting /tmp/duplicity-HC6yqb-tempdir/mktemp-iq9s8E-473
Running 'sftp -oIdentityFile=/some/cnfdir -oServerAliveInterval=15 -oServerAliveCountMax=2 ***@somehost.com'
sftp command: 'mkdir "."'
sftp command: 'cd "."'
sftp command: 'ls -1'
Added incremental Backupset (start_time: Fri Jan 1 02:15:01 2016 / end_time: Tue Jan 5 02:24:00 2016)
Added incremental Backupset (start_time: Tue Jan 5 02:24:00 2016 / end_time: Wed Jan 6 02:24:00 2016)

….<repeated until today>….

Added incremental Backupset (start_time: Fri Apr 29 02:24:00 2016 / end_time: Sat Apr 30 02:24:00 2016)
Warning, found incomplete backup sets, probably left from aborted session
Last full backup date: Fri Jan 1 02:15:01 2016
Using temporary directory /var/db/duplicity/e39ca701908673c2e81e6c59894946c1/duplicity-QDitlY-tempdir
Using temporary directory /var/db/duplicity/e39ca701908673c2e81e6c59894946c1/duplicity-N3YtEw-tempdir
AsyncScheduler: instantiating at concurrency 1

A ...snipped add of 2390 files ..

AsyncScheduler: scheduling task for asynchronous execution
Processed volume 1
Writing duplicity-full.20160503T111720Z.vol1.difftar.gpg
Running 'sftp -oIdentityFile=/some/cnfdir -oServerAliveInterval=15 -oServerAliveCountMax=2 ***@somehost.com'
sftp command: 'put "/tmp/duplicity-HC6yqb-tempdir/mktemp-WcmH1t-474" "./.duplicity-full.20160503T111720Z.vol1.difftar.gpg.part”’

A ...snipped add of 25 files ..

sftp command: 'rename "./.duplicity-full.20160503T111720Z.vol1.difftar.gpg.part" "./duplicity-full.20160503T111720Z.vol1.difftar.gpg”’

… all well for the next 950 ish difftar uploads…...

sftp command: 'rename "./.duplicity-full.20160503T111720Z.vol953.difftar.gpg.part" "./duplicity-full.20160503T111720Z.vol953.difftar.gpg"'
A ...snipped add of 197 files ..

Deleting /tmp/duplicity-HC6yqb-tempdir/mktemp-FNlhPm-1426
AsyncScheduler: task execution done (success: True)

A ...snipped add of 852 files ..
— hang —

Loading...