Docker Issues and Tips (aufs/overlay/btrfs..)

October 10, 2017 · View on GitHub

Picked up and categorized subjectively from https://github.com/docker/docker/issues. Comments and pull requests are welcome.

:white_large_square: = Open (maybe not up-to-date, please check the link by yourself!)

:white_square_button: = Mostly resolved (ditto, plus subjective)

:white_check_mark: = Resolved

Storage Drivers

AUFS

IssueAbstractImpactReproducibilityCauseSolutionNotes
:white_check_mark: #783Cannot access to a directory due to a permission error:neutral_face: Medium:smiley: EasyExpected AUFS behavior. dirperm1 mount option fixes this issue.Update the kernel (AUFS >= 2008xxxx?) and Docker daemon (>= 1.7)Confirm: `docker info
:white_check_mark: #18180A process becomes a zombie and hangs up:scream: High:scream: Hard(multiprocessor)
:smiley: Easy(uniprocessor)
Compatibility between the kernel and AUFSUpdate the kernel (AUFS >= 20160111)Java apps and MongoDB are known to be affected
:white_check_mark: #20199fcntl(F_SETFL, O_APPEND) is ignored and hence data can be corrupted:scream: High:smiley: EasyAUFS bugUpdate the kernel (AUFS >= 20160301)Dovecot is known to be affected
:white_check_mark: #20240Weird permission even though dirperm1 is enabled:neutral_face: Medium:scream: HardAUFS bugUpdate the kernel (AUFS >= 20160905)
:white_large_square: AUFS ML 2016-03-08Hang up related to O_DIRECT:scream: High:smiley: EasyUnanalyzedNonePercona is known to be affected
:white_large_square: #24309Unable to remove files previously committed:scream: High:smiley: EasyUnanalyzedThis article seems related, but perhaps slightly different(Japanese)
:white_square_button: #34361AUFS + XFS hangs up:scream: High:smiley: EasyAUFS bugUpdate AUFS

Non-bug issues:

Overlay

IssueAbstractImpactReproducibilityCauseSolutionNotes
:white_check_mark: #10180RPMDB corruption:scream: High:neutral_face: MediumExpected overlay behaviorUse yum-{utils,plugins-ovl}-1.1.31-33.el7 (included in RHEL 7.2) or later. Kernel patch is also available.Linux 4.6 or later prints human-friendly dmesg
:white_check_mark: #12080Cannot use UNIX domain sockets:neutral_face: Medium:smiley: EasyOverlay BugUse Linux 4.7-rc4 or later
:white_check_mark: #12327pip fails:scream: High:smiley: EasyOverlay BugUse Linux 4.5 or later
:white_check_mark: #19082Weird behavior after removing the current directory:smiley: Low:smiley: EasyOverlay BugUse Linux 4.5 or later
:white_square_button: #19647, coreos/bugs#1095Untar fails intermittently:scream: High:scream: HardOverlay BugUse Linux 4.13 with OVERLAY_FS_INDEX=yAnalysis is in progress in coreos/bugs#1095
:white_large_square: #20640Container cannot be started:neutral_face: Medium:scream: HardUnanalyzedNonePossibly identical to #16902
:white_check_mark: #20950/dev/console: operation not permitted:scream: High:smiley: EasyKernel BugUse recent Linux kernels
:white_check_mark: #21555docker build fails intermittently (overlay1):scream: High:scream: HardDiffDriver bugUse Docker 1.13 or laterOverlay2 doesn't have this issue by design
:white_check_mark: #24913permissions broken after chown:neutral_face: Medium:smiley: EasyOverlay BugUse Linux 4.6 or laterThe overlay2 issue #28391 is due to the identical bug
:white_check_mark: #25244opaque flag not reset after directory copy up:neutral_face: Medium:smiley: EasyOverlay BugResolved in Linux 4.8 and backported to 4.4.21 and 4.7.4npm is known to be affected
:white_check_mark: machine#3327chmod fails with EPERM:smiley: Low:smiley: EasyOverlay BugUse Linux 4.5 or later
:white_check_mark:#27358file removal weird on overlay + XFS (ftype=0):scream: High:smiley: EasyExpected behaviorFormat xfs with ftype=1
:white_check_mark:#34320docker build produces weird images with CONFIG_OVERLAY_FS_REDIRECT_DIR=y:scream: High:smiley: EasyDiffDriver issueApply #34342 (Docker 17.08?)

Non-bug issues:

AUFS / Overlay common

Non-bug issue: rename(2) is not fully supported #25409

reports about the incompatible behavior of rename(2) from the real world

SoftwareReport
Apache Kuduhttps://issues.apache.org/jira/browse/KUDU-1419
CernVM-FShttps://sft.its.cern.ch/jira/browse/CVM-651
GPGhttps://github.com/docker/docker/issues/26317
NPMhttps://github.com/npm/npm/issues/9863
Sambahttps://bugzilla.samba.org/show_bug.cgi?id=9966

BtrFS

IssueAbstractImpactReproducibilityCauseSolutionNotes
:white_check_mark: #19073sendfile(2) can be unkillable:smiley: Low:smiley: EasyBtrFS bugNoneNot likely to happen in production, but needs consideration for public PaaS
:white_large_square: #20080cgroups kmem limit leads crash and data corruption:scream: High:smiley: Easy?Btrfs bugAvoid kmem limit configuration?

Non-bug issues:

ZFS

IssueAbstractImpactReproducibilityCauseSolutionNotes
:white_check_mark: #20153Some operations fail due to EBUSY:neutral_face: Medium:neutral_face: MediumDaemon bugUpdate Docker daemon

Non-bug issues:

DeviceMapper

IssueAbstractImpactReproducibilityCauseSolutionNotes
:white_check_mark: #4036Mount fails:scream: High:smiley: Easyudev sync disabledUse a Docker daemon binary which supports udev syncConfirm: `docker info
:white_large_square: #20401Infinite “mount/remount” loop, which makes the system unresponsive:scream: High:scream: HighUnanalyzed (perhaps related to XFS)None

Non-bug issues:

Storage driver test tool

So which storage driver should I use?

It totally depends on your workload, but Docker, Inc. says AUFS and Devicemapper (direct-lvm) are "production-ready".

https://github.com/docker/docker/blob/master/docs/userguide/storagedriver/selectadriver.md#future-proofing

driver-pros-and-cons.png

Although not listed in the above table, VFS driver is also attractive for its robustness.

Links:

Anyway...

You know, containers should be "immutable" and "disposable".

For persistent data and some special temporary data, you should better consider using an external volume (docker run -v).

Links:

Network

IssueAbstractImpactReproducibilityCauseSolutionNotes
:white_square_button: #5618hang up with unregister_netdevice: waiting for lo to become free:scream: High:scream: HardKernel bugUse Linux 4.8 or laterThe patch will be backported to old kernels in major distros
:white_check_mark: #18776TCP checksums are ignored:scream: High:scream: HardKernel bugUse Linux 4.4 or laterblog

Logging

IssueAbstractImpactReproducibilityCauseSolutionNotes
:white_check_mark: #19209GELF driver saturates CPU:scream: High:smiley: EasyCompressionDisable compression
:white_check_mark: #18057,#20600cat /dev/zero leads to out of memory:scream: High:smiley: Easylogger's stdio handling issueUse Docker 1.13 or later (or just disable the logging)Related: #21181
:white_large_square: #22497container cannot be stopped if many logs are being printed:scream: High:scream: Hardlogger's stdio handling issue
:white_check_mark: #22502logging blocks the container:scream: High:smiley: Easylogger's stdio handling issueUse Docker 1.11 or lateraffected versions: 1.10.0

Others

IssueAbstractImpactReproducibilityCauseSolutionNotes
:white_check_mark: #17720Docker daemon 1.9 serious performance issue:scream: High:scream: Hard?Use Docker 1.10
:white_large_square: #19758soft lockup related to show_mountinfo(), after frequent docker run:scream: High:scream: HardUnanalyzed (Kernel bug related to the number of processors?)None
:white_check_mark: #20670/dev/pts unmounted on the HOST when you are using -v /dev:/dev (After that you can no longer open SSH nor xterm):scream: High:smiley: Easydaemon bug related to mount namespaceUse Docker 1.11.1. (Or Spawn the docker daemon from systemd. Or do not use -v /dev:/dev)
:white_check_mark: #20836Daemon hangs up after frequent docker run:scream: High:scream: HardDaemon bugUse Docker 1.11.1
:white_check_mark: #28936Strange permission issues with named containers on 1.12.3:scream: High:smiley: EasyDaemon bug related to SELinux)Use Docker 1.12.4
:white_check_mark: Ubuntu linux-azure #1719045fatal error: unaligned sysUnused on Azure:scream: High?Ubuntu linux-azure kernel bugUse linux-azure 4.11.0-1013.13 or later

Non-bug issues:

  • docker ps is sometimes slow due to lock: #19328 (Mitigated in Docker 17.07, #31273
  • EBUSY on docker rm in Linux < 3.19: #26510