Failure-Atomic file updates for Linux
Updating previously written data has always been a problem in a file system.
On the whole file level such atomic updates can be done using careful combinations of the rename and fsync system calls, but updating portions of a file safely has been impossible to do relying only on guaranteed behavior. The lack of atomic file updates requires applications that care about data integrity (e.g. databases) to implement their own logging schemes and thus increasing write amplification instead of relying on the file system or storage device.
This talk describes a proposed extension that allows for failure-atomic updates to files, and it's implementation for XFS file system and modern SSDs, as well as earlier research from HP Labs that these implementations are based upon.
Christoph Hellwig has been working with and on Linux for the last ten years, dealing with kernel-related issues much of the time. In addition he is or was involved with various other Open Source projects. After a number of smaller network administration and programming contracts he worked for Caldera's German development subsidiary on various kernel and userlevel aspects of the OpenLinux distribution. Since 2004 he has been running his own business focusing on consulting, training and contracting work in the OpenSource hemisphere. Specializing on Linux filesystems and storage he is also active in bordering areas such as virtualization and networking. He has worked for well known customers such as Dell, SGI, IBM, Red Hat and various startups. He has worked in a leading role on the Virtual Filesystem Switch (VFS), the XFS local filesystem, the block layer including support for SCSI and NVME, the NFS server and client as well as other aspect of the Linux kernel, and did a major rewrite of the storage subsystem of the Qemu project, which is used by the KVM and Xen projects for system emulation.