qemu-kvm with cache=none fails on ext4 filesystem with journal_data option
Kvm has become one of the major virtualization technologies the last years. For Redhat Linux it has even become the default virtualization solution. Kvm´s IO performance is hardly competitive to other virtualization solutions when using the default options. Especially when using qcow2 images, the IO performance of kvm/qemu can be greatly improved by disabling the cache of the underlying host filesystem. This can be done by starting kvm with the cache=none option, in example with the options
-drive file=my_image.qcow2,index=0,media=disk,cache=none
instead of just supplying the image file with -hda my_image.qcow2. Then the image file is being opened using the O_DIRECT flag, bypassing the page cache. If the underlying filesystem does not support the O_DIRECT flag, this fails with the error message:
could not open disk image my_image.qcow2: Invalid argument
This is the case for an ext4 filesystem with full journaling enabled. One can easily test if the O_DIRECT flag is supported by the underlying filesystem with a simple dd command on the host:
dd if=some_file of=/dev/null iflag=direct
If the O_DIRECT flag is not supported it results in the following error:
dd: opening `some_file’: Invalid argument
Thus, if safety concerns do not apply, one does not want to use full journaling, to increase performance. The journaling options can be set either in /etc/fstab or in the filesystem itself. For the fstab case the red marked part of the following example entry has to be removed.
/dev/sda7 / ext4 defaults,noatime,nodiratime,async, data=journal 0 1
If the journaling option is set in the filesystem, this can be shown and edited with the tune2fs command. In example tune2fs -l /dev/sda7 displays information on the filesystem on /dev/sda7. If full journaling is enabled, the output contains the journal_data mount option:
Default mount options: journal_data
The option can be removed with tune2fs -o^journal_data /dev/sda7. Afterwards the output of tune2fs -l does not contain the journal_data mount option any more:
Default mount options: (none)
In both cases the filesystem has to be remounted to activate the changes. Afterwards qemum-kvm works with the cache=none option, as described above, and with increased IO performance.
Jürgen
References:
[1] itscblog.tamu.edu
[2] blog.nkadesign.com
March 23rd, 2014 at 2:09 pm
Don’t know if this changed since 2012 but to deactivate journal or even create a partition without journal is not necessary anymore.
You can use cache= with all available options (Qemu 1.1.2):
writethrough – default, host page cache
writeback – like writethrough, but report data writes when completed at host (recommended for qcow2)
none – no host page cache, IO directly to the guests memory
unsafe – keep things in guest cache, default with -snapshot
directsync – like “unsafe” but write notifications are send to the guest
March 23rd, 2014 at 8:23 pm
I stand corrected: Of course you’ll need a host file system without journaling or else it will re-read the Qemu image all the time.