[Users] Zimbra 8.8 Swap File Usage
L Mark Stone
lmstone at lmstone.com
Wed Jan 16 17:59:42 CET 2019
If you are running Zimbra 8.8 and seeing unexpected swap file usage, I have a suggestion you may want to try.
After doing a number of migrations of older Zimbra systems that never touched the swap file, I started seeing that these systems were now using the swap file a fair amount -- even with vm.swappiness set to 0. A 16GB server with under 400 mailboxes after just a few minutes would start using the swap file, and after a day or so would be using several GB of swap.
At first I suspected something in Zimbra/ZeXtras, but after I rebooted two small servers and increased their instance sizes to go from 16GB of RAM to 64GB of RAM, and they too started swapping within a few minutes, I felt this wasn't likely to be a Zimbra/ZeXtras issue. Even when these systems had 16GB RAM and were using a lot of swap, none of the users were complaining about reduced performance. The systems remained quite snappy at both RAM inventories.
After a week or so of research, last night I ran a test on few systems, and this morning those systems' swap file usage is about half or less of what it has historically been.
So I wanted to share what I've done, and ask for others who know more about this to provide feedback as to whether what I've done is optimal/appropriate -- or not.
By way of background, modern distros/kernels now make greater use of what top reports as "buff(ers)/cache". /var/run for example now is a symlink to /run on tmpfs, and this lives in that memory space. If you run a ram disk for Amavis's tmp directory as I do, that memory also as I understand it is living in what top reports as buff/cache.
But on the Zimbra systems that were swapping heavily, buff/cache was unexpectedly large. Even a 32GB server with under 500 users would report 20GB or more as buff/cache, and maybe 5-10GB of swap file usage. A 500-mailbox server should easily be performant with 12GB-16GB of RAM.
What I came across was /proc/sys/vm/vfs_cache_pressure
This setting, which ranges from 0 to 200 and has a default of 100, controls how much "pressure" is put on the kernel to release memory from the buff/cache pools. The higher the number, the more aggressively the kernel is "pressed" to release memory from the buff/cache pools.
This to me seems not so dissimilar from vm.swappiness, which controls how aggressively the kernel is "pressed" to move data out of RAM and into swap. The default there is 60 (lots of swap file usage) and Zimbra's best practice is to set this to zero.
So last night I ran as root "sysctl -w vm.vfs_cache_pressure=150" on a few systems, followed by "swapoff -a && swapon -a"
This morning, swap file usage was cut in at least half of what it is has been of late on all the systems on which I have made this change. More importantly, end users continue to report the systems remain highly performant.
Anecdotally, I've noticed on one system that an IMAPSYNC that had been running for a few days, and which had averaged ~3.75-4.0 messages per second through most mailboxes, has been running this morning through the remaining mailboxes at more than 5 messages per second.
I would be grateful if your Zimbra system(s) are using the swap file more heavily than expected, if you wouldn't mind trying this adjustment and report back. If you have experience with this setting either with Zimbra or other applications, please share -- especially if you feel, like setting vm.swappiness to zero, we should set vfs_cache_pressure to its max value of 200.
On next week's Zeta Alliance call I intend to add this to the agenda (and if you'd like an invite to that call, send me an email to mark.stone at MissionCriticalEmail.com). So if you are seeing unexpected swap file usage and want to try this, you can share your results on the call next week and/or on this list.
Another Message From... L. Mark Stone
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Users