sqlite caching with vmtouch
I have a script that queries a sqlite database file with ~8M rows in the main table and measures ~1.3GiB on disk. With a cold start, the script takes around 6 minutes to execute:
$ # clear the cache: https://www.kernel.org/doc/Documentation/sysctl/vm.txt
$ sudo sync
$ echo 3 | sudo tee /proc/sys/vm/drop_caches
$ time ./target/release/app
...
...
real 6m7.636s
user 0m14.322s
sys 0m15.043s
From time
’s output, we can infer that the script is spending most of its time on I/O.
Since the db file is not large, bringing the entire file into the disk/page cache should speed things up.
To do this, I found a small utility called vmtouch
.
The code
it uses to preload is straightforward: memory map the file and then read it with a stride of PAGESIZE.
The total runtime (including time to cache the file) reduces to ~24 seconds giving a speed up of ~15x:
$ # clear the cache
$ sudo sync
$ echo 3 | sudo tee /proc/sys/vm/drop_caches
$ vmtouch -vt db.sqlite
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 319313/319313
Files: 1
Directories: 0
Touched Pages: 319313 (1G)
Elapsed: 9.6319 seconds
real 0m9.674s
user 0m0.084s
sys 0m0.569s
$ time ./target/release/app
...
...
real 0m14.051s
user 0m7.056s
sys 0m6.796s