Tiitu wrote: ↑Mon Dec 20, 2021 8:33 am
Somebody did recommend to use a cache size that is twice the RAM size. But, why not use a much bigger cache and long latency, if you can afford it and you have got power supply that is as fail-safe as an UPS?
I don't who and why one would recommend twice the size of RAM as cache. Thats IMHO nonsense. What you have to take care of is the block size of your drives, the block size of your cache and how much the cache index takes up RAM.
Example:
If you have formatted 512 Byte blocks on your HDD and use 512 Bytes cache blocks, you would need at least 8 times more RAM for the index than using 4 KByte blocks for HDD formatting and caching. In fact, I use 64 KByte blocks for formatting and cache, resulting in a 16 times lower RAM usage for the index than with 4KByte blocks or 128 times lower RAM usuage for the index than wit 512 Byte blocks.
Using cache blocks that are bigger than the HDD blocks also makes no sense. At best, both match 1:1. It is worth reformatting if needed.
However, choosing the right formatting block size depends strongly on your overall drive size as well as on the specific kind of data files you are going to store on them.
Example:
If you work with mainly large video, sound and image files, using a 64 KByte block size will benefit you with faster reads, less I/O overhead. This is really where PrimoCache shines.
Vice versa, if you only work with small 1-10 KByte text files, like few page word documents, you might loose a lot of available disk space, due to 64 KByte blocks being excessively larger than required. A compromise might be to use file system compression in that case, but I would assume that would affect performance, but never really tested it. After all, working with many small files doesn't sound like needing those on a fast, NVME-SSD cached device at all, except you are going to process millions of those in minutes.
In my old company we use PrimoCache Server to cache a 64 TByte volume using a 2 TByte cache for video editing etc. via 10 GBit Ethernet links. The cache was a partition on a 4 drive SATA SSD, as the server had no NVME ports yet. The server was connected with 40 GBit to the switch and a whole render farm was able to read files almost full speed from the server. Especially in a render farm scenario, where hundrets of CPU cores access the very same source files, a PrimoCache enhanced server will shine in its full potential. Cached files fly in from network with almost no performance hit, instantly. Even our RAID run beyond 2 GByte/s, but the SSDs where simply faster due to access times, which are magnitudes better on SSD than on HDD/RAID controllers.
I can only point you to my previous recommendation (see a few messages above) to measure how much data your really access in a single day, double that and you should be fine for cache size. If you have free RAM that you can spend for an L1 RAM cache, fine, but important is the L2 cache size in relation to how much data is being accessed during a single day. Measuring this is - after all - poossible using PrimoCache itself as explained earlier.
Using a long latency is fine, but has some risk. If you have a UPS and double power supply in your machine, go for it. If not, reconsider. For a read cache: No problem at all. For a write cache: Beware! However, it is important that data, that write data, that has been flushed to disk is not immediately removed from the cache. Its just that you set maximum time period to collect data and then write it off to SSD at once, to reduce wear and tear on the cache drives.