Caching Policy/algorithm suggestion:

Suggestions around PrimoCache
Post Reply
Logic
Level 5
Level 5
Posts: 47
Joined: Mon Oct 29, 2018 2:12 pm

Caching Policy/algorithm suggestion:

Post by Logic »

From research paper:
Differentiated Storage Services
by Intel Labs

"...We associate caching policies with various classes (e.g., large files shall be evicted before metadata and small files), and we show that end-to-end file system performance can be improved by over a factor of two, relative to conventional caches like LRU..."
https://www.researchgate.net/publicatio ... e_services

This makes sense as HDDs are only ~4x slower than SATA SSDs in large sequential I/O
but
80X slower at small random I/O.

So I'm guessing a LFU algo, but evict large I/O 1st algo

Also good would be parallel I/O:
Read large to small from the HDD
in parallel with
Small to Large from the L2 cache
The 2 reads will meet in the middle somewhere, at which point the request/s is complete at faster than L2 cache speed..!?
User avatar
Support
Support Team
Support Team
Posts: 3623
Joined: Sun Dec 21, 2008 2:42 am

Re: Caching Policy/algorithm suggestion:

Post by Support »

Thanks, we also have such ideas on improving cache performance.
Logic
Level 5
Level 5
Posts: 47
Joined: Mon Oct 29, 2018 2:12 pm

Re: Caching Policy/algorithm suggestion:

Post by Logic »

Support wrote: Mon Jul 05, 2021 2:01 am Thanks, we also have such ideas on improving cache performance.
Ye I mentioned it before. :)

Assuming namespace data can not be read from NVME spec 1.4 etc:
https://nvmexpress.org/wp-content/uploa ... tified.pdf
https://www.anandtech.com/show/14543/nv ... -published

Here is CODE to determine SSD (clustered) Erase Block, Page, Buffer size etc to speed up I/O:
http://csl.skku.edu/papers/mascots09.pdf
InquiringMind
Level SS
Level SS
Posts: 477
Joined: Wed Oct 06, 2010 11:10 pm

Re: Caching Policy/algorithm suggestion:

Post by InquiringMind »

Logic wrote: Sat Jul 03, 2021 11:55 pm From research paper:
Differentiated Storage Services
by Intel Labs...So I'm guessing a LFU algo, but evict large I/O 1st algo
It's an interesting link and highlights an issue with how large files can have a disproportionate effect on caching. The problem for PrimoCache at the moment is that it runs at block-level, so isn't aware of what file each block belongs to (Windows' own file cache on the other hand, does run at file level).

Being able to identify such files and exclude them from caching could be useful - perhaps working by file extension, with "media files" like .mp4, .wmv, etc being excluded by default, or excluding all files over a certain size (both options being user configurable). The question then is how much impact adding these features would have on PrimoCache's performance given the extra housekeeping needed (creating and maintaining a block exclusion list, checking every access against that list, etc).
Logic wrote: Sat Jul 03, 2021 11:55 pm Also good would be parallel I/O:
Read large to small from the HDD
in parallel with
Small to Large from the L2 cache
The 2 reads will meet in the middle somewhere, at which point the request/s is complete at faster than L2 cache speed..!?
This I think would be "interesting" rather than "practical". On a heavily-used system, the HDD will already be busy with write traffic (copying "dirty" L2 cache data to HDD) and adding extra read requests for the HDD would reduce performance disproportionately (due to its seek time). Also SSDs now can saturate SATA bandwidth so an HDD on the same controller as the SSD might not offer any performance benefit at all. Finally, applications making read/write requests specify a start cluster and the number of clusters to read for a specific file (you can monitor this with software like Process Monitor) - if a process chooses to read a large amount of data in small chunks then parallel I/O isn't going to be much help.
Logic
Level 5
Level 5
Posts: 47
Joined: Mon Oct 29, 2018 2:12 pm

Re: Caching Policy/algorithm suggestion:

Post by Logic »

InquiringMind wrote: Mon Aug 09, 2021 9:28 am It's an interesting link and highlights an issue with how large files can have a disproportionate effect on caching. The problem for PrimoCache at the moment is that it runs at block-level, so isn't aware of what file each block belongs to (Windows' own file cache on the other hand, does run at file level).

Being able to identify such files and exclude them from caching could be useful - perhaps working by file extension, with "media files" like .mp4, .wmv, etc being excluded by default, or excluding all files over a certain size (both options being user configurable). The question then is how much impact adding these features would have on PrimoCache's performance given the extra housekeeping needed (creating and maintaining a block exclusion list, checking every access against that list, etc).
IIRC the paper does in fact discuss file unaware block caching.
As I understand it, it's simply a LRU algorithm, modified to have a propensity for demoting large blocks, or perhaps a collection/s of sequential blocks first.
That being the case, one is already half way to "small blocks/random from cache" and "large blocks/sequential from HDD"
All that is required is parallel/simultaneous reading from both sources.

If the caching is Size Aware, the below should also be possible without file awareness.
Now I'm no dev. All I know, from the paper, is that it's possible.
InquiringMind wrote: Mon Aug 09, 2021 9:28 am This I think would be "interesting" rather than "practical". On a heavily-used system, the HDD will already be busy with write traffic (copying "dirty" L2 cache data to HDD) and adding extra read requests for the HDD would reduce performance disproportionately (due to its seek time). Also SSDs now can saturate SATA bandwidth so an HDD on the same controller as the SSD might not offer any performance benefit at all. Finally, applications making read/write requests specify a start cluster and the number of clusters to read for a specific file (you can monitor this with software like Process Monitor) - if a process chooses to read a large amount of data in small chunks then parallel I/O isn't going to be much help.
As I understand things, Primocache does promoting/demoting to/from cache when I/O is otherwise idle.
During heavy use, data will be fetched directly from whichever media it resides on anyway.
If L2 is full and data needs to be demoted, it simply does not make sense to bother during heavy use as that would further slow things down.

I cannot speak to the internals of all the various SATA controllers, but IIRC; as long as the SATA drive's in its own cable, simultaneous I/O, with no loss in speed is the norm.
That's how RAID works...

On a HDD (and even a SSD to a far lesser extent) a large amount of data read in small chunks is sequential as long as the drive is defragmented and said data is physically sequential.
(That's one of the reasons I feel a good defragmenter, should be integrated into Primocache.
Mydefrag for instance, is open source and gives one I/O around ~15% faster than Windows' Defrag)
ie: An I/O acceleration system that speeds up ALL data,


(You may also find my setup interesting:)
I cache QD1 R4K to an 800P Optane (fastest of all Optanes for QD1 R4K) using Readyboost.
(SuperFetch's 'filtered for R4K' L3 cache, if you will.
It will also contain a 2nd Pagefile)

The cache software overhead is there, but I get well over 120 MB/s of R4k and
SIMULTANIOUS
large sequential from the Primocache cache. (a Corsair MP600 NVMe 4 drive)

My L1 cache is write only as I believe Windows does a good enough job of FILE AWARE read caching
and that:
With the help of Readyboost, this file aware pre/caching indirectly makes Primocache somewhat file aware...

Readyboost is not ideal as it's a write through cache and not persistent through reboots, but its all I can find with the above desired properties.

I will also work on decreasing both caching app's software's overhead, by:
Ensuring Message Signalled Interrupts are used.
Affining the processes and interrupts to 1 or 2 (close) core complex/es (NUMA-ish) and perhaps increasing the priority of both the processes and interrupts.

My bottom tier will consist of 2X EXPRESS4K Compacted Firecuda 3.5" SSHDs. (no SMR on the 3.5" drives. 2X 8GB of NAND cache)
I think I may be able to put them in RAID 0 successfully by disabling Write cache buffer flushing, giving me a 4th, 16+ GB tier of caching, just for fast boots and fun! :)
(Double caching should subside as files 'now read' from Primocache are demoted)

Naturally there's an oft used backup drive. :)

I will write all this up upon my return from Africa.
Post Reply