Compression (with an option to turn it on/off) for cache data

Suggestions around PrimoCache
Post Reply
npelov
Level 5
Level 5
Posts: 55
Joined: Thu Jun 30, 2016 3:01 pm

Compression (with an option to turn it on/off) for cache data

Post by npelov »

I want to request compression (with an option to turn it on/off, choose algorithm) for cache data (not meta data). ZFS is using compression for it's cache with near zero impact on performance/cpu usage with LZ4 compression - a really high speed algorithm. I don't know what's the license for it, but you can check it out. Also LZO is another fast algorithm. There are libraries which you can port from linux.

Memory is expensive. You can sometimes double it using compression.
User avatar
Support
Support Team
Support Team
Posts: 3623
Joined: Sun Dec 21, 2008 2:42 am

Re: Compression (with an option to turn it on/off) for cache data

Post by Support »

Thank you for your suggestion. We considered this before. It's not easy to be implemented in current program design.
npelov
Level 5
Level 5
Posts: 55
Joined: Thu Jun 30, 2016 3:01 pm

Re: Compression (with an option to turn it on/off) for cache data

Post by npelov »

I agree It's not easy - but it does take an effort to make something good. Currently L1 cache does not offer much more than windows cache. Compression would be a big advantage. There are open source libraries for most compression algorithms.
You could add a compression type attribute to blocks - like 0 - no compression, 1- gzip .. etc. Then just store blocks uncompressed and another thread (or multiple threads) can compress the blocks queued for compression.

I can see that RAM is allocated even when cache is free. Well you probably allocate all cache as a whole then split it. What you could do is allocate smaller blocks (ex. 128 MiB) - let's call it sectors, and only allocate new sectors when the last sector is filled.

So when a block is written it searches for empty space in existing sectors before allocate more. When a block is compressed, it goes to a new place where it is stored as compressed block and the uncompressed block is freed to allow more blocks to be written.

Of course this will fragment the memory because compressed blocks have different size, but you could run a scheduled defragmentation on sectors by moving blocks to the beginning. Or better you could copy the blocks to a new sector and release the old one.
User avatar
Jaga
Contributor
Contributor
Posts: 692
Joined: Sat Jan 25, 2014 1:11 am

Re: Compression (with an option to turn it on/off) for cache data

Post by Jaga »

The overhead for compression on a small 4k block might not even be worthwhile, if the resulting ratio is perhaps not even 50% reduction. I could however see compression as useful for large block sizes (32K or larger), and with large L1 caches.

Again however, with most boot drives being sold today as either 3.5"/2.5" SSDs (or NVMes), the L1 performance with compression might make IOPS suffer too much when compared to these drives, making compression an unwanted characteristic.

The place where compression might really shine is on data/archive arrays where the underlying drives are still spinners. With large block sizes and server-sized Cache Tasks (think 64GB or larger), you could effectively cache a decent sized spinner drive or array of drives.

Still, very niche use given that the size of most drives you'd put a L1 cache task on are going to be SSD/NVMe drives.
InquiringMind
Level SS
Level SS
Posts: 477
Joined: Wed Oct 06, 2010 11:10 pm

Re: Compression (with an option to turn it on/off) for cache data

Post by InquiringMind »

Jaga wrote: Wed Nov 25, 2020 10:51 pm The overhead for compression on a small 4k block might not even be worthwhile, if the resulting ratio is perhaps not even 50% reduction. I could however see compression as useful for large block sizes (32K or larger), and with large L1 caches.
A 50% reduction would effectively double the size of L1 cache which should make it a desirable feature - larger blocks would typically compress better, but a 50% rate is likely to be optimistic.

Also just as "whole disk compression" products like Stacker and SuperStor made most of their savings by reducing slack space (unused space at the end of clusters or sectors), it seems quite possible that PrimoCache could provide more efficient cacheing since such slack space should compress very well.

There would be a cost in higher CPU usage, but compression/decompression could be easily split across multiple cores. The biggest reason against is really a programming one - compressed blocks will take up a varying amount of space making retrieval of individual cached blocks far harder and finding space for new blocks would also become more complex (consider the case of a new block that compresses poorly requiring the flushing of 3-4 other blocks that compressed well).
User avatar
Support
Support Team
Support Team
Posts: 3623
Joined: Sun Dec 21, 2008 2:42 am

Re: Compression (with an option to turn it on/off) for cache data

Post by Support »

InquiringMind wrote: Thu Jan 07, 2021 1:56 pm There would be a cost in higher CPU usage, but compression/decompression could be easily split across multiple cores. The biggest reason against is really a programming one - compressed blocks will take up a varying amount of space making retrieval of individual cached blocks far harder and finding space for new blocks would also become more complex (consider the case of a new block that compresses poorly requiring the flushing of 3-4 other blocks that compressed well).
:thumbup:
Toyzrme
Level 1
Level 1
Posts: 3
Joined: Thu Mar 25, 2021 7:59 pm

Re: Compression (with an option to turn it on/off) for cache data

Post by Toyzrme »

I can understand compression for L2, but not L1.

L1 is all about speed - which includes IOPS, especially for random small blocks, since that's how programs generally access data (i.e. lots of little bites, even if sequential). Unless you're just caching text files which compress well, you might only get a 2x improvement at best. Worst case for video and audio, you'd get no space improvement (incompressible), and only slow it down. Adding more code in that path, more work, and vastly more complex cache management doesn't make much sense - the ratio of improvement from more space vs the slowdown from CPU load & latency would be a wash - or worse.

REFERENCE: PCIe 4.0 (i.e. NVMe) speed throughput is 8GB/sec. DDR4 is ~20-25GB/s. So the ratio is ~3:1 on current tech - meaning you can't add much code to the path before you negatively impact the overall application performance.

L2 however has a much higher ratio between device speeds. So a little bit of CPU time is more worth it to speed up very slow access (depending on hit ratios, i.e. cache to HDD size ratio). This was another Stacker advantage (and I owned one of their early ISA cards!): even using a slow CPU (80286), the read rate from the HDD for 2 sectors was sooo much longer than the compute time to read a single 2x compressed sector and decompress it, that it netted an overall speed improvement - AND doubled your drive's capacity.

Of course the numbers are different than in the 286 days, but I imagine the ratios aren't that much different. So I would think this might be an interesting advantage for L2 (especially if Romex increases the max cache size).

REFERENCE: HDD throughput is ~200MB/s, so roughly 40 times slower than PCIe 4.0.
Latency is even higher: NVMe = ~100 MICRO seconds at 50k IOPS, while HDD's are ~10 *MILLI*seconds (~100x slower).
IOPS is even more skewed: typical SATA 7,200rpm HDD's do less than 100 IOPS; the Samsung 980 Pro PCIe 4.0 NVMe claims 1,000,000 - or about 1000x higher.

So in the case of large HDD to L2 size ratios, I would think the higher cache hit rate and time saved by having 2x the data in cache would more than pay the penalty for compression - but I'd still want it as an option to decide based on my workload, HDD and L2 size, etc.


That said, I get that managing variable-length blocks due to varying compression rates is a nightmare! (but sounds like a fun problem to at least model, if not solve ;)
Post Reply