Enlightening findings on file compression and PrimoCache

FAQ, getting help, user experience about PrimoCache
Post Reply
RobF99
Level 8
Level 8
Posts: 113
Joined: Fri Sep 19, 2014 5:14 am

Enlightening findings on file compression and PrimoCache

Post by RobF99 »

I noticed a few users asking for a compression feature within PrimoCache. I am sure it is a difficult feature to implement and potentially introduce issues with data integrity and unnecessary overhead. Instead, just compress your data on your drive using the excellent new compression options in Windows 10 and 11. Not available in earlier versions (except for NTFS compression).

I ran some performance tests with PrimoCache and the various compression options in Windows (XPRESS4K, XPRESS8K, XPRESS16K, LZX and NTFS) and obtained some very interesting results.

Two findings that I want to mention up front are:

1. PrimoCache L1 cache performs 23% faster than Windows cache on reading uncompressed data!

2. You can get the best overall average performance and about DOUBLE the file data stored in L1 and on L2 (effectively doubling their size) with XPRESS16K compression.

CAUTION: Please be sure to know what you are doing if you are going to compress your data. Details of my disasters (and some advice) are below.

This image contains details and results of the performance tests and some more discussion below the image.

Image

Further Discussion
If you want to calculate the actual read performance, just divide 5.38 Gb by the read times (milliseconds) in the table.

When you consider the very slow read time of the uncompressed data on the spinner, you can understand why Windows 10 and 11 drag so much on a regular HDD system. Even if you are not using any caching software, you will double your Windows performance just by compressing your data with the XPRESS16K algorithm.

Even though uncompressed data reads from L1 around five times faster than XPRESS16K, the large time difference in the column is for reading 5 Gb. In the general use of your PC, typical reads and writes (e.g. loading a program) might be in the order of a few hundred Mb. You do not generally perceive any slower performance between uncompressed and XPRESS16K data reading from L1. It feels about the same, and you get the benefit of double the data in your cache. This means that over long term use, you will hit the L1 and L2 cache about 10 to 15% more than using uncompressed data. I typically sit at around 93% cache hit rate (L1 and L2). But I do dedicate 24 Gb of my 32 Gb RAM to L1. In doing so, I effectively run a super-lightning-fast 8 Gb system. If I get low on memory (very rare) anything that would be paged to the disk is paged to L1 at L1 speeds.

The beauty of the new compression algorithms compared to NTFS compression is that they do not try to compress uncompressible files; so those files remain uncompressed on the disk. It does not have to decompress those files on the fly like it has to do with NTFS compression that tries to compress all data. Also, NTFS compression is subject to severe fragmentation. With NTFS, it is not uncommon to have 50,000 fragments after compressing a 2 or 3 Gb file. The new algorithms generally do not fragment the files when you compress them. You might get a few fragments here and there. Also understand that the new algorithms (XPRESS4K, XPRESS8K, XPRESS16K, LZX) are supposed to be used with data that is never modified, e.g. .DLL and .EXE files. If you compress regular data files they will compress but if you modify and save the file, they are saved uncompressed. That being said, if you have older data files that you don't access or modify any longer, an algorithm such as XPRESS16K is a great way to have them ready to read in native Windows format while having them occupy about half the space on your drive.

Hopefully, this shows that you can take advantage of compression features already built into Win 10 and 11 and get double the benefit from PrimoCache. It also should take the pressure off Romex Software from the daunting and delicate task of implementing compression into PrimoCache.
Elrichal
Level 1
Level 1
Posts: 4
Joined: Fri Feb 07, 2020 11:43 pm

Re: Enlightening findings on file compression and PrimoCache

Post by Elrichal »

I've found that Win7 NTFS writes the uncompressed file first, then compresses it into a new file and finally deletes the original uncompressed file. Does Win10/11 work the same way or has M$ finally managed to compress the data on the fly using RAM?
RobF99
Level 8
Level 8
Posts: 113
Joined: Fri Sep 19, 2014 5:14 am

Re: Enlightening findings on file compression and PrimoCache

Post by RobF99 »

Yes these methods - XPRESS4K, XPRESS8K, XPRESS16K, LZX compress on the fly. Much different from the way NTFS compression compresses, which is as you described. The NTFS compression in Windows 10 still works the same classic way, but those methods are far more different and much faster too. They also are not subject to fragmentation like NTFS compression is.

Note that they will only compress as a read process, and designed for files that aren't modified. If you compress a regular data file, modify and save it, it will save uncompressed. I hope in the future we can save files this way too.
User avatar
Support
Support Team
Support Team
Posts: 3201
Joined: Sun Dec 21, 2008 2:42 am

Re: Enlightening findings on file compression and PrimoCache

Post by Support »

@RobF99, Thank you very much for your work! Very interesting results. :thumbup:
A relevant thread for other readers: viewtopic.php?t=5327
Elrichal
Level 1
Level 1
Posts: 4
Joined: Fri Feb 07, 2020 11:43 pm

Re: Enlightening findings on file compression and PrimoCache

Post by Elrichal »

Thank you both Rob and Support for your studies and results, they are very interesting and informative reads.
Logic
Level 4
Level 4
Posts: 35
Joined: Mon Oct 29, 2018 2:12 pm

Re: Enlightening findings on file compression and PrimoCache

Post by Logic »

Thx for the tests RobF99.

As:
Xpress4K compresses data in 4KB chunks.
Xpress8K compresses data in 8KB chunks.
Xpress16K compresses data in 16KB chunk
https://docs.microsoft.com/en-us/window ... on_info_v1

Some questions:
Did you use the std 4K cluster size on the HDD?
What size was your PrimoCache Cache Block Size during tests?

ie:
Matching/aligning HDD cluster size and PrimoCache's cache block size to the Xpress 4/8/16 chunk size may well give different/better results..?

Some ideas/suggestions:
66% of windows I/O is Random 4K at a Que Depth of 1. (R4K QD1)
Less than 1% of windows I/O is the large sequential #s advertisers like to wave around like burning flags..!
https://www.thessdreview.com/ssd-guides ... ers-bluff/

Newer games are doing more 16KB and 32KB reads:
https://www.gamersnexus.net/guides/1577 ... m-relevant
(But also NB; lots of 4KB writes)

RAID 0:
Is good for large sequential, but as it adds processing/latency to the I/O stack; Random I/O suffers and is generally slower than that of a single SSD in AHCI mode.

Windows typically does a 75/25% mix of simultaneous reads/writes. (see the 70/30 Mixed option in Crystal Diskmark)
This I/O mix typically slows I/O to around 30% of the advertised speeds..!
See the U shaped graph/s here for eg:
https://www.anandtech.com/show/16012/th ... d-review/6

ie:
You may well be better off splitting your SSD RAID 0 array:
Set 1 drive to cache reads and the other (50% overprovisioned) to cache writes. (Possibly with the accompanying delayed write L1 RAM caching)

This should give you the advertised x300 read and write speeds, along with (most important) faster Random ~4K speeds.

Everyone NB the GUI for Windows 10 compact; Compactor by Freaky:
https://github.com/Freaky/Compactor
I prefer this to CompactGUI because:

"...Compactor performs a statistical compressibility check on larger files before passing them off to Windows for compaction. A large incompressible file can be skipped in less than a second instead of tying up your disk for minutes for zero benefit..."

"...Compactor can skip over files that have been previously found to be incompressible, making re-running Compactor on a previously compressed folder much quicker..."
RobF99
Level 8
Level 8
Posts: 113
Joined: Fri Sep 19, 2014 5:14 am

Re: Enlightening findings on file compression and PrimoCache

Post by RobF99 »

Logic wrote: Tue Jan 18, 2022 2:25 pm Thx for the tests RobF99.

As:
Xpress4K compresses data in 4KB chunks.
Xpress8K compresses data in 8KB chunks.
Xpress16K compresses data in 16KB chunk
https://docs.microsoft.com/en-us/window ... on_info_v1

Some questions:
Did you use the std 4K cluster size on the HDD?
What size was your PrimoCache Cache Block Size during tests?
HDD was 4K, Primocache was 32K. As far as I understand the block size in PrimoCache doesn't affect performance at all. Instead it determines the amount of data surrounding the cached block that is cached. I confirmed this by lining up the files from the folder on the drive consecutively and when I used 512Kb it cached a fraction above the amount of data in the folder. When the files were randomly about the drive it cached almost twice the amount of data because it was picking up many small blocks and each of their their surrounding 512Kb of data blocks.

I tested 512Kb block size down to 8 Kb and it made no difference to performance regardless of compression method and block size. It so happens that for my processor the sweet spot for transfer rates and compression are at 16 Kb with XPRESS16K.
Some ideas/suggestions:
66% of windows I/O is Random 4K at a Que Depth of 1. (R4K QD1)
Less than 1% of windows I/O is the large sequential #s advertisers like to wave around like burning flags..!
https://www.thessdreview.com/ssd-guides ... ers-bluff/
Yes, I am familiar with this statistic. Pareto's Law applies to file access too!
Windows typically does a 75/25% mix of simultaneous reads/writes. (see the 70/30 Mixed option in Crystal Diskmark)
This I/O mix typically slows I/O to around 30% of the advertised speeds..!
See the U shaped graph/s here for eg:
https://www.anandtech.com/show/16012/th ... d-review/6
Yes, these are interesting curves.
ie:
You may well be better off splitting your SSD RAID 0 array:
Set 1 drive to cache reads and the other (50% overprovisioned) to cache writes. (Possibly with the accompanying delayed write L1 RAM caching)
How do I do this since you can only have 1 L2 cache for each drive you are caching? Or am I missing some neat trick in PrimoCache?
Everyone NB the GUI for Windows 10 compact; Compactor by Freaky:
https://github.com/Freaky/Compactor
I prefer this to CompactGUI because:
Yes I discovered this program just a few days ago. It does what it does very well!
Post Reply