support wrote:Axel Mertes wrote:1. Implement a caching strategy, where any written block will be automatically written to L2 cache too (rather then L1 cache only as of now).Reason: We create far more - soon to be re-read - data in short time, than the L1 cache can keep. By moving these blocks to a *large* L2 cache will have a strong impact on read speeds.
When the write-data is re-read, this data will be stored into level-2 cache.
Yes, I understand that re-reading data from HDD will force it to be moved to read L2 cache.
However, it is a HUGE waste of performance in our scenario.
Example:
When I have to render a "4K" 4096x2160y resolution image sequence in uncompressed 32 Bit/channel OpenEXR format, then each single frame will be ~138 MByte in size. For a 40 seconds sequence at 24 frames/seconds that is 40 seconds * 24 frames/second * ~138 MByte/frame = ~132480 MBytes = ~132.5 GBytes. For 40 seconds results in 4K! The renderfarm can process that data in just a few minutes.
The size of this sequence alone is about 10 times the size of the L1 cache. So writing the data to the L1 cache has no effect at all, because it is being wiped out long before the sequence rendering is finished. Then the user reviews the sequence and it is being forced to re-read it from HDD RAID. Better would be to write data to HDD and L1 cache, and make deferred writes from L1 to L2 cache as well (deferred to reduce wear on the L2 cache SSD). This off-loads the L1 cache and writes data extremely fast to the L2 cache.
If you would allow storing write data to the L2 cache too - right in the moment it is written to HDD as well - then we would not need to read from the HDD RAID at all if our L2 cache is big enough. Our L2 cache is currently 2 TByte. We would make it even larger if you implement that at some point, but 2 TByte is OK for the moment for a single drive.
Our users most time work like this:
Render, review, make changes, re-render, review, make changes, render, and so on. It can be a few dozen times or even a hundret times until the user is satisfied with the results. So its the very same sequence, over and over again. It would happen 100% inside the L2 cache (beside the written data also being written to HDD), but the reads might come 100% from the cache, freeing a lot of performance from the HDD RAID. This will result in significant reduction on wear & tear on the HDDs and improved overall performance of the HDD RAID (due to not needing to read the sequences all the time).
I have made huge efforts to analyse the amount of data "touched" by our users on a single day. As a result I found that we usually do access around 0.5 to 2 TByte of data per single working day. But the truth is: We access far more, because when we summarize the reads and the over-writes of the very same files, then it can be a few TBytes per day. So if the L2 cache is bigger than the total size of the files, we might work from inside the cache alone. If we over-write the data blocks with fresh versions, this will stay in the cache too.
support wrote:Axel Mertes wrote:2. Implement a cache strategy which provides a pure read cache and write blocks straight onto the target disks, but which will also copy written blocks immediately to the L1 and L2 cache for soon re-read.
If you choose cache strategy "Read & Write" and without Defer-Write enabled, write-data will be written through to the target disks and also copied to L1 cache.
See example above.
I stumble over:
"...and without Defer-Write enabled, write-data will be written through to the target disks and also copied to L1 cache."
So this copy to L1 cache will NOT happen when defer write is enabled?
I always thought:
The L1 cache is used to do the defer write process, collecting blocks in L1 cache and writing them "at once" in combined - potentially sequential - write operations to HDD.
Now you wrote the exact opposite. Is that a mistake?
I am sure a lot of customers in my business would pay even more than the current Server version prise for this, as it would safe quite a lot of time and money in the long run, as you can easily demonstrate.
support wrote:Axel Mertes wrote:4. Implement an automatic pre-read of user customizeable amount of blocks for any *read* block into L1 / L2 cache.
Thanks. This "Read-Ahead" feature might be implemented in future.
Again, thank you very much for your kind suggestions!
This is welcome news. Having also a version that outputs the cache operations (a log file...) would help a lot in simulating the results. If you provide me a version that outputs a log, I can do the simulation...
Btw, could you integrate a simple timer into the cache statistics, so we can see since how long the cache is running to produce the given statistics?
And whenever we reset the statistics, we reset the timer too, so we get an idea how much data in which time is processed.