repopulate/proactive level 1 cache Topic is solved

Report bugs or suggestions around FancyCache
Post Reply
nosiya
Guest
Guest

repopulate/proactive level 1 cache

Post by nosiya »

Hello,

a quick suggestion:
To keep the optimization, repopulate level 1 cache after a boot/reboot/crash/login/etc. (flush cache data to a file). Otherwise all the optimizations are lost.
User avatar
Support
Support Team
Support Team
Posts: 3627
Joined: Sun Dec 21, 2008 2:42 am

Re: repopulate/proactive level 1 cache

Post by Support »

Thanks for the suggestion.

The idea is great, however, there are lots of potential issues shall be considered, such as the scenario that disk data are offline changed. That's why we still do not open the feature of "persistent cache".
manus
Level 4
Level 4
Posts: 28
Joined: Fri Nov 18, 2011 6:03 pm

Re: repopulate/proactive level 1 cache

Post by manus »

Maybe you can only save which blocks (address) are used and not the block himself. And at start reload blocks directly from hard-drive. With this system you are sure that the data is right and fresh.
It's not a persistent cache but it keep the optimization usage of blocks.
JimF
Level 4
Level 4
Posts: 36
Joined: Sun May 01, 2011 10:13 pm

Re: repopulate/proactive level 1 cache

Post by JimF »

support wrote:Thanks for the suggestion.

The idea is great, however, there are lots of potential issues shall be considered, such as the scenario that disk data are offline changed. That's why we still do not open the feature of "persistent cache".
I would like the option of "infinite latency", even if it is not repopulated after a reboot. My data is all distributed computing work that just overwrites old data with new data having the same set of file names, and I normally don't need to flush it at all.

If you want to also repopulate the cache after a reboot, that would be fine, but could be a separate option.
Mradr
Level 7
Level 7
Posts: 87
Joined: Sun Mar 25, 2012 1:36 pm

Re: repopulate/proactive level 1 cache

Post by Mradr »

Support, I think he means the cache statics that is created when FC priorities the data for what is used the most. Think Algorithm: LRU and LFU.

Atm, FC "seems" to recreate cache statics every reboot instead of reloading what was used before. I guess you can look at this good or bad.
support wrote:Thanks for the suggestion.

The idea is great, however, there are lots of potential issues shall be considered, such as the scenario that disk data are offline changed. That's why we still do not open the feature of "persistent cache".
No, if you keep the cache statics it self, it would not cause the issue above as the data still has to be reloaded back into memory. It just wouldn't have to redo the cache statics again as it would already know what to load (and when if it used my future load idea along side it.)


As for the "persistent cache" part of it:
Even if the data changes, why not check the data's "MD5" to make sure it is current? This would increase startup time of the program, so it wouldn't be really useful at startup, but would offer that "safer" offline mode change to happen.

Then again... What data is being changed offline? I mean, most users wont be booting into Linux and changing data on a windows partition for fun. Even if they are, they would already know the risk of doing so. If they did not, then that would be a user failer on how to use their system. Even still, the File System should also be updating the Time Changed or the Data Accessed variable on the file. This gives even another option for a "safer" offline experience would it not?

I'm just asking myself :)
mabellon
Level 3
Level 3
Posts: 10
Joined: Fri May 25, 2012 5:32 pm

Re: repopulate/proactive level 1 cache

Post by mabellon »

JimF wrote:I would like the option of "infinite latency", even if it is not repopulated after a reboot. My data is all distributed computing work that just overwrites old data with new data having the same set of file names, and I normally don't need to flush it at all.
You have a very interesting workload.If you have enough free ram to keep everything cached, why not just fix your workload to run in-memory and save to disk when done. If you run out of ram, the data is paged to disk. Put the page file on an SSD.

If you want "infinite latency" you don't really care about data integrity in the event of a crash. So why is your system constantly writing to disk anyways. I'm guessing you can't change whatever it is you are running?
Mradr
Level 7
Level 7
Posts: 87
Joined: Sun Mar 25, 2012 1:36 pm

Re: repopulate/proactive level 1 cache

Post by Mradr »

mabellon wrote:
JimF wrote:I would like the option of "infinite latency", even if it is not repopulated after a reboot. My data is all distributed computing work that just overwrites old data with new data having the same set of file names, and I normally don't need to flush it at all.
You have a very interesting workload.If you have enough free ram to keep everything cached, why not just fix your workload to run in-memory and save to disk when done. If you run out of ram, the data is paged to disk. Put the page file on an SSD.

If you want "infinite latency" you don't really care about data integrity in the event of a crash. So why is your system constantly writing to disk anyways. I'm guessing you can't change whatever it is you are running?
You can't always control how a applcation works, so when you say, "[y]ou have enough free ram to keep everything cached," would good if you could edit how the programed worked or had the ability to, but most applcation do not have this feature. What he is trying to do is what you suggested, "just fix your workload to run in-memory and save to disk when done," by using FC.

Not everything about data integrity either my friend ^^; sometimes getting the job done > data integrity as that data might be replaceable (simply rerunning the task). Then again, I would use a ram disk instead as you would have that option of "infinite latency" auto magicly while not risking your OS data or other data that also does writes durring that time.
JimF
Level 4
Level 4
Posts: 36
Joined: Sun May 01, 2011 10:13 pm

Re: repopulate/proactive level 1 cache

Post by JimF »

mabellon wrote:
JimF wrote:I would like the option of "infinite latency", even if it is not repopulated after a reboot. My data is all distributed computing work that just overwrites old data with new data having the same set of file names, and I normally don't need to flush it at all.
You have a very interesting workload.If you have enough free ram to keep everything cached, why not just fix your workload to run in-memory and save to disk when done. If you run out of ram, the data is paged to disk. Put the page file on an SSD.

If you want "infinite latency" you don't really care about data integrity in the event of a crash. So why is your system constantly writing to disk anyways. I'm guessing you can't change whatever it is you are running?
As Mradr points out, I have no direct control over the workload. The World Community Grid science application is downloaded, along with the data, and run on my PC. Then the results are uploaded to a collection server. I don't do anything except install the BOINC control program that governs the uploads and download. But it does not follow that "infinite latency" implies that data integrity does not matter. In the event of a crash, I would lose several hours of work. That is why I have a backup power supply, as I noted elsewhere.

However, a Ramdisk might work better for me, with less risk to the other programs I run and the OS itself in case of a crash. At the moment, I am trialing Primo Ramdisk, and it is working quite well. While the BOINC control program itself is installed on my SSD, the data folder that contains the data and algorithms can be placed on the Ramdisk. That saves a large number of writes to my SSD, which can be several hundred GB per day for some of the projects. In effect, it gives me the long latency I need for retaining the data until it is ready for upload, and the data never needs to be copied to the SSD except when shutting down the PC. I think that will be the better solution for me.
Post Reply