Risk of Data Loss with Defer-Write on SSD cache?

FAQ, getting help, user experience about PrimoCache
Post Reply
HolyFire
Level 1
Level 1
Posts: 2
Joined: Sat Feb 15, 2020 11:51 pm

Risk of Data Loss with Defer-Write on SSD cache?

Post by HolyFire »

Hi.

The documentation describes defer-write as a feature to accelerate writes, but warns that there is a risk of data loss if there is an outage before writing to the main disk is complete.

This is perfectly clear in the case of RAM cache. But if the cache is an SSD, it's not so clear - SSD is nonvolatile, so even if there is an outage before the data is written to the main HDD, the data should be safe as long as it has finished being written to the SSD cache.

I expected the documentation to specify this - that the L2 SSD cache is exempt from the data loss risk - but could not find such a provision.

So my primary question is - if defer-write is used, and there is an outage between the time the data finished being written to SSD, but before being written to HDD, can PrimoCache seamlessly recover the data (and finish the task of writing to HDD when the system is back up again)?

More concretely - assuming that I have a main HDD, an SSD I'm using as L2 cache, and optionally also L1 RAM cache. I want the data to be safe once written to SSD, and I don't want to wait for HDD speeds to complete the write.

Ideally, what I'd like is the following:

1. When there is a write task, writing to the SSD begins immediately.
2. The write will not be considered complete by the OS until the data is written to SSD.
3. Writing to HDD will be done at leisure.
4. If there is an outage before writing to HDD, the data will be safe and be recovered seamlessly, as it has already been written to SSD.

Is the above arrangement possible?

Slightly less ideal but also satisfactory is the following:

1. When there is a write, it is first written to L1 RAM cache.
2. The write is considered complete once written to RAM.
3. Quickly afterwards (~1 second), the data starts being written to SSD. (Writing to HDD will either be concurrent, or deferred to a future leisure time).
4. If there is an outage, the data will be lost if it was not yet written to SSD, even though the OS considered the write complete. However, if the write to SSD happened to be finished already, the data will be safe, regardless of whether writing to the HDD is complete.

Is this possible?

In both cases, how should I configure PrimoCache to achieve the desired effect?

Note - I have not yet installed PrimoCache. I'm considering using it as part of the setup for my next system. I'm trying to figure out whether it's suitable for my needs.

Thanks.
User avatar
Support
Support Team
Support Team
Posts: 3623
Joined: Sun Dec 21, 2008 2:42 am

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by Support »

So far the risk of data loss exists whether using RAM or SSD cache when defer-write is enabled. The cache index database might not be correctly updated during an ungraceful shutdown, so even using SSD cache PrimoCache still cannot recover data. But we are working on this feature and we do hope we can find a solution for this. Thanks.
HolyFire
Level 1
Level 1
Posts: 2
Joined: Sat Feb 15, 2020 11:51 pm

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by HolyFire »

Ok, thanks for the answer. I hope you work this out soon.
User avatar
Jaga
Contributor
Contributor
Posts: 692
Joined: Sat Jan 25, 2014 1:11 am

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by Jaga »

The easy way to avoid this currently is to have a UPS unit on the machine using deferred writes. There are still cases of ungraceful shutdown (though rare), but for power loss situations, you can avoid them with the additional hardware.
phat
Level 2
Level 2
Posts: 5
Joined: Tue Jul 14, 2020 12:24 am

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by phat »

I love the performance boost from deferred writes, but am also worried about the risk of data loss. A UPS can protect from power outages, but nothing can protect from the occasional crash (and crashes are more often than power outages in my area, anyway).

So, is it the case that if PrimoCache isn't gracefully shutdown, then on next startup, it always assumes that the SSD cache is corrupt and will discard it, and any deferred writes still in it?

Or is the vulnerability finer grain than that? For example, perhaps the risk of data loss occurs only if there's a crash in the middle of an update to the cache index, and PrimoCache only discards data if the index is left in an inconsistent state? If so, how long are the critical sections of such operations?

Any information that would help me quantify the risk would be appreciated. Thanks.
User avatar
Support
Support Team
Support Team
Posts: 3623
Joined: Sun Dec 21, 2008 2:42 am

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by Support »

@phat, I'm sorry for the late reply. So far PrimoCache will discard any deferred writes when booting from an ungraceful shutdown.
InquiringMind
Level SS
Level SS
Posts: 477
Joined: Wed Oct 06, 2010 11:10 pm

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by InquiringMind »

HolyFire wrote: Sun Feb 16, 2020 12:11 am Slightly less ideal but also satisfactory is the following:

1. When there is a write, it is first written to L1 RAM cache.
2. The write is considered complete once written to RAM.
3. Quickly afterwards (~1 second), the data starts being written to SSD. (Writing to HDD will either be concurrent, or deferred to a future leisure time).
4. If there is an outage, the data will be lost if it was not yet written to SSD, even though the OS considered the write complete. However, if the write to SSD happened to be finished already, the data will be safe, regardless of whether writing to the HDD is complete.
You might want to consider Primo Ramdisk for this situation, since it provides a timed image-file save option (where data in RAMdisk is written out to SSD or HDD at intervals you specify). You will need to spend more time setting things up (specifically creating NTFS junctions so that folders on HDD/SDD are instead linked to a copy on RAMdisk - check out Link Shell Extension to simplify this process) but it also means you can be specific about what data gets stored on RAMdisk and therefore subject to increased risk/speed.

If you've not already done so, take a look at your backup strategy also which should ideally include a combination of (manual) full image backups and automatic file versioning.
klepp0906
Level 4
Level 4
Posts: 25
Joined: Wed Feb 23, 2022 10:30 am

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by klepp0906 »

Support wrote: Mon Feb 17, 2020 2:48 am So far the risk of data loss exists whether using RAM or SSD cache when defer-write is enabled. The cache index database might not be correctly updated during an ungraceful shutdown, so even using SSD cache PrimoCache still cannot recover data. But we are working on this feature and we do hope we can find a solution for this. Thanks.
Any updates on this front? with some 5900rpm SMR spinners i reaaally need the performance offered by this but not at the expense of data loss or corruption.

I know the drivepool ssd optimizer plugin will do this via writing files to the ssd then moving them to disk at a later point but i got primocache specifically because I didnt want to use the plugin. its limited in basically every other way. I dont think stacking them is beneficial in any way either.
User avatar
Support
Support Team
Support Team
Posts: 3623
Joined: Sun Dec 21, 2008 2:42 am

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by Support »

I'm sorry but this is not easy for current program architecture. PrimoCache was first designed about 10 years ago, at that time L2 cache (SSD) was even not introduced, so the original program architecture didn't cover the need of recovering cache from L2. We have to change and test a lot. I'm afraid that it may not come out in near future. Sorry.
Post Reply