Risk of Data Loss with Defer-Write on SSD cache?

FAQ, getting help, user experience about PrimoCache
Post Reply
HolyFire
Level 1
Level 1
Posts: 2
Joined: Sat Feb 15, 2020 11:51 pm

Risk of Data Loss with Defer-Write on SSD cache?

Post by HolyFire »

Hi.

The documentation describes defer-write as a feature to accelerate writes, but warns that there is a risk of data loss if there is an outage before writing to the main disk is complete.

This is perfectly clear in the case of RAM cache. But if the cache is an SSD, it's not so clear - SSD is nonvolatile, so even if there is an outage before the data is written to the main HDD, the data should be safe as long as it has finished being written to the SSD cache.

I expected the documentation to specify this - that the L2 SSD cache is exempt from the data loss risk - but could not find such a provision.

So my primary question is - if defer-write is used, and there is an outage between the time the data finished being written to SSD, but before being written to HDD, can PrimoCache seamlessly recover the data (and finish the task of writing to HDD when the system is back up again)?

More concretely - assuming that I have a main HDD, an SSD I'm using as L2 cache, and optionally also L1 RAM cache. I want the data to be safe once written to SSD, and I don't want to wait for HDD speeds to complete the write.

Ideally, what I'd like is the following:

1. When there is a write task, writing to the SSD begins immediately.
2. The write will not be considered complete by the OS until the data is written to SSD.
3. Writing to HDD will be done at leisure.
4. If there is an outage before writing to HDD, the data will be safe and be recovered seamlessly, as it has already been written to SSD.

Is the above arrangement possible?

Slightly less ideal but also satisfactory is the following:

1. When there is a write, it is first written to L1 RAM cache.
2. The write is considered complete once written to RAM.
3. Quickly afterwards (~1 second), the data starts being written to SSD. (Writing to HDD will either be concurrent, or deferred to a future leisure time).
4. If there is an outage, the data will be lost if it was not yet written to SSD, even though the OS considered the write complete. However, if the write to SSD happened to be finished already, the data will be safe, regardless of whether writing to the HDD is complete.

Is this possible?

In both cases, how should I configure PrimoCache to achieve the desired effect?

Note - I have not yet installed PrimoCache. I'm considering using it as part of the setup for my next system. I'm trying to figure out whether it's suitable for my needs.

Thanks.
User avatar
Support
Support Team
Support Team
Posts: 2802
Joined: Sun Dec 21, 2008 2:42 am

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by Support »

So far the risk of data loss exists whether using RAM or SSD cache when defer-write is enabled. The cache index database might not be correctly updated during an ungraceful shutdown, so even using SSD cache PrimoCache still cannot recover data. But we are working on this feature and we do hope we can find a solution for this. Thanks.
Primo Ramdisk | PrimoCache
Romex Software Support
HolyFire
Level 1
Level 1
Posts: 2
Joined: Sat Feb 15, 2020 11:51 pm

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by HolyFire »

Ok, thanks for the answer. I hope you work this out soon.
User avatar
Jaga
Level SS
Level SS
Posts: 546
Joined: Sat Jan 25, 2014 1:11 am

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by Jaga »

The easy way to avoid this currently is to have a UPS unit on the machine using deferred writes. There are still cases of ungraceful shutdown (though rare), but for power loss situations, you can avoid them with the additional hardware.
phat
Level 1
Level 1
Posts: 4
Joined: Tue Jul 14, 2020 12:24 am

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by phat »

I love the performance boost from deferred writes, but am also worried about the risk of data loss. A UPS can protect from power outages, but nothing can protect from the occasional crash (and crashes are more often than power outages in my area, anyway).

So, is it the case that if PrimoCache isn't gracefully shutdown, then on next startup, it always assumes that the SSD cache is corrupt and will discard it, and any deferred writes still in it?

Or is the vulnerability finer grain than that? For example, perhaps the risk of data loss occurs only if there's a crash in the middle of an update to the cache index, and PrimoCache only discards data if the index is left in an inconsistent state? If so, how long are the critical sections of such operations?

Any information that would help me quantify the risk would be appreciated. Thanks.
User avatar
Support
Support Team
Support Team
Posts: 2802
Joined: Sun Dec 21, 2008 2:42 am

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by Support »

@phat, I'm sorry for the late reply. So far PrimoCache will discard any deferred writes when booting from an ungraceful shutdown.
Primo Ramdisk | PrimoCache
Romex Software Support
InquiringMind
Level S
Level S
Posts: 388
Joined: Wed Oct 06, 2010 11:10 pm

Re: Risk of Data Loss with Defer-Write on SSD cache?

Post by InquiringMind »

HolyFire wrote:
Sun Feb 16, 2020 12:11 am
Slightly less ideal but also satisfactory is the following:

1. When there is a write, it is first written to L1 RAM cache.
2. The write is considered complete once written to RAM.
3. Quickly afterwards (~1 second), the data starts being written to SSD. (Writing to HDD will either be concurrent, or deferred to a future leisure time).
4. If there is an outage, the data will be lost if it was not yet written to SSD, even though the OS considered the write complete. However, if the write to SSD happened to be finished already, the data will be safe, regardless of whether writing to the HDD is complete.
You might want to consider Primo Ramdisk for this situation, since it provides a timed image-file save option (where data in RAMdisk is written out to SSD or HDD at intervals you specify). You will need to spend more time setting things up (specifically creating NTFS junctions so that folders on HDD/SDD are instead linked to a copy on RAMdisk - check out Link Shell Extension to simplify this process) but it also means you can be specific about what data gets stored on RAMdisk and therefore subject to increased risk/speed.

If you've not already done so, take a look at your backup strategy also which should ideally include a combination of (manual) full image backups and automatic file versioning.
Post Reply