Event 129 secnvme

FAQ, getting help, user experience about PrimoCache
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

Still waiting on the replacement drive, but I've been doing some additional testing and monitoring and I am leaning towards the source of the problem being HEAT :o

The 970 EVO under full, sustained load is hitting a peak temperature of a whopping 95°!!
I can get that drastically reduced (peak of 65°) by cranking all my fans to max, but that's not a practical solution; I have way more airflow than normal users would, and none of my fan control sensors can be based on the drive temp; only CPU and PCH temps.

With PrimoCache pushing the drive pretty hard during large reads (e.g. level loads in games) I could see sustained temps that high causing a device failure.

I'm going to continue running some load tests with my fans at max over the next few days, until the replacement drive comes. It's possible there's just a heat dissipation problem on the unit I have, but it's also seeming more likely that the 970 EVO just has severe heat issues :(
User avatar
Jaga
Contributor
Contributor
Posts: 692
Joined: Sat Jan 25, 2014 1:11 am

Re: Event 129 secnvme

Post by Jaga »

If that's 95 F it's no big deal, and not the source of the problems. Mine right now is idling at 112F / 45C. They are rated between 0-70C (32-158F).

https://www.samsung.com/semiconductor/m ... er/970evo/

So no, the 970 Evo (yours included) doesn't have heat problems.
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

If that's 95 F it's no big deal, and not the source of the problems. Mine right now is idling at 112F / 45C. They are rated between 0-70C (32-158F).
95°C :(

Note that this is the throttling sensor. Sensor 1 = Health, Sensor 2 = Throttling
With no active airflow, Sensor 2 idles at 60°C and hits 95°C under full load.

Sensor 1 idles like yours, around 45°C
However peak temp under load is over 70° by a bit.

Throttling or even just heat itself seem like likely sources for my issues.

I've been continuing to run with active airflow on the device while I wait for the RMA, and haven't had any failures :\
User avatar
Jaga
Contributor
Contributor
Posts: 692
Joined: Sat Jan 25, 2014 1:11 am

Re: Event 129 secnvme

Post by Jaga »

Yikes - 95C might be a death sentence for that kind of hardware. That's practically boiling water in the air around it. Glad to know you have a replacement on the way.

And yeah, that's probably why you're seeing the issue crop up - the drive is throttling so as not to melt.
User avatar
Support
Support Team
Support Team
Posts: 3623
Joined: Sun Dec 21, 2008 2:42 am

Re: Event 129 secnvme

Post by Support »

Great finding! This high temperature issue is very likely to cause the problem.
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

I'm back to suspecting a problem with PrimoCache :\

I received my RMA and had the same issue.
I purchased a Samsung 970 PRO (not EVO) and am ALSO experiencing the same problem when using PrimoCache.
The 970 PRO does not get nearly as hot; temperature on all sensors stays below 70ºC

Let me know if there's more information I can gather!
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

My latest investigation leads me to believe that the issue I am seeing is specific to the PrimoCache shim/driver compatibility with Samsung NVMe SSDs.

- I can reproduce the issue with both the Samsung 970 EVO and Samsung 970 PRO
- I can reproduce the issue with either M.2 slot on my motherboard
- I can reproduce the issue when temperatures are steady below 75ºC
- I can reproduce the issue when there is no cache task, but PrimoCache is installed and running, by benchmarking the drive
- I can NOT reproduce the issue when PrimoCache is completely uninstalled
- I can NOT reproduce the issue with other non-Samsung devices
User avatar
Support
Support Team
Support Team
Posts: 3623
Joined: Sun Dec 21, 2008 2:42 am

Re: Event 129 secnvme

Post by Support »

neatchee, thank you very much for lots of testing! I will post here when we have updates.
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

More data:
It's not just PrimoCache :(

Tried VeloSSD to see if they had the same problem and they do.
Maybe it's a Samsung driver/hardware issue?

I've got a dedicated PCIe NVMe card on its way from Amazon, in case the issue is with the M.2 slot on my motherboard.

This is really frustrating :\
User avatar
Jaga
Contributor
Contributor
Posts: 692
Joined: Sat Jan 25, 2014 1:11 am

Re: Event 129 secnvme

Post by Jaga »

It's quite possible it's the NVMe driver - you specifically have to replace the Windows one with Samsung's if you want full compatibility. Go here, click on Download FIles, then find the NVMe driver if you haven't already done that. I'd recommend stopping the L2 cache before installing the driver as well, if you're still using the standard driver.

It could be something else in Windows too. I don't use my Samsung NVMe as a L2, but I've never seen the reported error, and use mine as a cached boot drive (a 30GB L1 caches it). Since it happens with VeloSSD, chances are it's not a Primocache specific problem, but more related to underlying software (drivers and/or Windows).

Frustrating for sure. Had you considered the possibility of a complete system rebuild (Windows and Primocache only, then test)?
Post Reply