Event 129 secnvme

FAQ, getting help, user experience about PrimoCache
Post Reply
zeroibis
Posts: 19
Joined: Thu Oct 11, 2018 11:13 am

Event 129 secnvme

Post by zeroibis »

This error is produced exclusively (in my testing) when writes from a remote system running a acronis backup are placed on the cache (using write cache).

Not sure if anyone else has run into this error before. Note that the error does not occur when preforming a local acronis backup.

In my case I am running two 970EVOs using the samsung driver and latest firmware from samsung magician.

Just wanted to post this up in case anyone else has run into this. Please note that the backup does execute successfully so nothing is broken but still it is strange this error only crops up then.

User avatar
support
Posts: 2699
Joined: Sun Dec 21, 2008 2:42 am

Re: Event 129 secnvme

Post by support »

I'm not familiar with this event. It seems that this event was generated by Samsung driver. I found following link for your reference.
https://blogs.technet.microsoft.com/kev ... as-issued/
Primo Ramdisk | PrimoCache
Romex Software Support

zeroibis
Posts: 19
Joined: Thu Oct 11, 2018 11:13 am

Re: Event 129 secnvme

Post by zeroibis »

Just wanted to update that the issue was not related to settings in power management. I have the linked setting configured in my power plan. I have also now tried to turn off windows write cache on the SSDs to see if that does anything. I have also reinstalled the Samsung driver.

zeroibis
Posts: 19
Joined: Thu Oct 11, 2018 11:13 am

Re: Event 129 secnvme

Post by zeroibis »

So today I saw the issue when it occurs in real time. Basically the SSD will have 100% activity but with 0 W/R activity and 0 access time.

I am going to try to uninstall the samsung drivers and see if that is the cause of the issue.

zeroibis
Posts: 19
Joined: Thu Oct 11, 2018 11:13 am

Re: Event 129 secnvme

Post by zeroibis »

Wanted to update that last night was the first time I did not get this error during a backup process. The change I made was to tell acronis not to validate the backup after creation and instead do it a few hours later.

User avatar
support
Posts: 2699
Joined: Sun Dec 21, 2008 2:42 am

Re: Event 129 secnvme

Post by support »

Many thanks for the information and updates!
I'm sorry that so far we have no idea about this problem, but we'll keep an eye on it. We'll post here when we have any updates.
Primo Ramdisk | PrimoCache
Romex Software Support

zeroibis
Posts: 19
Joined: Thu Oct 11, 2018 11:13 am

Re: Event 129 secnvme

Post by zeroibis »

Wanted to update again that my last attempt did not correct the issue. Also it managed to crash the entire system.

What is interesting is how these storage controller errors occur exclusively when the process is done via eathernet but when a local backup is preformed using the same software there is no issue.

From this I conclude that the real source of the problem is likely with the NIC and is then being manifested in the NVME controller. As an attempt to fix this I compared the latest NIC drivers on the mfg website to those provided by Asrock and found that there is much more recent drivers available. I have since switched to these new NIC drivers and restored my NIC configuration to see if this changes anything.

Will update with results.

zeroibis
Posts: 19
Joined: Thu Oct 11, 2018 11:13 am

Re: Event 129 secnvme

Post by zeroibis »

Wanted to update with some interesting findings. Issue still occurs.

It appears there is a time delay requirement for the issue to present itself. At least 48 hours of system up time must be achieved.

The issue is exclusive to write operations performed from acronis backup on a remote system. The issue does not occur when a standard SMB transfer is initiated from the same remote system.

The issue once it presents itself will continue to occur unless an SMB transfer from a remote sytem occurs at the same time as the issue. When this occurs the issue dispersal although I am not sure as of yet if it occurs gain.

System can be on for 48 hours and have multiple SMB transfers during this time and it will not effect the issue presenting itself. The issue only will clear up if there is a SMB transfer at the same time as the issue is or wants to occur.

Possible solutions:
Configure the remote backup to first occur locally and then move the backup files after.
Configure the server to reboot every 48 hours.
Configure the remote system to initiate an arbitrary SMB transfer of some files at the same time the backup is scheduled to occur.

Will also test some other methods of clearing the error once it is occurring such as if running a benchmark on the server which hits the write cache will cause it to be fixed.

User avatar
support
Posts: 2699
Joined: Sun Dec 21, 2008 2:42 am

Re: Event 129 secnvme

Post by support »

I find this link for your reference.
https://blogs.msdn.microsoft.com/ntdebu ... 29-errors/

Will the issue happen without PrimoCache or without Defer-Write enabled?
Is it possible for you to try another brand of SSDs? I'm wondering if the issue is hardware-related.
Primo Ramdisk | PrimoCache
Romex Software Support

zeroibis
Posts: 19
Joined: Thu Oct 11, 2018 11:13 am

Re: Event 129 secnvme

Post by zeroibis »

Thanks for the link, very intresting to see some of the comments there relating the issue to various seaming unrelated software issues such as nic drivers and VM issues.

In my case the issue is exclusive to when using acronis backup which is running on a remote computer and that remote computer stores the backup file onto my server which is running primocache.

The issue does not occur when my server uses acronis backup on itself to backup to a drive which is using the same cache.

What is even stranger is that I do not get the issue every time but only some times and only if at least 48 hours of up time has elapsed. I will look into seeing if there is some timeout values I can modify for the ssd to see if that changes anything.

I should also mention that if the windows driver is used instead of the samsung it will additionally generate a stornvme error 11 as well.

Also as a reminder the error is generated for a pair of 970s one on a pcie 3.0 x4 and the other on a pcie 2.0 x4.

Post Reply