Page 1 of 1

PrimoCache and File Based Synchronization

Posted: Tue Jan 23, 2018 10:55 pm
by jholland
Hi,

I currently have several machines setup to use primocache, that use a touch based system to synchronize some tasks. These tasks can take anywhere from a split second to ten minutes. They are not thread safe, so we create a file and if that file is present, we sit and wait for it to disappear. For a while this worked OK, but we've started seeing this file creation hang. We thought it might be Cygwin's touch, but we have implemented a separate systems that uses perl to create files, and are still running into this issue.

Could this potentially be coming from PrimoCache? We have it set up with a deferred write of 10 seconds. We are also running 2.7.0. In order to do some debugging, I have done the following to see if some part of our configuration is at fault:
  • Disabled PrimoCache entirely on 2 machines
  • Disabled the deferred write on two of them
  • Upgrade PrimoCache to the latest on two others (3.0.2)
but I was also hoping someone with a bit more knowledge of PrimoCache might be able to chime in.

Thanks!

Re: PrimoCache and File Based Synchronization

Posted: Wed Jan 24, 2018 2:58 am
by Jaga
Are you write-caching on the volume(s) using a strategy other than Native? Native will completely flush the write cache when the timer expires, if it can. Intelligent is the most graceful flush algorithm I've found, but pure Native will be the most regular from a time perspective.

What did you find in your debugging with Primocache disabled? Windows can also be set to defer flushing the write buffer, which you don't want when Primo is running.

Any utility not running at the Windows kernel level should see any file PrimoCache has in it's cache, even if it hasn't flushed to disk yet. Therefore the file should always exist from the instant of being created, regardless of whether Primo has written it to physical disk or not. I'm not 100% sure how files that are created and then destroyed while still in the cache are handled from a physical perspective, Support might have to answer that.

Best course of action for testing I've found, is to only change one variable at a time, and start from as much of a "base" config as you can. Take PrimoCache out of the loop along with no Anti-Virus, watch the file creation/destruction realtime. Then add in Primo for read caching only, re-test. Then re-do Primo with read/write caching and a short delay, and re-test. Etc.

Re: PrimoCache and File Based Synchronization

Posted: Wed Jan 24, 2018 3:57 am
by Support
I think you may disable or uninstall PrimoCache first to see if there are any problems in your systems.

Re: PrimoCache and File Based Synchronization

Posted: Fri Mar 09, 2018 6:46 am
by jussssx1
what type of machines do you have; are they multi-socket machines? We experienced some hickups with
multi-socket systems because rapid file IO's sometimes failed (file based oltp apps). We did not experience
the issue without primocache, but the issue may be related to windows kernel bug that Bruce Dawson recently
found

Re: PrimoCache and File Based Synchronization

Posted: Fri Mar 16, 2018 12:14 pm
by Axel Mertes
Has anyone been able to confirm the issue you are running into, jussssx1?

Can you elaborate on the exact behavior and how we may be able to reproduce it?
Is there a thread related only to this subject (please point me to it - if there is one)?

I'd love to verify and confirm this on my end, as I am planning new servers for use with PrimoCache and I really need to know if there are any bad obstackles ahead. Right now we do run a dual CPU system and can't really tell of issues, except that we rarely see files not being showing up in time on network shares or can't be removed/modified for a given time. Usually closing all open files resolves the problem, in worsed cases we need a server reboot.

So please let me know as much as possible.

Thanks
Axel

Re: PrimoCache and File Based Synchronization

Posted: Wed Mar 21, 2018 12:56 am
by InquiringMind
jholland wrote:...They are not thread safe, so we create a file and if that file is present, we sit and wait for it to disappear. For a while this worked OK, but we've started seeing this file creation hang...Could this potentially be coming from PrimoCache? We have it set up with a deferred write of 10 seconds...
PrimoCache works at block-level rather than file-level so is unlikely to be the cause here. A more probable scenario is one of your tasks preventing full file deletion (e.g. by maintaining a handle to the file) blocking subsequent file (re)creation. You may find Process Monitor a useful tool here since it can track file access/usage.

As you no doubt appreciate, creating/deleting files isn't the best method to synchronise processes - mutexes or inter-thread messaging would be a better (and faster) option. So this might be a case of a "quick and dirty" solution starting to fail as workload increases.
jussssx1 wrote:...We experienced some hickups with
multi-socket systems because rapid file IO's sometimes failed (file based oltp apps). We did not experience
the issue without primocache, but the issue may be related to windows kernel bug that Bruce Dawson recently
found
Could this be due to the IO completing too quickly? (in this case, suspending PrimoCache should stop the symptoms). ProcMon may be of help here, but it depends on whether (or if!) you were getting an error message from Windows or the app itself.

Re: PrimoCache and File Based Synchronization

Posted: Wed Mar 21, 2018 6:48 am
by Axel Mertes
We have PrimoCache 2.4 Server running on Windows 2012 R2 x64 on a dual socket Xeon system with 96 GB RAM and 2 TByte SSD cache. Its a Intel X5650 dual hexacore system. I have not witnessed any race conditions as far as I can tell.

Re: PrimoCache and File Based Synchronization

Posted: Sun Mar 25, 2018 12:02 pm
by jussssx1
Axel Mertes wrote:Has anyone been able to confirm the issue you are running into, jussssx1?
Can you elaborate on the exact behavior and how we may be able to reproduce it?
Axel
Hello Axel; After applying KB4090914 we have not seen the issue anymore. It might have been
related to file flush kernel issue; see https://randomascii.wordpress.com/2018/ ... ernel-bug/

-Jussi

Re: PrimoCache and File Based Synchronization

Posted: Sun Mar 25, 2018 6:31 pm
by Axel Mertes
I will have a look at this after the weekend.

Thanks for letting us know.

Does this fix come with regular updates as well?
Which exact OS3 version are you using?
We use 2012 server R2 x64 datacenter.

Cheers