Strange Server Performance Issue

FAQ, getting help, user experience about PrimoCache
Post Reply
Stateless
Level 3
Level 3
Posts: 10
Joined: Wed Dec 28, 2016 3:17 pm

Strange Server Performance Issue

Post by Stateless »

Hello,

I'm testing 4.2.0 trial on a server and I'm seeing a very strange performance issue. Any cached volume on the PERC H710P controller is capping at about 1.5GB/s read and write. This happens for both an HDD array and a SSD array on the PERC controller. I am testing using HDTunePro using an 8GB file test, and PrimoCache is configured with a 32GB L1 cache.

If I cache the Microsoft StorageSpaces array on the same system I'm seeing 9GB/s write / 7GB read.

I have an identical server that has an Adaptec controller and I installed the 4.2.0 trial and there is no issue on that machine.

I have tried setting NUMA aware since this is a dual CPU server. I have deleted all the cache tasks and recreated them.

The cache task is configured as 32GB L1 (shared), 512KB block size, defer-write.
User avatar
Support
Support Team
Support Team
Posts: 3623
Joined: Sun Dec 21, 2008 2:42 am

Re: Strange Server Performance Issue

Post by Support »

Quite weird. In theory, the speed should only be related to RAM access speed. Do you run HDTunePro with administrative privileges? Or how about using another test tool like CrystalDiskMark?
Stateless
Level 3
Level 3
Posts: 10
Joined: Wed Dec 28, 2016 3:17 pm

Re: Strange Server Performance Issue

Post by Stateless »

This server has the PERC 710P volume and storage spaces volume on the same 32GB cache task in PrimoCache. Notice the difference of the StorageSpaces volume. I verified in task manger there is no disk activity during the tests so it's only hitting the cache.
PERC System.png
PERC System.png (118.88 KiB) Viewed 768 times
This is the server with the Adaptec volume on a 32GB cache task. Significantly higher numbers.
Adaptec System.png
Adaptec System.png (181.46 KiB) Viewed 768 times
Both systems are running 4.2.0 trial with numa aware set to ON. Both are using the same cache size and cache task settings.

I've tried uninstalling and reinstalling, I've tried deleting and recreating cache task, I've tried only caching the PERC volume, I've tried every block size.

THEN I tried on an even newer PowerEdge server with a PERC 730P, this is also has a NVMe StorageSpaces volume. Same exact cache task settings. SAME PERFORMANCE GAP! I again ensured only the cache was getting hit.
PERC System 2.png
PERC System 2.png (96.33 KiB) Viewed 768 times
Running HDTune as admin doesn't change anything.
User avatar
Support
Support Team
Support Team
Posts: 3623
Joined: Sun Dec 21, 2008 2:42 am

Re: Strange Server Performance Issue

Post by Support »

Could you try another test tool like crystaldiskark instead of hdtune to see if results are same?
Stateless
Level 3
Level 3
Posts: 10
Joined: Wed Dec 28, 2016 3:17 pm

Re: Strange Server Performance Issue

Post by Stateless »

PERC System 3.png
PERC System 3.png (152.61 KiB) Viewed 759 times
Same strange difference between volumes that are on the same 32GB cache set. Ran tests multiple times, verified the benchmark activity never actually hit the disk.
User avatar
Support
Support Team
Support Team
Posts: 3623
Joined: Sun Dec 21, 2008 2:42 am

Re: Strange Server Performance Issue

Post by Support »

I'm sorry for the late reply because we were on Lunar New Year Holiday.

Firstly, I'd like to confirm that you enabled NUMA-aware as the following page described, especially that you have restarted the server after the set command.
https://kb.romexsoftware.com/en-us/2-pr ... numa-aware

PS. Because these drives are in the same cache task, I don't think NUMA-aware is the cause.

According to your detailed testing results, it seems that the performance issue only happens on PERC devices, though we think the best performance of L1 cache should only be related to CPU and RAM, in addition to Windows OS.

Checked the results of HDTune and CrystalDiskMark, We see that under SEQ Q1T1 in CDM (Sequential in HDTune) and RAND4K Q1T1, the PERC device's performance is not good while its SEQ Q8T1 result is better than the other drive. According to these results, we suspect that some properties on the PERC controller affected the benchmark performance, like device queue length, NCQ... (not sure which hardware property is the cause). Benchmark Tools or Windows acknowledges these hardware properties when sending IO requests, affecting the benchmark results.
Post Reply