ReFS vs. NTFS, Stripe vs. Simple Storage Space, Diskeeper vs. PrimoCache
Posted: Fri Sep 30, 2016 12:43 pm
Hi All!
I am currently running extensive tests on a 16bay Dual 4 GBit RAID subsystem directly connected to a dual Hexacore Xeon server running Windows 2012 R2.
In attempt to design the best performance system for future server replacement in my company, I am currently investigating ways of connecting, controlling and using the storage and what results are shown. Beside many still not fully answered questions, there are a lot of findings I had, which I'd like to share here.
First, I made a long set of tests comparing how to set up the storage. I have therefor formatted the FC enclosure either as 2x8 drive JBOD (2 FC host connections to two server ports), 2x8 RAID0, 2x8 RAID1, 2x8 RAID5 and 2x8 RAID6.
For JBOD and RAID0 I tried Storage Spaces with Single or Dual parity, and Mirror modes with Dual and Triple copies. Clearly I also looked into Simple Storage Space, but only for performance check, as this makes no sense security wise, as there is NO chance to survive any accident then. Without going into detail I can summarize that the Storage Space redundancy features like Single or Dual Parity or Dual or Triple Mirror cost you a lot of disk space and performance. In fact, performance drops to as low as 1/4th possible otherwise, which is why I rule them ALL out at this point.
So the first result was to use the chassis RAID controller in RAID6 mode, as the trade off over RAID5 is minimal: Read speads are identical, write speeds are 1/10th less, but the redundancy is far better (2 drive failures per 8 drive set). However, total disk space left for parity is 25%. While the actual numbers are not THAT important, they have been helpful in decding which way to go in the sheer endless mixture of possibilities.
Given the RAID enclosure is being used in RAID6 mode, Windows does not really need to take care of it. The only risk is what would happen if a drive set fails, e.g. by power outage (unlikely, we have a 40 kwh 3-Phase UPS behind, everything has at least 2 PSUs) or (more likely) a broken cable/SFP link or lost controller. This kind of testing is not yet done, but I have reasonable experience with NTFS strips surviving such situations.
Comparing test results, it showed up that a Storage Space with a virtual partion in "Simple" mode appears to be reasonably faster in writing than a "standart" Windows disk stripe using dynamic disks. Dynamic disks has always felt a bit risky, with lots of possible problems underneath and few industry support. Read speeds appear to be close to each other. So storage space seem to sound like a good idea over dynamic disk striping.
Next point is to decide which file system is better: NTFS or ReFS.
I am still undecided on this one. Many of the information around ReFS make it sound a good decision, e.g. that its said to be "self-healing" and does not require CHKDSK anymore. Then there is the fact its scubbing data from one place to another and thereby essential defragmenting the volume more or less automatically, over time. Plus, its increase file/name length among some other important improvements. However, there are some draw backs in using ReFS too:
- There is still few industry support, such as Condusiv Undelete server not supporting ReFS yet (no response yet if it ever will, and when). Undelete server can save your butt in case someone deletes a wrong file or folder via a SMB share.
- ReFS is new. We don't know if there are still bugs in there. Some users apparently had really bad experience in the past and I could not even tell if their problems have been resolved in the meantime.
So some conservative thoughts speak pro NTFS!
What about speed, is there any difference?
After a long set of benchmark tests, I see minimal to none difference in speed between NTFS and ReFS.
So, here some real numbers:
ATTO Disk Benchmark 3.05
Transfer size 4 KB to 64 MB, 256 MByte total length, Queue Depth 10
2x8 RAID6 NTFS Stripe ~511 MByte/s writes, ~777 MByte/s reads
2x8 RAID6 ReFS Stripe ~515 MByte/s writes, ~775 MByte/s reads
2x8 RAID6 NTFS Simple Storage Space ~653 MByte/s writes, ~811 MByte/s reads
2x8 RAID6 ReFS Simple Storage Space ~648 MByte/s writes, ~811 MByte/s reads
Transfer size 4 KB to 64 MB, 32 GByte total length, Queue Depth 10
2x8 RAID6 NTFS Stripe ~483 MByte/s writes, ~674 MByte/s reads
2x8 RAID6 ReFS Stripe ~485 MByte/s writes, ~668 MByte/s reads
2x8 RAID6 NTFS Simple Storage Space ~485 MByte/s writes, ~749 MByte/s reads
2x8 RAID6 ReFS Simple Storage Space ~496 MByte/s writes, ~749 MByte/s reads
So in the 256 MByte transfers we see a huge difference in speeds (like +140 MByte/s, +28%) pro Simple Storage Space over Stripe sets.
And in 32 GByte transfers we see a clear read performace advantage (like +75 MByte/s, +12%) pro Simple Storage Space over Stripe sets.
Given the minimal differences between NTFS and ReFS in terms of speed and taking into account the conservative arguments about NTFS being a proven file system with lots of third party support, I currently tend to stay with NTFS for a while, until at least the third party support is getting bigger. In that context its a pitty that Microsoft does not reveal all the details behind ReFS to the public, as this would certainly increase its support and reliability tremendously.
While this was all about TRUE HDD performance, we can add caching on top, such as PrimoCache or using a tiered storage model by having SSDs and HDDs at the same time in the same storage space.
In that context I found this interesting document about ReFS and Storage Spaces, indicating it highly relies on write back caching (1 GByte by default, isn't that to be called dangerous?) and how it moves around data from / to SSD storage tier:
https://blogs.technet.microsoft.com/lar ... r-2012-r2/
Another interesting document is this one, explaining step by step how to set up storage tiering:
https://blogs.technet.microsoft.com/ask ... r-2012-r2/
Tiered storage works with NTFS too. Didn't knew that before... and it helps considering NTFS - again.
While I pretty much like the idea of a tiered storage space, I am not sure how fast it reacts on moving "hot" and "cold" data between the two tiers. It may be not as transparent as PrimoCache. And I am also not sure how to enable single SSDs along with a true hardware RAID to become secure without loosing to much disk space. And remember, its still all new and we have few exerience with what happens in a desaster momemt with this.
So for the time being I tend to use PrimoCache as READ cache, preferably using SSD L2. Potentially I'll enable write caching again, if everyone else thinks thats safe: Windows does it (1 GByte default!!!). And Diskeeper does it too, using its "Invisitasking" called write caching. Plus, Diskeeper 2016 aparently added RAM READ caching - but no L2 cache like PrimoCache. But a tiered storage space alone or along with Diskeeper 2016 may be similar to PrimoCache right now. But at which price point?
Dedicated cache along with a secure "main HDD array" sounds much better and predictable to me than the tiered storage. The problem is that I can't set different resilency levels on the two storage tiers. I may need to solve this in hardware, such as using RAID1 mirroring on SSDs and RAID6 on HDDs and provide them as two drives to make tier from them.
I'll continue my tests and configuration "trials" with this system to see how to improve it. At least I was able to sort out some options for performance reasons. I will likely have new RAID solutions in place in the long run, this is experimental to design the best solution.
I hope this may be interesting for some of you.
I am currently running extensive tests on a 16bay Dual 4 GBit RAID subsystem directly connected to a dual Hexacore Xeon server running Windows 2012 R2.
In attempt to design the best performance system for future server replacement in my company, I am currently investigating ways of connecting, controlling and using the storage and what results are shown. Beside many still not fully answered questions, there are a lot of findings I had, which I'd like to share here.
First, I made a long set of tests comparing how to set up the storage. I have therefor formatted the FC enclosure either as 2x8 drive JBOD (2 FC host connections to two server ports), 2x8 RAID0, 2x8 RAID1, 2x8 RAID5 and 2x8 RAID6.
For JBOD and RAID0 I tried Storage Spaces with Single or Dual parity, and Mirror modes with Dual and Triple copies. Clearly I also looked into Simple Storage Space, but only for performance check, as this makes no sense security wise, as there is NO chance to survive any accident then. Without going into detail I can summarize that the Storage Space redundancy features like Single or Dual Parity or Dual or Triple Mirror cost you a lot of disk space and performance. In fact, performance drops to as low as 1/4th possible otherwise, which is why I rule them ALL out at this point.
So the first result was to use the chassis RAID controller in RAID6 mode, as the trade off over RAID5 is minimal: Read speads are identical, write speeds are 1/10th less, but the redundancy is far better (2 drive failures per 8 drive set). However, total disk space left for parity is 25%. While the actual numbers are not THAT important, they have been helpful in decding which way to go in the sheer endless mixture of possibilities.
Given the RAID enclosure is being used in RAID6 mode, Windows does not really need to take care of it. The only risk is what would happen if a drive set fails, e.g. by power outage (unlikely, we have a 40 kwh 3-Phase UPS behind, everything has at least 2 PSUs) or (more likely) a broken cable/SFP link or lost controller. This kind of testing is not yet done, but I have reasonable experience with NTFS strips surviving such situations.
Comparing test results, it showed up that a Storage Space with a virtual partion in "Simple" mode appears to be reasonably faster in writing than a "standart" Windows disk stripe using dynamic disks. Dynamic disks has always felt a bit risky, with lots of possible problems underneath and few industry support. Read speeds appear to be close to each other. So storage space seem to sound like a good idea over dynamic disk striping.
Next point is to decide which file system is better: NTFS or ReFS.
I am still undecided on this one. Many of the information around ReFS make it sound a good decision, e.g. that its said to be "self-healing" and does not require CHKDSK anymore. Then there is the fact its scubbing data from one place to another and thereby essential defragmenting the volume more or less automatically, over time. Plus, its increase file/name length among some other important improvements. However, there are some draw backs in using ReFS too:
- There is still few industry support, such as Condusiv Undelete server not supporting ReFS yet (no response yet if it ever will, and when). Undelete server can save your butt in case someone deletes a wrong file or folder via a SMB share.
- ReFS is new. We don't know if there are still bugs in there. Some users apparently had really bad experience in the past and I could not even tell if their problems have been resolved in the meantime.
So some conservative thoughts speak pro NTFS!
What about speed, is there any difference?
After a long set of benchmark tests, I see minimal to none difference in speed between NTFS and ReFS.
So, here some real numbers:
ATTO Disk Benchmark 3.05
Transfer size 4 KB to 64 MB, 256 MByte total length, Queue Depth 10
2x8 RAID6 NTFS Stripe ~511 MByte/s writes, ~777 MByte/s reads
2x8 RAID6 ReFS Stripe ~515 MByte/s writes, ~775 MByte/s reads
2x8 RAID6 NTFS Simple Storage Space ~653 MByte/s writes, ~811 MByte/s reads
2x8 RAID6 ReFS Simple Storage Space ~648 MByte/s writes, ~811 MByte/s reads
Transfer size 4 KB to 64 MB, 32 GByte total length, Queue Depth 10
2x8 RAID6 NTFS Stripe ~483 MByte/s writes, ~674 MByte/s reads
2x8 RAID6 ReFS Stripe ~485 MByte/s writes, ~668 MByte/s reads
2x8 RAID6 NTFS Simple Storage Space ~485 MByte/s writes, ~749 MByte/s reads
2x8 RAID6 ReFS Simple Storage Space ~496 MByte/s writes, ~749 MByte/s reads
So in the 256 MByte transfers we see a huge difference in speeds (like +140 MByte/s, +28%) pro Simple Storage Space over Stripe sets.
And in 32 GByte transfers we see a clear read performace advantage (like +75 MByte/s, +12%) pro Simple Storage Space over Stripe sets.
Given the minimal differences between NTFS and ReFS in terms of speed and taking into account the conservative arguments about NTFS being a proven file system with lots of third party support, I currently tend to stay with NTFS for a while, until at least the third party support is getting bigger. In that context its a pitty that Microsoft does not reveal all the details behind ReFS to the public, as this would certainly increase its support and reliability tremendously.
While this was all about TRUE HDD performance, we can add caching on top, such as PrimoCache or using a tiered storage model by having SSDs and HDDs at the same time in the same storage space.
In that context I found this interesting document about ReFS and Storage Spaces, indicating it highly relies on write back caching (1 GByte by default, isn't that to be called dangerous?) and how it moves around data from / to SSD storage tier:
https://blogs.technet.microsoft.com/lar ... r-2012-r2/
Another interesting document is this one, explaining step by step how to set up storage tiering:
https://blogs.technet.microsoft.com/ask ... r-2012-r2/
Tiered storage works with NTFS too. Didn't knew that before... and it helps considering NTFS - again.
While I pretty much like the idea of a tiered storage space, I am not sure how fast it reacts on moving "hot" and "cold" data between the two tiers. It may be not as transparent as PrimoCache. And I am also not sure how to enable single SSDs along with a true hardware RAID to become secure without loosing to much disk space. And remember, its still all new and we have few exerience with what happens in a desaster momemt with this.
So for the time being I tend to use PrimoCache as READ cache, preferably using SSD L2. Potentially I'll enable write caching again, if everyone else thinks thats safe: Windows does it (1 GByte default!!!). And Diskeeper does it too, using its "Invisitasking" called write caching. Plus, Diskeeper 2016 aparently added RAM READ caching - but no L2 cache like PrimoCache. But a tiered storage space alone or along with Diskeeper 2016 may be similar to PrimoCache right now. But at which price point?
Dedicated cache along with a secure "main HDD array" sounds much better and predictable to me than the tiered storage. The problem is that I can't set different resilency levels on the two storage tiers. I may need to solve this in hardware, such as using RAID1 mirroring on SSDs and RAID6 on HDDs and provide them as two drives to make tier from them.
I'll continue my tests and configuration "trials" with this system to see how to improve it. At least I was able to sort out some options for performance reasons. I will likely have new RAID solutions in place in the long run, this is experimental to design the best solution.
I hope this may be interesting for some of you.