Page 1 of 1

Has anyone used PrimoCache to accelerate an iSCSI target under Windows 2008/2012 sever?

Posted: Tue Feb 16, 2016 8:45 am
by Axel Mertes
Hi All,

has anyone used PrimoCache to accelerate an local disk that containes iSCSI VHD LUNs shared by the Windows 2008 or Windows 2012 server OS as iSCSI targets?

Given the dramatic price drop on 10 GBit ethernet that might be a nice alternative to build an SSD accelerated SAN replacement:

- Windows 2012 server
- Big local disk array
- Lots of RAM as L1 cache
- 2 TB per physical volume L2 SSD cache
- Share VHDs on the physical volume as iSCSI target drives via 10 GBit ethernet

I wonder if anyone has already tried this and can share some real world feedback here?

Re: Has anyone used PrimoCache to accelerate an iSCSI target under Windows 2008/2012 sever?

Posted: Tue Feb 23, 2016 10:02 am
by 0ss555ines
Hello

Unfortunately not; however, given you are looking at the server OS's I would like to hope you had looked at tiered storage under Windows Server 2012.

That supports combining "tier 1" fast disk with "tier 2" slower disk. It is not as easily pluggable as PrimoCache and I seem to remember you need to use Storage Pools.

Anyway, it still looks like it matches your scenario.

Re: Has anyone used PrimoCache to accelerate an iSCSI target under Windows 2008/2012 sever?

Posted: Tue Feb 23, 2016 11:50 am
by Axel Mertes
I've been looking into tiered storage under Windows 2012 server - however I did not yet build a test system.

Deeper research showed that it is rather complex and - unfortunately - currently a quite intransparent process.

Coming from a 4 GBit fibre channel + Tiger Technologies MetaSAN environment I did a lot of research. While I find the idea of a true tiered storage really appealing, here some of my thoughts:

- To build a more or less secure system you need at least 3 external chassis per pool from certain certified vendors plus quite a bit of setup strategy, as the total performance is said to be severely affected by the exact numbers of disks in the pools etc.
- Data exists apparently on either HDD or SSD. The data is moved. In return you must build your SSD pool as redundant as your HDD pool or you create a huge risk.
- Storage pools create redundancy by having one or more true copies of the files spread around the storage. I think that means once you write a file to HDD pool and it gets moved to SSD pool for faster reference, the filesystem may keep the first copy on the HDD as backup. But I have not yet much information if that is correct thinking and how it works. It also implies that when you fill your pools over a given level, you can loose your redundancy.
- I have not found any information if and how we can control the strategy on how files are moved from HDD to SSD and back. (compare to caching policy in PrimoCache).
- I hope that storage pools do reduce fragmentation by automatic scrubbing, e.g. by moving files from HDD to SDD and later back to HDD. However, its hope, not yet knowledge. I have not found information on this yet. Microsoft should make it 100% transparent, especially the new file system ReFS is not open documented (so is NTFS) and that "proprietarity" makes it currently hard to compare it to other use cases.
- There is rare to none information regarding benchmarks of storage pools. I talked to my usual FC/SAN/Storage vendors and some more and the overall resume is that about no one out there really tackled storage pools yet or has any serious experience, not even how to set it up best. Its almost totally unknown territory.


So while I would love to employ a true tiered storage system with SSD and HDD pools inlucding ReFS, its a huge problem to go for it:

- If I make something wrong in the configuration, its very hard to change, because I may need to move dozens to hundrets of TBytes of data in worsed case.
- My experience with SAN systems tells me that some problems might show up in the long run, not short term on a fresh system. If thats the case I may get stuck in the same problem: Needing to move dozens to hundrets of TBytes of data in a hurry. Thats not good.


Currently we have 4 GBit FC SAN drives with MetaSAN shared storage access running side by side with a SMB share solution using PrimoCache and 2 TByte of L2 cache per drive. The hardware is identical RAIDs, its just the way they are presented to the clients. I get higher read/write/access performance via SMB shares on 10 GBit Ethernet than on 4 GBit FC SAN as local drives. This is amazing and apparently only possible using PrimoCache. Its not perfect yet and there is a lot of room for improvement. However, it a very good starting point.

If it is possible to use a VHD file on the server to promote it as iSCIS target to the Ethernet clients, I will enable my old RAIDs to use PrimoCache read cache acceleration and increase performance dramatically. Server inside benchmarks showed improvements on read/write speeds up to 8-10 times (350 MByte/s without, 3000+ MByte/s with PrimoCache), I/O response is dramatically faster (due to SSD/RAM instead of HDD seeks).

My idea was - as kind of slow "transition process" - to re-use existing hardware and improve total throughput/speed by empolying PrimoCache acceleration. As our old RAID systems do not support SSD nor SSD caching (and new ones that do are seriously expensive) I think the work-around of using iSCSI and PrimoCache might turn out to be a good option:

- 10 GBit iSCSI/Ethernet infrastructure rather than 4 GBit FC SAN.
- True RAM L1/SSD L2 caching using PrimoCache
- Higher bandwidth on iSCSI than SMB shares usually have on the same hardware
- 10 GBit point to point dedicated links possible if needed.

Beside that we consider to step away from shared SAN (MetaSAN), as we found issues with
- failover not working as desired, a lot of manual administration required to promote control to the right machine
- sometimes "stalled" systems, which can "easily" recovered by re-starting everything (each machine in the SAN)
- RAID drives getting filled "all over" due to the 200 MByte pre-allocation feature in the SAN software. In concurrent rendering processes like we have them in our renderfarm and multi-user environment, the disks get easily fragmented heavily. Defragmentation of SAN drives requires unmounting them. A SMB shared drive is *dramatically* easier to handle.

We may then - step by step - replace the RAIDs with newer ones and even with Microsoft storage pools at some day. However, I need to make a smooth transition in a working environment, that will work with our budgets and not create to many unknown variables and risks.


In that context:
Has anyone tried PrimoCache on a clustered server?
Can this work?
I have some doubts, but maybe someone knows better?

Re: Has anyone used PrimoCache to accelerate an iSCSI target under Windows 2008/2012 sever?

Posted: Thu Feb 25, 2016 6:14 am
by 0ss555ines
That's quite a comprehensive response.

Using storage pools:
* For data integrity reasons you would probably want to use a mirrored configuration for both mechanical HDD and SSD. You are correct that the data exists in one tier or the other rather than using it as a cache.
* Storage pools allocate contiguous block of (i think) 100MB and move these around as required. How fragmentation within these blocks is managed is up to whatever volume maintenance is carried out on the volume.
* Microsoft uses the term hot block's and cold blocks. So in the simplest sense the tiering policy is most frequently used. Beyond that I'm not sure they (or any other vendor that I have seen use tiering) explain any more. Competitive edge or because the techs are the only ones that understand it I don't know.
* I would stick with NTFS at present unless you had a good reason to use ReFS.

It sounds like there are are a couple of key issues that you would run into using storage pools:
* Looking at MetaSAN it looks like you are after a distributed SAN of some kind. That's probably moving out of the realms of cheap / simple / possible for storage pools.
* It sounds like your performance requirements / expectations might be difficult to achieve by putting together a bunch of commodity components without considerable trial and error. On a more traditional SAN (over multipathed FC) with three tiers and 300GB of RAM cache we could not get our benchmarks (much) above 1GB/s from a single host with sequential workloads.

Your points are around risk and budget sound reasonable.

Good luck with your R&D.

Re: Has anyone used PrimoCache to accelerate an iSCSI target under Windows 2008/2012 sever?

Posted: Thu Feb 25, 2016 8:42 am
by Axel Mertes
My hope is that the iSCSI protocol allows to go near the true 10 GBit transfer rate.

I have no idea if iSCSI under Windows Server 2012 and Windows 10 clients would make use of SMB3's new multipathing architecture.
With SMB3 Windows automatically combines multiple (fastest) connections to build a true multipathing Ethernet route. I've seen demos with nearly 3 GByte/s transfer speed using 4 * 10 GBit links "at once". If that works with iSCSI too - I don't expect that, but it would be nice. I simply don't know yet.

You wrote you have a 3 tier FC SAN setup with 300 GB RAM cache. How does that work?
Do you have a computer that uses local (tiered) storage like HDD, SSD and additional RAM and behaves like a FC target drive, using multiple FC connections?
If yes, which OS / Software are you using?
How fast is your FC (1/2/4/8/16)?

The reason why I am asking is that I always played with the idea of building exactly that: A computer with local attached HDD and SSD plus RAM as cache, which the provides that storage via e.g. FC as a FC target to other machines. Proprietary systems doing this cost a real fortune. I am sure that most of it can be achieved with what is already available, possibly even in the OpenSource arena. Redundancy can be achieved by mirroring, as you wrote.

Using our old RAIDs I am able to achieve around 780 MByte/s true transfer, when reading from/writing to two of them at once in a stripe - sequential data. This is 4 GBit FC, so nothing spectacular. For local / direct attached storage I favorize PCIe SSD or in future M.2. Its extremely cheap, the size provided is usually enough for our requirements (rarely beyond 2-4 TByte for a project) and using e.g. Syncovery I can make incremental backups "all the time" to a dedicated e.g. 8 TByte HDD for safety reasons or to a network share. Our old OCZ RevoDrive3x2 have 1.2 GByte/s write and 1.6 GByte/s read speeds. Newer models go well beyond 2 GByte/s. Share these via a 10 GBit SMB share and you'll love the results.

For a single machine we don't often need transfer speeds that high such as 1 GByte/s or faster. We do a lot of 2K/4K/6K work in film post production, but most of the time using compressed camera codecs. The data rates there are fairly small and our research showed that uncompressed is useless most of the time and a very big waste of resources. Many of the shops that held up to uncompressed workflows have simply gone, killed by the competition.

For us its more the multi-user and renderfarm issue, thats creating massive I/O loads and summarizes in a few GByte/s in total. So a single server should be able to handle it.

Re: Has anyone used PrimoCache to accelerate an iSCSI target under Windows 2008/2012 sever?

Posted: Thu Feb 25, 2016 11:02 pm
by 0ss555ines
I understand Microsoft's approach to increased throughput + redundancy on iSCSI is still the MPIO feature in Windows. This is feature is only available on the server versions of the OS; however, I have noted that a Windows 8.1 client can have multiple connected (iSCSI) sessions with some sort of load distribution policy. I have never tested whether this actually queues data on both channels.

I expect the SMB 3.0 aggregation to be done via multiple SMB sessions (probably the Session Setup request or the Create request). If this is the case it is way up the the application layer and would have no benefit to iSCSI. This would be the safest way to provide the functionality as it would have no specific requirements on any additional functions in the underlying technology.

The 3 tier FC SAN was a couple of million worth of EMC kit. It exposed multiple front end 16G FC ports across three storage processors. The storage processors had an amount of cache memory in them which was distributed for read and redundant for writes (the write needed to be in the cache of two storage processors before it was acknowledged). Each of the storage processors was connected through a private FC network so that loss of a FC switch, path, or storage processor should allow the data to be written from the redundant copy on other storage processor. Stored data redundancy was provided via many small pools (4 disks I think) and blocks wide striped across many pools. The storage was made up of SSD, 10k or 15k (sorry - I cannot recall which) and 7.5k disks. Obviously this only did block. File storage was done off the same SAN via another device.

It really does sound like you have quite an effective (for your use case) system in place already. If you could push all your high performance workloads over SMB3 (using the link sharing for higher throughput) then you could do whatever you wanted on the storage server end to get high performance and not have to fiddle with link aggregation / MPIO on iSCSI. Hopefully that would cut your iSCSI workload to something that would work without link aggregation / MPIO. If you could get link aggregation working on top of the iSCSI then that would just be a bonus.

Re: Has anyone used PrimoCache to accelerate an iSCSI target under Windows 2008/2012 sever?

Posted: Fri Feb 26, 2016 8:27 pm
by Axel Mertes
As far as I know the SMB3 transfers with multipathing enabled work only from 2012 servers towards 2012 / 10 / 8.1 as clients. I think transfers between 10/8.1 peer to peer will not use SMB3 multipathing, but I really need to check that.

You did't get 1 GByte/s streaming on such a million dollar EMC beast? What a shame for EMC...

I remember that they once visited us for my request of doing a 2 GByte/s total transfer system. They came up with a ~300,000 Euros solution eating two racks full of 15k FC drives (EMC TMS3800 or how it was called, don't remember exactly). And they did not even promise the sustained data rate. And after they explained me that I have to pay 4000 Euros extra per single partition on that storage for setting up the partition, I showed them where the carpenter left the hole in the room...

The EMC stuff was completely over our budgets and the amount of electricity used was like 10+ kW/h - a nightmare. I am looking at TCO and electricity can easily justify buying other hardware. My old RAIDs cost like 5000 Euros a year for power alone, so thats enough to buy one new chassis for the same price and as we need to replace drives, I can easily decide to go for way larger disks and concentrate on SSD caching, so one or two chassis would be "enough". The less parts, the less can go wrong. The easiest is probably a fully mirrored systems, either active or passive mirrored.

If we would not have the requiremtents of 100+ TByte online capacity, I would go for pure SSD.

For now I tend to use SSD (read) caching until SSD cost come down to go for 100% SSD.

Or soon Intel/Microns 3D XPoint to replace Flash/RAM/HDD all in one, smaller, faster, non-volatile storage. We will see.

Coming back to iSCSI:
I would need it only where traditional SMB sharing may create latencies that are uncommon in FC and hopefully iSCSI environments. We have a few workstations that may need such lower latency storage access. The total bandwith of one 10 GBit channel should probably fit (so multipathing isn't a real need as of now, just sounds like a neat option).

I guess I will set up a test system in the coming weeks and play with it.