Page 1 of 1

improving 50 GB linear database performance

Posted: Wed Jun 22, 2016 4:13 am
by thunderzhao
Hi guys, i'm new to primocache, right now on trial period and see if this will speed up my work environment

Our server has raid 1 with 2 10k drives, 50 GB linear database and 150 GB image data with 25 pc's connected to it through gigabit network, still quite slow because of the large linear database.

i'm testing primocache with 10GB Ram L1 cache and 120GB SSD L2 Cache. I'm thinking of ordering 500 GB samsung EVO to be used as L2 Cache. Will i get 100% L2 read hit because the entire C drive is about 200 GB and much less than 500 GB.

or should I do raid 1 with 2x 500 drives or 4x500 raid 10 ? will that deteriorate SSD drives very fast?

please help.
Thanks

Re: improving 50 GB linear database performance

Posted: Wed Jun 22, 2016 8:46 am
by Support
I think even you use 120GB SSD L2cache + 10GB RAM L1, you'll get a quite high read hit rate.
The speed is faster if you directly use SSD or SSD raid for storing data (linear database & image data). But if you don't want to migrate data from old disks/raid to ssd/ssd raid, then L2 cache is a good choice.

Re: improving 50 GB linear database performance

Posted: Wed Jun 22, 2016 11:12 am
by Axel Mertes
Hi!

If you already consider having two or more SSDs with 500 GByte each, why not do a RAID1 with two 500 GByte SSDs instead of caching an old HDD?
You can still do daily backups from your SSD RAID to HDD (e.g. with Syncovery or similar) if you are hesitant to trust SSD reliability.

You apparently have quite a small amount of data to deal with (I am comming from the TByte business), so I would consider going straight for maximum performance in your case and doing it all SSD only (except for the backup as explained above).
You can the still employ L1 RAM caching if necessary. In that context I would rather recommend increasing RAM size if possible.

The difference by emplyoing RAM cache and using SSD over HDD is dramatic. You will have almost instant answers on your network clients and bottlenecks due to seek times on your HDD will be simply gone.

In that context:
Consider using PCIe or NVME SSDs over SATA SSDs if you can. Clearly that depends on your motherboard, which kind of PCIe slots are hopefully there and free and how fast these are.

The point is that the throughput from an PCIe SSD or NVME SSD is *DRAMATICALLY* faster than from any SATA SSD - at really comparable prices.

If you look for instance for the SAMSUNG SM950 M.2 (soon the SM961 will become available too). The SM950 is spec'ed for ~2.5 GByte/s reads,1.5 GByte/s writes, the 961 is said to do 3.2 GByte/s reads, 1.8 GByte/s writes. There are M.2 to PCIe adpater cards (relatively inexpensive) if you don't have the M.2 slots in your motherboard. However, you will not be able to transport at that speed into your network, except you do the next step and add a 10 GBite or dual 10 GBit card to your server and employ a network switch that has at least a 10 GBit or dual 10 GBit server uplink and 1 GBit connections to the clients. That kind of technology becomes quite cheap nowadays. So even if you don't do it right now, consider investing in the right type of SSD will allow you do that next step of upgrading your network bottleneck without the need to buy new SSDs then.

Alternatively you can use a true PCIe SSD like the Mushkin, OCZ, Intel, etc. Look at their specs beforehand, some aren't faster than a typical SATA SSD, but some break the 2 GByte/s limits easily.

M.2 SSDs usually cost as much as their same sized SATA counterparts, sometimes even cheaper (due to no housing etc.).
PCIe SSDs are slightly more expensive, but are signifcantly faster than any SATA SSD can be.
Some of the PCIe SSDs are simply host cards that employ 2 or 4 M.2 SSDs.

Important:
There are also M.2 SSDs with a SATA interface. Do never use these, they are obsolete products. You always find true SSDs and they are typically 4-6 times faster as M.2 SSD over M.2 SATA - at almost identical prices.

Re: improving 50 GB linear database performance

Posted: Thu Jun 23, 2016 11:56 pm
by thunderzhao
Axel Mertes wrote:Hi!

If you already consider having two or more SSDs with 500 GByte each, why not do a RAID1 with two 500 GByte SSDs instead of caching an old HDD?
You can still do daily backups from your SSD RAID to HDD (e.g. with Syncovery or similar) if you are hesitant to trust SSD reliability.

You apparently have quite a small amount of data to deal with (I am comming from the TByte business), so I would consider going straight for maximum performance in your case and doing it all SSD only (except for the backup as explained above).
You can the still employ L1 RAM caching if necessary. In that context I would rather recommend increasing RAM size if possible.

The difference by emplyoing RAM cache and using SSD over HDD is dramatic. You will have almost instant answers on your network clients and bottlenecks due to seek times on your HDD will be simply gone.

In that context:
Consider using PCIe or NVME SSDs over SATA SSDs if you can. Clearly that depends on your motherboard, which kind of PCIe slots are hopefully there and free and how fast these are.

The point is that the throughput from an PCIe SSD or NVME SSD is *DRAMATICALLY* faster than from any SATA SSD - at really comparable prices.

If you look for instance for the SAMSUNG SM950 M.2 (soon the SM961 will become available too). The SM950 is spec'ed for ~2.5 GByte/s reads,1.5 GByte/s writes, the 961 is said to do 3.2 GByte/s reads, 1.8 GByte/s writes. There are M.2 to PCIe adpater cards (relatively inexpensive) if you don't have the M.2 slots in your motherboard. However, you will not be able to transport at that speed into your network, except you do the next step and add a 10 GBite or dual 10 GBit card to your server and employ a network switch that has at least a 10 GBit or dual 10 GBit server uplink and 1 GBit connections to the clients. That kind of technology becomes quite cheap nowadays. So even if you don't do it right now, consider investing in the right type of SSD will allow you do that next step of upgrading your network bottleneck without the need to buy new SSDs then.

Alternatively you can use a true PCIe SSD like the Mushkin, OCZ, Intel, etc. Look at their specs beforehand, some aren't faster than a typical SATA SSD, but some break the 2 GByte/s limits easily.

M.2 SSDs usually cost as much as their same sized SATA counterparts, sometimes even cheaper (due to no housing etc.).
PCIe SSDs are slightly more expensive, but are signifcantly faster than any SATA SSD can be.
Some of the PCIe SSDs are simply host cards that employ 2 or 4 M.2 SSDs.

Important:
There are also M.2 SSDs with a SATA interface. Do never use these, they are obsolete products. You always find true SSDs and they are typically 4-6 times faster as M.2 SSD over M.2 SATA - at almost identical prices.
Thank you for your suggestion, to move data to new sad, can I clone one of the raid 1 drive to an ssd?

Re: improving 50 GB linear database performance

Posted: Fri Jun 24, 2016 6:33 am
by Axel Mertes
Technically you can clone an SSD but many older mainboard RAID controllers don't really support SSDs natively, so TRIM operations would not work, which is quite essential for SSD use.

Copying 50 GBytes usually takes just a few minutes.

I'd rather recommend you set up your SSD and simply copy over from RAID to SSD. Later you may use the RAID to make constant backups of the SSD. That might in turn spare you an extra backup SSD.

As the SSD is super fast and has essentially no head seek time overhead, you can easily run a incremental backup or shnapshot on an hourly basis without affecting your general system performance. So you would feed your clients / ethernet from the SSD as main source drive and make constant copies of that data to your RAID HDDs. You may use something like Syncovery, which is fairly cheap and useful for disk to disk backups etc.

I would also recommend to defragment your RAID regularly, at least once a day, to keep data intact and secure your system from crash data loss. Make sure you always have more than one copy of your database (not only on two drives, I mean also more than one state in time, so like yesterday & today, last monday, 1. of month etc.).

On top you might add/test using PrimoCache to accelerate the SSD from RAM. But honestly: Don't use write caching in this scenario, as this could lead to corrupt data in case of a power loss whatsoever. And that means you gain a little read speed over the naked SSD only, but I doubt you can measure/feel it from the network clients, as the SSD will already outperform your network connections easily (even if its only SATA 300), except you have at least 4 gigabit ports running full speed at once from the server into your switch (or 10 GBit connections as I suggested).

As much as I recommend PrimoCache, in your specific scenario buying one large SSD for your database and using the RAID HDDs as constant place for backups sounds a cheap and useful solution. You will be impressed how reaction times will change and you will find that your network becomes the new bottleneck. I'd also recommend using software such as Syncovery and professional defrag software for the RAID like O&O Defrag, Diskeeper or even the freeware ones.

Re: improving 50 GB linear database performance

Posted: Sat Jul 16, 2016 4:47 pm
by InquiringMind
Axel Mertes wrote:...many older mainboard RAID controllers don't really support SSDs natively, so TRIM operations would not work, which is quite essential for SSD use.
While I'd agree with everything else posted, my experience (running an SSD RAID for 4-5 years on WinXP) would suggest that TRIM is far from essential unless you are writing pathologically large amounts of data to SSD (in which case, reduced lifespan would seem a more pressing concern). Occasional checks with disk benchmarking tools have not shown any slow down over that time on my system which would suggest that SSD firmware's garbage collection would suffice (indeed, Bit-Tech's later testing seems to reach the same conclusion).
Axel Mertes wrote:...I doubt you can measure/feel it from the network clients, as the SSD will already outperform your network connections easily (even if its only SATA 300), except you have at least 4 gigabit ports running full speed at once from the server into your switch (or 10 GBit connections as I suggested).
Agreed - using PrimoCache on the client workstations might offer a bigger benefit (since caching data would reduce network traffic as well as database access, but this carries the risk of missing recent data changes made by other users) but the biggest benefits are likely to be gained with a top-down analysis. How is the database being used, what data/queries are most often accessed and where could indexes be employed to speed up data retrieval?

As you mention 150GB of image data, it may be worth checking if that is using lossy compression - JPEG can greatly reduce image size with little visual impact on "natural" images (e.g. photographs) but doesn't work as well on those with sharp contrasts (screenprints, scanned documents). Depending on image usage, a hybrid system (using highly-compressed JPGs for database queries while having high-quality, lossless, PNG images stored for archival use) could be worth investigating.

Re: improving 50 GB linear database performance

Posted: Sat Jul 16, 2016 5:57 pm
by Axel Mertes
InquiringMind wrote:[..]Agreed - using PrimoCache on the client workstations might offer a bigger benefit (since caching data would reduce network traffic as well as database access, but this carries the risk of missing recent data changes made by other users) but the biggest benefits are likely to be gained with a top-down analysis. How is the database being used, what data/queries are most often accessed and where could indexes be employed to speed up data retrieval?[..]
You simply can't use PrimoCache to cache a network shared drive from on a client to that shared drive. Its impossible.

Network shared storage (Windows SMB) is file based and not presented as block storage, so its invisible to PrimoCache and cannot be cached this way. It needs a completely different architecture of caching.

So don't get confused:
Having a central data base is a *good thing* and caching it locally using PrimoCache will accelerate it. Even better would be to run it directly from a native SSD on the server, rather than a HDD with SSD caching, given the small mentioned size.

The reaction times on networking will be significantly introduced when using SSDs. Putting PrimoCache on top (with lots of RAM) may help improve it further (read cache only, please ;-) ).