Tag Archives: Database

SQL Server HA and DR solutions ?

I have been trying to figure out the best way to HA and DR protect SQL servers for some time now.

My customer is currently running SQL server 2008 R2 on Veritas cluster with Veritas Volume Manager as block level replication between primary and DR site.  The cluster has 4 physical nodes on each site with multiple instances for different system. Two of the hosts on each site has FusionIO flash cards for tempdb storage.  All other data is stored in a NetApp SAN connected through dual 8 GB FC HBA’s. Symantec NetBackup 7.6 is used for backups, it’s using the latest snapshot technology in NetApp and Vmware to make backups of databases in just seconds. Restore is also done in just minutes.

Upgrade to SQL 2012 is in the pipe and we are also looking into the possibility of making these servers virtual in Vmware. Load on the physical boxes is not that high and the performance of the Vmware environment would be sufficient. Vmware is running on HP blades, blade chassi is connected to NetApp through dual (or more) 10 GB Ethernet. Datastores in Vmware uses NFS protocol. Veritas is considered expensive and the goal is to get rid of it.

IOmeter och Crystal diskmark gives us almost the same disk performance between physical and virtual servers. We are looking into the possibility of using either InfinIO and/or IOTurbine in the Vmware hosts to get even better performance.

Last 8 months we have been testing AlwaysON to see if that is the solution for a new environment. AlwaysOn is very nice when used with non-shared storage and both synchronous and asynchronous replication between hosts and sites seems to be working very well. We setup a cluster with 3 plus 3 hosts on primary and DR site and created AG’s for both Sharepoint environments and other both big and small databases. We setup AG’s local to only one site and AG’s spanning both sites and we used a file share witness that could also be moved to DR site.

All worked well until we performed  a planned DR test where we failed over all AG’s to the DR site and then pulled the plug between the sites. Primary site cluster stayed up but DR site stopped because of lost quorum…. My mistake of course, had to manually change the quorum configuration after failing over. Have fixed that now with a powershell script that fails over all AG’s, resumes replication and changes quorum. So far so good.

But there is still a problem. As I said, one of the AG’s I have created is only local to the servers on the DR site, it’s a database only used by application installed on the DR site so we don’t need it on the primary site. Problem is that the cluster on the DR site goes down when we lose connectivity between the sites..  Microsoft  Cluster is designed this way to avoid a split brain situation, but it’s not what I want….

One way to solve this would be two have separate AlwaysOn clusters, but that’s expensive and it would be better to utilize the same servers for more than one thing.

Another way would be two have two FCSI clusters and put AlwaysON on top of that, but that would require shared storage for each FCSI on each site and in Vmware this requires FC or ISCSI. If I understand it correctly it could work in Windows 2012 and Vmware 5.5. But we are not there yet.

On top of this there is the problem with logins and jobs that need s to be synched between the servers. Having six servers and 4 AG’s makes this a bit complicated.

Jonathan Kehayias at SQLskills has written a small application to help out with this, but it’s still not enough. Check it out here http://www.sqlskills.com/blogs/jonathan/synchronize-availability-group-logins-and-jobs/

So, I have started to look into different solutions. I have found the following alternatives that we should consider:

  • Vmware native HA with SRM for DR protection
  • SIOS
  • DH2i
  • Veritas

What I’m considering right now is to just put the SQL servers in Vmware as is and use SRM for DR protection, it would give basically the same protection level as the Veritas solution. Block level snapshot replication in the NetApp instead of block level volume replication with Veritas. Failover between local host on primary would be slightly slower because  it would require the virtual machine to restarted from scratch on a new hosts, compared to Veritas where only SQL server has to startup. But Veritas has to take care of shared storage and unmounting and mounting diskgroups and that can also take some time. Failing to DR site is manual in Veritas and would be the same in Vmware, data loss would possibly be the same if the line between the sites is not fast enough for high peak load.

Using some other third-party clustering like SIOS or DH2i would maybe be better than Veritas, less complicated, possibly cheaper but something new that needs to be implemented.

With the protection that Vmware, NetApp and Symantec provides is it really neccesary to add AlwaysOn protection as well if we don’t need readable secondarys or backup offloadning ?

What’s your opinion ?

Planning my #SQLPass session schedule….

Started planning my session schedule for SQLPASS while I was waiting inline for immigrations at O’Hare airport. It was a long line so I had time to go through most of the details in the #Guidebook and I picked everything I liked.

No I have a schedule with 3, 4 or even more sessions I want to go to for any given time. How should I choose ?  As a #sqlfirsttimer I’m not sure what the best approach is ? Should I go for the   speaker that I have read and listened to on webinars, youtube, the once that I’m following in Twitter and who seems to really know their stuff ? Or should I go for the once that I never heard off and hope that they have something completely new to teach me ?

There are so many interesting areas I would like to dig inn to, should I choose one area all the way or should I mix them as much as possible to get new input ?

BigData, SANLess Clustering, Extended Events, Power BI and Power Map, database corruption, AlwaysOn, Powershell, PDW, 2014 InMemory OLTP, Clustered updatable columnstore indexes, Indexes, Central Managment Server, SQLCAT session, SQL Clinic, Query optimising, etc, etc…

Not to mention all the other activities, First-Timers orientation, Welcome reception, Exhibitor reception, #SQLRun,  Sponsor breakfasts, Keynote, Community appreciation Party at Nascar hall of Fame, Birds of a Feather Luncheon, SQL Server Clinic an SQLCAT, SQL Kilt day and #SQLKaraoke.

I starting to believe I will not make to all of these events. But I will sure try. I will probably end up bying the DVD…

DataQualityServices setup with AlwaysON

Setting up Data Quality Service with Always on is well described in this document:

http://download.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26-FEF9550EFD44/Using_AlwaysOn_Availability%20_Groups%20_for%20_HighAvailability_and_DisasterRecovery_of_DQS.docx.

I used that as guide when I installed DQS at a customer site today.

The procedure was very straight forward and I had no problems at all.

1. First I added DQS feature from the SQL 2012 installation media. I did not have to reboot or restart SQL server so that was easy.

2. I applied the SP1 and CU3 patches that where previously applied for the rest of the SQL server.

3. I repeated this on all the nodes where the AvailabilityGroup was setup to run.

4. After that I had to run the DQSInstaller.exe file in the …\MSSQL11.MSSQLSERVER\MSSQL\Binn folder on each node. This requires me to enter a Database Master Key, you must use the same key on all nodes and be careful not to forget it. It is need when a SQL upgrade takes place.

The DQSInstaller then created three databases.

5. Then I added a Windows account on the first node and gave it permissions to the DQS_MAIN database according to the document.

6. Then I deleted the new databases from the two other nodes in the cluster.

7. Now I could add the three databases as availability databases in the AG. And they where copied out to all nodes.

8. I then copied the Windows account to all three nodes. Keep in mind tha AlwaysOn doesn’t replicate any logins. I use a sp called dbs_CopyLogins, it can downloaded here: http://www.sqlsoldier.com/wp/sqlserver/transferring-logins-to-a-database-mirror, is very useful since it copies the password correctly..

9. Then I installed the DQS client from the SQL distribution and patches it. When I started the Client I Could just enter the AG listener name and could successfully login.

Staffan Olofsson
MCSE: Data Platform