Tag Archives: NetApp

Nutanix, is it a gamechanger ?

Ever heard of Nutanix ?

I think its time now to tell the story about something that might change the data center as we know it.

Traditional infrastructure for virtualization includes servers, network and storage. Normally operated by different departments within the IT organization. This often leads to problems when there is a performance issue, everybody start to blaming each other.

This is also in general a very expensive solution with both servers, switches and SAN equipment. All of this often duplicated to make it high available with redundant extra HW, often doing nothing most of the time.

With VDI becoming more and more common, even in smaller organization, it’s has become a challenge to provide adequate performance for that environment without being affected by other storage activity.

One way could be to separate storage for the VDI environment on it’s on storage, preferably on something with low latency, able to provide high IOPS during bootstroms etc. Here flash-based arrays like Nimbus, Purestorage, Violin or hybrid arrays like Nimble could play a part. Even the big storage providers like NetApp, EMC, HP, IBM now have both hybrid and pure flash-based arrays to compete with the new kids on the block.
But even with these solutions we still have a solution divided into three different areas. Server, network and  storage are just faster, but they will still need to be maintained by different groups of people. And they will still need to communicate over a network.

We are more and more moving towards the software defined data center with public, hybrid and private cloud becoming the standard for the new data center. We need to adopt to this and get rid off all the old legacy systems to be able to smoothly migrate resources between in-house and hosted solutions.

With IT becoming more centralized and outsourcing into cloud solutions becoming some kind of best practice, IT technicians must strive to make the infrastructure as easy to maintain as possible, otherwise it will be outsourced to someone doing it faster and cheaper.

Nutanix, delivering kind of appliance box with CPU and storage locally in the box is definitely on the right track here. By delivering a 2U box with four servers, each with 2 CPU’s, up to 512 GB RAM,SSD and SATA drives with a total of 8 TB useable storage with high IOPS and a distributed filesystem where everything scales linear when adding more 2U boxes it becomes a very easy to manage solution.

Inline dedupe, compression, remote replication to a DR site box, support for running Vmware, Hyper-V or KVM as hypervisor makes it a very attractive solution for running VM’s.

There are a number of different HW specs to choose from, from high density boxes with four nodes to low latency with 2 and added storage capacity or even with Nvidia GPUs to provide VDI solutions capable of delivering 3D and CAD performance.

if you are already running a Vmware shop, just add the box to your Vmware data center or DRS cluster and live migrate machines over to the new hosts.

There is so much new about this that I must urge you all to look deeper into it.  Below are links to some interesting sites and some withepapers.

Update #1:

vSAN is a new feature in vSphere 5.5 that have some of the features that Nutanix offers. This article explains Nutanix more in detail and also compares it to vSAN.

http://up2v.nl/2013/08/26/nutanix-virtual-computing-platform-compared-to-vmware-vsan-1-0/

http://www.nutanix.com

http://stevenpoitras.com/

http://stevenpoitras.com/the-nutanix-bible/

http://stevenpoitras.com/2013/11/advanced-nutanix-sql-server-nutanix-best-practices-released/

http://www.nutanix.com/blog/2013/01/02/2013-predictions-the-end-of-big-iron/

Advertisements

SQL Server HA and DR solutions ?

I have been trying to figure out the best way to HA and DR protect SQL servers for some time now.

My customer is currently running SQL server 2008 R2 on Veritas cluster with Veritas Volume Manager as block level replication between primary and DR site.  The cluster has 4 physical nodes on each site with multiple instances for different system. Two of the hosts on each site has FusionIO flash cards for tempdb storage.  All other data is stored in a NetApp SAN connected through dual 8 GB FC HBA’s. Symantec NetBackup 7.6 is used for backups, it’s using the latest snapshot technology in NetApp and Vmware to make backups of databases in just seconds. Restore is also done in just minutes.

Upgrade to SQL 2012 is in the pipe and we are also looking into the possibility of making these servers virtual in Vmware. Load on the physical boxes is not that high and the performance of the Vmware environment would be sufficient. Vmware is running on HP blades, blade chassi is connected to NetApp through dual (or more) 10 GB Ethernet. Datastores in Vmware uses NFS protocol. Veritas is considered expensive and the goal is to get rid of it.

IOmeter och Crystal diskmark gives us almost the same disk performance between physical and virtual servers. We are looking into the possibility of using either InfinIO and/or IOTurbine in the Vmware hosts to get even better performance.

Last 8 months we have been testing AlwaysON to see if that is the solution for a new environment. AlwaysOn is very nice when used with non-shared storage and both synchronous and asynchronous replication between hosts and sites seems to be working very well. We setup a cluster with 3 plus 3 hosts on primary and DR site and created AG’s for both Sharepoint environments and other both big and small databases. We setup AG’s local to only one site and AG’s spanning both sites and we used a file share witness that could also be moved to DR site.

All worked well until we performed  a planned DR test where we failed over all AG’s to the DR site and then pulled the plug between the sites. Primary site cluster stayed up but DR site stopped because of lost quorum…. My mistake of course, had to manually change the quorum configuration after failing over. Have fixed that now with a powershell script that fails over all AG’s, resumes replication and changes quorum. So far so good.

But there is still a problem. As I said, one of the AG’s I have created is only local to the servers on the DR site, it’s a database only used by application installed on the DR site so we don’t need it on the primary site. Problem is that the cluster on the DR site goes down when we lose connectivity between the sites..  Microsoft  Cluster is designed this way to avoid a split brain situation, but it’s not what I want….

One way to solve this would be two have separate AlwaysOn clusters, but that’s expensive and it would be better to utilize the same servers for more than one thing.

Another way would be two have two FCSI clusters and put AlwaysON on top of that, but that would require shared storage for each FCSI on each site and in Vmware this requires FC or ISCSI. If I understand it correctly it could work in Windows 2012 and Vmware 5.5. But we are not there yet.

On top of this there is the problem with logins and jobs that need s to be synched between the servers. Having six servers and 4 AG’s makes this a bit complicated.

Jonathan Kehayias at SQLskills has written a small application to help out with this, but it’s still not enough. Check it out here http://www.sqlskills.com/blogs/jonathan/synchronize-availability-group-logins-and-jobs/

So, I have started to look into different solutions. I have found the following alternatives that we should consider:

  • Vmware native HA with SRM for DR protection
  • SIOS
  • DH2i
  • Veritas

What I’m considering right now is to just put the SQL servers in Vmware as is and use SRM for DR protection, it would give basically the same protection level as the Veritas solution. Block level snapshot replication in the NetApp instead of block level volume replication with Veritas. Failover between local host on primary would be slightly slower because  it would require the virtual machine to restarted from scratch on a new hosts, compared to Veritas where only SQL server has to startup. But Veritas has to take care of shared storage and unmounting and mounting diskgroups and that can also take some time. Failing to DR site is manual in Veritas and would be the same in Vmware, data loss would possibly be the same if the line between the sites is not fast enough for high peak load.

Using some other third-party clustering like SIOS or DH2i would maybe be better than Veritas, less complicated, possibly cheaper but something new that needs to be implemented.

With the protection that Vmware, NetApp and Symantec provides is it really neccesary to add AlwaysOn protection as well if we don’t need readable secondarys or backup offloadning ?

What’s your opinion ?