Rally is back in Stockholm, apparently this years event was the largest Rally event ever outside US. Even bigger then last time. That is nice and even though I only participated part-time this year I just love it.
This blog has been written during the event but was finished almost a week later. Timeline within the blog is therefore a bit confusing. But I prefer to leave it that way so that you can read it as I experienced it.
This Conference is obviously small compared to SQL Pass Summit in Charlotte some weeks ago, but the speakers are as good, food seems even better and Tuesdays entertainment sound promising.
“Setup best practice of the year” – Brent Ozar
I attended Brent Ozar’s pre-con the first day. And it’s amazing that there is always something ew to pickup, Brent has the same kind of background as me, learned SAN and virtualization because admins in these areas didn’t understand SQL related problems. SQL is after all the most demanding servers in most SAN or virtualization environment.
I won’t go into detail about all that was discussed during the day, I’m just going to comment on some of the notes I took during the session.
BrentOzat.com is a great source to get deeper into all of this. There are ton’s of free material to explore.
Try out sp_AskBrent with @expertmode=1 to find what your SQL Server is waiting on.
Try sp_Blitz to get at start on troubles in your SQL server.
Combine with Adam Machanic’s sp_WhoIsActive and use the @get_plans=1 parameter and you have a wonderful combination of tools to troubleshoot your environment.
Look also for the triage_waitstats script to look deeper into waits.
Best advice to boost performance of your SQL server is to buy memory instead of upgrading SAN. RAM is so cheap these days, getting enough RAM to hold your complete DB in memory is the best way to get rid of storage IO.
To define your hardware for HA and DR the best way you need to define RPO (RecoverPointObject) and RTO (RecoveryTimeObject). In other word, how much data can you lose and how long can your environment be down.
All systems needs to be defined in the following matrix.
none to 1 sec
Depending on your choices above you then have several options to achieve this RPO and RTO. AlwaysON, clustering, mirroring, replication in various combinations and with underlying storage replication and such.
AlwaysON is not always the best choice, it is still a product that requires a lot of hands on.And it’s not as easy as running just a wizard. You might end up with a less reliable environment if you don’t do it right.
Run more virtual hosts instead of instances. Combining more instances on one host often more difficult than running multiple VM’s instead. And if you are running Enterprise edition, licensing that on the hypervisor level, means you can run an unlimited number of instances on that host. Might be a solution for your environment.
Always start all VM’s with 2 cpu and increase when you can see wait’s on CPU, giving a VM more CPU can actually make it slower.
I’m a big fan of PCI flashcards and Intel 910 seems to be a bargain, need to check it up. It comes in 600 and 800 Gb size and delivers up to 2 Gb/s.
PCI cards with caching software is maybe a good idea, but Products like OCZ ZD-XL or FusionIO with IOTurbine adds an extra layer of problem with bugs etc. I was interested in these at first but I Think for SQL Server it is better to avoid this.
“Putting SSD in a SAN is like flushing money down the toilette”, according to Brent Ozar. SSD is better local to SQL Server. Fabric will still be the biggest bottleneck.
A normal core these days is capable of receiving about 200 MB/s. To saturate all cores in a 2 socket server with 4 cores each the SAN needs to deliver 1600 MB/s.
Imagine the SAN needed to saturate 2 socket, 10 cores….
Vmware 5.5 with local SSD storage is ideal for SQL server, it can be very good for standard edition with limited memory, cache can appear bigger with this solution. Less SAN IO.
INFINIO also has release a Product last week that uses RAM in vmware host to cache IO. All hosts forms a pool of RAM that is shared to get better cache hit rate. Would like to try that out in the future.
You can also check out brentozar.com/go/filecache for more info on this subject.
To check if there is enough RAM in your server run perfmon. Target memory should be same as max memory otherwise we have memory pressure
“Getting the most from your SAN – file and filegroup layout” – Stephen Archbold
User databases actually has the same hotspot problem as tempdb, adding extra files to a filegroup gives an immediate performance increase of up to 15%.
Check that read ahead of 512 kb/s test with an aggregate select on fact table and watch avg. bytes /transfer in perfmon .
The following startup parameters for SQL Server can be used to get better IO performance, investigate and make sure you understand how they should be used and for which enviornments. Google for SQL Server Datawarehouse Fast Track from Microsoft and read it all.
“Use your brain to beat SQL Server” – Adam Machanic and Thomas Kejser
Sherlock Holmes kind of feeling in this session. Thomas and Adam tried to learn us tricks we can use to manipulate how the optimizer creates queryplans. Especially when the queryplans is not looking the way we want and we can’t fix it with normal solutions. I will not even try to recap what was said during the session, I was just amazed. Looking forward to the presentation material when becomes available.
“Index Impact Workload Analysis” – Klaus Aschenbrenner
Klaus Aschenbrenner had an interesting session about new SQL 2012 feature named distributed replay. This tool can be used to record a trace of all activity in system for a specified time. The trace can then be moved to another system where is can be re run. This gives excellent opportunities to test out indexes or other changes to a system without affecting the production environment. Load testing can also be done by utilizing up to 16 clients to run the trace.
I have only one problem with this and that is that prod and test environment needs to be very similar so that we don’t get different queryplans and result just because of the HW differences.
In the evening on Tuesday there was a dinner with Brynolf and Ljung entertaining, they are two Swedish magicians how gave us an amazing show in magic and illusions.
“Data Warehouse in the Cloud – Marketing or Reality?” – Alexei Khalyako
Alexei tried to convince us that we can actually run DW’s in Azure. He has a point and for some customers this probably a viable solution, but for the majority of customers I still Think performance is not enough. Give it a year and will be possible.
“Automating Data Warehouse Patterns Through Metadata” – Davide Mauri
Davide is a great speaker and has a way of getting your attention that only Brent Ozar can beat. In this presentation he discussed different DW load patterns and how we can solve them and in the end he gave us insight in how we can automate much of the “monkey work” in a DW development with BIML (Business Intelligence Markup Language).
“HA and DR in the context of the SLA” – Tobiasz Koprowski
Was a nice recall of what we actually have SLA for. It’s not always as easy as just saying that a DB has a 99,999% uptime, all components a system is relaying must be added to the equation to get the real uptime. This polish guy did a great job describing the consequences.
“Does Optimistic Concurrency Resolve Your Blocking Problems?” – Margarita Naumova
Maggie Naumova walked us through the problems with locking and how we can solve some of the problems with RowLevelVersioning. I have personally used this on several systems to solve locking problems with good result.
She finished of by explaining how most of the locking problems we have to today will be solved by SQL 2014.
In Summary I had some very nice days at Arlanda Airport where the Conference was held. I meet old friends and also made some new. I learned alot, especially from Brent, how made me think twice about some coming purchases.
I’m sorry I could not attend SQL Rally in Amsterdam also. That would have been nice.