tag:blogger.com,1999:blog-80554185473329927882024-03-29T05:02:54.375-06:00Databases and Dead LanguagesAs a professional, I am a data scientist with an interest in the languages, linguistics, and cultures of ancient civilizations.
I primarily work with Microsoft SQL Server, Access, Excel, and PostgreSQL but I have 15 years of experience with databases. I am interested in data analysis, visualization, big data, and business intelligence.
I have a master's degree in Classical Studies from UNM where I taught Latin for four years. I run an online dictionary called "Numen - The Latin Lexicon."Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.comBlogger38125tag:blogger.com,1999:blog-8055418547332992788.post-60824885382591945202018-02-14T14:55:00.001-07:002018-02-14T14:59:32.802-07:00"Cross-Database" Execution Plans Favor Row Mode instead of Batch Mode in Clustered Columns IndexesColumnstore indexes are the most amazing technology. I have come to love them because of their massive increases in compression ratios and also their performance. My database servers can literally have terabytes of data cached in memory in a tiny 25GB footprint. It's some kind of miracle.<br />
<br />
While brushing up for an MCSE refresher in a couple of months, I came across a good series of articles. Itzik Ben-Gan wrote an article about <a href="http://www.itprotoday.com/microsoft-sql-server/what-you-need-know-about-batch-mode-window-aggregate-operator-sql-server-2016-0" target="_blank">batch mode processing in columnstore indexes</a> a year or so ago back when SQL Server 2016 SP1 was released. In them, he provides some examples. I had some tables with clustered columnstore indexes that looked almost exactly like his so I decided to try out his examples. Unfortunately, I couldn't get batch mode to work as advertised.<br />
<br />
Why, I wondered? Perhaps it was a difference of the data? Statistically, his test table and my real table looked almost identical. I kept digging. Then I paused to think about it. What was the difference between his query and mine? Well ... it turns out that I was actually using the <b>master</b> database but referencing the other database with 3-part object notation. It was a form of laziness that has crept into my ad-hoc queries over the past few years. Why bother using a database when I can just reference the database I need? Could that be the problem?<br />
<br />
Lo and behold, when I used the same database as the one where the data resided, the execution plan changed and surprisingly looked exactly like Ben-Gan's.<br />
<br />
I had accidentally rediscovered something I had learned many years ago: "cross-database" execution plans can be skewed because of the way the optimizer views "remote-database" data. Gak!<br />
<br />
Below are the two execution plans (with queries) for reference. Notice that the only difference between the two is that one employs <b>USE BI_DM</b> and the other employs <b>USE master</b>. Both employ 3-part object notation.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEie8VQBHqenMdBTttATB27mjzP7PgkURH08ocyrmOKNAWzSmYFI5cHmZgFgK5NqNg2PUvFZZTEjwQ7wT1jbWWiKAj9VDFJBHxo38HSwIHu0egGB6Qy2_Y3d29SexeOEmRB5Tat9WjPn7Me5/s1600/Capture01.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="664" data-original-width="960" height="275" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEie8VQBHqenMdBTttATB27mjzP7PgkURH08ocyrmOKNAWzSmYFI5cHmZgFgK5NqNg2PUvFZZTEjwQ7wT1jbWWiKAj9VDFJBHxo38HSwIHu0egGB6Qy2_Y3d29SexeOEmRB5Tat9WjPn7Me5/s400/Capture01.JPG" width="400" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheLMddSCm91qHHLqqPTt-jz-yohIuODkYsCZKeItAqXwMCJeSrG8MaasT-fR9AcgYAInGzXEyHd6nVS5FbXRQMHVcfXst-Kwl0s-bfDWmlJ0DPzG1RwLVm6IHnNK3frUXY97S_JPjuQq0l/s1600/Capture02.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="798" data-original-width="957" height="332" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheLMddSCm91qHHLqqPTt-jz-yohIuODkYsCZKeItAqXwMCJeSrG8MaasT-fR9AcgYAInGzXEyHd6nVS5FbXRQMHVcfXst-Kwl0s-bfDWmlJ0DPzG1RwLVm6IHnNK3frUXY97S_JPjuQq0l/s400/Capture02.JPG" width="400" /></a></div>
<br />
<br />Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com1tag:blogger.com,1999:blog-8055418547332992788.post-47877297492468039242015-10-28T09:33:00.002-06:002015-10-28T09:33:16.533-06:00SSDT: What is the Solution Items Folder?<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqL12qPT0IDCd6Qj2d9EhrH1cHasw3ixqhG87fecmzINwlc7HlQhGmVOnKDWxVXfJr0PKlX4pEC23VFvLT9vk_sVnChMRNAgVZJO43cEhqUCZEohyphenhyphen6xSsAWzMBMa1z52EYh-9o_Ck93Fp4/s1600/solutions+explorer+solution+items+unopened.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqL12qPT0IDCd6Qj2d9EhrH1cHasw3ixqhG87fecmzINwlc7HlQhGmVOnKDWxVXfJr0PKlX4pEC23VFvLT9vk_sVnChMRNAgVZJO43cEhqUCZEohyphenhyphen6xSsAWzMBMa1z52EYh-9o_Ck93Fp4/s200/solutions+explorer+solution+items+unopened.png" width="150" /></a></div>
There's a folder in a SQL Server Data Tools (SSDT) Solution Explorer called Solution Items*. It's a folder and its icon has a dotted outline -- so the implication seems to be that it's some kind of <i>ghost</i> folder, almost to say that it isn't quite like the <i>others</i>. I don't know how long it's been there, but today was the first day I noticed it**.<br />
<br />
"What's that?" I wondered to myself. I searched and searched long and hard hours, in the rain, in the snow, and near mountains****. Naturally, the first article that popped up was an MSDN article: <a href="https://msdn.microsoft.com/en-us/library/vstudio/0kz72ax7(v=vs.100).aspx" target="_blank">Solution Items</a>. There weren't very many articles about it, which means it was a good candidate for this blog****. <br />
<br />
In a lot of cases, there are files that a developer might want to draw attention to a file and add it to a Solution that doesn't necessarily fit into the actual code, for instance README files or installation instructions. They are not source files, they are not project files, they are just ... miscellaneous files! But they are still important.<br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
In cases like the one below (an SSIS template), such files can be added (or drag-and-dropped!) into the Solution Items folder, as depicted below.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg10P41JuP9hiIBxsjaqqsu_p8rKymbPcTrkJRkfq01-K7xCj5II6VrI01VvdI9VILCasCvpadtyzlWJXj4IZ4jPwUSyCcllvI8aHZ9ozRLLBY4ByEeR3F9qyx_pJ4NnuU2hoHz5-pBtxpL/s1600/solutions+explorer+solution+items.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg10P41JuP9hiIBxsjaqqsu_p8rKymbPcTrkJRkfq01-K7xCj5II6VrI01VvdI9VILCasCvpadtyzlWJXj4IZ4jPwUSyCcllvI8aHZ9ozRLLBY4ByEeR3F9qyx_pJ4NnuU2hoHz5-pBtxpL/s400/solutions+explorer+solution+items.png" width="297" /></a><br />
<br />
<span style="font-size: xx-small;">* For those who don't know, SSDT is just a shell of Visual Studio. Ergo, this information also applies to the full product. </span><br />
<span style="font-size: xx-small;">** Which is weird because I spend 30 hours a week in SSDT. </span><br />
<span style="font-size: xx-small;">*** It's not hard. I live in Denver and it was raining that day. </span><br />
<span style="font-size: xx-small;">**** I try to only post about things that don't already have a good presence in the Internet Mindshare.</span>Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com4tag:blogger.com,1999:blog-8055418547332992788.post-35428580910405734842015-06-11T15:45:00.001-06:002015-06-11T15:47:08.431-06:00This is what I do with my time ... I will upload this Pivot Chart which comes from a data warehouse built off an ETL that pulls from an app on my phone that I use to track all my time.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgXyLJ_YuOZZ8FrNifcqgYDlMSqHEPpEHKa2ukE7XnCxlnyj5AMwpsCU66EdpLp40n5IQRHFgIc0cmrU9xl91o0O9ghuyao7-02svC9tt9hISBlqc3Il5uoI5WzAYWFIg1R4mUmmTH7TuOZ/s1600/area+chart+20150611.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="267" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgXyLJ_YuOZZ8FrNifcqgYDlMSqHEPpEHKa2ukE7XnCxlnyj5AMwpsCU66EdpLp40n5IQRHFgIc0cmrU9xl91o0O9ghuyao7-02svC9tt9hISBlqc3Il5uoI5WzAYWFIg1R4mUmmTH7TuOZ/s400/area+chart+20150611.png" width="400" /></a></div>
<br />Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com1tag:blogger.com,1999:blog-8055418547332992788.post-73588657284244302015-04-22T11:05:00.002-06:002015-04-22T13:05:07.110-06:00DAX: Introductory ResourcesAs anyone who has worked with a cube or a tabular model knows, getting data out is easy -- but doing more complex calculations is difficult.<br />
<br />
I've been developing a data warehouse (Kimball) for my job over the last two months and two of our business analysts are trying to figure out how to get data out of the SSAS tabular models. They know about DAX (the query language of SSAS Tabular and PowerPivot) but they've been having a hard time getting basic tutorials related to DAX.<br />
<br />
For analysts new to DAX, I'm going to post some links to sites that I think are helpful for learning DAX. They will be broken down into three categories. I will also try augmenting and improving these lists from time to time.<br />
<br />
<b>Update 20150422</b>: If SSMS is too primitive for you, there is also a tool on CodePlex called <a href="https://daxstudio.codeplex.com/" target="_blank">DAX Studio</a>.<br />
<br />
<h3>
Introductory DAX</h3>
<div style="background-color: white; color: #333333; font-family: arial; font-size: 1.7em; font-weight: normal; margin: 0px;">
</div>
<ol>
<li><a href="https://www.simple-talk.com/content/article.aspx?article=1824" target="_blank">Using DAX to retrieve tabular data</a>. This link is a really excellent starting point if you want to run DAX queries in SSMS. It's a much, much easier sandbox compared to PowerPivot or SSDT. </li>
<ul>
<li>In fact, <a href="https://www.simple-talk.com/author/robert-sheldon/" target="_blank">Robert Sheldon's blog posts</a> are great in general for step-by-step tutorials relating to SQL Server. </li>
</ul>
<li><a href="https://msdn.microsoft.com/en-us/library/gg413422.aspx" target="_blank">Data Analysis Expressions (DAX) Reference</a>. DAX is a Microsoft expression language. It was built to resemble the Excel Formula Bar, but it goes way, way beyond that. Microsoft's documentation is sometimes heavy on the technicals and low on intuition. Nonetheless, it's always a good idea to start at the source. </li>
<ul>
<li>The <a href="https://msdn.microsoft.com/en-us/library/gg399181.aspx" target="_blank">DAX Overview</a> is especially useful. </li>
<li>Also, pay close attention to <a href="https://msdn.microsoft.com/en-us/library/gg492156.aspx" target="_blank">DAX Query Syntax Reference</a>.</li>
</ul>
<li><a href="https://cwebbbi.wordpress.com/2011/07/13/dax-queries-part-1/" target="_blank">DAX Queries, Part 1</a>. Chris Webb also has a multi-part series that is an intro to DAX. </li>
<ul>
<li><a href="https://cwebbbi.wordpress.com/category/dax/" target="_blank">Chris Webb also has an entire series of posts on DAX</a>, though they are typically more advanced.</li>
</ul>
</ol>
<h3>
Intermediate DAX</h3>
<ol>
<li><a href="https://msdn.microsoft.com/en-us/library/gg413423(v=sql.110).aspx" target="_blank">Context in DAX Formulas</a>. Context is an important topic. However, it's not immediately necessary to get data out of your cubes. But rest assured, it will come up. And when it does, it's a <i>challenging</i> topic and will require extra brain-power to tackle. </li>
</ol>
<h3>
Advanced DAX</h3>
<div>
<ol>
<li><a href="http://www.daxpatterns.com/" target="_blank">DAX Patterns</a>. You can pull of some amazing queries and calculations in DAX. Once you've mastered the introductory and intermediate levels, it's time to start learning these great patterns. </li>
</ol>
<h3>
Books</h3>
<div>
<ol>
<li><a href="http://www.amazon.com/Microsoft-Building-Models-PowerPivot-Business/dp/0735676348/ref=sr_1_1?ie=UTF8&qid=1429722189&sr=8-1&keywords=Microsoft+Excel+2013%3A+Building+Data+Models+with+PowerPivot" target="_blank">Microsoft Excel 2013 Building Data Models with PowerPivot</a>. I read this book. It's slightly advanced, but a good reference. I recommend reading it after you've figured out the basics of DAX. I also recommend working through as many of the examples as possible!</li>
<li><a href="http://www.amazon.com/Microsoft-Server-2012-Analysis-Services/dp/0735658188/ref=sr_1_1?ie=UTF8&qid=1428567810&sr=8-1&keywords=SSAS+Tabular+Model" target="_blank">Microsoft SQL Server 2012 Analysis Services: The BISM Tabular Model</a>. This is an advanced book that is aimed at developers of tabular models. I recommend this only if you are going to be building cubes/tabular models.</li>
</ol>
<div>
<br /></div>
</div>
</div>
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-51283067487321537442015-01-28T14:49:00.000-07:002015-01-28T15:17:26.416-07:00Pentaho Hosted Demo<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjtaayE3kJvVmpLsRDdNlt6pliO9VgUpTF-ZBV6Ceau_blL8T9ARzGV30D003htTdfry1lEnJcCqdB83iwcmDh9sBJW-Oo1oqcTQ4l_zZTP-0nSoZRjP4SDuVYS3W1oGu7xWrEXpXHbXW1s/s1600/07_basic_chart2.gif" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjtaayE3kJvVmpLsRDdNlt6pliO9VgUpTF-ZBV6Ceau_blL8T9ARzGV30D003htTdfry1lEnJcCqdB83iwcmDh9sBJW-Oo1oqcTQ4l_zZTP-0nSoZRjP4SDuVYS3W1oGu7xWrEXpXHbXW1s/s1600/07_basic_chart2.gif" height="196" width="320" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjtaayE3kJvVmpLsRDdNlt6pliO9VgUpTF-ZBV6Ceau_blL8T9ARzGV30D003htTdfry1lEnJcCqdB83iwcmDh9sBJW-Oo1oqcTQ4l_zZTP-0nSoZRjP4SDuVYS3W1oGu7xWrEXpXHbXW1s/s1600/07_basic_chart2.gif" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><br /></a></div>
The <a href="http://demo.pentaho.com/pentaho/Login">Pentaho Hosted Demo</a> went well.<br />
<br />
First and foremost, every time I tried to login, the first attempt failed. I always had to backtrack and login again because the site would just hang at “Just a few moments please.” This is extremely frustrating. I was only able to get in every once in a while. Not cool. Also, the timeout is fairly low – I think I even got booted once while watching a tutorial video! <br />
<br />
The demo, naturally, is targeted at Business Analysts. That means it’s more functional and less technical. As a Data and Business Intelligence specialist, I had lots of questions initially. However, I got the gist of what was going on from the demo. Clearly this demo site wouldn’t help me install and configure a Pentaho server, but it would help me connect to data sources and build some reports. <br />
<br />
<img src="file:///C:/Users/kwoodell/AppData/Local/Temp/msohtmlclip1/01/clip_image002.jpg" />One thing to note here is that there are some terms used that aren’t defined well on the demo site. Terms such dimensions and measures. These are specialized jargon used in the database industries. However, I’ve had to explain the terms to every Business Analyst I’ve ever met. I don’t think they are common terms on the Business side of the house. Let me quickly explain these terms using analogies.<br />
<ol>
<li>A <b>dimension </b>(which has nothing to do with dimensions as in Star Trek or The Twilight Zone) is like the horizontal axis on a chart. A dimension is a “thing” so to speak, like time, people, houses and so on. </li>
<li>A <b>measure </b>is like the values that are inside the chart. These are typically numbers – anything that can be counted, really. A measure is a “value”, like $52, 30.525. If you’re just counting “things” (like dimensions, say, fruit) the value can just be 1 – which perhaps indicates “it exists.” Measures can be additive (meaning they can be summed across any dimension), semi-additive (they can be summed across only some dimensions), or non-additive (they cannot be summed at all). Please consult the information superhighway for more information. </li>
</ol>
The demo site itself is minimalist and well-designed. It has the now standard soft-gradient look. There is an intro video and several tutorial videos. Each one is about 5 minutes long and gives actual, functional information. In less than 30 minutes, any Business Analyst should be able to create an interactive report, an analyzer report, and a dashboard.<br />
<br />
For those of you who aren’t familiar with those terms, I’ll break them down.<br />
<ol>
<li>An <b>interactive </b>report is a fairly standard report (a grid, a matrix, a chart) that can be filtered using – as Pentaho calls them – prompts. Prompts in other systems are known as parameters or run-time options.</li>
<li>An <b>analyzer </b>report is much more interactive report that allows drill-downs. In fact, let’s say you click on a country: USA. If you have your data hierarchies set up correctly, the report will drill down to states. Pick Colorado and it will further drill down to cities. This type of report is dynamic and allows data to be "sliced and diced" as the saying goes.</li>
<li>A <b>dashboard </b>is a collection of charts and graphs. Each graph or chart is shown in a panel. The panels can be moved around, made larger, smaller, or reshaped. Colors and indicators are integral to dashboards. The whole point of a dashboard is to give basic, high-level information. A red color is often used to indicate to the viewer, “hey, pay attention to this!” </li>
</ol>
Somewhere during the middle of the demo, I got a call from a Pentaho rep. My one big question was this: how do I publish one of these reports or dashboards? Is there a plugin or is there some sort of widget that will allow me to integrate this into a webpage? What if I’ve created a marketing website and I want to show off my numbers. Can I throw a dashboard into that webpage easily? Or is Pentaho an data exploration tool meant for internal use only? <br />
<br />
He couldn’t answer the question, unfortunately. It sounds like publication is not a major part of Pentaho’s business model. If it’s not, then the product will unfortunately not be useful to me. I have plenty of tools for data exploration that do the exact same thing (Power View jumps to mind). What I need is a way to get these interactive reports and dashboards on a website! Anyway, he said he’ll get back to me with that information. <br />
<br />
<b>UPDATE</b>: <a href="http://khaidoan.wikidot.com/pentaho-bi-integration" target="_blank">Embedding Pentaho</a> in a webpagesounds way more complicated than I was hoping.<br />
<br />
<br />Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-4370155452003061722015-01-28T12:12:00.002-07:002015-01-28T14:49:37.019-07:00Open Source Business Intelligence: Pentaho<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibmx9-w33OYa0DFOfLJWZVOcnyEGpTP6MejZM05MY7MHFnvb5VCw1kCpZ6y7bGbbwuTpMQsMRF33LQ-wyZIpFP9wHD-i9wc0BiXbbrd_71mDJVQKpnyJ5undh6l9LUu6TZiuN7DJb_yvg0/s1600/ba-top-level_0.png" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibmx9-w33OYa0DFOfLJWZVOcnyEGpTP6MejZM05MY7MHFnvb5VCw1kCpZ6y7bGbbwuTpMQsMRF33LQ-wyZIpFP9wHD-i9wc0BiXbbrd_71mDJVQKpnyJ5undh6l9LUu6TZiuN7DJb_yvg0/s1600/ba-top-level_0.png" height="209" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Pentaho Demo? Looks sweet. </td></tr>
</tbody></table>
Business Intelligence has been around for a few years, but the underlying foundation of BI has been around forever. Since data has existed, businesses have wanted to summarize that data and extract valuable information from it like blood from a stone. When modern accounting was codified by Luca Pacioli in the 15th century, he demonstrated not only the double-entry book-keeping system but also ways to summarize that data. You could even <a href="http://en.wikipedia.org/wiki/History_of_accounting" target="_blank">go back to Mesopotamia to see early <b>reports</b></a>.<br />
<br />
Wait. BI is just reporting?<br />
<br />
Why, yes. That's true. In fact. However, BI tends toward the more modern, the more technical, the more visual, the more flashy. Also, BI might lean a little bit more towards the ... intelligent side of things. KPIs and such.<br />
<br />
What kind of offerings does Microsoft have out there?<br />
<br />
Excel and SharePoint has some really awesome BI tools thanks to Microsoft's recent initiatives. They want to compete with SAP, Tableau Software, and so forth. But guess what? Those tools are <i>massively</i> expensive and SharePoint is <i>ridiculously</i> hard to install and configure. (There's an <a href="https://www.microsoft.com/learning/en-us/mcse-sharepoint-certification.aspx" target="_blank">MCSE for SharePoint</a> if that gives you any clue).<br />
<br />
So what's the cure to all software <i>expensive</i>? Open source, of course.<br />
<br />
The one that looks most intriguing right now? I just found <a href="http://community.pentaho.com/" target="_blank">Pentaho Community</a>. It looks pretty slick. I'm about to try the <a href="http://www.pentaho.com/hosted-demo" target="_blank">Pentaho Hosted Demo</a>. Wish me luck! I'll report back -- if I make it back!<br />
<br />
UPDATE: Here's <a href="http://demo.pentaho.com/pentaho/Login" target="_blank">a direct link to the Pentaho demo</a>. Username and password are under the link that says "Login as an Evaluator."Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com91tag:blogger.com,1999:blog-8055418547332992788.post-90775402229296945522014-11-21T10:15:00.001-07:002014-11-21T10:38:40.689-07:00SSAS Warning: Define attribute relationships as 'Rigid' where appropriateWhen you're first deploying a dimension with hierarchies in SSDT-BI, you might notice this warning (or more appropriately, this directive): <b><a href="http://technet.microsoft.com/en-us/library/bb630312%28v=sql.105%29.aspx" target="_blank">Define attribute relationships as 'Rigid' where appropriate</a></b>.<br />
<br />
A quick interwebs search reveals the answer in dozens of different websites and you'll also probably run across the answer on Microsoft's website. The answer is -- and I quote:<br />
<blockquote class="tr_bq">
<span style="color: #2a2a2a; font-family: 'Segoe UI', 'Lucida Grande', Verdana, Arial, Helvetica, sans-serif; font-size: 12.8000001907349px; line-height: 18px;">This rule analyzes dimensions to determine whether they contain at least one user-defined hierarchy in which each member attribute uniquely implies the member attribute above it. If the dimension does have such a hierarchy, the rule checks to see whether all the attribute relationships in the dimension have the </span><span style="color: #2a2a2a; font-family: 'Segoe UI', 'Lucida Grande', Verdana, Arial, Helvetica, sans-serif; font-size: 12.8000001907349px; line-height: 18px;"><span class="input" style="font-weight: 700;">RelationshipType</span></span><span style="color: #2a2a2a; font-family: 'Segoe UI', 'Lucida Grande', Verdana, Arial, Helvetica, sans-serif; font-size: 12.8000001907349px; line-height: 18px;"> property set to </span><span style="color: #2a2a2a; font-family: 'Segoe UI', 'Lucida Grande', Verdana, Arial, Helvetica, sans-serif; font-size: 12.8000001907349px; line-height: 18px;"><span class="input" style="font-weight: 700;">Flexible</span></span><span style="color: #2a2a2a; font-family: 'Segoe UI', 'Lucida Grande', Verdana, Arial, Helvetica, sans-serif; font-size: 12.8000001907349px; line-height: 18px;">.</span></blockquote>
Thanks, Microsoft documentation! You've given me the answer. Now all I have to do is find the <b>RelationshipType</b> property and we're good to go.<br />
<br />
Now where is it? Is it here? Nope. There? Nope. Over yon? Nuh-uh.<br />
<br />
Well! Did you just spend 20 minutes of your life searching for the answer and you still haven't found it? Well, it's not completely obvious. Remember, everything in every relationship diagram has properties, even lines and arrows!<br />
<br />
<ol>
<li>In your Multidimensional Analysis Services project in SSDT-BI (or SSDT or BIDS or whatever Microsoft has renamed it this week), open your dimension for editing.</li>
<li>Click on <b>Attribute Relationships</b>. </li>
<li>Click on the line-arrow between any two attributes. </li>
<li>Take a look at the <b>Properties</b> window and you'll see <b>RelationshipType</b>. That's where you can change it to <b>Rigid</b>. </li>
</ol>
<br />
Of course, you could read that statement all day and it still might take another twenty minutes to figure out what I'm talking about. In that case -- like me -- you're a visual learner. So in that case, there are two screenshots below to help you find it.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWerd12tjtj9ENKdYvWRk3xK3iIKMbCQfvu-CVaZ1MkWnDVQr5c8BSfmJLocJAo_j1D7PnTxnDmgJlWiTYofb3X74cIhV5GcHyZWnbO0Arn3mb8BxaXVwt0vRPYY2ZmooFw1Xn13FPX1z-/s1600/SSDT-BI+RelationshipType+01.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWerd12tjtj9ENKdYvWRk3xK3iIKMbCQfvu-CVaZ1MkWnDVQr5c8BSfmJLocJAo_j1D7PnTxnDmgJlWiTYofb3X74cIhV5GcHyZWnbO0Arn3mb8BxaXVwt0vRPYY2ZmooFw1Xn13FPX1z-/s1600/SSDT-BI+RelationshipType+01.png" height="82" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Click on the line-arrow!</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmkjTpd7d9rjlbvrb18bYwMS0ANvi_olGIYkb6vIE4aIUOi-PrZ67nyhAWklkb7F1Fr1lApw5pqRM5bp4aA7dGkoi3_2aKTzepk1GWGC2twu2b94RsL-dQBEguW1SmBLQPyfSJY34R1x6y/s1600/SSDT-BI+RelationshipType+02.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmkjTpd7d9rjlbvrb18bYwMS0ANvi_olGIYkb6vIE4aIUOi-PrZ67nyhAWklkb7F1Fr1lApw5pqRM5bp4aA7dGkoi3_2aKTzepk1GWGC2twu2b94RsL-dQBEguW1SmBLQPyfSJY34R1x6y/s1600/SSDT-BI+RelationshipType+02.png" height="210" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Check the properties window. Hint: press the F4 key.</td></tr>
</tbody></table>
You're welcome, internet.<br />
<br />
<b>Update (same day):</b> I just came up with two new helpful tips. Firstly, You can double-click the warning message in the error list and it will take you to the correct location in the project. Secondly, you can also double click on the line-arrow to bring up a dialog bog that will allow you to change the relationship type. See screenshots below.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhq6G1mXljsV6lHv_SDiV7bs30okJYr90RFm_IIRSXyzJeS5tl47UNILu48FY9lL_BI_TeGhc3S6T-kVKKX_S9ccjM769fCAXTwNIzdAVK1GuJLAaUzd0hX9qEDLctoDa2pZxnLbKKANhkp/s1600/SSDT-BI+RelationshipType+03.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhq6G1mXljsV6lHv_SDiV7bs30okJYr90RFm_IIRSXyzJeS5tl47UNILu48FY9lL_BI_TeGhc3S6T-kVKKX_S9ccjM769fCAXTwNIzdAVK1GuJLAaUzd0hX9qEDLctoDa2pZxnLbKKANhkp/s1600/SSDT-BI+RelationshipType+03.png" height="240" width="400" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9QZaeHtrOLLxHA_yPbNFJkc18vxRh1PttNEC_ddtvdjyW-Z6nHAwOROSFiycjqDWOrgop_OnP2F0S7RVo7nDpZs6EKVONpcHn8i4AYfwHnB6OIHElBozceMdve2LURFbZv4YG_nEmyoP1/s1600/SSDT-BI+RelationshipType+04.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9QZaeHtrOLLxHA_yPbNFJkc18vxRh1PttNEC_ddtvdjyW-Z6nHAwOROSFiycjqDWOrgop_OnP2F0S7RVo7nDpZs6EKVONpcHn8i4AYfwHnB6OIHElBozceMdve2LURFbZv4YG_nEmyoP1/s1600/SSDT-BI+RelationshipType+04.png" height="82" width="400" /></a></div>
<br />Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com6tag:blogger.com,1999:blog-8055418547332992788.post-80086930010147961202014-10-14T09:42:00.001-06:002014-10-14T09:42:18.879-06:00Learning the Microsoft BI Stack for Free through VideosMicrosoft's Business Intelligence offerings are varied and sometimes difficult to connect together. The marketing department sometimes doesn't know what the development departments are doing, and the the development departments sometimes don't interoperate as well as they could.<br />
<br />
Despite a bit of disorganization, the Microsoft BI stack -- last I heard it was called BISM -- the Business Intelligence Semantic Model -- is powerful and useful. Learning it can be a long process. Essentially, Microsoft's BI offerings are all categorized under their database offerings, so when you go into the field of database administration, you will also learn reporting, cubes, data cleansing, master data, integration, visualization and more!<br />
<br />
Microsoft offers a series of certifications that cover the entire gamut of database and data analysis tools. I talked about them briefly here: <a href="http://databases-and-dead-languages.blogspot.com/2014/05/road-to-certification-70-461-defeated.html" target="_blank">Road to Certification: 70-461 Defeated and an Analogy between Microsoft Certifications and the US Educational System</a>. One way to approach these certifications is to simply buy the training kits that Microsoft offers. This is not a bad approach in all honesty for people who learn by reading -- but some people simply learn better in visual environments.<br />
<br />
Luckily for us, there are a whole bunch of videos out there to help us learn the basics of these database and data analysis tools. Here is a quick breakdown of my favorite sites:<br />
<br />
<ol>
<li><a href="http://www.microsoftvirtualacademy.com/" target="_blank">Microsoft Virtual Academy</a>. First and foremost, Microsoft has produced hundreds if not thousands of categorized videos and they are all available for free. No matter which Microsoft product you'd like to learn or which certification you're studying for, there's some sort of offering. <a href="http://microsoft.com/learning" target="_blank">Find out which certification you're interested in</a>, buy the training kit, and start watching videos! They even have quizzes to test your knowledge.<br /></li>
<li><a href="http://msbiacademy.com/" target="_blank">MSBI Academy</a>. This is another great site -- one of my favorites. It goes into depth on a lot of really interesting topics that might be for the more advanced users -- cubes, PowerPivot, etc.<br /></li>
<li><a href="http://www.learnmicrosoftbi.com/" target="_blank">Learn Microsoft BI</a>. This site is not as well organized or as well updated, but it does have some good videos for some slightly older products.<br /></li>
<li><a href="http://www.lynda.com/Excel-tutorials/Microsoft-Business-Intelligence-Stack-Fundamentals/169623-2.html?utm_source=google&utm_medium=cpc&utm_campaign=Search-Dev-Microsoft%2BBusiness%2BIntelligence-XCT&utm_content=51598807146&utm_term=learn%20microsoft%20bi&gclid=CjwKEAjwk_OhBRD06abu3qSoxlwSJACt7sZ7vvgu_-XhwPuhXKvJ45CMqeWIkR1aPZ88j_mMw3Xl3BoCnbLw_wcB" target="_blank">Lynda</a>. You can learn quite a bit about the Microsoft BI stack from this general purpose learning site.<br /></li>
<li><a href="http://youtube.com/" target="_blank">YouTube</a>. YouTube is not exactly set up for Microsoft BI -- you know it's a big, disorganized mess when it comes to finding structured information. But if you know what you're looking for, just type it into the search box and start roaming around until you find something you like!</li>
</ol>
<br />
<div class="MsoNormal">
<o:p>If you know of other good video tutorial sites cover the Microsoft BI stack, please drop me a line! </o:p></div>
<br />
<br /><div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-82359514416049731642014-10-06T13:32:00.001-06:002014-10-06T13:32:55.673-06:00What Do I Really Need to Install Power View in SharePoint?<h2>
Introduction</h2>
The landscape for Microsoft Business Intelligence products in the enterprise is frankly cluttered. There are several great tools that run primarily in Excel – after all, 90% of the world’s data analysis currently happens in dimly lit cubicles using Excel. But Microsoft has been trying to get into the big leagues by bringing their BI offerings to SharePoint. SharePoint is, in effect, Microsoft’s version of a “wiki” platform. It’s their non-developer-centric web-based collaboration and publishing tool. It makes sense that when data analysts create complex Excel reports that they should be able to share them using a technology more modern than the F: drive – and that’s why SharePoint was built.<br />
<div>
<br /></div>
<div>
Nonetheless, trying to figure out what’s what is a bit mind boggling. This is fairly evident if you’ve ever tried to install or administer these various different products. Microsoft’s documentation in this arena is a scattershot collection of chaos. This is evident in the fact that there are 10 different documents explaining how to install PowerPivot for SharePoint and they all have different nuances and characteristics, not a single one of which “gets it right” – or in more concise terms, not a single one is authoritative. <br />
<br />
Power View for Excel in itself is not simple to install. It doesn’t come pre-activated from Microsoft. It requires a person dig deep inside the bowels of Excel’s labyrinthine options to activate a COM add in. Then, in most cases, the user needs to install Silverlight since it isn’t part of the operating system. Can you imagine the system administrator’s frustration when everyone in the organization suddenly wants Excel to do more? Beyond that, Power View requires you to essentially understand PowerPivot. PowerPivot is a data modeling tool – a very, very powerful one. Both of these pieces of software are easy to use, but getting maximum value out of them is – like any other software – a learning process.</div>
<div>
<br />
But that’s a minor inconvenience compared to the system administration nightmare that is SharePoint. SharePoint is dauntingly complex all by itself and then trying to add a bunch of services that are supposed to interact with other services and servers and software. Add to that complexity the Microsoft licensing model – you should be aware at this point right now that your deployment isn’t going to be cheap. SharePoint Enterprise edition is not cheap, nor is SQL Server Enterprise edition (or Business Intelligence edition). Let’s be honest – if you were planning on running Microsoft’s BI platform for less than $25,000 in licensing fees, you’re going to be sticker shocked to say the least. <br />
<br />
In any case, I’m going to try to break this down into the simplest terms and be thorough at the same time. This post is about all the actual requirements to get Power View for SharePoint working. I will also try to explain the software linkages and basic terminology. <b>If you don't want basic explanations and want to get to the nuts and bolts of this post, jump down a few sections. </b><br />
<h2>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRJbh98foN9x0ZR8xIgbiBJeigMXZNKRx0dD00kVWwfB36lZ96zKGevMYPZVcwhXkZJcjH157WRZQhC58BOl8USWtVBDyFzYV1pWu6dcvaBojttcNA8gxRGusis6XDi6VGrxa6xI1RUd1Z/s1600/fraglevels.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRJbh98foN9x0ZR8xIgbiBJeigMXZNKRx0dD00kVWwfB36lZ96zKGevMYPZVcwhXkZJcjH157WRZQhC58BOl8USWtVBDyFzYV1pWu6dcvaBojttcNA8gxRGusis6XDi6VGrxa6xI1RUd1Z/s1600/fraglevels.png" height="146" width="200" /></a>What is Power View?</h2>
Power View is an interactive data exploration tool. It allows you to explore data and relationships in order to find insight and trends. It's similar to other software like Tableau. It's really neat because you can filter data in one chart by choosing a data point in another chart. It allows you to see relationships visually. If you've never tried it, give it a whirl. You'll need Microsoft Excel 2013. </div>
<div>
<br /></div>
<div>
Power View comes in two flavors. It exists in Microsoft Excel and also in SharePoint. In SharePoint, people can look at your Power View worksheets and interact with them and they can also use your data model to create their own Power View worksheets. </div>
<h2>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBtye5pMoWh3WtbpjEBaOBg_EWsK9z0Nh09Qbf0qln96bImEJOEl44y_T-ayEW4LRsz51mhJCygh_GOxQAp_uURslt5CojUeRK4ix6reWyvCoj99Tq_VKdesVcNZrtZpPIICWYbROp82qJ/s1600/powerpivot.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBtye5pMoWh3WtbpjEBaOBg_EWsK9z0Nh09Qbf0qln96bImEJOEl44y_T-ayEW4LRsz51mhJCygh_GOxQAp_uURslt5CojUeRK4ix6reWyvCoj99Tq_VKdesVcNZrtZpPIICWYbROp82qJ/s1600/powerpivot.png" height="198" width="200" /></a>What is PowerPivot?</h2>
<div>
PowerPivot is the data modeling engine that sits <i>behind</i> Power View. If Power View is the car, then PowerPivot is what's sitting under the hood. This metaphor isn't great, because you still need to know how to be a mechanic to make this motor run. Granted, it's still 1910 and it's just a couple of spark plugs and a distributor (i.e., simple), but you will have to get under the hood to make this motor run. PowerPivot is not a database although it does store a lot of data in a tabular format. Tabular format data stores are beyond the scope of this post, but if you'd like more information, check out <a href="http://blogs.msdn.com/b/analysisservices/archive/2012/03/09/xvelocity-and-analysis-services.aspx" target="_blank">xVelocity and Analysis Services</a>. </div>
<div>
<br /></div>
<div>
PowerPivot comes in two flavors just like Power View. The Micrsoft Excel version and the SharePoint version. The SharePoint version actually doesn't stand alone -- its actually a "frontman" for Microsoft Analysis Services running in Tabular mode (a component of Microsoft SQL Server). </div>
<h2>
What is SharePoint Server?</h2>
<div>
SharePoint Server is a beast. It's a big piece of server software from Microsoft. Its purpose is to unify the universe. No, seriously! Okay, it's a collaboration engine. It's also gigantic and amorphous. You can do anything with it. The basic premise is this: anything that can be shared or collaborated on or published can be part of SharePoint. Think of it as a gigantic wiki page that allows people to share documents, collaborate on projects, build web pages, publish dashboards, and so on. The problem with describing SharePoint is that it's so big and complex and undefined. It's like a big ball of play dough -- you can quite actually mold it into any shape you want. Want a web page? Okay. Want a project management site? Okay. Want a blog? Okay. Remember: sharing, collaborating, publishing. That's key.<br />
<br />
There are four different services in SharePoint that you'll also need. The jumble of services gets tangled so I'm going to try to detangle them below.<br />
<h3>
Reporting Services</h3>
</div>
<div>
Technically, Reporting Services is part of SQL Server. However, it can be installed in SharePoint mode and it's one of the many pieces of the puzzle necessary to get Power View working.</div>
<h3>
PowerPivot Services</h3>
<div>
PowerPivot, as described above, is the tabular data engine inside Excel that stores the data that Power View uses as its data source. In SharePoint, this service must be installed, configured, and enabled to essentially connect to SQL Server Analysis Services running in PowerPivot mode (basically, a special form of Tabular mode). </div>
<h3>
Excel Services</h3>
<div>
Excel Services at their most basic level allow SharePoint users to open and use Excel spreadsheets. On a deeper level, these services are necessary to open PowerPivot models and Power View sheets that have been created in Excel on the desktop. </div>
<h3>
PerformancePoint Services</h3>
<div>
PerformancePoint Services are simply a dashboarding suite. They allow you to connect to various data sources (primarily SSAS cubes) and create meaningful and easy-to-use dashboards. Dashboards are typically one-page collections of informative charts, graphs, or KPIs that give a user a quick overview of what's happening with a system or a process. </div>
<h2>
What is SQL Server?</h2>
<div>
SQL Server is Microsoft's relational database product. The core database engine is fairly easy to understand. It's basically just a very advanced SQL engine. However, there are many, many components tacked on to SQL Server that are necessary to perform business intelligence using Microsoft products. For instance, two key components that must be installed are Reporting Services and Analysis Services. In addition, these services cannot simply be installed in their native modes -- they must be installed in SharePoint mode. </div>
<h2>
The Nuts and Bolts</h2>
<div>
That's a clever title, I reckon. (Insert smiley face)<br />
<br />
That's a lot of software and moving parts just to get one data exploration tool up and running. Unfortunately, there's no way around it if you want to share your Power View sheets within your organization. </div>
<div>
<br /></div>
<div>
You will need this software:</div>
<div>
<ol>
<li><b>Windows</b>. You will most likely need a server edition but possibly not -- I've always used the server editions for SharePoint, but check the system requirements to see if it'll run on desktop editions. </li>
<li><b>Internet Information Services</b>. This web server software is packaged with just about any modern version of Windows you can think of. But no matter which version of Windows you settle on, SharePoint needs it. </li>
<li><b>SharePoint Enterprise Edition</b>. Yes, Enterprise. You cannot run the BI tools on Standard. </li>
<li><b>SQL Server Enterprise Edition or Business Intelligence Edition</b>. I've never tried using Developer edition, but according to Microsoft, it is exactly the same as Enterprise Edition -- so in theory it should work. </li>
</ol>
<div>
Installing #1 - #3 above is outside the scope of this article. But let's say you get that far and you're going to install SQL Server. There are several things you need to keep in mind:</div>
</div>
<div>
<ol>
<li>When you install SQL Server (usually on a machine separate from SharePoint), make sure you do not install any Reporting Services of any kind. </li>
<li>Feel free to install Multi-Dimensional Analysis Services on your default instance. </li>
<li>After all that's done, you will go back to your SharePoint server and install Reporting Services for SharePoint and Reporting Services for SharePoint plugin. </li>
<li>Finally, you will install PowerPivot for SharePoint. This is one of the options when installing SQL Server. If you don't see the option, you might not be installing from Enterprise or Business Intelligence Edition disks. It will automatically install a new instance of Analysis Services called [ServerName]\POWERPIVOT. This is a special version of the Tabular model of Analysis Services. It's fairly well locked down to only service SharePoint so you won't be able to use this to do backups and restores of other, non-PowerPivot related databases. </li>
<ul>
<li>What's really important here is that your Excel Services configuration must point to the POWERPIVOT instance of analysis services. If it does not, the chances of getting it to work are very very low.</li>
</ul>
</ol>
<div>
Getting Power View to work in SharePoint is not easy or simple or straight foward. There are a number of guides on the web and you might have to go through more than one of them. I can pretty much guarantee this: there are a number of <i>important details</i> that Microsoft doesn't emphasize well. It's these small details that make the difference between a successful configuration and one that's not. </div>
<div>
<br /></div>
<div>
Good luck!</div>
<div>
<br /></div>
</div>
<div>
<br /></div>
<div>
<br />
<br />
<br /></div>
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-66478749181195065932014-06-06T12:08:00.001-06:002014-06-06T13:44:40.062-06:00SQL Server Monitoring Tools (Dashboards)Now see here! Dashboards are awesome. I'm a person easily stimulated by visual input and a good dashboard is, in my opinion, the quickest way to get information into my brain.<br />
<br />
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhK7oUwv4mqYoCBLlNWVKZd1bLbsW2wuONy0l2sbUd5GJA5lt3Gjgc3lq0f7pH68Nbcfvo9XU3RRruehngVhrzWTCAhT-e5sLTeq-e9cY0yrv7_awiLMiIrov5B1hmZcxfBhFhDCzsdfScu/s1600/104Reload05_-_Zion_Control.jpg" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhK7oUwv4mqYoCBLlNWVKZd1bLbsW2wuONy0l2sbUd5GJA5lt3Gjgc3lq0f7pH68Nbcfvo9XU3RRruehngVhrzWTCAhT-e5sLTeq-e9cY0yrv7_awiLMiIrov5B1hmZcxfBhFhDCzsdfScu/s1600/104Reload05_-_Zion_Control.jpg" height="180" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Zion Control from Matrix Reloaded</td></tr>
</tbody></table>
What I really really want is an interface like Zion Control from the Matrix movies. I know, I know. We've wanted this kind of stuff for years -- all of us.<br />
<br />
Honestly, though, we're getting pretty deep into the 21st century, aren't we?<br />
<br />
So why do SQL Server monitoring interfaces suck so badly? Why do dashboards in general suck so badly? We are data scientists and the data visualization experts. All I want is an easily installable, easily customizable interface that I can hook up to a data source and instantly have metrics and graphs and charts. It's not that much of a request.<br />
<br />
Well, here's the reason, I think. Companies that do this kind of stuff make big big dollars charging for it. Companies like QlikView and what not. Microsoft is trying to get into that space with Power Query and Power Map and Power View, and those are <i>amazing</i> tools that I use on a daily basis. Unfortunately, we have to tie them to SharePoint and the administration and configuration of that is just a nightmare. It'll be at least another year before we have those bad boys installed here.<br />
<br />
So in the meantime, I'm evaluating SQL Server Monitoring software. Ugh. It's bad. It's ugly. It's hard to customize. Yes, I know, I'm the DBA and I need to know all this stuff. Honestly, though. It all needs work. Why can't we have something clean and simple and easy like <a href="http://dashingdemo.herokuapp.com/sample" target="_blank">Dashing</a>?<br />
<br />
Anyway, in my opinion, the best <i>visual</i> monitors are Spotlight on SQL Server and Idera's Diagnostic Manager. I'm still running through the evaluations. I probably won't post a full write up since there have been so many written before. Here's <a href="http://sqlbadboy.wordpress.com/2013/05/03/sql-server-monitoring-tools/" target="_blank">a thorough write up of various SQL Server Monitoring Tools</a>. Enjoy!<br />
<br />
<b>Update 2014.06.06: </b>Here is <a href="http://www.mssqltips.com/sqlservertip/1067/sql-server-performance-monitoring-tools/" target="_blank">a comprehensive list of SQL Server Monitoring Tools</a>. It doesn't dive into the details of each product.<br />
<br />
<b>Update 2014.06.06: </b>Here is <a href="http://www.serverwatch.com/trends/article.php/3937411/5-SQL-Server-Performance-Tools.htm" target="_blank">another review of some SQL Server Monitoring Products</a>. The article itself is not very interesting, rather the <i>comments</i> are what I found fascinating.<br />
<br />Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-80574790717263954902014-05-30T09:19:00.002-06:002014-05-30T09:25:16.339-06:00SQL Server: Huge Log Files and Recovery Pending StatusIt's been an interesting morning so far.<br />
<br />
We all got an email from some phantom IT department member last night (who knows who it was because it was a system account) letting us know that there was going to be some reconfiguration going on at the data center (which is hosted). It basically said, "There's nothing to see here. Move along."<br />
<br />
So we moved along.<br />
<br />
Last night around 8:33, I got a notification from my main production SQL Server (2008 R2 Enterprise) that the server had been restarted and moved to a different cluster node. I logged in remotely, checked it out, all looked good. I figured it was a failover or a failover test. (Incidentally, for some strange reason, our IT department sees fit not to allow the database administrator -- me -- any access to the clusters). Then at 10:45PM I got another reboot message -- same thing, but this time it was flipping back to the other node. Interesting, but I was able to get to all the data and nothing seemed amiss in the logs.<br />
<br />
This morning, there is basically nobody from IT in the building except for myself and one single system admin. This SA has been working with a contractor on a System Center database and noticed that he couldn't get any data from System Center. He logged into SSMS and noticed that the databases were all in a state that said "Recovery Pending" and he couldn't get any information about the databases whatsoever -- no properties, no file information, nothing. So he restarted the server. Still, "Recovery Pending."<br />
<br />
He came to me and asked me to help troubleshoot. Right away I knew that there were only a couple of paths that could cause this error message. Basically the log files for these database was missing or corrupt. However, the system databases were all online.<br />
<br />
Step 1. Check for properties on the database. Not available. Curious.<br />
Step 2. Check the logs. Opened up an archived log. Or tried. SSMS started choking. Okay, the log must be too big?<br />
Step 3. Check the physical log on the disk. It was 1.2GB. Yes! You read that correctly! It was 1.2GB! No wonder SSMS was choking.<br />
Step 4. Check the data files and the log files.<br />
<br />
CRASH! <b>My PC</b> (formerly my computer) was showing a single drive -- the system drive. There used to be 3 LUNS there mapped to three different drive letters (E:, F:, G:).<br />
<br />
Step 5. Open up the disk management tool. The LUNs were still there, they just offline. We flipped the switch and they all came online.<br />
Step 6. Restart SQL Server.<br />
Step 7. Verify in SSMS that the databases were online. They were.<br />
<br />
I am not a SAN administrator not am I a dedicated Windows Server Admin, but I have not to this date seen a SAN LUN completely go offline. I'll leave that work to the the Sysadmins ...<br />
<br />
What's the lesson here? If your databases are in "Recovery Pending" mode, one of the first things to check is whether the files actually exist -- in fact, if the drives they are on actually exist.<br />
<br />
<b>UPDATE:</b> I think the log was so large because the LUN volumes went offline while SQL Server was still running. I haven't had a chance to look at the log file yet, but that's my guess.<br />
<br />Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-2372615377660523862014-05-15T09:23:00.001-06:002014-05-15T09:27:31.727-06:00TSQL: Check to see if a partition function already exists OR check to see if a partition scheme already existsWhen working with TSQL scripts and creating objects, it's typical to drop the object before creating it using a conditional check on it's object identifier. For example:<br />
<br />
<pre class="brush:sql">IF( OBJECT_ID( 'myTable', 'U' ) IS NOT NULL )
BEGIN
DROP TABLE myTable;
END
</pre>
<br />
The biggest reason for this is to avoid an error. What if the table didn't exist? If you tried to drop it without the conditional expression, you'd get an error and your script would likely halt.<br />
<br />
But what if you're trying to drop/create an entity that isn't necessarily an object (i.e., it doesn't have an object identifier and therefore the OBJECT_ID function won't work)? You will definitely need to change your condition up a bit by querying a <a href="http://technet.microsoft.com/en-us/library/ms174365.aspx" target="_blank">catalog view</a>.<br />
<br />
For partition functions and partitions schemes, you will query <b>sys.partition_functions</b> and <b>sys.partition_schemes</b>.<br />
<br />
<pre class="brush:sql">-- create a paritition function to test this out
CREATE PARTITION FUNCTION PfTest( tinyint )
AS RANGE LEFT FOR VALUES( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 );
-- visually inspect the partition functions table to see that it's there
SELECT
*
FROM
sys.partition_functions
-- drop the partition function if it exists
IF( EXISTS( SELECT * FROM sys.partition_functions WHERE name = 'PfTest' ) )
BEGIN
DROP PARTITION FUNCTION PfTest;
END
-- visually inspect the table to see that it's gone
SELECT
*
FROM
sys.partition_functions
</pre>
<br />
The EXISTS function basically runs the query to see if there are 0 rows (hence, nothing exists) or 1+ rows (hence, something exists). If the sub-query returns a single row that matches the name of <b>PfTest</b>, then the condition is true and the partition function will be dropped. If it doesn't exist, then it will not be dropped.<br />
<br />
Another option -- if you prefer cleaner and more readable TSQL -- is to create some custom functions, something like PARTITION_FUNCTION_ID or PARTITION_SCHEME_ID.Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-11300832164439426802014-05-12T11:33:00.002-06:002014-05-12T11:35:30.671-06:00Road to Certification: 70-461 Defeated and an Analogy between Microsoft Certifications and the US Educational SystemIf I recall my previous post correctly, I was perfectly accepting of failure of the 70-461 exam. As I reported, last Saturday I sat through the exam knowing that failure was an option because I had the ever-helpful <a href="http://www.microsoftvirtualacademy.com/offers/20-percent-off-exam-and-free-second-shot" target="_blank">Second Shot</a> voucher in play. It turns out I didn't need it. I passed the exam on the first try! That's the second pass in a row and I couldn't be happier about it. Two more exams and I'll be an MCSA (Microsoft Certified Solutions Associate).<br />
<br />
On a related but slightly tangential note, I have been having some thoughts on the <i>cursus honorum </i>(Latin for <b>course of offices</b>, or in more common terms, the political ladder) through the certification process. Analogies are helpful and I think a comparison to our US education system will be appropriate to understand how the process works. Please note that this information is relevant for SQL Server 2012.<br />
<h4>
Level 1 - High School Diploma</h4>
<blockquote class="tr_bq">
<i>MCT - Microsoft Certified Technician</i>. This is the most basic level that Microsoft offers. You have to know quite a bit about databases to pass this test but if you've been working in the database world for a few years, it shouldn't be that hard. This won't really help you get a job just like a high school diploma won't, but it demonstrates effort and basic competency. Incidentally, elementary and middle school would be the equivalent of learning how to use operating systems and productivity applications. For an MCT in databases you have to pass one test. </blockquote>
<h4>
Level 2 - Bachelor's Degree</h4>
<blockquote class="tr_bq">
<i>MCSA - Microsoft Certified Solutions Associate. </i>After the MCT is the MCSA. Not even a seasoned database administrator would be able to pass these without specific training in the Microsoft SQL Server stack. Sure, knowing advanced queries would help a lot but without actually picking up some training material, you'd be very lucky to guess what <b>SET TRANSACTION ISOLATION LEVEL SNAPSHOT</b> actually means. This level includes a lot of theoretical material that will get you going on the basics of a lot of different aspects of querying, data warehousing, and administration. This will definitely help you get a job if you back it up with a few years of experience -- but unless you have lots and lots of experience, don't expect to be coming into the Senior level. For an MCSA in databases you need to pass three tests. </blockquote>
<h4>
Level 3 - Master's Degree</h4>
<blockquote class="tr_bq">
<i>MCSE - Microsoft Certified Solutions Expert. </i>This is where it gets interesting because, for the database track at least, the program actually splits off into two separate tracks -- Data Platform and Business Intelligence. Each one requires two tests. In my opinion, this coupled with a good amount of experience will definitely qualify you for a job at the Senior level in terms of database administrators or business intelligence analysts. Although many less experienced IT managers might not see the value in certifications, the ones who have been around for a long time will no doubt understand that certifications provide the theoretical underpinnings of a highly experienced professional. Sure, you may have been working on SQL Server technologies for 20 years by now, but if you don't know what you <i>don't know</i> then you're truly missing out on the bigger picture -- and by extension, so is the company you're working for. </blockquote>
<h4>
Level 4 - Doctorate</h4>
<blockquote class="tr_bq">
<i>MCM - Microsoft Certified Master. </i>Finally there's the grand-daddy of all database certifications. I've only read about this one and it sounds fairly hellacious. There was, of course, an exam or two to pass for this level but the most interesting part of it was that you had the option of spending several weeks on Microsoft's campus and learning from top-tier Microsoft product specialists and perhaps even product developers. Unfortunately, this level is gone. Microsoft Learning was not getting enough interest in this -- apparently it was a $20,000 investment and so they retired it late in 2013. Sadly, they have yet to announce a replacement. However, I'm sure that a new MCM program is in the works -- and hopefully a bit cheaper. If you ever achieve this level, you definitely deserve to be at the senior or director level in your local IT department simply because you somehow managed to talk someone in your company into shelling out the big bucks for the training. </blockquote>
<h4>
Level X - Honorary Degree </h4>
<blockquote class="tr_bq">
<i>MVP - Most Valuable Player. </i>Were you the popular kid in high school? The cheerleader or the quarterback or the class president? If so, you know how these kinds of popularity contests work. It's a lot of hand shaking and baby kissing. Basically, hook or crook your way to the top and become everybody's friend and they will vote for you. I'm joking, of course, but remember that every joke has a kernel of truth. If you're at this level, you are probably working the lecture and training circuit already and you are -- in some sense -- your own boss. </blockquote>
So -- you're probably thinking to yourself -- "Oh, this guy is barely in Level 2. What does he know?" Well, I know I have almost exactly 15 years of experience with the SQL Server product line dating back to the days of SQL Server 6.5. It's been a long road, of course, to get where I am. Now that I have a massive amount of experience in all three areas -- production, development, and business intelligence -- I want to fill in the gaps and get those shiny pieces of paper to prove that I know what I know. Plus, I already have an <i>actual</i> master's degree and I was a teacher for 4 years. So, long story short, I've been through the wringer a couple of times. Despite the fact that I'm at Level 2 in terms of certifications, I'm sitting somewhere within Level 3 in terms of actual job competency. And to be honest, I wouldn't mind being Level 4 or even Level X someday.<br />
<blockquote class="tr_bq">
<br /></blockquote>
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-36464527463844219942014-05-10T08:56:00.002-06:002014-05-10T08:56:56.830-06:00Road to Certification: Pre-exam Jitters for 70-461About a month ago I took the MCT exam for database fundamentals (98-364). I signed up a day in advance and blew through the test in less than 30 minutes and got a high passing score. For me (as for any database administrator), the exam was simple. For anyone who doesn't have a lot of experience with databases and querying, the exam would have been difficult to pass. But that's the point of these certification exams, right? To basically make sure you're at the level you say you are.<br />
<br />
Today, in about 2 hours in fact, I'm going to take the first test in the MCSA exam series, the 70-461. I'm a bit nervous, I admit it. However, I'm not that nervous for two reasons. First, I managed to score a retake voucher -- if I fail, I get to take the exam again before May 31st. Therefore, I fully expect to fail and it's okay because I get a redo. Second, I studied -- a lot. I read the 70-461 prep book and took the practice exams. I also watched all the videos and took the practice exams on Microsoft Virtual Academy. Finally, <a href="http://borntolearn.mslearn.net/certification/database/f/516/t/321649.aspx" target="_blank">I read through this synopsis of an actual exam taker's experiences with the exam</a>.<br />
<br />
What scares me a little bit is that I've found a couple of exam questions and exam review topics that point to information that I have never even heard of before today. But, as I said earlier, I'm not that worried. I get a redo. After the last exam, I got a scoring sheet that told me what areas I need to work on. So, if I fail the exam the first time (and I hear it's a fairly common occurrence), then I'll bone up on my weak areas and take it again in two weeks.<br />
<br />
One other thing that kinda bothers me -- it's supposedly a 3 hour exam! Yikes. I'm glad I didn't schedule my post-exam, stress-relieving massage until 4 hours afterwards!Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-12008566682312670662014-04-10T10:37:00.001-06:002014-04-10T10:37:24.651-06:00Microsoft SSRS Error: The report parameter is read-only and cannot be modifiedSo one of my scheduled SSRS (SQL Server Reporting Services) reports that was called with GET query parameters in the URL started failing today. It started giving this message:<br />
<blockquote class="tr_bq">
Unknown error generating/adding reports</blockquote>
I checked the SSRS logs (in my case in <b><a href="http://technet.microsoft.com/en-us/library/cc512029.aspx" target="_blank">C:\Program Files\Microsoft SQL Server\MSRS10_50.MSSQLSERVER\Reporting Services\LogFiles</a></b>) and found a message I have never seen before and was not anticipating:<br />
<blockquote class="tr_bq">
The report parameter 'PARAM' is read-only and cannot be modified. ---> Microsoft.ReportingServices.Diagnostics.Utilities.ReadOnlyReportParameterException: The report parameter 'PARAM' is read-only and cannot be modified.</blockquote>
Note that PARAM was substituted for the actual parameter name to protect the innocent. Bizarre, right? This happened on a column that was previously hidden from the user in the SSRS Report Manager. It would run just fine in SSDT (SQL Server Data Tools).<br />
<br />
One thing I had noticed is that when I logged in, someone had applied updates to the SSRS server (or perhaps it was an automatic update). Aside from that communication slip-up between admins, I wonder if it's related? <a href="http://support.microsoft.com/kb/2410593" target="_blank">Microsoft released a bulletin a few months ago regarding this</a>.<br />
<br />
Here is the basic solution:<br />
<br />
<ol>
<li>Go into the SSRS Report Manager web interface.</li>
<li>Find the report.</li>
<li>Right click on the report and click <b>Manage</b>.</li>
<li>Go to the <b>Parameters </b>tab.</li>
<li>Uncheck the <b>Hide</b> and <b>Prompt User</b> checkboxes on the affected parameters.</li>
<li><b>Apply </b>the changes.</li>
<li>Recheck the <b>Hide</b> parameter.</li>
<li><b>Apply </b>the changes.</li>
<li>Test your report using it's parameterized URL to see if the problem is solved. </li>
</ol>
<br />
Simple!<br />
<br />
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-22484800029581169012014-03-19T12:04:00.000-06:002014-03-26T13:26:01.570-06:00TFS Plugin for SSMS Stopped Working TodayToday, the <a href="http://visualstudiogallery.msdn.microsoft.com/b5b5053e-af34-4fa3-9098-aaa3f3f007cd" target="_blank">TFS plugin that I use (MSSCCI)</a> that I use for version control stopped working. When I try to load a solution, SSMS simply ignores my request. No error message, no event logs, nothing. I can connect to the TFS server but when I open the solution from version control, again, SSMS ignores my request. I can see my solutions in VS2012 and in the Team Explorer window/Source Control Explorer.<br />
<br />
What happened!? Argh. I think the IT department pushed a bunch of updates. Unfortunately, I don't have any control over that. I don't know which one it might have been.<br />
<br />
So I uninstalled SQL Server 2012 and the MSSCCI Provider, reinstalled the client tools with updates and service packs and all that, reinstalled the plugin, and no go!<br />
<br />
Now I'm installing 2008R2 and 2014 CTP2 client tools to see if the older and newer versions of SSMS are also horked.<br />
<br />
<b>UPDATE: </b>I got desperate and I didn't want to reinstall the OS, so I uninstalled all SQL Server and Visual Studio products, uninstalled all the SSMS plugins, and reinstalled everything. Drastic, I know, but it fixed the problem.<br />
<br />
<b>UPDATE2</b>: Have you ever had a "duh" moment? Yeah, me too.<br />
<br />
Turn on the Solution Explorer with <b>CTRL+ALT+L</b>. See below.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAUUiYSyE39SpDwBqU3P429EGBRf60LqbTbKS6NSxAiEO9dv-xnwMLMv0qk8P0pnhHxeJAGGAx1c0ioaXKVZMr9PPX7v_v5cYDiVyzdiyk1Gh88SjeLicyieQsYOqARk212QoqhxMcX6Iy/s1600/ssms.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAUUiYSyE39SpDwBqU3P429EGBRf60LqbTbKS6NSxAiEO9dv-xnwMLMv0qk8P0pnhHxeJAGGAx1c0ioaXKVZMr9PPX7v_v5cYDiVyzdiyk1Gh88SjeLicyieQsYOqARk212QoqhxMcX6Iy/s1600/ssms.png" height="320" width="180" /></a></div>
<br />Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-20816468661130224882014-03-13T16:14:00.001-06:002014-03-13T16:15:39.505-06:00Set Theory Mathematics: The Problem with Sums of Sums and Averages of AveragesRecently, it has come to my attention that the reports that have been written prior to my coming have an itsy bitsy flaw. Let me explain.<br />
<br />
In one particular report, we count groups of people carrying out actions. We then add and subtract the counts of those actions to get a total number of people doing aggregate actions.<br />
<br />
I'm going to make up a hypothetical example. Let's say I'm counting the actions of grocery shoppers who bought certain items.<br />
<br />
<ul>
<li>32 people bought paper towels</li>
<li>17 people bought napkins</li>
<li>12 people bought chicken </li>
<li>5 people bought crackers</li>
</ul>
<div>
Now, let's say I want to know how many people bought paper products. The answer should be easy. </div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7d-K3yo9hyyNHIBaKzfoYpw3YYKYRaWh0-4rulRPrE2BHcCDCIpuFedcOI7u-BGUzj7A_Zoy974ejnbEteimmWjq_Q_Wc3rD2LIUsHigEC9usYIqQK15_ciPWCLSJgMKzX4jrPEd9lFFo/s1600/venn+diagram+01.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7d-K3yo9hyyNHIBaKzfoYpw3YYKYRaWh0-4rulRPrE2BHcCDCIpuFedcOI7u-BGUzj7A_Zoy974ejnbEteimmWjq_Q_Wc3rD2LIUsHigEC9usYIqQK15_ciPWCLSJgMKzX4jrPEd9lFFo/s1600/venn+diagram+01.png" height="132" width="200" /></a></div>
<div>
32 + 17 = 49</div>
<div>
<br /></div>
<div>
Right? That would be true if the sets were <i>mutually exclusive</i>, like the Venn diagram to the left. That means the sets don't overlap ever.<br />
<br />
Unfortunately, that is most likely incorrect! Why? A person can very easily buy both napkins and paper towerls.<br />
<br />
Let's say I failed to mention this fact:</div>
<div>
<ul>
<li>7 of the people who bought napkins also bought both napkins and paper towels</li>
</ul>
<div>
So the more accurate answer is:</div>
</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjht5cG9Dyax5O-xJ26AsyVfiKOF4uTT7FN8UqA8Jl593DgR8ATIqnu6tufNA0uqe0gtZRDml7cIpdV8kjxDazPpygK84TaWYJZ3YV5qyXu_ciEPkdOTbLYD3xpuYnaLFo-9xD8EhPITWWH/s1600/venn+diagram+02.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjht5cG9Dyax5O-xJ26AsyVfiKOF4uTT7FN8UqA8Jl593DgR8ATIqnu6tufNA0uqe0gtZRDml7cIpdV8kjxDazPpygK84TaWYJZ3YV5qyXu_ciEPkdOTbLYD3xpuYnaLFo-9xD8EhPITWWH/s1600/venn+diagram+02.png" height="148" width="200" /></a></div>
<div>
32 + 17 - 7 = 42</div>
<div>
<br /></div>
<div>
Why does this happen? Because of the count of counts and the average of averages and the sum of sums (or any combination thereof) are not the same! It's a mathematical fact of life, and we need to take it into account whenever we're dealing with sets. </div>
<div>
<br /></div>
<div>
Sets, like scalar values, can be "added together" and "subtracted" in a manner of speaking. There are lot of special terms and symbols used in <a href="http://en.wikipedia.org/wiki/Set_(mathematics)" target="_blank">set mathematics</a> that I won't get into here today. What is relevant now is how this all applies to database sets, otherwise known as tables.</div>
<div>
<br /></div>
<div>
If you want to get the proper count of two intersecting (or exceptional) tables, you will need to use the <a href="http://technet.microsoft.com/en-us/library/ff848745.aspx" target="_blank">set operators</a>, UNION, INTERSECT, and EXCEPT. </div>
<div>
<br /></div>
<div>
If you want to (distinctly) add two tables together, use UNION. If you want to get the rows where the tables intersect, use INTERSECT. Finally, if you want to subtract two tables, use EXCEPT.</div>
<div>
<br /></div>
<div>
Once you've completed your set operation, then you can summarize it. But if you want accurate numbers, never summarize before the set operation.</div>
<div>
<br /></div>
<div>
Here's some pseudo-sql as an example:<br />
<br /></div>
<div>
<pre class="brush:sql">SELECT
COUNT(customer_id) AS
FROM
(
SELECT
customer_id
FROM
products
WHERE
product = 'paper towels'
UNION
SELECT
customer_id
FROM
products
WHERE
product = 'napkins'
) unioned_products</pre>
<pre class="brush:sql"></pre>
<br />
The reason this works is because duplicate <b>customer_id </b>values are thrown away. UNION works by sorting both tables and then merging them together, throwing away any duplicates that it finds. In the second Venn diagram above, people who bought both napkins and paper towels were doubly counted when in fact they should have been singly counted.<br />
<br />
Rest assured, this is an imagined scenario that came about because of a real situation. Our reports at work were most definitely reporting the wrong numbers and now we have to go back and rewrite a lot of them with this specific problem in mind.<br />
<br />
What will you do the next time your CEO comes to you and asks you how -3 people bought your products this month? Embarassing. Just tell 'em "I inherited the report from the last report writer." Then fix it as fast as you can! </div>
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-13313506098148587582014-02-28T10:26:00.001-07:002014-02-28T11:41:19.297-07:00Installing a trial of SQL Server 2014 and Windows Server 2012 in a VirtualBox VMIt's sometimes really useful to get a trial copy of SQL Server to play with. There are a number of reasons for this: certification testing, validation, upgrade testing, skills training, experimentation, recovery trials, etc.<br />
<br />
There are a lot of caveats, however. You don't want to install it on your work machine because you don't want to mess up your current configuration. The next best option is using a virtual machine. But you don't want to spend hours or days installing and configuring Windows and you don't even have Hyper-V running on your operating system (for instance, Windows 7).<br />
<br />
Luckily, all these problems have easy workarounds. Here's a preview:<br />
<br />
<ol>
<li>Download the VHD file for the Windows Server 2012 trial</li>
<li>Downlload and install VirtualBox</li>
<li>Windows is already installed! </li>
<li>Download the ISO for SQL Server 2014 </li>
<li>Mount the ISO virtually in your virtual machine</li>
<li>Start the virtual machine</li>
<li>Install SQL Server as you please</li>
<li>You're done! </li>
<li>However, if you're really interested, you don't even need to <i>sign in</i> to your new VM to access SQL Server. Just make it accessible to the network and make sure your VM host and guest can talk to eachother. Then just install your Client Tools (such as SSMS) on your host and you can remotely manage the guest. AMAZEBALLS!</li>
</ol>
<div>
This process looks longer than it actually is. The most tedious part about this will be waiting for the downloads, honestly. </div>
<div>
<br /></div>
<div>
Let's get started!<br />
<br /></div>
<h4>
Download the VHD file for Windows Server 2012 Datacenter (Trial)</h4>
<div>
This trial version of Windows comes in a pre-packaged VHD (virtual hard drive) file directly from Microsoft here: <a href="a.http://technet.microsoft.com/en-us/evalcenter/dn205286.aspx?WT.srch=1&WT.mc_id=SEM_GOOGLE_USEvergreenSearch_WS2012&CR_CC=300095958" target="_blank">Microsoft Windows 2012 Trial</a></div>
<div>
<br /></div>
<div>
Download it and put it somewhere safe on your local machine. From here forward, your local machine will be known as the <i>host machine</i>.<br />
<br /></div>
<h4>
Download and install VirtualBox</h4>
<div>
Download from here: <a href="https://www.virtualbox.org/wiki/Downloads" target="_blank">Download VirtualBox</a></div>
<div>
<br /></div>
<div>
You'll want the 64-bit version for Windows. Put it in a safe place on the host machine and then install it. You shouldn't have to worry about install options.<br />
<br /></div>
<h4>
Create a new virtual machine using the VHD file</h4>
<div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirF9xOfiobGF81YFsfKhY592YD4oOmFO6bigV7wuehCVlkqf-v76nV1NENOGxNPt91tES6mAAC1VvT3etePXPbIHea4dZE01GTpf5Q8eMv_3b-ne0R_U15V7Am4BNQdDtGOMXUXp6Ml8nj/s1600/virtualbox+machine.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em; text-align: center;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirF9xOfiobGF81YFsfKhY592YD4oOmFO6bigV7wuehCVlkqf-v76nV1NENOGxNPt91tES6mAAC1VvT3etePXPbIHea4dZE01GTpf5Q8eMv_3b-ne0R_U15V7Am4BNQdDtGOMXUXp6Ml8nj/s1600/virtualbox+machine.png" height="175" width="200" /></a>This is easy. Just start up VirtualBox and click [New]. When you're setting up your virtual machine, make sure to choose Microsoft Windows for the type and Windows 2012 (64-bit) for the version. See the screenshot. </div>
<div>
<br /></div>
<div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQ9rnML_0MtysT-zpE5a-FZRUva5wfxJ9durl2GZNugE_CCIS0cVccsDKdjaxX24DY1tp_coFgtrYPtykFqXjhsAqTT0jsAGX5nyp1avbsNuopqZY5TA_pHLEi7aCqrMuBY9GDjxkPkkpv/s1600/virtualbox+machine+02.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em; text-align: center;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQ9rnML_0MtysT-zpE5a-FZRUva5wfxJ9durl2GZNugE_CCIS0cVccsDKdjaxX24DY1tp_coFgtrYPtykFqXjhsAqTT0jsAGX5nyp1avbsNuopqZY5TA_pHLEi7aCqrMuBY9GDjxkPkkpv/s1600/virtualbox+machine+02.png" height="175" width="200" /></a>Next, you won't want to create a new hard drive. Instead, use the one you downloaded from Microsoft. </div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifQmssdrbQf_TnfvqgAC5GRKIeffvEVw-YnebX4ZFaTkq-qbjh3KP78dFRf6vp9MNYOUT75G4Gh9ieIi8LJvfoqWOBcaw649ka_sUBUgryVAhpE8K0mRDDo6e8-ShTFqMyveKVxMXYIbEQ/s1600/virtualbox+machine+03.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em; text-align: center;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifQmssdrbQf_TnfvqgAC5GRKIeffvEVw-YnebX4ZFaTkq-qbjh3KP78dFRf6vp9MNYOUT75G4Gh9ieIi8LJvfoqWOBcaw649ka_sUBUgryVAhpE8K0mRDDo6e8-ShTFqMyveKVxMXYIbEQ/s1600/virtualbox+machine+03.png" height="60" width="200" /></a>You will <i>most likely</i> receive an error that the hard drive already exists See the screenshot. The error is approximately "<b>Cannot register the hard disk file because a hard disk with UUID already exists</b>." If that's the case, you need to cancel the process and go to the main VirtualBox window. From there, click the File menu, then Virtual Media Manager. Delete the old hard drive from the list. It will ask you if you want to delete the file -- don't do that (unless you'd like to redownload the file). <a href="chrome-extension://klbibkeccnjlkjkiokjodocebajanakg/suspended.html#url=http%3A%2F%2Fwww.archibel.com%2F103.html%3F%26tx_ttnews%255Btt_news%255D%3D1346%26cHash%3D60042121521c4fd52b12697dd22c672d" target="_blank">See here for more details on the error "Cannot register the hard disk file because a hard disk with UUID already exists"</a>.</div>
<div>
<br /></div>
<div>
Next, you can play with the settings of your Virtual Machine. Give it as much or as little memory as you feel appropriate. Give it more or fewer CPUs. Do what you feel is appropriate for your situation. The defaults should run adequately for demonstration purposes. </div>
<div>
<br /></div>
<div>
Here's some extra help: <a href="a.http://www.sysprobs.com/how-to-open-run-microsoft-vhd-files-on-virtualbox-pre-installed-vhd-images" target="_blank">Microsoft VHD images on VirtualBox</a> (although not everything in the article is accurate or applicable to this situation<br />
<br /></div>
<h4>
Download the ISO for SQL Server 2014 </h4>
<div>
Download the <a href="http://technet.microsoft.com/en-US/evalcenter/dn205290.aspx" target="_blank">Microsoft SQL Server 2014 trial</a>. Make sure to grab the ISO.<br />
<br /></div>
<h4>
Install the ISO as a Virtual DVD in your VM</h4>
<div>
Go into the settings of your VM, click on Storage, then click the little icon of a CD next to the CD/DVD Drive. Then choose "Choose a virtual CD/DVD disk file..." and browse to your SQL Server ISO file. Save the settings.<br />
<br /></div>
<h4>
Start your Virtual Machine</h4>
<div>
In the Oracle VM VirtualBox Manager window, highlight your server and then click start. It will take a while to boot and it will look a lot like a Remote Desktop session. Jump through all the standard hoops. </div>
<div>
<br /></div>
<div>
Next, you're wondering, how do I send CTRL+ALT+DEL to the Virtual Machine to log in? Click on the machine menu and then click on "Insert Ctrl-Alt-Del."<br />
<br /></div>
<div>
<h4>
Install SQL Server</h4>
<div>
Once you're in Windows, go to This PC and double click the installation DVD for SQL Server. Then click the setup icon to start the installation. </div>
<div>
<br /></div>
<div>
It is beyond the scope of this article to help you install SQL Server. If you're a SQL Server admin looking to become certified, this is the part when you should grab your certification manual and follow along so that you understand all your installation options.<br />
<br />
Special note: You will need to install .NET 3.5 ... this is not automatically included in Windows Server 2012, but you can add it as a feature. Check out <a href="http://technet.microsoft.com/en-us/library/hh831809.aspx#BKMK_FoD" target="_blank">To install .NET Framework 3.5 by using the Add Roles and Features Wizard</a> (scroll down about 3/4 of the way).<br />
<br /></div>
<h4>
Install Guest Additions</h4>
<div>
VirtualBox guest additions are some utilities and features to help VirtualBox talk to your virtual machine and speed it up a bit. To install them, while your machine is running, click on the Devices menu then click on "Insert Guest Additions CD Image." Next, inside the virtual machine, go to This PC then open up the VirtualBox Guest Additions CD Drive and install VBoxWindowsAdditions. The installation is straightforward. Make sure to allow the installation of all drivers and to reboot afterwards.<br />
<br /></div>
<h4>
Remotely Access Your Server</h4>
<div>
Your virtual machine is also known as the <i>guest machine</i>. You can install just the SQL Server client tools in your <i>host machine</i> and remotely administer the server running your <i>guest machine</i>. That configuration is also beyond the scope of this article, but most likely I will post a second article describing that. In the meantime, fire up your <i>guest machine</i> and start SQL Server Management Studio (under the Start menu or the All Apps menu). </div>
<div>
<br /></div>
<div>
Good hunting!</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirF9xOfiobGF81YFsfKhY592YD4oOmFO6bigV7wuehCVlkqf-v76nV1NENOGxNPt91tES6mAAC1VvT3etePXPbIHea4dZE01GTpf5Q8eMv_3b-ne0R_U15V7Am4BNQdDtGOMXUXp6Ml8nj/s1600/virtualbox+machine.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><br /></a></div>
</div>
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-17333316508178470302014-02-10T10:10:00.001-07:002014-02-10T10:12:11.741-07:00SQL Server Agent Restart Notification / SQL Server Database Cluster Failover NotificationI work in an environment running a SQL Server Cluster (2-node, Active/Passive) on the production server. I want to know when the SQL Server Cluster fails over or when it's tested. It seems to have happened a few times. Since our IT department is somewhat regimental, I don't have access to a lot of system administrator functions. However, there is still a simple way to get that notification.<br />
<br />
When I was researching this problem, I came across this: <br />
<a href="http://www.practicalsqldba.com/2012/07/sql-server-how-can-i-get-notification.html">SQL SERVER: How Can I Get Notification Alert When Fail Over Happened?</a> This got me started down the right path but it turns out to be simpler.<br />
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimMLCUeWUiKNfUM9OAoFiZ-I8sEmpOkZMsUND5X6dsqSQP7-RyL6y0_EXuyvpj4NODNY7acbd6yj9q1AYmNn8y8q0ySBM5EAr4H0Lu62ZCNh2N1wboE_iAhYXxEVhmUcWDQ6bxXjAXsrJf/s1600/agent+restart+notification+01.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimMLCUeWUiKNfUM9OAoFiZ-I8sEmpOkZMsUND5X6dsqSQP7-RyL6y0_EXuyvpj4NODNY7acbd6yj9q1AYmNn8y8q0ySBM5EAr4H0Lu62ZCNh2N1wboE_iAhYXxEVhmUcWDQ6bxXjAXsrJf/s1600/agent+restart+notification+01.png" height="173" width="200" /></a></div>
<div>
First, a bit of background on the fail over. When fail over occurs, the clustering services on the remaining nodes detect that failure. They attempt to shut down services on the failing machine. As soon as that completes or when the machine is deemed inaccessible, the clustering service starts SQL Server services on the new active node. We can take advantage of that stop service/start service process because SQL Server Agent has a hook in its scheduler that can start a job when SQL Server Agent starts (see the screenshot to the right).</div>
<div>
<br /></div>
<div>
Here are the steps:</div>
<div>
<ol>
<li>Create the new job.</li>
<li>Name it something like "Agent Restart Notification".</li>
<li>Add at least one step. </li>
<ul>
<li>It can be an empty step with some do-nothing code in it, but it has to be there (otherwise the scheduler will not enable the job).</li>
</ul>
<li>Add a schedule, but choose <b>Start automatically when SQL Server Agent starts</b>.</li>
<li>On the notifications page, send an email when the job completes.</li>
<ol>
<li>Please note that you must have <b>Database Mail</b> enabled and properly configured.</li>
<li>You must also have an Operator properly configured.</li>
</ol>
<li>Save the job.</li>
</ol>
<div>
That's it. Next time the SQL Server Agent restarts, you will receive a notification. There's one problem here, of course. What if the SQL Server Agent restarts because of some other event, like someone manually restarting it? Unfortunately more complex code is necessary for such a specific event. To get started with that, see the above referenced blog post. As for me, this serves my purposes well enough. </div>
</div>
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com3tag:blogger.com,1999:blog-8055418547332992788.post-40856002204822474362014-02-06T13:14:00.000-07:002014-02-06T14:55:01.244-07:00T-SQL Error: Each GROUP BY expression must contain at least one column that is not an outer referenceI've written thousands (perhaps tens of thousands) of queries in at least half a dozen database engines. I consider myself pretty good at this query business. Unfortunately, I run into a quizzical error message every once in a while that makes me scratch my head.<br />
<blockquote class="tr_bq">
<b>Msg 164, Level 15, State 1, Line 163:</b><b>Each GROUP BY expression must contain at least one column that is not an outer reference.</b></blockquote>
Gnarly.<br />
<br />
In general, I feel that Microsoft's error messages leave something to be desired. Oh, for instance, meaning, context, and substance.<br />
<br />
Here's the story. I had a regular old query and I wanted to sum the financials for each period.<br />
<br />
<pre class="brush:sql">SELECT
X,
Y,
@CURRENTYEAR as CurrentYear,
SUM( Z ) AS Z
FROM
FINANCIALSUMMARY
</pre>
<br />
No problem, right? Just add the non-aggregate columns into a GROUP BY statement and VOILA! But no. Didn't work after all.
Why? First, let's parse the error message to see if we can actually get any meaning out of it.<br />
<ul>
<li>expression</li>
<ul>
<li>Any value made of constants or variables combined using operators</li>
<ul>
<li>examples:</li>
<ul>
<li>1 </li>
<li>10</li>
<li>@x</li>
<li>10 + @x</li>
</ul>
</ul>
</ul>
<li>must contain at least one column</li>
<ul>
<li>Columns are by definition part of a table or table expression! Therefore any expression that doesn't have a column must be excluded. </li>
</ul>
<li>that is not an outer reference</li>
<ul>
<li>This is a very abstract concept in terms of the SQL query language. I can't really do it justice here in one sentence, so I recommend reading up on something like <a href="http://sqlmag.com/blog/aggregates-outer-reference" target="_blank">Aggregates with an Outer Reference</a>. </li>
</ul>
</ul>
Because there are variables that do not make reference to columns in some of the expressions, those expressions cannot be part of the <b>GROUP BY</b> clause. In other words, I had to remove the variable <b>@CURRENTYEAR</b> from the <b>GROUP BY </b>list.Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-44355163169381082732014-02-06T09:09:00.001-07:002014-02-06T13:14:21.523-07:00T-SQL Error: The multi-part identifier could not be boundSometimes when I work with different databases I forget the rules that are exclusive to each one. For example, today I wrote a trigger in T-SQL (SQL Server) that would have probably worked in PL/PGSQL (PostgreSQL). When it bombed with a syntax error, I had to scratch my head. I read most of the manual page from Technet but had no real luck finding the answer. I've been trying to use my own mental knowledge-base to figure out the answers to a lot of questions lately, but this time I was under an approaching deadline so I just Googled it.<br />
<br />
Here's the problem:<br />
<pre class="brush:sql">
CREATE TRIGGER
dbo.tr_FOO
ON
dbo.FOO
AFTER INSERT, UPDATE AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for trigger here
IF( UPDATE( REF ) AND ISNUMERIC( inserted.REF ) = 1 )
BEGIN
UPDATE
dbo.FOO
SET
REF = 'Y'
WHERE
ID = inserted.ID;
END
END
GO
</pre>
<br />
Why didn't this work in T-SQL? Because <b>inserted</b> is a table and I'm trying to reference a column. The error message I received was <b>The multi-part identifier "inserted.REF" could not be bound</b>.<br />
<br />
So it turns out that <a href="http://social.msdn.microsoft.com/Forums/sqlserver/en-US/206caae5-bfba-4971-ba1c-0b31f615d6f4/cant-access-inserted-table-from-trigger-msg-4104-the-multipart-identifier-could-not-be?forum=sqlexpress" target="_blank">there's no way to directly reference a table in T-SQL without using a SELECT statement</a>. Unfortunately, I write triggers so rarely that I forget this because it seems somewhat illogical and annoying to me. In PL/PGSQL that's not the case. In that language,<a href="http://www.postgresql.org/docs/9.2/static/plpgsql-trigger.html" target="_blank"> the <b>inserted</b> and <b>updated</b> sets is a <i>record</i> variable that can be referenced directly</a>, so referencing something like <b>inserted.REF</b> is perfectly legitimate. That's one of the really annoying "missing features" of T-SQL. So the solution is simple but long-winded.<br />
<br />
<pre class="brush:sql">CREATE TRIGGER
dbo.tr_FOO
ON
dbo.FOO
AFTER INSERT, UPDATE AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- In T-SQL,
DECLARE @ID char(10);
DECLARE @REF char(6);
SELECT
@ID = NID,
@REF = REF,
FROM
inserted;
-- Insert statements for trigger here
IF( UPDATE( REF ) AND ISNUMERIC( @REF ) = 1 )
BEGIN
UPDATE
dbo.FOO
SET
REF = 'Y'
WHERE
ID = @ID;
END
END
GO
</pre>
<pre class="brush:sql"></pre>
<br />
On another note, SQL Server has an <b>AFTER</b> and <b>INSTEAD OF </b>triggers but not the extremely useful <b>BEFORE</b> triggers that are in PostgreSQL. Dear Microsoft: Fix these two annoyances and I will love you more!
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-81425860843090953122014-01-16T14:25:00.000-07:002014-01-16T14:25:08.527-07:00How to Find All Ecrypted Columns in a SQL Server Database<span style="background-color: white; color: #222222; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 18px;">I needed to nullify the data in all encrypted columns in a development server (according to our business rules). I knew most of the columns because we used them regularly, but I wanted to be thorough and also I wanted to be able to prove that I've found them all.</span><br />
<span style="background-color: white; color: #222222; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 18px;"><br /></span>
I asked the question <a href="http://dba.stackexchange.com/questions/56844/is-there-a-quick-way-to-find-all-columns-in-sql-server-2008-r2-that-are-encrypte">Is there a quick way to find all columns in SQL Server 2008 R2 that are encrypted/have encrypted data</a> on Stack Exchange and got a couple of great responses that pointed to "no" but someone came through with the answer and it worked. Basically you have to use the KEY_NAME() function and loop through the system tables to get what you need. <div>
<br /></div>
<div>
I won't go into great detail here since it's all in the web page linked above. </div>
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-3726479576365633042014-01-06T10:01:00.000-07:002014-01-06T10:01:06.473-07:00Questions to Ask the Interviewer during a Job InterviewI recently switched jobs because I was unhappy with the business and technical directions at my previous employer. When I decided to switch, I remembered two things. First, during every interview, they always ask the questions, "Now, do you have any questions?" If the answer is "no" then it's either a unicorn job or else you need to have a moment with yourself to come up with some. Second, I didn't ask enough questions (or at least didn't pay close enough attention to the answers) during the interview process at my last employer. I could have kicked myself for falling into such an obvious trap -- the signs of a sinking ship -- water pouring in through holes in the windows -- were everywhere.<br />
<br />
More recently when I went on the job hunt, I was very careful to come up with a comprehensive list of questions ahead of time -- most especially questions relating to the problematic nature of my previous employer. I think this strategy really paid off. My new and potential employers were much more engaged by the level of awareness I had going into the process. I feel like I've found a good career-home and I owe it directly to a strong interview process (and a good recruiter who agreed with my philosophy).<br />
<br />
This morning I was forwarded <a href="http://jvns.ca/blog/2013/12/30/questions-im-asking-in-interviews/" target="_blank">a list of interview questions by Julia Evans</a> -- not for your potentially new employer to ask you but for you to ask. There are a ton of questions and many of them are developer specific, but a lot of them are very good for general situations and for a starting point to get you going.<br />
<br />
My advice: spend an hour before the interview and get a good 20-40 questions ready. You won't use them all during any given interview, but when you glance at the list, you should have a good idea of which ones are appropriate. Also, highlight your high priority questions. Those are the ones for which you absolutely need to know the answers.<br />
<br />
Some examples from Julia Evans' blog:<br />
<ul>
<li>Do you do code review? Does all code get reviewed?</li>
<li>Do you have an issue tracker?</li>
<li>Do people work on the weekend?</li>
<li>Do people check in when they’re on vacation? How often?</li>
<li>Do your employees speak at conferences about your work?</li>
<li>Is it easy to move to other divisions or offices?</li>
<li>How does internal communication work? This one is super important and I need to remember to ask it more.</li>
<li>How much are you planning to hire in the next year?</li>
<li>What do you wish you had known when you joined this company?</li>
</ul>
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-55094302188042659562013-12-31T10:00:00.003-07:002013-12-31T10:00:26.305-07:00My New Years Resolution: Learn About Massively Distributed Database SystemsDatabase scalability seems to be a problem for many database administrators and data managers. The problem is not "how do I do it" or "what is it" but rather "what do I do with it?" I've worked with the smallest databases in many different traditional RDBMS platforms and I've worked with some pretty huge databases. My last job had several terabyte databases sitting in a single database server -- no clustering. I currently work in an environment with several small databases sitting on an over-sized cluster. Frankly, I think they have it backwards. The big environment should have been clustered and the small environment doesn't need to be. But I digress. The point is, I don't think most database architects know what to do with scalable systems because, frankly, they don't have enough data in their small to medium shops. But that will change -- and soon!<br />
<br />
In order to grow and to perhaps work in a bigger shop that has the need for massively distributed data, we have to think bigger and learn about the world outside of our small ponds. Not only that, but our small ponds will be getting bigger. The data landscape is changing quickly.<br />
<br />
I'm thinking about this today because I read <a href="http://opensource.com/business/13/12/infinisql" target="_blank">an article about InfiniSQL (actually, more of a press release)</a> from its principal developer. Some of the concepts didn't make a whole lot of sense to me because I've never worked in an environment that distributes data to multiple geographical locations. Thinking back on the articles I've read recently, it seems like distribution of data is going to be the big problem to solve in the future and it seems like an important niche to dive into. After all, <a href="http://www.npr.org/2012/12/20/167702665/geoff-nunbergs-word-of-the-year-big-data" target="_blank">last year's contender for word-of-the-year was Big Data</a>.<br />
<br />
But Big Data is conceptual in nature and very few people understand what it really is. Big Data is not a terabyte or two terabytes in a database cluster -- not even ten or twenty or fifty or a hundred. Big Data is more like a hundred terabytes <i>per day</i>. The actual storage systems are dozens to hundreds of petabytes in size. Big Data was actualized by massive players -- Yahoo and Google and maybe even Pixar. I found this out by reading <a href="http://www.amazon.com/Big-Data-Revolution-Transform-Think/dp/0544002695" target="_blank">Big Data: A Revolution That Will Transform How We Live, Work, and Think</a>.<br />
<br />
What I would like to talk about here is not Big Data. Big Data is going to get bigger -- that's true. Next decade's databases will probably be 10-100 times bigger than the ones we have now, even for small to medium sized firms. Since storage is so cheap we've all begun to hoard data. But in its most fundamental format data -- and especially Big Data -- is less than useless!<br />
<br />
The time for some is now -- and the time for the rest of us is soon -- when we will actually have to start leveraging our treasure mounds of data for something beyond simple reports. The new buzzwords are Data Visualization, Data Mining, and the like. We don't just need to <i>store</i> the data, we need to <i>extract useful information</i> from it.<br />
<br />
Sure, Data Visualization and Data Mining have been around for a while as concepts but they haven't really entered the lexicon of the common database administrator. The reason is because DV and DM have been so difficult to accomplish without a degree in higher mathematics -- seriously! Have you ever wanted to compare the graphs of two time periods to detect their "sameness" or "differentness?" There's a reason we draw graphs and give them to analysts to interpret -- because the human brain can do so much more with two simple graphs today than a team of computer scientists. The algorithms and mathematics are out there, for the most part, but where data, computer algorithms, and higher mathematics intersect is the <i>trivium</i> (the crossroads) where only a few have been able to tread successfully.<br />
<br />
Those people, with that trinity of accessible skills, are the ones who make $400-$500k annually as developers. They make so much money and live such charmed lives because conceptually it's such an exquisitely difficult nirvana to reach. It's hard enough learning how to program, and even harder to understand the intricacies of data, and harder still to master the highest orders of mathematics. Put them together and you have a requirement of 20 years of intense college education peppered with summers of internships and late and lonely nights of coding and experimentation. Unfortunately, twenty years in this industry is a lifetime and everything will have changed in 20 years.<br />
<br />
That's why it's important to keep up with the current trends in the data sciences. Sure, there are a few geniuses here and there who have the right mix of skills to command $500k (in a non-executive role) but they are far and few between. For the rest of the world, there are the data scientists who will earn $150k, $125k, $100k, $75k per year. They are not the ones who <i>develop</i> the database software or the analytics software necessarily (albeit there will be some small-scale development for those who can push themselves). Rather, they will be the ones who learn to listen to the tracks to hear the oncoming trains. In other words, they will be the ones who learn to leverage existing software and push their small shops to the bleeding edge by testing prototype systems in strange and new combinations to see what works and what doesn't.<br />
<br />
So here's my New Year's resolution. 2014 will be the year in which I push my skills into new territories. I don't just want to read about Big Data, Data Visualization, Data Mining, Statistical Analysis, Distributed Systems, etc. I want to learn their intricacies and nuances and I want to listen to what the data science community is saying about them. I want to learn about horizontal scaling, vertical scaling, large scale clustering, and anything having to do with storing and mining that data for useful information. Perhaps next year's resolution will be to come up with a problem to solve. Until then!<br />
<br />
Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0tag:blogger.com,1999:blog-8055418547332992788.post-70808417577256142742013-12-20T14:34:00.000-07:002013-12-20T14:34:48.723-07:00CONVERT_IMPLICIT in a JOIN or SubqueryImplicit conversion can be a big problem in SQL Server that can slow down your queries by orders of magnitude -- even with relatively low row counts.<br />
<br />
First of all, what is implicit conversion? When two values are compared in an expression, the data types should match. If they don't match, SQL Server will <i>implicitly</i> (without being told) convert one data type into the other so that they match. After they match, then SQL Server can make the comparison.<br />
<br />
<a href="http://www.brentozar.com/archive/2012/07/identifying-correcting-sql-server-implicit-conversion/" target="_blank">Implicit conversion can be identified and corrected in many cases</a> and it's most easily done in the query itself with explicit CAST and CONVERT statements. But what happens when joining two tables together with keys of different data types?<br />
<br />
When sets are overlapped (with a JOIN or with a subquery in an IN operator), each set has a key which correlates them together. If those keys are the same data type (but not simply compatible data types), then indexes which exist on the keys can be directly used to speed up the correlation. However, if the two data types are not the same, SQL Server must <i>implicitly</i> convert the data types and it can no longer use those indexes. So basically, all that work you did to make those indexes finely tuned and optimized gets thrown out the window because they will never be used.<br />
<br />
Look at this snippet of an execution plan:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdw9cLhU6QOUMJ7D6G_xY2Qln6qeCkiJdMpx_6HeHEf8HUx7NjJWPVFJxBK4KN_GI7UPfvjsDa9VwsUPSn0rkkJBUa4Q_-StidZ2T0G9WIywARggV7_YA5OWy9xzYLU6Xa4xg56xHSom8o/s1600/implicit+conversion.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="168" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdw9cLhU6QOUMJ7D6G_xY2Qln6qeCkiJdMpx_6HeHEf8HUx7NjJWPVFJxBK4KN_GI7UPfvjsDa9VwsUPSn0rkkJBUa4Q_-StidZ2T0G9WIywARggV7_YA5OWy9xzYLU6Xa4xg56xHSom8o/s320/implicit+conversion.png" width="320" /></a></div>
<br />
Above, an implicit conversion is happening inside the COMPUTE SCALAR nodes in this execution plan. You can see that almost half a million rows are being <i>scanned</i> from <b>Table3</b>. Notice that the compile time is almost 4 seconds. Client statistics of 10 runs revealed that this query takes an average of 2892ms to run.<br />
<br />
COMPUTE SCALAR is actually converting <b>nvarchar(15)</b> in <b>Table3 </b>to <b>char(10)</b>, and it's doing it 500,000 times! Is all that work really necessary?! Not at all.<br />
<br />
Look what happens when I change the data type of the base table <b>Table3 </b>to <b>char(10)</b>. Note that SQL Server will complain about this conversion because it could result in a loss of data, but in my case, I verified that it would not have.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRy9pyH50LgE27EA88Gtuf_RT7r7bjEpIqSTpENYqLzY1oxq2_fO6LW4T_olD-msVe0DCdRVvC7GYstLKexxenglt_84GHrg1FZZ9Z4KiSS8j-6VJ_HTBVi3SlfeFuG4jYuddDDAo27F1u/s1600/implicit+conversion+2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="133" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRy9pyH50LgE27EA88Gtuf_RT7r7bjEpIqSTpENYqLzY1oxq2_fO6LW4T_olD-msVe0DCdRVvC7GYstLKexxenglt_84GHrg1FZZ9Z4KiSS8j-6VJ_HTBVi3SlfeFuG4jYuddDDAo27F1u/s320/implicit+conversion+2.png" width="320" /></a></div>
<br />
Above, first notice that the anonymizer has changed <b>Table3</b> to <b>Table4</b>. Furthermore, there is no implicit conversion and the indexes can and in fact are being <i>seeked</i>. Instead of scanning 500,000 rows, you now see it's seeking 112 and 7. A tiny change has radically altered the dynamics of this query. Notice other stats, like the fact that it ran in almost 1/40th (3%!) of the time. Client statistics confirm the speedup.<br />
<br />
Your mileage may vary, of course, but always look out for CONVERT_IMPLICIT!<br />
<br />
<br />Keith Alexander W.http://www.blogger.com/profile/03164086700339769547noreply@blogger.com0