26 March 2015

#### Statistics

I work for an organisation in a job where understanding statistics is definitely required. I wont bore you in this post about the applications I use, as this is about a singular product that I have created using Microsoft Excel. If you are interested in my Bell Curve Excel Template, feel free to download it. Don’t use a low powered CPU to run this, as it does require a little grunt. Now, like most things I create, its a jigsaw of features from here and there. Feel free to open it and take a look prior to actually pressing any buttons or following my example further down.

BTW I am a BIG fan of using XLSB file types for saving Large Excel Documents.

#### Add ’em all up, Divide by the Count = Average

When I first started in this job, I was asked to create an ‘average’ number from a population of 1300 records. The expected average was to be somewhere around 80, but what I returned was approx 60% higher at 124. Why?

Lets face it, most people don’t care about anything beyond one piece of data – that average number. But when the numbers don’t add up, they can just assume you don’t know what you are doing. Thankfully I understood Standard Deviations and was able to create a Bell Curve Graph to explain why. Although the graph took 3 hours to create and the explanation took 15 minutes. Way too long for my liking. And instead of you wasting time, take a copy of my Bell Curve Excel Template and use it for yourself.

#### Standard Deviation

There are LOTS of good YouTube – I recommend ExcellsFun on YouTube, this guy knows his stuff and has some great formula ideas for Spreadsheets. In fact, if you look at some of my formulas, I think I used his ideas to ensure the chart would not show ‘null’ values.

I also acknowledge that a lot of my inspiration came from

Wittwer, J.W., “Graphing a Normal Distribution in Excel”.

While there are similarities tot he above references, hopefully you will conclude that I have made this a much more user friendly version.

**Percent of the Population**

Here it gets complicated. I want to try to keep the blog simple – so the complexity is in the Bell Curve Excel Template. The beauty of the spreadsheet, is I can simply highlight the area of interest to individuals (which to be honest, normally isn’t even the graph as that is way too complicated for them too).

The part of interest to most people is the 68% or the 95% section. My humble understanding is that I can throw away the 0.3% result AND here is where it gets really cool (I’m just a geek at heart) I can also ignore the 99.7% if I so desire. Suddenly, my average numbers are on target and explainable!

#### Bell Curve Template Spreadsheet

I wont go through the mechanics (VBA) of the Bell Curve Excel Template. I wont lock the code so feel free to take a look at it (Alt + F11). I believe I limit the record set to 20,000 (a constant set in the module) records, but if you have the grunt power, or the requirement or nerves of steel, feel free to up the anti. I had it at 100,000 at one point and… overkill!

*Disclaimer #1:* Two Spread Sheets. IF you rename the sheets, the code wont work. You can copy a sheet and rename the copied sheet if you so desire. I do this to compare the graph outputs from time to time.

*Disclaimer #2: If you are smart enough to understand the VBA and to make positive changes to it, feel free to show me what you did or how you improved it. If your a programmer and only want to banter about how crap VBA is or that I should have used an OO language instead.. Then go away. f*

*Sheet One:* **USER INPUT **

Option 1 – Delete the old data and paste your data.

Option 2 – In the Average Number cell, put a number. Say 60? In the Std Deviation cell, put the standard deviation you’d like to see. Say 5? Then how many records would you like to compare. More is best. Don’t say 5. don’t say 10.. make it a number above 400 preferably in the thousands. Then just click ‘Create Test Data Set’ and wait for a moment. The data set will now have changed.

Option 3: Click the Create Totally Random Numbers button. Just so you know. Its not totally random. Like all buttons, I made it to help ME test an idea.

THEN: Click ‘Load Graph’. And wait. Just wait for it. it will flicker. that’s good! Patience. It will show two pages at once.. No issues.. Pax Gentlemen, Pax. Then suddenly, you will have landed on the NEW sheet.

*Sheet One:* **NEW**

This is a large sheet that contains many wonderful things. Column A (labeled **x**) is your list of numbers. Column B (labeled **x (n Sorted)**) is your list of numbers sorted lowest to highest.

To understand Columns C, D and E, seriously, do some research. My code will automatically fill in these cells if they are needed.

Columns G and H took a bit of work… but it allows you to move a slider under the graph (not my idea and I forgot where I saw this at work, but I did modify it for the way I needed it in my spreadsheet).

Cell **J2:** Dont edit this. its the Average Number. ‘yay!’

Cell **J3:** Dont edit this cell. It is The Standard Deviation.

Cell **N2** and **N3:** you may alter this, but keep it like in the following formats:

(-1, 1) or (-2,2) or (-3,3) or (-4,4) .. or.. (-12, 12). If you don’t see the pattern, then I highly suggest you do NOT use the spreadsheet 😉

Cells **J8** Through to **J17.** (thats **J8, J9, J10….. J16, J17)** : Don’t change these either.

Ah, yes… Cell **J12** is indeed a Magic Number. It took me way too long to figure out how to calculate this number (it ended up being the MAX number in my data set.

If you want to reload the Chart after making a change, just press ‘Reload Chart’. Easy Huh?

Then the true magic. The results! A graph that automatically adjusts to your numbers. you needed one decimal place for the Standard Deviation? It (should) give you one decimal.. You need two decimal places. It (should) give you two decimals. I think its up to three decimal places max….

And finally, those words. Displaying the four optional Standard Deviation % rule. I forget what this rule is.

### Provisos, Quid Pro Quos, Disclaimers:

If there are errors, let me know.

#### And in Conclusion

Should you create a random set of numbers from the ‘User Input’ Sheet, then you should see a near perfect Bell Curve Graph.

Should you use your own set of numbers from a different subset, it MIGHT be a perfect Chart, or like mine, it might be skewed.

But therein lies the fun.

And no longer do I need to spend 3 hours to create a Bell Curve Graph.

March 26, 2015 at 4:46 PM

Great post about dovetails. Thanks.