Our latest release has new and improved
statistical features, and an enhanced Assistant
that provides even greater help performing and
interpreting your analyses.
Organize, execute and
report on your improvement
projects with one
InfinityQS’ proven enterprise quality hub, ProFicient-
powered by a centralized SPC engine-delivers
manufacturing intelligence to improve quality,
decrease costs and ...
Maximize you ability to
improve quality by learning
how to harness the power of
Minitab Statistacal Software.
Lightsaber Capability Analysis:
Picking the Right Distribution
In my previous post, you learned how to prepare your data for capability
analysis in Minitab. Now let's see where we need to go in the statistical
software to run the correct Capability Analysis.
When it comes to capability analysis, Minitab offers a few different choices.
We offer Normal Capability Analysis for when your data follow a normal
distribution. If your data follow a different distribution, such as the
Weibull distribution, there's Non-normal Capability Analysis.
We also offer Binomial Capability and Poisson Capability for when
you are looking to produce a process capability report...
Was Alabama's Blowout of
Notre Dame Really Unexpected?
In this year's BCS Championship game, Alabama dominated Notre Dame
42-14 in a game that was never really even close. While many people felt
Alabama would win the game, most expected a defensive battle. Few predicted
it would have been so lopsided (and only a small percentage of those
would have actually bet money on a blowout).
But should we really be surprised? I mean, Alabama clearly outperformaed
expectations—but did they do so in a truly unusual manner?
How Can Data Reveal If a Victory Was Unusual?
To investigate how expected or unexpected this game's
28-point margin of victory was, we...
Violations of the Assumptions for Linear
Regression (Day 2): Independence of the Residuals
Recap: Lionel Loosefit has been arrested and hauled to court for violating the
assumptions of regression analysis. In the previous court session,
the prosecution presented evidence to show that the errors in
Mr. Loosefit’s model were not normally distributed. Today, the prosecution
addresses the second alleged violation: namely, that the errors in the defendant’s
regression model are not independent. Dr. Minnie Tabber, a world-renowned
statistician, is on the witness stand.
Prosecutor: Let me remind the members of the jury that
a residual is simply the difference between the data value...
How Well Did Our Statistical Model
Project the Top 100 Fantasy Football Players?
Before the 2012 NFL season, I used Minitab Statistical Software to project
the top 100 fantasy football players. In that data analysis, I used projections
from ESPN, Yahoo, and ones derived solely from Minitab. I averaged the
projections from all 3 sources and then ranked the players by taking the
difference between their projection and the score of the “average” player
at their position. Like any statistical model that projects future events,
it’s always a good idea to go back and see how accurate you were.
So that’s exactly what we’ll do here!
Were the Projections Accurate? Our first order of...
William of Occam Chooses a Model
Entities should not be multiplied unnecessarily.
— William of Occam, Quodlibeta Septem
We’ve had a chance now to explore Best Subsets Regression and Stepwise
Regression in Minitab statistical software. Both techniques are ways of
quickly looking at lots of candidate models so that you can identify promising ones.
We’ve seen that statistical significance and model fit statistics can’t
guarantee a fit as good as we’re looking for. With stepwise regression,
we found a model with unsatisfactory residuals until we considered extra terms.
With best subsets regression, we had to understand that there...
How Effective Are Flu Shots?
This flu season has been worse than normal. The Centers for Disease
Control and Prevention (CDC) data show that the flu has struck early and hard.
Influenza cases shot up during December rather than the more usual
January or February, and 47 states report wide-spread influenza cases.
I get a flu shot every year even though I know they’re not perfect.
I figure they’re a relatively easy and inexpensive way to reduce the
chance of having a miserable week.
I’ve heard on various news media that their effectiveness is about 60%.
But what does 60% effectiveness mean, exactly? How much does this...
Understanding Type 1 and Type 2 Errors from
the Feline Perspective: All Mistakes Are Not Equal!
Serving cat food? I sure hope you've set your alpha level high enough.
"Bad kitty!" That's a phrase you almost never hear, but even we cats make the
occasional mistake. I was reminded of this recently as I watched my human
trying to analyze some data. People frequently make mistakes when they test a
hypothesis with data analysis. Specifically, they can make either Type I or Type II errors.
When I first started reading my human's statistics textbooks a few years ago,
this idea seemed awfully silly to me. We cats appreciate being direct,
and you either get the answer correct or you don't. I...
Tip 3: Gain Confidence
with Confidence Intervals
New to confidence intervals? Here are some important things to keep in mind!
are used to estimate population parameters (commonly the process mean,
standard deviation, % of defective units, or even capability indices).
provide more meaningful information than any random sample
statistic for characterizing the population.
See “Tip 1: Every sample statistic is a little bit wrong.”
When your 95% confidence interval for the mean is (μlow, μhigh),
Testing the Equivalence of Instruments
I recently got a request from one of our Facebook fans to do a post about
orthogonal regression, which I admit is not a subject I’m very familiar with.
However, with a little help from Minitab’s help resources and by consulting
a few Minitab experts, I think I came up with a post that will be useful.
I thought it would help to discuss orthogonal regression with an example, but first...
What the Heck Is Orthogonal Regression?
Orthogonal regression is also known as “Deming regression” and examines
the linear relationship between two continuous variables.
It’s often used to test whether two...
Bewildering Things Statisticians Say:
"Failure to Reject the Null Hypothesis"
Subcultures have languages all their own. Teen gangs, statisticians, gamers,
music buffs, sports nuts, furries...all use terminology that baffles outsiders.
The arcane language helps identify kindred spirits: using the correct
phrase proves you belong. The proper buzzwords can gain you admittance
to the right professional circles...or the wrong biker bars. Maybe both.
Not knowing them can get you into serious trouble.
When you enter a dangerous place (like the data analysis arena),
you need at least a basic grasp of the jargon the local toughs use.
I'm not comparing any particular group of...
A Simple Guide to Gage
R&R for Destructive Testing
Measurement systems analysis (MSA) is essential to the success of any data
analysis. If you cannot rely on the tool you’re using to take measurements, then
why bother collecting data to begin with? It would be like trying to lose weight
while relying on a scale that doesn’t work. What’s the point in weighing yourself?
Minitab Statistical Software offers many types of tools that you can
use to assess your measurement system, including:
Parity in the NFL?
Nope! It’s the Sample Size!
It's almost Super Bowl Sunday, and this year’s matchup pits the Baltimore
Ravens against the San Francisco 49ers. The 49ers are no huge surprise,
as they were favored in both of their playoff games. However the Ravens
had to win 3 games, pulling two major upsets along the way, to get to the
Super Bowl. It marks the 8th time in the last 10 years that a team that
played on Wild Card Weekend advanced the entire way to the Super Bowl.
This again shows how much parity there is in the NFL. It’s unpredictable!
Any team can win the championship!
Well...not quite. While I agree that the NFL playoffsare...
Violations of the Assumptions for Linear
Regression: Residuals versus the Fits (Day 3)
Lionel Loosefit has been hauled to court for violating the assumptions of
linear regression. On Day 3 of the trial, the court examines the allegation that
the residuals in Mr. Loosefit's model exhibit nonconstant variance. The
defendant’s mother, Mrs. Lottie Loosefit, has taken the stand on behalf of her son.
Defense Attorney: So, Mrs. Loosefit, from what you’ve described to us,
your son, Lionel, appears to have been a model child.
Lottie Loosefit [eyes watering]: He was every mother’s dream.
He brushed his teeth every morning and every night, made his bed,
folded his socks, picked up all his...
Super Bowl Ticket Prices
Tickets to attend the Super Bowl are among the most coveted sports
tickets in the world, and the high average price for even nose-bleed seats
illustrates how low supply and high demand can result in astronomical ticket pricing.
After last year’s Super Bowl, the Bleacher Report published this article that
ranked the average ticket price of every Super Bowl since 1966:
Can you believe that the average cost for a ticket 1966 was only $12.00?
The pricing for the 2013 Super Bowl tickets range...
The Best Super Bowl
Commercials of 2013, Plotted in Minitab
Various commercials valiantly vied for the attention and dollars of football
Super Bowl commercial fans on Sunday. The game was decided objectively
(or by the referees), but the drama of which commercials won lives on.
Because we love data analysis, I gathered a little bit to see which efforts
really stood out. Then I plotted the data in Minitab to explore the results.
Three top-ten lists attracted my attention:
Flu Shot Followup: Assessing the
Long-Term Benefits of Flu Vaccination
In my last post, I wrote about the 60% effectiveness rate for flu shots that
news media commonly report. The effectiveness is actually a relative measure
of the reduction in your flu risk if you’re vaccinated. Relative measures
are hard to interpret without additional information. With that in mind,
I reanalyzed the data to put it in absolute terms. I found that if you get a flu shot,
your average annual risk of getting the flu drops from 7.0% to 1.9%,
which is a 5.1% reduction.
I’ve received several requests to look at this over a longer timeframe.
After all, flu shots aren’t a one-time thing....
They Call Them
"Free" Throws For a Reason
When Penn State guard Jermaine Marshall stepped to the line to take
two free throws with 0:27 remaining against Ohio State, it didn’t really
matter whether he made the shots. The game was already out of reach,
and although the Nittany Lions would attempt to foul their way into a
miracle victory, most of the fans were all too aware that Penn State was
now 0-8. That Marshall then missed both free throws was the exclamation
point on a night where the team made just 13 of 22 free throw attempts.
Lest you not already know this,
a "free throw" is a shot taken against no defense, a shot that likely...
A Story-based Approach to
Learning Statistics (Statistical Software)
Want to learn more about analyzing data? Try taking a page from Aesop's book.
Well...really, I'm suggesting taking multiple pages from Minitab's
book, but my suggestion stems from an idea that Aesop epitomizes.
Aesop was no fool. When he wanted to convey even the heaviest of lessons,
he didn't waste time detailing the intellectual and philosophical arguments
behind them. He didn't argue, cajole, or berate. He didn't lecture or pontificate.
He told a story.
Minitab uses the same approach in Meet Minitab, the introductory
guide to data analysis and quality statistics using our statistical...
Cost of Quality: Well, hello,
my old, complex friend….
As I sat down to examine Cost of Quality (COQ) at Minitab, I flashed back
to my CQE exam almost 20 years ago. I can still vividly remember staring
down at a particularly difficult Cost of Quality question and wondering why
I didn’t just follow my 4th-grade career assessment and become a novelist.
Mental note: a study of the correlation between 4th grade career
assessments and actual career paths would make for an interesting blog post.
I briefly considered fleeing the building,
buying a bottle of absinthe and channeling Hemingway.
But then I remembered that, even with absinthe, I was no...
Sweet on Valentine's Day
Planning on giving a bag of M&M's to your sweetie this Valentine's Day?
Well, you can woo your Valentine with not only the gift of candy,
but also the statistics behind those candy-coated chocolate pieces.
Are there equal amounts of each color in a bag?
You can record your counts of each color in the bag in a Minitab worksheet,
and then use a pie chart (Graph > Pie Chart) to visualize the counts:
There were 138 blue M&M’s and only 63 red M&M’s in our sample.
But is the difference between these counts statistically significant?
Performing DOE for Defect Reduction
Lean Six Sigma and process excellence leaders are often asked to
“remove defects” from products and processes. This can be quite a challenge!
Lou Johnson, senior Minitab technical trainer and mentor, has some tips
that might help if you’re faced with this situation. I had the chance to talk with Lou,
and here’s what he shared with me about how to first approach a DOE.
How to Approach a DOE
Before jumping into a Design of Experiment (DOE) for defect reduction,
Lou suggests stepping back and thinking first about what issue is likely
causing the problem. If you need help thinking about what might...
Is This the Craziest College
Basketball Season Ever?
The last few weeks have been pretty crazy in college basketball. In the
first 13 days of February, nine different teams ranked in the Top 10
have lost. And had Duke not squeaked by Boston College last Sunday,
it would have been the first time since 1992 that every team ranked in the
AP Top 5 had lost in a single week.
All of this has led to analysts saying that the parity in college basketball
is greater than it’s ever been. And while it might seem that way,
it’s always best to perform a data analysis to confirm whether your
claims are true. Have there really been more Top 10 upsets this year...
Violations of the Assumptions for Linear
Regression: Closing Arguments and Verdict
Lionel Loosefit has been hauled to court for violating the assumptions of
regression analysis. On the last day of the trial, the prosecution and defense present
their closing arguments. And the fate of Mr. Loosefit is decided by judge and jury...
The Prosecution's Summary
Prosecutor: Ladies and gentlemen, we’ve presented a slew of evidence in this trial.
You’ve seen, with your own eyes, every possible heinous violation of the assumptions
for regression in the defendant’s model. Here’s what we’ve shown, in a nutshell:
Prosecutor: We’ve carefully delineated each violation with specific graphic...
3 Common (and Dangerous!)
Have you ever been a victim of a statistical misconception that’s affected
how you’ve interpreted your analysis? Like any field of study, statistics
has some common misconceptions that can trip up even experienced
statisticians. Here are a few common misconceptions to watch out for
as you complete your analyses and interpret the results.
Mistake #1: Misinterpreting Overlapping Confidence Intervals
When comparing multiple means, statistical practitioners are sometimes
advised to compare the results from confidence intervals
and determine whether the intervals overlap. When 95%...
Where to find meteorites,
the Pareto chart way
It’s an amazing thing when a mass of rock and iron streaks through
space and enters Earth’s atmosphere. So naturally, the Chelyabinsk
meteor has attracted a great deal of attention. We’re fascinated by
the images and captivated by the stories. And, if you’re interested in
statistical analysis, you start to wonder a little bit about meteorites.
The nice thing is that the Meteoritical Society has a large database
with information about meteorites recovered on Earth.
The database has over 50,000 records.
It’s particularly neat to see where people find meteorites with...
Why Statistics Is Important
"There are three kinds of lies: lies, damned lies, and statistics."
I’m sure you’ve heard this most vile expression, which was popularized
by Mark Twain among others. This dastardly phrase impugns the
reputation of statistics. The implication is that statistics can bolster
a weak argument, or that statistics can be used to prove anything.
I’ve had enough of this expression, and here’s the rebuttal! In fact,
I’ll make the case that statistics is not the problem, but the solution!
Mistakes Can Happen
First, let’s stipulate that an unscrupulous person canintentionally...
Lightsaber Capability Analysis:
Is Our Process In Control?
In my last post, we talked about using statistical tools to identify the right
distribution of our lightsaber manufacturing data. Now that we have our
data in Minitab along with a specific distribution picked out, we can
find out if we are dealing with an in-control process. If the process is not
in control, the capability estimates will be incorrect. Thus, an extremely
important (and often overlooked) aspect of Capability Analysis is to
make sure our process in first in control. We can do this with a tool
Minitab Statistical Software offers called the Capability Sixpack.TM
First, let’s go to S...
For Want of an FMEA, the Empire Fell
For want of a nail the shoe was lost,
For want of a shoe the horse was lost,
For want of a horse the rider was lost
For want of a rider the battle was lost
For want of a battle the kingdom was lost
And all for the want of a horseshoe nail. (Lowe, 1980, 50)
According to the old nursery rhyme, "For Want of a Nail," an entire kingdom was
lost because of the lack of one nail for a horseshoe. The same could be said
for the Galactic Empire in Star Wars. The Empire would not have fallen if the
technicians who created the first Death Star had done a proper Failure Mode and...
Forget Statistical Assumptions -
Just Check the Requirements!
One of the most poorly understood concepts in the use of statistics is the idea
of assumptions. You've probably encountered many of these assumptions,
such as "data normality is an assumption of the 1-sample t-test."
But if you read that statement and believe normality is a requirement of the
1-sample t-test, then you have missed a subtle and important
characteristic of assumptions and need to read on...
An "assumption" is not necessarily a "requirement"!
To understand where this idea of assumptions come from,
let's forget about statistics for a minute and imagine we sell bikes online. We...
How to Use Value Stream
Maps in Healthcare
While value stream mapping, or VSM, is a key tool used in many Lean
Six Sigma projects for manufacturing, it’s also widely used in healthcare.
Value stream mapping can help you map, visualize, and understand
the flow of patients, materials (e.g., bags of screened blood or plasma),
and information. The “value stream” is all of the actions required to
complete a particular process, and the goal of VSM is to identify
improvements that can be made to reduce waste (e.g., patient wait times).
How is VSM applied to healthcare?
When used within healthcare, one obvious application for VSM is mapping a...
Helping Beginners Learn about
Process Variation using Miles Per Gallon
One of the things that I love most about my job is that I get to help educate, coach,
and develop others on topics such as continuous improvement and data analysis.
In that capacity, one of the most frequently seen challenges is that team members
and managers want to react to every data point. Their intentions are noble –
but doing so is almost always an unnecessary exercise since these
variations are a normal part of how the process behaves.
I’ve used lots of different examples to illustrate this point,
but few seemed to resonate deeply with them and get them...
Basketball Statistics Question:
How Important Is a Team's "Momentum"
Heading into the NCAA Tournament?
It’s March, which means it’s the time of year when the country's sports fans
focus their gaze upon college basketball. And since there are still a few weeks
until the brackets come out, people will be trying to determine which teams
are poised for a deep run in the tournament. One of the criteria people use
to determine a team's potential is “momentum.” Everybody says you want
your team to be “peaking at the right time.” But is this really important?
We just saw the Baltimore Ravens win the Super Bowl despite losing
4 of their final 5 regular-season games.
What Makes Great
Presidents and Good Models?
If the title of this post made you think you’d be reading about
Abraham Lincoln and Tyra Banks, you’re only half right.
A few weeks ago, statistician and journalist Nate Silver published
an interesting post on how U.S. presidents are ranked by historians.
Silver showed that the percentage of electoral votes that a U.S. president
receives in his 2nd term election serves as a rough predictor
of his average ranking of greatness.
Here’s the model he came up with, which I’ve duplicated in Minitab using
the scatterplot with regression and groups (Graph > Scatterplot ):
My Work in Statistics: Developing
New Tools for Analyzing Data
In honor of the International Year of Statistics, I interviewed Scott Pammer,
a technical product manager here at Minitab Inc. in State College,
Pa. Scott works to develop new product concepts and the accompanying
prototypes and business plans.
Before taking on the role of technical product manager, Scott worked for
Minitab as a senior statistician. In this role, he designed and
programmed various features in Minitab Statistical Software.
He’s been with Minitab since 1995.
What was your journey to becoming a statistician?
How to Win an Oscar
(If You Misunderstand Statistics)
Statistician-to-the-Stars William Briggs deserves credit for his correct prediction
of the Best Picture Oscar the day before the ceremonies. And while
Mr. Briggs would never encourage anyone to misuse his model this way,
I feel my statistics heartstrings strummed by the desire to remind everyone
about a particular common and dangerous statistical mistake:
Correlation does not = causation.
Mr. Briggs correctly predicted Argo would be selected as Best Picture from
among the nominated films and noted that "The key reasons for
its victory will be: the lead actor is at least forty, the other...
Using Data Analysis to Assess Fatality
Rates in Star Trek: The Original Series
I’m a Star Trek fan and a statistics fan. So, I’m thrilled to finally have the
opportunity to combine the two into a blog post! In the original Star Trek
series with Captain Kirk, the crew members of the U.S.S. Enterprise who
wear red shirts have a reputation for dying more frequently than those who
wear blue or gold shirts. Wearing a red shirt appears to be the kiss of death!
In this blog, we’ll conduct several hypothesis tests to determine whether this is true.
Matthew Barsalou published an article in Significance that studies
this from a statistical perspective. Barsalou is also a guest...
Why the Weibull Distribution
Is Always Welcome
In college I had a friend who could go anywhere and fit right in.
He'd have lunch with a group of professors, then play hacky-sack with
the hippies in the park, and later that evening he'd hang out with the
local bikers at the toughest bar in the city. Next day he'd play pickup football
with the jocks before going to an all-night LAN party with his gamer pals.
On an average weekend he might catch an all-ages show with the
small group of straight-edge punk rockers on our campus,
or else check out a kegger with some townies,
then finish the weekend by playing some D&D with his friends from the...
Build a DIY Catapult for DOE
(Design of Experiments), part 1
I needed to find a way to perform experiments to practice using design of
experiments (DOE), so I built a simple do-it-yourself (DIY) catapult.
The basic plan for the catapult is based on the table-top troll catapult from
My catapult is not as attractive as the troll catapult; my goal was to build a
catapult with multiple adjustable factors—and not to lay siege to a castle—
so I don’t mind the rough appearance of my catapult.
The frame consists of two pieces of 40 cm x 4 cm x 2 cm wood, two...
Build a DIY Catapult for DOE
(Design of Experiments), part 2
In my last post, I shared my plans for building a simple do-it-yourself
catapult for performing experiments to practice using design of experiments (DOE).
That's the completed catapult there on the right.
If you want to build your own, here are my plans and instructions in a PDF.
Now that my catapult is built, I have one last step to complete: to find
the optimal catapult setting using DOE, which I'll do with Minitab
(If you'd like to follow along but don't already have it,
please download the 30-day free trial of Minitab.)
Gage Study With One Part
Recently, Minitab News featured an article that talked about how to perform
a Gage R&R Study with only one part. This prompted many users to contact
our technical support team with questions about next steps, like these:
What can I do with the output of a Gage study with only one part?
How can I use the variance component estimates to obtain meaningful
information about my measurement system?
By themselves, the variance component estimates from the ANOVA
output for a Gage study with just one part are not particularly useful.
However, if we combine what we’ve learned about the variance for...
Why Isn't This "Six Sigma"
Project Improving Quality?
Whether you're a quality improvement veteran or you're just starting to do
research about what quality improvement methods are available today,
you've seen headlines and articles that explain why Six Sigma and other
data-driven quality improvement methods don't work.
Typically these pieces have an attention-grabbing headline, like Six Sigma
Initiative Fails to Save the Universe, followed by a dissection of a deployment
or project that failed—usually in spectacular fashion—to achieve its goals.
"There!" the writer typically crows. "See? It's obvious
Six Sigma doesn't work!" What makes these...
Using Minitab to Choose the Best
Ranking System in College Basketball
Life is full of choices. Some are simple, such as what shirt to put on in the morning
(although if you’re like me, it’s not so much of a “choice” as it is throwing on the
first thing you grab out of the closet). And some choices are more complex.
In the quality world, you might have to determine which distribution to
choose for your capability analysis or which factor levels to use to bake
the best cookie in a design of experiments. But all of these choices
pale in comparison* to the most important decision you have to make each year:
which college basketball teams to pick during March...
Predicting the 2013
NCAA Tournament with Minitab
Did everybody have a good Selection Sunday? Hopefully you did, and now
you’re ready to jump into the brackets. And just like last year, Minitab is
here to help you along! But first we have to wait for EPSN to stop yelling
about who the 68th best team in the country is. I mean, honestly,
you think they would have learned their lesson two years ago
when they were adamant about what a travesty it was that VCU got
an at-large bid. You know, the same VCU team that then went to the final four.
Maybe next year Virginia should try not losing to Delaware at home.
Oh, what’s that? Dick Vitale finally...
Rethinking the Obvious: How Data Analysis
and Diagrams Can Upend Conventional Wisdom
Has it happened to you?
You organize a brainstorming session to begin analyzing your process.
At the kick-off meeting, several people sit with arms crossed, lips pursed,
eyes cast downward. Frequently, they’re the ones who’ve worked at the
process for most of their professional lives.
“Here we go again. Wasting time to prove the obvious,” their faces say. “I’ve done
my job for years. You’re not going to show me anything I don’t already know.”
Yet you bravely push forward. Every now and then you see someone roll their
eyes. “When can I get back to my desk and do some real work?!!!” they seem to...
If You Don't Try Minitab's Project
Manager, You'll Hate Yourself Later
Normally, I like to talk about fun statistical things to build your confidence:
gummi bears, poetry, and movies, just to name a few. But building your
confidence also means getting comfortable with Minitab Statistical Software.
One of the features that makes it easy to view your results and data in a
snap is Minitab's Project Manager.
My favorite way to use the Project Manager is through the toolbar:
Click the leftmost button once, and you see all of the output in your project.
Click the second button once, and you see all of the worksheets in your project.
Click the third button once, and...
Processes with Lean Six Sigma
I had the privilege of talking with Sue Schlegel, Lean Six Sigma black belt and
quality improvement mentor at White Sands Missile Range, which is located
just outside of Las Cruces, New Mexico. Schlegel and an improvement
team at White Sands recently conducted a Lean Six Sigma project to
streamline surveillance processes and they used Minitab to analyze the data.
We found Sue’s story an interesting case study for a LSS project,
so I thought I would share it with you here on the blog, too.
Reducing Work Hours
When clients request classified video surveillance missions, the White Sands Missile...
Choosing Statistical Software:
Four Questions You Should Ask
Data. Analysis. Statistics. It seems like everybody is talking about the importance
of doing data analysis, whether it's analytics for predicting consumer behavior
or looking at critical metrics for Six Sigma and other data-driven quality
improvement programs. Not only do we have more data available to us
than ever before, we're also blessed...and/or cursed...with an enormous
range of software options to help us make sense out of all this data
we're trying so hard to understand.
Your options for doing data analysis run the gamut—from a pencil,
paper and calculator costing a couple of bucks...
Getting Started with Factorial
Design of Experiments (DOE)
When I talk to quality professionals about how they use statistics, one tool
they mention again and again is design of experiments, or DOE. I'd never
even heard the term before I started getting involved in quality improvement
efforts, but now that I've learned how it works, I wonder why I didn't learn
about it sooner. If you need to find out how several factors are affecting
a process outcome, DOE is the way to go.
Somewhere in school you probably learned, like I did, that when
you do an experiment you need to hold all the factors constant
except for the one you're studying. That seems simple...
When is Easter . . .
for the next 2086 years?
Spring is in the air, and Easter is coming up soon! Easter occurs on
March 31, 2013, and I’ve heard people exclaim that it’s early this year.
I never really remember the date of Easter from one year to the next,
but I had vague memories of it being in March not too long ago.
Like any good statistician, I started wondering about the distribution
of Easter dates. What dates are more common and which are less common?
Is Easter in March really that unusual?
Even after reading the official definition of when Easter occurs,
I still wasn’t clear about the date range. Easter occurs on the Sunday that...
Real-life Data Analysis: How Many Licks
to the Tootsie Roll Center of a Tootsie Pop?
Almost all of us have tried a Tootsie Pop at some point. I’m willing to bet
that most of us also thought, “I wonder how many licks it does take to get to
the center of the Tootsie Pop?” If you haven’t wondered about this, here’s
the classic commercial that may get you more curious:
Personally, I was not very satisfied with the owl's answer of “3,”
so I decided to continue the little boy’s quest to find the number
of licks required to reach the center of a Tootsie Pop.
Looking around the ‘net, I found that other studies done by student researchers at various...
What Statistical Software Should You
Choose: Three More Critical Questions
Earlier I wrote about four important questions you should ask if you're looking at
using statistical software to analyze data in your organization, especially
if you're hoping to improve quality using methods like Six Sigma.
But there are other points to consider as well. If you're in market
for statistical software, be sure to investigate these questions, too!
What Types of Statistical Analysis Will They Be Doing?
The specific types of analysis you need to do could play a big part in
determining the right statistical software for your organization.
The American Statistical Association's softwa...
The Glass Slipper Story: Analyzing the
Madness in the 2013 NCAA Tournament
Cinderella showed up early and often during the first weekend of the 2013
NCAA Tournament. Florida Gulf Coast stole the show with their glass slippers,
becoming the first ever 15 seed to reach the Sweet 16. But don’t let that
overshadow what happened in the West Region: Wichita St and La Salle
both arrived in a pumpkin-turned-carriage, and now the Shockers are a
game away from the Final Four! And don’t forget about Harvard just because
the clock struck midnight on them first. They were at the ball, too! Madness indeed.
In the world of statistics, we have another word for this “madness.” It’s...
How to Prove You're (a)
Case Using Statistics
I enjoy using Minitab Statistical Software to uncover the vast causal
relationships unfolding in the universe all around me.
What kind of novel things have I proven with Minitab?
Almost anything you can imagine, mon petite shoe.
For example, the fitted line plot below clearly shows one thing:
it’s time for our political parties to stop all the bickering and finally give
Americans what we really want… …a much taller president!
(See the dot way up at the top of the plot?
That’s George Washington, the Father of our Country. He was one of...
My Job as a Minitab Statistician
In honor of the International Year of Statistics, I interviewed Rob Kelly,
a senior statistician here at Minitab Inc. in State College, Pa.
Rob designs features you see in Minitab Statistical Software, and the focus
of much of his work is on Design of Experiments. Check out what Rob
had to say about how he became interested in pursuing a career in statistics.
1. What kind of reaction do you get when you tell people
you are a statistician?
I find that a lot of people have some experience with statistics,
but I usually get a couple of different reactions.
There are the people who immediately tell...
Use Minitab to Graph
I have a good time putting together simple data sets that you can use to
build your confidence in statistics. But I tend to like fairly old things:
Shakespeare (1564-1616), Poe (1809-1849) and gummi bears
(invented 1922). But I have some modern interests too. One of those,
appearing in about 2009, is Minecraft.
If you like Minecraft, then here’s a data set that you can use to practice
a few things in Minitab Statistical Software. One of the nicest things about
Minitab is that even with this spreadsheet, saved in Googledocs,
you can copy and paste directly into Minitab.
Great Presidents Revisited: Does
History Provide a Different Perspective?
Recently, Patrick Runkel blogged about using regression models to explain
how historians ranked the U.S. presidents. Given that I both love regression
and that I’ve written about using regression to predict U.S. presidential elections,
I wanted to take Patrick up on his challenge to improve upon his model.
My goal isn’t merely to predict the eventual ranking for any President. Instead,
I’m much more interested in a fascinating question behind this analysis.
Is the public’s contemporary assessment of the president consistent with
the historical perspective, or do they differ?
With this in mind,...
Lightsaber Capability Analysis:
Here at the lightsaber factory, we've completed several
steps in doing a capability analysis:
We’re getting close to our deadline, and it’s finally time to carry out our
Capability Analysis and see if we are manufacturing our lightsabers
to the correct specifications as set forth by the Jedi Temple.
First, let’s go to Stat > Quality Tools > Capability Analysis > Normal.
Learning Process Capability
Analysis with a Catapult, part 1
We can use a simple catapult to teach process capability analysis using
Minitab Statistical Software’s Capability SixpackTM. Here's how.
A process capability analysis is performed to determine if a process
is statistically capable. Based on the results of the capability study,
we can estimate the amount of defective components the process would produce.
However, a process must be in statistical control and have a normal distribution.
A process that is not in statistical control must be brought in control
before the capability analysis is performed. In addition,...
Learning Process Capability
with a Catapult, part 2
Process capability analysis using Minitab Statistical Software’s Capability
SixpackTM can be taught using a catapult. A process capability analysis
is performed to determine if a process is statistically capable.
In my last blog post, I collected data from a first run of catapult results
and found that the run not only had a large amount of variability,
it also violated the assumption of normality. Now it's time to do a second run.
The Second Run and Capability Analysis
A second run was performed using thicker and more robust wire
to stretch the rubber band;...
Status Reports: Reduce Your
Cycle Time using Minitab Macros!
Across all industries, there are many different ways professionals utilize
Minitab Statistical Software to improve the quality of their products and services.
You may be a professional in the health care industry who is interested
in monitoring the days between hospital-acquired infections (HAIs) using
rare event charts. You may be a professional in manufacturing who is
using a Pareto chart to evaluate the types of defects you are discovering
during the inspection process. Or you may be a product manager like me
who uses Minitab analytics to evaluate customer survey data.
But no matter the...
Lean, Six Sigma,
or Lean Six Sigma?
Due to recent comments on this blog post (scroll down to view the
comments section), I want to acknowledge that the definition of Lean
in this post is incomplete. The goal of this post wasn't to offer definitions
of Lean, Six Sigma, or any other methodology, but was rather to
state that the focus of improvement efforts should be on using all the
available tools, whether those be Lean or Six Sigma tools or both, to make
the necessary improvements. Thank you to those who left comments and
opinions. I appreciate your viewpoints and discussion on this topic. -Carly Barry
When I first started working...
The Top 10 (Statistically) Craziest Things that
Happened in the 2013 NCAA Tournament
In my previous blog post, I analyzed the madness in this year’s
NCAA tournament for games through the Sweet 16. I found that
it was one of the wackiest Sweet 16s ever. But things didn’t stop there—
the Final Four was pretty crazy, too, having two 4 seeds and a 9 seed!
So now that the tournament is over, I want to look back and see what were
(statistically speaking) the most unlikely things to have occurred.
Was it Florida Gulf Coast in the Sweet 16? Or Wichita State in the Final Four?
What about Wisconsin’s horrible shooting performance?
Let’s start analyzing the statistics to find out.
Benthic Invertebrates Gone Wild!
Using a Survey of Aquatic Bugs to Estimate Stream Quality
As we click, flip, and scroll through hundreds of sites and channels,
cruising for our daily dose of e-thrills, it’s easy to forget there’s a beautiful,
wild, creative universe right in our backyards.
I had the chance to experience a tiny part of that universe on a recent Saturday
afternoon, when a couple of friends, Yolanda and Monika, asked me if I wanted
to join them to monitor the water quality of the stream that runs in back of our house.
Yolanda and Monika are part of a large grassroots network
of volunteers who selflessly give their...
Enough Is Enough! Handling
Multicollinearity in Regression Analysis
In regression analysis, we look at the correlations between one or more
input variables, or factors, and a response. We might look at how baking
time and temperature relate to the hardness of a piece of plastic,
or how educational levels and the region of one's birth relate to annual income.
The number of potential factors you might include in a regression
model is limited only by your imagination...
and your capacity to actually gather the data you imagine.
But before throwing data about every potential predictor under the sun
into your regression model, remember a thing called multicollinearity...
When Should I Use Confidence Intervals,
Prediction Intervals, and Tolerance Intervals
In statistics, we use a variety of intervals to characterize the results.
The most well-known of these are confidence intervals.
However, confidence intervals are not always appropriate.
In this post, we’ll take a look at the different types of intervals that are
available in Minitab, their characteristics, and when you should use them.
I’ll cover confidence intervals, prediction intervals, and tolerance intervals.
Because tolerance intervals are the least-known,
I’ll devote extra time to explaining how they work and
when you’d want to use them.
Positive in a Time of Tragedy
My holy of holies is the human body, health, intelligence, talent, inspiration,
love, and the most absolute freedom imaginable, freedom from violence and lies,
no matter what form the latter two take.
Normally, I write about subjects that are generally of interest to me when
I do a blog post: you may have seen the list before. So I’ll have to start
out this post with the admission that until I was leaving the office on Monday,
I had no idea that the Boston Marathon was going on that day.
I had no clue that the estimate of the number of spectators would be over...
Explaining Quality Statistics So My Boss Will
Understand: Measurement Systems Analysis (MSA)
As a teenaged dishwasher at a local eatery, I had a boss who'd never washed
dishes in a restaurant himself. I once spent 40 minutes trying to convince
him that forks and spoons should go in their holders with the business end up,
while knives should go in point-down. Whatever I said, he didn't get it.
We were ordered to put forks and spoons in the holders with the handles up.
The outraged wait staff soon made clear what I hadn't: you can't immediately
tell the difference between a fork and a spoon when all you can see is the handle!
Explaning that in the right way would have minimized wasted...
Using Binary Logistic Regression to
Investigate High Employee Turnover
Human resources might not be a business area where you’d typically expect
to conduct a Six Sigma project. However, Jeff Parks, Lean Six Sigma master
black belt, found the opportunity to apply Six Sigma to human resources while
leading quality improvement efforts at a large manufacturer of aerospace engine parts.
The manufacturer was suffering from high employee attrition, or turnover,
and struggled to understand why. With a DMAIC Six Sigma project,
Parks set out to work with the HR department to investigate
and reduce the high turnover rates.
In 2009, the manufacturer had normal attrition rates...
Nonparametrics & Symmetry Plots
“Shall I compare thee to a standard normal distribution?
Thou art more symmetric and more bell-shaped…” — Melvin Shakespeare
(William’s lesser-known statistician brother) The Greek philosopher Aristotle
believed that symmetry was one of the primary elements of the universal ideal
of beauty. Over 2000 years later, emerging research seems to bear him out.
Studies suggest we tend to be more attracted to people with symmetrical bodies.
Using motion-capture technology to record the movements of people dancing to
a popular song, one recent study concluded that we even prefer those who dance...
Leveraging Designed Experiments
(DOE) for Success
You know the drill…you’re in Six Sigma training and you’re learning
how to conduct a design of experiment (DOE). Everything is making sense,
and you’ve started thinking about how you’ll apply what you are learning
to find the optimal settings of a machine on the factory floor. You’ve even
got the DOE setup chosen and you know the factors you want to test …
Then … BAM! … You’re on your own and you immediately have
issues analyzing the data. The design you’ve chosen might actually
not be the best for the results you need. It's a classic case of learning
something in theory that becomes much more...
Understanding Alpha Alleviates Alarm
One of the more misunderstood concepts in statistics is alpha,
more formally known as the significance level. Alpha is typically set before
you conduct an experiment. When the calculated p-value from a hypothesis
test is less than the significance level (α), the results of an experiment are so
unlikely to happen by chance that the more likely explanation is the results occur
because of the effect being studied. That the results are unlikely to happen
by chance is what we mean by the phrase “statistical significance,”
not to be confused with practical significance.
There was a wonderful example...
What Are the Effects of Multicollinearity
and When Can I Ignore Them?
Multicollinearity is problem that you can run into when you’re fitting a regression
model, or other linear model. It refers to predictors that are correlated with other
predictors in the model. Unfortunately, the effects of multicollinearity can feel
murky and intangible, which makes it unclear whether it’s important to fix.
My goal in this blog post is to bring the effects of multicollinearity
to life with real data! Along the way, I’ll show you a simple
tool that can remove multicollinearity in some cases.
My goal in this blog post is to bring multicollinearity to life with real data about...
Control Charts: Rational Subgrouping
and Marshmallow Peeps!
Control charts are used to monitor the stability of processes,
and can turn time-ordered data for a particular characteristic—
such as product weight or hold time at a call center—into a picture
that is easy to understand. These charts indicate when there are points
out of control or unusual shifts in a process.
Statistically speaking, control charts help you detect nonrandom
sources of variation in the data. In other words, they separate variation
due to common causes from variation due to special causes, where:
Explaining Quality Statistics So Your
Boss Will Understand: Pareto Charts
I once had a boss who had difficulty understanding many, many things.
When I need to discuss statistical concepts with people who don't
have a statistical background, I like to think about how I could
explain things so even my old boss would get it.
My boss and I shared a common interest in rock and roll, so that's the
device I'll use to explain one of the workhorses of quality statistics,
the Pareto chart. I'd tell my boss to imagine that instead of managing
a surly gang of teenaged restaurant employees, he's managing a surly
rock and roll band, the Zero Sigmas. The band did a 100-date tour...
Talking Design of Experiments (DOE)
and Quality at the 2013 ASQ World Conference
The 2013 ASQ World Conference is taking place this week in Indianapolis,
Indiana, and it's been a treat to see how our software was used in
the projects highlighted in many of the presentations. As a supporter
of the conference, a key event for quality practitioners around the world,
Minitab was proud to sponsor one of the presentations that
seemed to get a lot of attendees talking. Scott Sterbenz,
a Six Sigma leader from Ford Motor Company, delivered a presentation
entitled "Leveraging Designed Experiments for Success,"
which explained how to make designed experiments succeed with examples...
How to “Expand” Your Gage Studies
As we said in yesterday’s post, it’s been exciting for Minitab to be a
supporter of the ASQ World Conference on Quality and
Improvement taking place this week in Indianapolis.
There have been many great sessions and an abundance
of case studies shared that highlight how quality teams worldwide
are improving the performance of their businesses.
One session that generated a lot of interest from the conference
participants was conducted by Minitab trainers Lou Johnson,
Daniel Griffith and Jim Colton.
Their presentation, Sampling Plan for Expanded Gage R&R Studies,
covered Gage R&R studies and how...
The Diversity (and Consistency) of Quality
Improvement: the 2013 ASQ ITEA Presentations
I'm in the airport at Indianapolis, waiting to go home after three exciting days
at the 2013 American Society for Quality World Conference.
As I write this, it's Wednesday evening after the conference has closed,
and it turns out my flight has been delayed.
This could give me ample opportunity to muse about the quality issues
that might keep me from reaching central Pennsylvania tonight.
But I'm kind of pumped up, so I'm more interested in thinking about
what I've experienced and seen over the past few days. This is the
kind of event that makes you want to keep focusing on the positive, not...
Which Big Ten Division is Better?
After another round of what seems like endless conference realignment,
the Big Ten has settled on 14 teams split into two divisions; East and West.
However, with the likes of Ohio State, Penn State, Michigan,
and Michigan State, the East division appears to be much stronger.
In fact, Indiana athletic director Fred Glass called it the “Big Boy Division,”
and Penn State coach Bill O’Brien referred to it as “Murderers' Row.”
But will the statistics back up their claims? After all, it’s easy to
spout off any opinion you want. I could claim that the Sun Belt
is a better football conference than the...
Get Your Way, Every Time: 7 Default Settings
in Minitab You Didn’t Know You Could Change
Unless you’re 3 years old, you probably can’t have
things just the way you want them all the time.
You can’t always have peanut butter and ranch dressing on your toast.
Or ketchup on your pineapple. Or sugar sprinkles on your peas.
But there is one small arena in life over which you can still exert your control.
Tools > Options in Minitab's statistical software allows you to change selected
default settings in the software, without having to throw a temper tantrum first.
This powerful, underutilized feature in Minitab may save you from the
inconvenience of having to change a default setting...
Using Games to Teach Statistics
We usually think of games as a distraction—just something we do for fun.
However, growing evidence suggests that games can do much more,
especially when it comes to learning in a classroom setting.
Because statistics is a topic that doesn’t come easily to most,
using properly designed games to teach statistics can become a valuable
tool to spark interest and help explain difficult concepts.
So what kinds of “properly designed” games are we talking about here?
Not traditional board games like Monopoly or Chutes and Ladders,
but interactive computer games—the types of games younger generations...
Planning Summer Fun
with Decision Matrix Tools
Normally, I tell you about ways to practice with Minitab Statistical Software
so that you can boost your confidence with statistical analysis. But over the
last few days in my house, we’ve been planning some activities for the family.
That planning has given me a chance to have some fun with Quality Companion.
Quality Companion is a substantial piece of software: everything that you need
to manage a quality improvement project in one application.
Quality Companion provides project management tools
so that you can make and communicate decisions.
My favorite tools in Quality Companion, with...
Expanding the Role of Statistics to Areas
Traditionally Dominated by Expert Judgment
Should this doctor consult a regression model?
In a previous post, I wrote about how the field of statistics is more important
now than ever before due to the modern deluge of data. Because you’re reading
Minitab's statistical blog, I’ll assume that we’re in agreement that statistics
allows you to use data to understand reality. However, I’d also bet that you’re
picturing important but “typical” statistical studies, such as studies where
Six Sigma analysts determine which factors affect product quality.
Or perhaps medical studies, like determining the effectiveness of flu shots.
In this post,...
Explaining Quality Statistics So Your Boss
Will Understand: Weighted Pareto Charts
Failure to properly calibrate this machine will result in defective rock and roll.
In my last post, I imagined using the example of a rock and roll band --
the Zero Sigmas -- to explain Pareto charts to my music-loving but
statistically-challenged boss. I showed him how easy it was to use a Pareto
chart to visualize defects or problems that occur most often, using the example
of various incidents that occurred on the Zero Sigmas last tour.
The Pareto chart revealed that starting performances late was far and away
the Zero Sigmas' most frequent "defect," one that occurred every single night of...
Lean Six Sigma in Healthcare:
Improving Patient Satisfaction
For providers like Riverview Hospital Association, serving Wisconsin Rapids,
Wis. and surrounding areas, recent changes in the U.S. healthcare
system have placed more emphasis on improving the quality of
care and increasing patient satisfaction. “In this era of healthcare reform,
it is even more essential for providers to have a systematic method to
improve the way care is delivered,” says Christopher Spranger, director of
Lean Six Sigma and Quality Improvement at Riverview Hospital Association. “
We have had a Lean Six Sigma program in place for four years,
and we are continuously working on...
Has Your Minitab Had Its V8?
Have you ever seen those commercials where people are walking
at a slant because they haven't had their V8?
I was reminded of these ads recently when I had the opportunity
to visit one of our Minitab customers in Tampa, Fl. During the visit,
I presented a seminar on some Advanced Minitab Tips and Tricks*
using the same content we have presented in some of our free webinars.
One of the very first scenarios in the presentation walks through data
cleanup in your Minitab worksheet. As I started, I literally had to stop
the class to address the commotion this topic kicked up on one side of the...
Will the Weibull Distribution
Be on the Demonstration Test?
Over on the Indium Corporation's blog, Dr. Ron Lasky has been sharing
some interesting ideas about using the Weibull distribution in
electronics manufacturing. For instance, check out this discussion of how
dramatically an early first-failure can affect an analysis of a part or component
(in this case, an alloy used to solder components to a circuit board).
This got me thinking again about all the different situations in which
the Weibull distribution can help us make good decisions.
The main reason Weibull is so useful is that it's very flexible in fitting
different types of data, because it...
A Mommy’s Look at
Lyme Disease Statistics…
I spend a majority of my time entrenched in statistics. Using statistics.
Studying statistics. Developing and testing statistical software. Statistics guide
many of my decisions at work and in life. That’s the world of an engineer.
For this reason, you can imagine my surprise when my husband called me
at work on a bright, sunny June day in 2009 to tell me that our 4-year-old
daughter had been diagnosed with Lyme disease. That, to me,
seemed completely improbable. We live in a development in suburbia.
Our children don’t play deep in the woods. We don’t hike in the woods.
In accordance with the...
The Curious (Statistical)
Case of Marc-Andre Fleury
The Pittsburgh Penguins are in the midst of another Stanley Cup playoff run.
With a 3-1 lead over the Ottawa Senators, they are a mere 1 game away
from their 3rd Eastern Conference Final in 6 years. But it looks like
they will do so without starting goalie Marc-Andre Fleury.
After a string of disappointing playoff games, Fleury has been benched
and netminder Tomas Vokoun has been guarding the goal. And Vokoun
is playing so well that it doesn’t look like Fleury will see the ice anytime soon.
So what does this have to do with statistics?
Well, Fleury’s statistics tell the story of why he is on...
No Matter How Strong,
Correlation Still Doesn't Imply Causation
There's been a really interesting conversation about correlation and causation
going on in the LinkedIn Statistics and Analytics Consultants group.
This is a group with a pretty advanced appreciation of statistical nuances and
data analysis, and they've been focusing on how the understanding of
causation and correlation can be very field-dependent. For instance, evidence
supporting causation might be very different if we're looking at data from a clinical
trial conducted under controlled conditions as
opposed to observational economic data.
Contributors also have been citing some pretty...
Family Democracy, Summer Fun,
and the Ballot
Previously I wrote about using a decision matrix to help make a decision.
Matrices are nice tools for collecting your thoughts and visualizing a decision.
But complex decisions could involve collecting and synthesizing
input from a number of different people.
Quality Companion (Minitab's process improvement software)
uses ballots to let team members record their input to a decision matrix.
If you’ve already made the matrix, setting up the ballot is easy.
The ballot simplifies data collection and organization, even among
team members who are dispersed in space and time. You can follow along in...
Regression Analysis: How Do I Interpret
R-squared & Assess the Goodness-of-Fit?
After you have fit a linear model using regression analysis, ANOVA, or
design of experiments (DOE), you need to determine how well the model fits the data.
To help you out, Minitab statistical software presents a variety of goodness-of-fit statistics.
In this post, we’ll explore the R-squared (R2 ) statistic, some of its limitations,
and uncover some surprises along the way. For instance, low R-squared values
are not always bad and high R-squared values are not always good!
What Is Goodness-of-Fit for a Linear Model?
Definition: Residual = Observed value - Fitted value
6 Simple Everyday Efficiency Tips
You Can Learn From Six Sigma
While it has been called the "million-dollar methodology" for the significant
investment sometimes required to deliver results, Six Sigma has
a wealth of practices that can be adapted to small and medium industries,
home businesses and even personal finances.
Organizations have used Six Sigma as a reliable part of the
quality improvement process since 1986. And while a large
Six Sigma project could cost anything from $1,000 to $1 million
in work-hours and other resources, the results of such projects
often far outweigh the investment. In addition to the direct...
Normal: The Kevin Bacon of Distributions
When you learned statistics, most of what you learned was
centered around the Normal distribution. Maybe you became close
friends and you later found out his birth name was Gaussian,
but either way you probably just call him Normal.
You might know Normal’s a pretty popular guy with plenty of relationships
with other distributions. There are some obvious connections,
like how eNormal is Lognormal, but I thought I’d share some less obvious ones.
You probably already know that by subtracting his mean and
dividing by his standard deviation you get Standard Normal.
What if you squared Standard...
Lean Six Sigma in the Classroom: Preparing
Students for Careers in Quality Improvement
I recently had the opportunity to talk with Ken Jones,
professor of operations and supply chain management at Indiana State University,
about a business process improvement course he teaches at the university.
The course covers a variety of Lean Six Sigma tools and techniques
and gives students the opportunity to team with local businesses to complete
real quality improvement projects. Upon successful completion of the class,
students even become certified green belts.
One item we talked about was how valuable the experiential component
of the projects can be for students, especially...
Studying Old Dogs with New Statistical Tricks:
Bone-Cracking Hypercarnivores and 3D Surface Plots
A while back my colleague Jim Frost wrote about applying statistics to decisions
typically left to expert judgment; I was reminded of his post this week when
I came across a new research study that takes a statistical
technique commonly used in one discipline, and applies it in a new way.
The study, by paleontologist Zhijie Jack Tseng,
looked at how the skulls of bone-cracking carnivores--modern-day
hyenas--evolved. They may look like dogs, but hyenas in fact are more
closely related to cats. However,
some extinct dog species had skulls much like a hyena's.
Tseng analyzed data from 3D...
Studying Old Dogs with New Statistical Tricks
Part II: Contour Plots and Cracking Bones
Yesterday I wrote about how paleontologist Zhijie Jack Tseng
used 3D surface plots created in Minitab Statistical Software to
look at how the skulls of hyenas and some extinct dogs
with similar dining habits fit into a spectrum of possible skull forms
that had been created with 3D modelling techniques.
What's interesting about this from a data analysis perspective is how
Tseng took tools commonly used in quality improvement and engineering and
applied them to his research into evolutionary morphology.
We used Tseng's data to demonstrate
how to create and explore 3D surface plots yesterday, so...
The Lottery, the Casino,
or the Sportsbook: What’s Your Best Bet?
New Jersey Gov. Chris Christie is currently in a battle with sports leagues over
the issue of allowing sports betting at casinos in Atlantic City and horse
racing tracks across the state. If he wins and sports betting becomes legal
in New Jersey, it will open the door for other states to follow suit. It appears there
is a long way to go before this form of gambling spreads across the country.
But is sports betting really so much worse than casinos (which are
legal in just under half of all U.S. states) or the lottery
(which is legal in almost every U.S. state)? For the purposes of this...
What Is a t-test? And Why Is It Like Telling
a Kid to Clean Up that Mess in the Kitchen?
A t-test is one of the most frequently used procedures in statistics.
But even people who frequently use t-tests often don’t know exactly what happens
when their data are wheeled away and operated upon behind
the curtain using statistical software like Minitab.
It’s worth taking a quick peek behind that curtain.
Because if you know how a t-test works, you can understand what your
results really mean. You can also better grasp why your
study did (or didn’t) achieve “statistical significance.”
In fact, if you’ve ever tried to communicate with a distracted teenager,
you already have experience with...
The Lottery, the Casino, or the Sportsbook:
Simulating Each Bet in Minitab
I previously started looking into which method of gambling was your best bet:
a NFL bet, a number on a roulette wheel, or a scratch-off lottery ticket.
After calculating the expect value for each one, I found out that the NFL
bet and roulette bet were similar, as each had an expected value
close to -$0.50 on a $10 bet. The scratch-off ticket was much worse,
having an expected value of -$2.78.
But I want see how each of these games could play out in real life.
After all, it is possible for people to come out ahead playing each game.
So I planned to take 300 people, split them into 3 groups (one...
The Lottery, the Casino, or the
Sportsbook: Who Came Out Ahead?
Have you heard about the Tennessee man who has 22 children to
17 different women? He was interviewed the other day, and when asked
how he supports all his kids he was quoted as saying:
"I'm just hoping one day I'll get lucky and might scratch off the numbers or
something. I play the hell out of the Tennessee lottery."
Well, what would it look like if a person really did play "the hell" out of the lottery?
Say you spent a year buying one $10 scratch-off ticket each day.
How likely would you be to come out ahead? And for that matter,
how would the lottery compare to making a $10 sports bet or a...
Multiple Regression Analysis: Use
Adjusted R-Squared and Predicted R-Squared
to Include the Correct Number of Variables
Multiple regression can be a beguiling, temptation-filled analysis.
It’s so easy to add more variables as you think of them, or just because
the data are handy. Some of the predictors will be significant.
Perhaps there is a relationship, or is it just by chance?
You can add higher-order polynomials to bend and twist that fitted line
as you like, but are you fitting real patterns or just connecting the dots?
All the while, the R-squared (R2) value increases, teasing you,
and egging you on to add more variables!
Previously, I showed how R-squared can be misleading when you assess the...
Did the NBA Finals End Tuesday?
My family moved to Los Angeles in 1987, just as the Los Angeles
Lakers were in the midst of winning back-to-back championships.
While I don’t consider myself a huge basketball fan,
the NBA finals always hold some interest for me. If you get to
watch James Worthy, Michael Cooper, Byron Scott, A.C. Green,
Magic Johnson, and Kareem Abdul-Jabbar win championships,
it sticks with you.
So now that the Spurs and Heat are competing
for the 2013 edition of the NBA championships,
I get a little drawn in by the excitement. One of the interesting
occurrences from this year’s finals is that the two teams...
Quality Improvement in Healthcare:
Completing Projects with DMAIC
The DMAIC methodology for completing quality improvement projects
divides project work into five phases: define, measure, analyze, improve,
and control. It’s also probably the most well-known and most used project
methodology for projects that focus on improving an existing process.
(Many other methodologies exist, such as DMADV, which focuses on using
quality improvement techniques to create a new product of process design.)
Franciscan Hospital for Children, a hospital in Brighton, Mass.,
that specializes in the care of children with special health care needs,
recently completed a project...
How to Create and
Read an I-MR Control Chart
When it comes to creating control charts, it's generally good to collect
data in subgroups, if possible. But sometimes gathering subgroups of
measurements isn't an option. Measurements may be too expensive.
Production volume may be too low. Products may have a long cycle time.
In many of those cases, you can use an I-MR chart. Like all control charts,
the I-MR chart has three main uses:
Monitoring the stability of a process. Even very stable processes have
some variation, and when you try to fix minor fluctuations in a process you can
actually cause instability. An I-MR chart can alert you to...
LeBron vs. Jordan:
Is There a Comparison Yet?
LeBron James has just captured his 2nd NBA Championship in as many years,
and has secured himself a place as one of the greatest basketball
players of all time. And he even did so by overcoming
the “Winner of Game 3 wins the series 92% of the time” odds.
With the victory, there is a 99% chance the
“LeBron is a choker and can’t win the big one” narrative is dead and gone
(I say 99% because I’ll never underestimate the ability of Skip Bayless
to find a new way to beat a dead horse). But that means that there is
another narrative that is going to start being thrown around.
Seven Basic Quality Tools to
Keep in Your Back Pocket
Here are seven quality improvement tools I see in action again and again.
Most of these quality tools have been around for a while,
but that certainly doesn’t take away any of their worth!
The best part about these tools is that they are very simple to use
and work with quickly in Minitab Statistical Software or Quality Companion,
but of course you can use other methods, or even pen and paper.
1. Fishbone Diagram
Fishbones, or cause-and-effect diagrams, help you brainstorm potential
causes of a problem and see relationships among potential causes.
Using Design of Experiments to
Minimize Noise Effects
All processes are affected by various sources of variations over time.
Products which are designed based on optimal settings, will, in reality,
tend to drift away from their ideal settings during the manufacturing process.
Environmental fluctuations and process variability often cause major
quality problems. Focusing only on costs and performances is not enough.
Sensitivity to deterioration and process imperfections is an important issue.
It is often not possible to completely eliminate variations due to
uncontrollable factors (such as temperature changes,
contamination, humidity, dust etc…).
Correlation, Causation, and
Remorse for my NBA Finals Prediction
Let me be direct. Lask week I wrote that “while a game 4 victory gives the
Heat renewed hope, history is still on the side of Tony Parker, Time Duncan,
and Manu Ginobili in their quest to win a fourth championship together.”
The implication is clear. I thought that because the Spurs had
been the first team to get to two victories in the 2013 NBA finals,
they were going to be the first team to get to 4 wins.
As any competent biostatistician with a British Science Association
media fellowship can tell you, “correlation is not causation.”
I promise it was just a momentary lapse. Here's a list of...
Coach Bill Belichick: A Statistical
"Hoodie" Analysis, Part 1
As a longtime Boston sports fan, these past 12 years have spoiled me for
the rest of my life—seven titles amongst all four of the major sports teams and
over 30 playoff berths. This era of dominance began with the New England Patriots,
and the one man at the center of the team’s ascent to greatness is
Coach Bill Belichick. Over the years, his choice in wardrobe has garnered just
as much attention as his mastery of football strategy and tactics.
His grey hoodie with the cutoff sleeves has become synonymous with his savvy,
if eccentric, football acumen.
Coach Bill Belichick: A Statistical
"Hoodie" Analysis, Part 2
Yesterday's post shared how an analysis of Bill Belichick's hoodie-wearing
patterns found no statistically significant difference in
New England Patriots wins if he wore sleeved or sleeveless hoodies,
nor if the hoodie were from Reebok or Nike.
Since these hypothesis tests failed to reject the null hypothesis,
I combined these factors under “grey hoodie” and started a new Minitab worksheet.
But when I took a look at all the different outfits Belichick wore,
there were still too many variables for a good analysis.
I then decided to split this category into two: Type and...
How to Interpret Regression Analysis
Results: P-values and Coefficients
Regression analysis generates an equation to describe the statistical relationship
between one or more predictor variables and the response variable.
After you use Minitab Statistical Software to fit a regression model,
and verify the fit by checking the residual plots, you’ll want to interpret the results.
In this post, I’ll show you how to interpret the p-values and coefficients
that appear in the output for linear regression analysis.
How Do I Interpret the P-Values in Linear Regression Analysis?
The p-value for each term tests the null hypothesis that the coefficient is equal to zero...
in Financial Services
Process improvement through methodologies such as Six Sigma and Lean
has found its way into nearly every industry. While Six Sigma had its beginnings
in manufacturing, we’ve seen it and other process improvement techniques
work very well in the service industry—from healthcare to more
service-oriented business functions, such as human resources.
However, Six Sigma seems to have had a slower rate of adoption in
financial services. I recently came across a great article about the challenges
faced in the financial industry when it comes to
successfully implementing a process improvement...
Coffee or Tea? Analyzing
Categorical Data with Minitab
Here at Minitab we have a quite a few coffee drinkers. From personal observation,
it seemed as if people who are more outgoing are the ones doing most of the
coffee drinking, while people who are less outgoing seem to opt for tea. I’d
noticed this over a period of time, and eventually decided to investigate.
To test out my hypothesis, I decided to pester some of my coworkers by asking
them to participate in my beverage choice survey. Given that the data
I collected is categorical rather than continuous, this also seemed
like a great way to showcase some of Minitab’s tools for analyzing...
T for 2. Should I Use a
Paired t or a 2-sample t?
Boxers or briefs. Republican or Democrat. Yin or yang.
Why is it that life often seems to boil down to two choices?
Heck, it even happens when you open the Basic Stats menu in Minitab.
You’ll see a choice between a 2-sample t-test and a paired t-test:
Which test should you choose? And what’s at stake?
Ask a statistician, and you might get this response:
"Elementary, my dear Watson. Choose the 2-sample t-test to test
the difference in two means: H0: µ1 – µ2 = 0 Choose the paired
t-test to test the mean of pairwise differences H0: µd = 0."
Exploratory Data Analysis:
The First (and Sometimes Last) Step
A good way to begin researching a topic is with exploratory data analysis (EDA).
In his 1977 book Exploratory Data Analysis, John Tukey suggested using
EDA to collect and analyze data—not to confirm a hypothesis, but to form
a hypothesis that could later be confirmed through other methods.
In some cases, EDA can even eliminate the need for a more in-depth
hypothesis test. Here's a case in point.
When I heard about the new Star Trek movie, I had started to complain to
anybody who would listen (which was not many people)
that director J. J. Abrams had used such a...
Analyzing Baseball Park Factors:
Checking the Data
Being a fan of both baseball and statistics is a special kind of joy.
I’m sure many of you have noticed the same appeal that Leonard Koppett
did when he wrote that "Statistics are the lifeblood of baseball. In no other
sport are so many available and studied so assiduously by participants and fans.
Much of the game's appeal, as a conversation piece, lies in the opportunity the
fan gets to back up opinions and arguments with convincing figures.”
I recently got interested in the ways that different websites are reporting park factors.
Park factors are supposed to give you an idea of whether a...
My Favorite Quality Tool:
We all seem to have our favorite statistical or quality improvement tool.
Jim Frost wrote a tribute to regression analysis. Dawn Keller seems to
enjoy control charts. Eston Martz discusses reliability analysis and the
Weibull distribution pretty regularly. So, I started thinking …
what’s my favorite quality tool?
I’ve always been drawn to process mapping, or what's sometimes
referred to as ‘flow charting.’ Even before I started my work with
Minitab and learning about quality improvement techniques,
I’ve considered myself somewhat of a visual learner.
Regression Analysis: How to
Interpret the Constant (Y Intercept)
The constant term in linear regression analysis seems to be such a simple thing. Also
known as the y intercept, it is simply the value at which the fitted line crosses the y-axis.
While the concept is simple, I’ve seen a lot of confusion about interpreting the constant.
That’s not surprising because the value of the constant term is almost always meaningless!
Paradoxically, while the value is generally meaningless,
it is crucial to include the constant term in most regression models!
In this post, I’ll show you everything you need to know
about the constant in linear regression analysis.
Statistical Fun …
at the Grocery Store?
Grocery shopping. For some, it's the most dreaded household activity.
For others, it's fun, or perhaps just a “necessary evil.”
Personally, I enjoy it! My co-worker, Ginger, a content manager
here at Minitab, opened my eyes to something that made me love grocery
shopping even more: she shared the data behind her family’s shopping trips.
Being something of a data nerd, I really geeked out over
the ability to analyze spending habits at the grocery store!
So how did she collect her data? What I find especially interesting is that
Ginger didn’t have to save her receipts or manually transfer any...
P and U Charts and Limburger Cheese:
A Smelly Combination
The art of cheese-making has been around for thousands of years.
By that measure, Limburger cheese is a relative newcomer.
It was first produced in the 1800s in the Duchy of Limburg
(now split by the borders of Germany, Belgium, and the Netherlands).
Limburger cheese is notorious for its distinct smell…which resembles body odor!
Given the most common reaction to Limburger – “P.U.!” –
I thought it made a perfect topic for discussing how to decide between
using the P or U control charts for attribute data.
Is this the Year the Pittsburgh Pirates
Snap the Streak, or is the Collapse Imminent?
We’re 93 games into the Major League Baseball regular season, and for the
3rd year in a row the Pittsburgh Pirates have a winning record.
Unfortunately for Pittsburgh fans, the previous two seasons have ended with collapses,
making the Pirates the first professional sports team to have 20 consecutive
losing seasons. Optimists say 2013 is finally the year they snap the streak,
while pessimists see another collapse coming. So which is it going to be?
I’m going to use Minitab Statistical Software to analyze the statistics from the
previous two seasons, and then see how they compare to this year.
Uncle Joe’s Fantasy: Picking the Perfect
Age to Reverse Your Aging Process
My Uncle Joe is always fantasizing about ways to outsmart Father Time.
“Suppose you could reverse your aging process at some fixed point in your life,”
he says to me, a crazed gleam in his eye.
“So you could pick any age to turn the clock backwards and start aging in reverse.
What age would you pick to try to maximize your life span?”
In other words, suppose you pick age 75. That means you’d turn 75 and
then start aging backwards another 75 years. So you could live up to 150 years.
Your risk of mortality increases with age,
so picking a higher age doesn’t guarantee a longer life span. If you...
Analyzing Baseball Park Factors:
Highlighting Unrealistic Data
The last time I posted, I used Minitab to find an error in the baseball
park factor data that ESPN provides on its website.
The error was easy to spot because no other values for Chase Field were below 1.
As a reminder, here’s how ESPN reports it calculates park factors:
((homeRS + homeRA)/(homeG)) / ((roadRS + roadRA)/(roadG))
The Odds of Finding a Four-Leaf Clover
Revisited: How Do Some People Find So Many?!
Picture of four-leaf clover by Joe Papp.
This may seem to be an odd time to write about four-leaf clovers,
the traditional Irish lucky charms. However, clovers are currently growing full-force in my yard!
I was out doing yard work when I noticed patches of clovers.
I blame my neighbor for them because, while I have patches of clover in my grass,
he has patches of grass in his clover filled yard! The clovers got me
thinking about Carly Barry’s post about the odds of finding four-leaf clovers.
It also prompted some fun, backyard science with my daughter!
In Carly’s blog, reader comments raise a...
Minitab's LinkedIn Group:
A Great Place to Talk Stats
If you've got questions about quality improvement and statistics,
I've got a resource for you: the Minitab Network on LinkedIn.
I'm privileged to serve as the moderator of this group, which lets people
who use Minitab products communicate and network with like-minded
people from around the world.
LinkedIn is the leading social networking site for professionals,
and the Minitab Network on LinkedIn has become an excellent way for
Minitab users to share ideas and learn from each other. Since we launched
the group in August 2008, it's become a very active community...
The Gentleman Tasting Coffee:
A Variation on Fisher’s Famous Experiment
In the 1935 book The Design of Experiments, Ronald A. Fisher used the example
of a lady tasting tea to demonstrate basic principles of statistical experiments.
In Fisher’s example, a lady made the claim that she could taste whether milk or
tea was poured first into her cup, so Fisher did what any good statistician
would do—he performed an experiment.
The lady in question was given eight random combinations of cups of tea
with either the tea poured first or the milk poured first. She was required to
divide the cups into two groups based on whether the milk or...
How Lean Six Sigma Students at
Rose-Hulman Reduced Food Waste
To promote ethical and moral responsibility in shaping its graduates,
Rose-Hulman Institute of Technology created a sustainability initiative to
reduce its own environmental footprint.
As part of that team's efforts, Six Sigma students at Rose-Hulman
conducted a project to reduce food waste at the campus dining center.
We got the opportunity to learn more about the project by talking with
Dr. Diane Evans, a Six Sigma black belt and associate professor of
mathematics at Rose-Hulman who led the students’ efforts, and Neel Iyer,
a mechanical engineering student who was also on the project team.
How Do Elite Fantasy Football Players
Perform a Year Later? Part 1
In 2006, LaDainian Tomlinson set an NFL record by scoring 28 rushing touchdowns.
The next year he had about half of that, only scoring 15. In 2011,
Calvin Johnson caught a ridiculous 11 touchdown passes in the first 8 games
of the season. In his next 24 regular season games, he caught 10. In 2007 Tom
Brady set an NFL record with 50 touchdown passes.
In each of the 5 seasons after that, he hasn't broken 40.
The point being that it is really hard to continuously play football at an elite level.
Sooner or later you’re going to come down to Earth.
Last season Calvin Johnson set the...
Graph Quest: How to Show that Life on
Venus Is Safer than Life on Mars
True confession: Nothing fires quickly from the top of my head.
At least nothing very lucid or useful.
To come up with a good idea, I have to dredge thoughts
slowly from the thick sludge and sediment in my brain.
It's not always easy—there are deeply encrusted layers in my cerebral
cortex that go all the way back to the Paleozoic era.
So coming up with a useful data display—one that uncovers hidden patterns or
elucidates interesting relationships—often takes a bit of doing for me.
It's rare that I can nail it on the first shot.
How Do Elite Fantasy Football
Players Perform a Year Later? Part 2
Last week I used Minitab's paired t test to compare how quarterbacks and
running backs performed the season after finishing in the top 3 in fantasy
points. Quarterbacks did not perform significantly worse,
while running backs scored about 80 fewer points and finished ranked
8.7 spots lower than their top 3 year. Now it's time to move on to wide
recievers and tight ends. Will they follow suit and perform the same,
like quarterbacks? Or will their top 3 season turn out to be just an
outlier that they're unlikely to repeat, like running backs?
Wide Receivers. Much like running backs, it’s hard for...
Finding Value in Your
Fantasy Football Draft
When it comes to fantasy football, there is a common statistical
term that comes up again and again. It’s "variation."
From season to season, week to week, and even quarter to quarter,
NFL players can be very inconsistent. This can make selecting
your fantasy team as much about luck as it is about skill.
Nobody has a crystal ball that reveals who will be fantasy
sleepers and fantasy busts in the upcoming season.
And even if they did, they’d be keeping it to themselves and
making millions in Vegas rather than writing about it on the Internet.
Using Multi-Vari Charts to
Analyze Families of Variations
When trying to solve complex problems, you should first list all the
suspected variables identify the few critical factors and separate
them from the trivial many, which are not essential to understanding the cause.
Many statistical tools enable you to efficiently identify the effects that
are statistically significant in order to converge on the root cause of a
problem (for example ANOVA, regression, or even designed
experiments (DOEs)). In this post though, I am going to focus
on a very simple graphical tool, one that is very intuitive,
can be used by virtually anyone, and does not...
Doing Gage R&R at the
How would you measure a hole that was allowed to vary one tenth the size of
a human hair? What if the warmth from holding the part in your hand could
take the measurement from good to bad? These are the types of problems
that must be dealt with when measuring at the micron level.
As a Six Sigma professional, that was the challenge I was given when
Tenneco entered into high-precision manufacturing. In Six Sigma projects
“gage studies” and “Measurement System Analysis (MSA)” are
used to make sure measurements are reliable and repeatable.
It’s tough to imagine doing that...
Fantasy Studs and
Regression to the Mean
Kevin Rudy has recently written two great posts (here and here) about
how fantasy football studs perform the following year. For any fantasy
team manager, the results demonstrate how difficult it can be to predict
player performance...pity the person with the first pick in a draft,
who seems almost certain to not pick the best performer that year!
But why is this the case?
One cause is special circumstances such as injury...if you placed among
the top fantasy performers, you were almost certainly injury-free or
close to it for the entire season. So an injury the following year means you...
Calculating Baseball Park Factors:
Minitab Execs Make it Fast
I’ve expressed an interest in baseball park factors that I’m still exploring.
It intrigues me that parkfactors.com says that there are 6 neutral parks in
major league baseball, even though when I look at the graphs that I’ve made
from the ESPN scores it looks to me like there are only 6 clearly non-neutral parks.
Unfortunately, I’ve noticed that the ESPN park factors don’t match the
formula that they give on their website. I’m not positive that the ESPN park
factors are wrong because the inputs are wrong. The website may
employ a more complicated formula. But lacking a reliable source for...
How Residuals Can Save You Thousands
of Dollars on Your Next Car Purchase
Purchasing a used car can be stressful due to all the factors that need to be considered.
Web sites such as www.cars.com provide you a wealth of information,
but how do you navigate through it all to find the best deal?
Minitab to the rescue. Once you narrow your choice down to a particular car model,
such as an Acura TSX, the data from www.cars.com can be copied and pasted into Minitab.
After some data manipulation, you can use a regression analysis to develop
an equation that calculates the expected list price of a vehicle based on variables
such as year, mileage, whether or not the...
Spicy Statistics and
Attribute Agreement Analysis
My husband, Sean, and I were recently at my parent’s house for a picnic dinner.
As a lover of hot sauce (I’m talking extremely hot, hot, hot, HOT sauce!),
my stepdad always has a plethora of bottles around to try.
While I do enjoy spicy foods from time to time, I’ve learned not to touch
his hot sauce selections. His favorites are much too spicy for my taste!
Unfortunately, Sean learned the hard way. He used Habanero hot sauce
on his hot sausage sandwich – talk about double the heat! I saw him
sinking in his seat, eyes watering … a few hoarse coughs …
Yikes! Anyway, Sean is alive and well after...
How Many Licks to the Tootsie Roll
Center of a Tootsie Pop? Part 2
A few months ago I posted a blog about Tootsie Pops and how many licks it takes to
get to the Tootsie Roll center. If you haven’t read the post, here's a quick summary.
Recap of Initial Study
I broke down my experiment into four parts where I would test:
Minitab Congratulates 2013 ASQ International Team Excellence Award Winners.
Minitab Customers Earn Manufacturing Leadership Awards, IQPC European Process Excellence Awards.
Minitab Customers Among 2013 ASQ International Team Excellence Awards Process Finalists.
© 2014 Cubic Computing Pvt Ltd. All rights reserved. All logos and trademarks in this site are property of their respective owner.