If you’ve been doing data analysis for long, you’ve probably had the ‘AHA’ moment where you realized statistical practice is a craft and not just a science. As with any craft, there are best practices that will save you a lot of pain and suffering and elevate the quality of your work. And yet, it’s likely that no one may have taught you these. I know I never had a class on this. [Read more…] about Best Practices for Data Preparation

## Chi-Square Test of Independence Rule of Thumb: n > 5

We all want rules of thumb even though we know they can be wrong, misleading or misinterpreted.

Rules of Thumb are like Urban Myths or like a bad game of ‘Telephone’. The actual message gets totally distorted over time.

For example, you may have heard this one: “The Chi-Square test is invalid if we have fewer than 5 observations in a cell”.

[Read more…] about Chi-Square Test of Independence Rule of Thumb: n > 5

## What is Kappa and How Does It Measure Inter-rater Reliability?

The Kappa Statistic or Cohen’s* Kappa is a statistical measure of inter-rater reliability for categorical variables. In fact, it’s almost synonymous with inter-rater reliability.

Kappa is used when two raters both apply a criterion based on a tool to assess whether or not some condition occurs. Examples include:

[Read more…] about What is Kappa and How Does It Measure Inter-rater Reliability?

## The Secret to Importing Excel Spreadsheets into SAS

My poor colleague was pulling her hair out in frustration today.

You know when you’re trying to do something quickly, and it’s supposed to be easy, only it’s not? And you try every solution you can think of and it *still* doesn’t work?

And even in the great age of the Internet, which is supposed to know all the things you don’t, you *still* can’t find the answer anywhere?

Cue hair-pulling.

Here’s what happened: She was trying to import an Excel spreadsheet into SAS, and it didn’t work.

Instead she got:

[Read more…] about The Secret to Importing Excel Spreadsheets into SAS

## How to Understand a Risk Ratio of Less than 1

When a model has a binary outcome, one common effect size is a risk ratio. As a reminder, a risk ratio is simply a ratio of two probabilities. (The risk ratio is also called relative risk.)

Risk ratios are a bit trickier to interpret when they are less than one.

A predictor variable with a risk ratio of less than one is often labeled a “protective factor” (at least in Epidemiology). This can be confusing because in our typical understanding of those terms, it makes no sense that a risk be protective.

### So how can a RISK be protective? [Read more…] about How to Understand a Risk Ratio of Less than 1

## What Is Regression to the Mean?

**by Audrey Schnell, PhD**

Have you ever heard that “2 tall parents will have shorter children”?

This phenomenon, known as regression to the mean, has been used to explain everything from patterns in hereditary stature (as Galton first did in 1886) to why movie sequels or sophomore albums so often flop.

So just what is regression to the mean (RTM)? [Read more…] about What Is Regression to the Mean?