Here's my answer: "How big is big?" is simply not an intrinsic property of the data. It takes subject matter knowledge and how important an effect might be for a particular objective (which can change over time). The same data can serve multiple purposes.

If it only took data, we wouldn't need any phoney-baloney scientists and managers. Armed with the universal confidence level of 95% (or alpha at 5%), we as statisticians would be able analyze the data and make all of the decisions necessary. (Yes, I'm being facetious.)

Jim

------------------------------

Jim Baldwin

Station Statistician

USDA-Forest Service

------------------------------

Original Message:

Sent: 09-02-2015 07:12

From: Marc Bourdeau

Subject: How to answera question about statistical testing

Hi! friends of Statistics,

Many statisticians know about this difficulty, all should know how to answer it.

With N, the sample size, growing, the more often a test of hypothesis will tend to be rejected. Wrongly in many cases, needless to say. That is the main reason why no statistical testing is possible with big data.

Strange enough! How would you answer a question on that difficulty of statistical testing theory?

------------------------------

Marc Bourdeau

Ecole Polytechnique

------------------------------