ASA Connect

View Only

Back to eGroups

Expand all | Collapse all

Sample Size estimation (Discrete case)

1. Sample Size estimation (Discrete case)

0 Recommend
Sudhi Upadhyaya
Posted 05-10-2015 17:41
I have a simple algorithm that determines whether a person is Diabetic or Not Diabetic { 1 or 0 }, not a 0.24 or 0.57 or 0.85 or anything like that just {1= Yes, 0 = No}.

I am planning on validating this algorithm and this requires choosing an appropriate sample size, I am wondering if anybody knew the formula for sample size section in this context. Assuming power = 0.9

------------------------------
Sudhi Upadhyaya

------------------------------
2. RE: Sample Size estimation (Discrete case)

0 Recommend
Douglas Landsittel
Posted 05-11-2015 08:23
If your goal is validation, I would base the sample size on precision, not power. So, with a given sample size, how precisely can you estimate sensitivity, specificity, etc. Of you also need to then consider proportion of cases.

Hope that's helpful.

------------------------------
Douglas Landsittel
Professor of Medicine, Biostatistics and Clinical and Translational Science
University of Pittsburgh-School of Medicine
------------------------------

Original Message
3. RE: Sample Size estimation (Discrete case)

0 Recommend
Brandy Sinco
Posted 05-11-2015 09:28
Hi Sudhi,

The maximum variance for a proportion, p_hat, occurs when p_hat=.5.

In SAS, Proc Power can be used for binary proportions.

Hope this helps to get you started.

------------------------------
Brandy Sinco
Research Associate
------------------------------

Original Message
4. RE: Sample Size estimation (Discrete case)

0 Recommend
Stephen Simon
Posted 05-11-2015 12:16
There are many ways to validate. I'm guessing here, but I suspect that you want to compare your algorithm, which is simple, cheap, or fast, to a gold standard measure of diabetes. The gold standard is something that has been around for a while and is well trusted by doctors, but it may be a lot more expensive or time consuming that what you are proposing.

Establishing validity in this framework is typically done by establishing that your sensitivity and specificity are large enough. You want to select a sample size so that the confidence intervals for sensitivity and specificity are reasonably narrow. A key statistic here is the proportion of patients in your sample that will have diabetes according to your gold standard.

Psychologists use terms like "criterion validity" or "predictive validity" in this case, though I am always a bit unclear on their terminology. That's probably more of a limitation on my intellectual capacity than a criticism of their definitions.

Note that there is no "power" involved in this calculation. The reason for this is that validity is not something that is easily reduced to a simple hypothesis test.

If you want more details, I talk about sample sizes needed for a study of a diagnostic test at http://www.pmean.com/04/SampleSizeDiagnostic.html

Establishing good values for sensitivity and specificity are not the only way to validate your algorithm, of course, and if you had a different method to establish validity, share it with us and we'll help you figure out how to justify your sample size.

------------------------------
Stephen Simon
Independent Statistical Consultant
P. Mean Consulting
------------------------------

Original Message

ASA Connect

Sample Size estimation (Discrete case)

Sudhi Upadhyaya05-10-2015 17:41

Douglas Landsittel05-11-2015 08:23

Brandy Sinco05-11-2015 09:28

Stephen Simon05-11-2015 12:16

1. Sample Size estimation (Discrete case)

2. RE: Sample Size estimation (Discrete case)

3. RE: Sample Size estimation (Discrete case)

4. RE: Sample Size estimation (Discrete case)

Contact Us

Membership

Privacy

Follow Us