Discussion: View Thread

Back to discussions

Expand all | Collapse all

Developing and Validating a Scoring Tool

1. Developing and Validating a Scoring Tool

Recommend
Shelley-Ann Walters
Posted 09-26-2012 14:21
Hello:

Can anyone provide me with guidance on how to validate a scoring tool?

We are trying to develop a scoring tool (rather improve an exisiting toool) to capture the severity of a clinical conidtion called incontinence-associated dermatitis (IAD).

The objective of the tool would be to track clinically relevant changes in the condition and help categorize IAD into either a mild, moderate, severe, or very severe cases. We ultimately want a tool for use in future comparative studies.

I have read about face/content validity; criterion/construct-related validity; inter-rater reliability or reproducibility and test-retest reliability...all of which sound relevant.

The questions are:
How do I go about proving my tool is valid in the eyes of the clinical users and the FDA reviewer?
What statistics or tests do I use to show validity?
What is the criteria for validity?
How do I determine the sample size for the study or studies that I need to prove validity/reliability.

Thanks much!

-------------------------------------------
Shelley-Ann Walters
3M
-------------------------------------------
2. RE:Developing and Validating a Scoring Tool

Recommend
John Bartko
Posted 09-26-2012 15:35
Hello reliability measures how well groups of raters agree on rating the instrument given a series (N) of subjects. Often for continuous data the intraclass correlation measure via an ANOVA is used or kappa for multichotomous data. Thus for reliability you would have to conduct a study using test subjects and raters trained in the use of the instrument. Following Fleiss and others look for an ICC or kappa GE 0.7.

Validity is an assessment of is the instrument measuring what it purports to measure. This is often arrived at by comparing the "conclusions" of your instrument with standards in the field.

John

-------------------------------------------
John Bartko
Consulting Biostatistician
-------------------------------------------
3. RE:Developing and Validating a Scoring Tool

Recommend
Shelley-Ann Walters
Posted 09-27-2012 15:02
Thanks John for your relevant and succinct feedback. I think ICC would be the most appropriate statistic to use. I looked at wikipedia and that directed me to R's icc function. So I think the relevant option or options for the icc function would be for me to compute the one-way random single measure of agreement or a two-way random single measure for both agreement and consistency. This all is a bit over-whelming, but the criteria of an ICC >= 0.7 as a good agreement/consistency is nice to know.
Thanks again!

-------------------------------------------
Shelley-Ann Walters
3M
-------------------------------------------

Original Message
4. RE:Developing and Validating a Scoring Tool

Recommend
David Reasner
Posted 09-28-2012 16:01
If you are going in that direction also check-out... The first addresses various options in software packages and the second is how I would cite your ICC work unless you are utilizing a specific published variation.

David

Richard N. MacLennan (November 1993). "Interrater Reliability with SPSS for Windows 5.0". The American Statistician (American Statistical Association) 47 (4): 292-296.]

P. E. Shrout & Joseph L. Fleiss (1979). "Intraclass Correlations: Uses in Assessing Rater Reliability," Psychological Bulletin 86(2), 420-428

-------------------------------------------
David Reasner
Albemarle Scientific Consulting LLC
-------------------------------------------

Original Message
5. RE:Developing and Validating a Scoring Tool

Recommend
Nayak Polissar
Posted 09-28-2012 16:40
Thanks David, those are helpful references. The second one, Shrout and Fleiss, is the definitive publication on defining the various types of ICC. Perhaps those definitions can be found in textbooks, but some time ago when a reviewer of an article asked us what "flavor" of ICC we were using (there are several), we hunted and hunted in the article literature (not texts) and finally found that reference.

Best wishes,

Nayak

-------------------------------------------
Nayak Polissar
Principal Statistician
The Mountain-Whisper-Light Statistics
-------------------------------------------

Original Message
6. RE:Developing and Validating a Scoring Tool

Recommend
Shelley-Ann Walters
Posted 10-02-2012 11:28
Thank you everyone for your fedback. It very much appreciated!
-------------------------------------------
Shelley-Ann Walters
3M
-------------------------------------------

Original Message
7. RE:Developing and Validating a Scoring Tool

Recommend
David Reasner
Posted 09-26-2012 15:44
Dear Shelly-Ann,

It would be worth looking at the FDA's PRO guidance (Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims). While the FDA emphasizes content validity (i.e., the patient perspective) rather than evidence based on relations to other variables, the release of the guidance triggered both an academic and industry response that is still ongoing. In addition to several dedicated conferences, ISPOR and other organizations have responded to the guidance with white papers, conference tracks, etc. Psychometrics is separate from biometrics and you will find a long-standing literature on the topics that you mention. It may be worth working with a consultancy (e.g., RTI) where researchers deal routinely with these topics and FDA expectations. I've also found Ron Hays' work very helpful and will paste his link below.

Best regards,
David

http://www.chime.ucla.edu/directory/Hays.htm

-------------------------------------------
David Reasner
Albemarle Scientific Consulting LLC
www.AlbemarleScientific.com
-------------------------------------------
8. RE:Developing and Validating a Scoring Tool

Recommend
Chengwu Yang
Posted 09-27-2012 10:31
Hello Shelly-Ann,

I cannot agree more with David, for his nice comments on FDA's PRO guidance and Dr. Ron Hay's work.

For scale development, based on my experiences at work I found this small booklet by Dr. Robert F.DeVellis is very useful:

DeVellis RF. Scale development: theory and applications, 3rd edn. Thousand Oaks, CA: Sage Publications 2012.

For scale validating, historically, there had been many types of validity, and here is a good summary by Dr. Bruno D. Zumbo:

Zumbo BD. Validity: foundational issues and statistical methodology. In Rao CR and Sinharay S. (Ed.). Psychometrics, Handbook of Statistics [26]. Amsterdam, The Netherlands: Elsevier 2007; 45-79.

And here is another article on this topic:

Cook DA, Beckman TJ. Current Concepts in Validity and Reliability for Psychometric Instruments: Theory and Application. Am J Med. 2006 Feb;119(2):166.e7-16. Review.

As to your sample size question, I think there is no easy/consistent answer. Based on different authors (e.g., DeVellis RF, Embertson SE & Reise SP (Item Response Theory for Psychologists, New York, NY. Psychology Press 2000), etc), few hundreds should suffice.

But I think that the most challenging thing in your plan is, you want to develop a 'tool that can track clinically relevant changes' and 'for use in future comparative studies'. As you may know, it is very difficult to define a 'minimum clinically important difference (MCID), and, in longitudinal studies that use scales/instruments ('tools' in your word), there is another big concern: you need to show 'longitudinal measurement invariance' (many articles on this topic, such as, Brown TA. Confirmatory Factor Analysis for Applied Research. New York. NY. The Guilford Press 2006; 252-266).

Sincerely yours,

Chengwu Yang （杨成武）
______________________
Chengwu Yang, MD, MS, PhD
Assistant Professor of Biostatistics
Department of Public Health Sciences
College of Medicine, The Pennsylvania State University
A210, ASB 3400H, 600 Centerview Drive, Hershey, PA 17033
Email: yangc@psu.edu; Phone: 717-531-3016; Fax: 717-531-0146
http://profiles.psu.edu/profiles/ProfileDetails.aspx?From=SE&Person=244
-------------------------------------------

Original Message
9. RE:Developing and Validating a Scoring Tool

Recommend
Shelley-Ann Walters
Posted 09-27-2012 15:03
Thank you; thank you especially for taking the time to write down your citations.

I am hoping to find out the minimum work I need to do to get a valid tool developed and validated. The bare minimum : )

In terms of study design and analysis, that will depend no doubt on the criteria for proving validity. I am leaning towards using ICC for inter and intra-rater agreement. I am limited in the sample size of total raters: up to 20 is feasible to recruit. However, another aspect fo sample size would be the number of cases to present to the raters. Since I am hoping the tool to distinguish 4 classes of IAD condition..perhaps 3 cases per condition, for a totoal fo 12 cases.

So can I comfortably utilize 20 raters who will score 12 different patients' IAD condition (running the gamult of severity) as sufficient for my needs?

The question of MCID (the minimum clinically important difference) is one I would have to put some thought in...but I guess the tool's ability to distinguish between or among the 4 severity classes: mild, moderate, severe and very severe is related to that question. MCID could then be determined from the validation study?

Any further thoughts?

Thanks much!

-------------------------------------------
Shelley-Ann Walters
3M
-------------------------------------------

Original Message
10. RE:Developing and Validating a Scoring Tool

Recommend
Eric Siegel
Posted 09-26-2012 20:52
Please tell us more about the intended end-users. Would this tool take the form of a questionnaire that the patient fills out himself or herself? Or would this tool be something that the doctor or nurse fills out out while inspecting the patient? Also, will the tool have quality-of-life-type questions and/or psychosocial questions? or will it be entirely clinical & work along the lines of something we'd find in Versions 3 or 4 of the Common Terminology Criteria for Adverse Events?

-------------------------------------------
Eric Siegel
Biostatistician
Univ of Arkansas for Medical Sciences
-------------------------------------------

Discussion: View Thread

Developing and Validating a Scoring Tool

Shelley-Ann Walters09-26-2012 14:21

John Bartko09-26-2012 15:35

Shelley-Ann Walters09-27-2012 15:02

David Reasner09-28-2012 16:01

Nayak Polissar09-28-2012 16:40

Shelley-Ann Walters10-02-2012 11:28

David Reasner09-26-2012 15:44

Chengwu Yang09-27-2012 10:31

Shelley-Ann Walters09-27-2012 15:03

Eric Siegel09-26-2012 20:52

1. Developing and Validating a Scoring Tool

2. RE:Developing and Validating a Scoring Tool

3. RE:Developing and Validating a Scoring Tool

4. RE:Developing and Validating a Scoring Tool

5. RE:Developing and Validating a Scoring Tool

6. RE:Developing and Validating a Scoring Tool

7. RE:Developing and Validating a Scoring Tool

8. RE:Developing and Validating a Scoring Tool

9. RE:Developing and Validating a Scoring Tool

10. RE:Developing and Validating a Scoring Tool