I didn't perceive the confound. My understanding is that the only thing varying is the vehicle...same CD in each vehicle. CD contains all 3 music selections.
My thoughts:
1) to estimate sample size, you need to specify the effect size you'd like to be able to detect for test statistic you anticipate using
2) Analysis Plan
a) if participants place Xs largely in the middle, you may be able to get away with ANOVA, as your data may approximate Gaussian. This isn't too improbably given the psychophysics, which if memory serves will push people away from extremes.
b) alternatively, you could use a binomial model, recoding the location of X as a proportion of the distance of the line...or use the success/trials notation using distance from left and total distance.
c) in either case would probably want to employ sandwich estimation
3) For sample size, it won't be perfect, but you could estimate using power for ANOVA (effect size f)...but may want to conceptualize the differences initially using Cohen's d (mean difference/standard deviation of a group)
4) If someone else has a suggestion for power anticipating a generalized linear model for binomial data, I'd go with that...but I suspect you will be "close enough" using ANOVA
5) If not already available, you will probably need some pilot data at least to estimate the standard deviation...nice if get some sense of effect size, though I'd recommend targeting a minimally relevant difference rather than assuming the pilot effect size is a great estimate of the "real" one.
6) Design
a) order of music may matter...consider counterbalancing car x music order (3 x 3 =9 combinations).
b) the potential for order effects are compounded since ratings take place after all 3 music selections have been heard, rather than following each...consider ratings after each if feasible. Order still may matter, but primacy/recency effects will be lessened
c) The question of showrooms is more complicated...more epidemiology... An education or psychology researcher may use a hierarchical model (random effects) with showroom as the unit of analysis, rather than potential customer. This will appropriately nest observations and provide you with a closer answer to the fundamental question of which are better than which, somewhat robust to showroom.
d) On the other hand, if there are features of showroom that MODERATE the effect of the differences between combinations, you're missing them. This would seem relevant, particularly if the showrooms vary by region and/or demographic, and particularly as one considers music.
- Different regions/demographics may favor particular genres
- Different genres may interact with the psychoacoustics (a system may sound great for folk vocals and terrible for hard rock)
7) The acoustical environment of a car is very challenging. This smells like an easy project to miss something in the details if there isn't enough convergence of knowledge. I suspect your customers understand this complexity, which is why they're employing you. That raises the bar. You should educate yourself on the various issues of psychoacoustics and the environment of a vehicle...there is a rich literature on psychoacoustics and the best in the car audio business are very well informed.
8) What a fun project!!!! I'm jealous. Enjoy.
-------------------------------------------
Jason T. Machan
Director, Lifespan Biostatistics Core,
Lifespan Hospital System
Research Scientist, Biostatistics, Research
Rhode Island Hospital
Assistant Professor, Departments of Orthopaedics and Surgery
The Warren Alpert Medical School, Brown University
Director Biostatistics Externship, Adjunct Assistant Professor, Department of Psychology
University of Rhode Island
-------------------------------------------
Original Message:
Sent: 06-17-2014 16:11
From: W. Vogt
Subject: SAMPLE SIZE question
-------------------------------------------
W. Vogt
Professor
-------------------------------------------
The sample size matters less here than the fact that the car companies and music selections are confounded.