I was recently referred to analyze a project whose intention it was to evaluate the use of an initial (i.e. onboarding) customer support program in the effort to reduce the number of problem calls later on.
The way it appears the study was designed : a sample of incoming customers over a specific time period were randomly assigned to either the 'onboarding' program or the status quo (I cannot tell by what method the assignment was done, so I can't speak to its validity). The sampling approach they used was odd to me, and I can't assert at this point that they were homogenous in any way. They used a sample size calculator on
http://www.raosoft.com/samplesize.html. The way they propose a sample on this page seems like they could dangerously arm a novice with erroneous information.
In the end, I am not sure what inputs they used, but assured me that the sample was given "at the 95% confidence interval". The rules for sampling were clearly not laid out. For example, they defined a period in which they would "recruit" (over a months time) however they didn't define the length of time that, post the onboarding period that event (problem ticket) data would be collected. It appears they defined the length of data capture within the "recruitment" period. So basically those who had been assigned to the program later in the game would likely have fewer events captured (oy vey - does it get worse?).
I think some of this can be salvaged, but I'm thinking tactically how. Since the ticket data is captured by customer, there shouldn't be a problem with defining a window of observation post hoc for all customers. Here's the catch, for reasons apparently due to manpower to support this pilot project, the 2 groups ended up unequal - 308 in the experimental group and 1743 in the control. I don't have info for the mean and variance of each yet.
So here's the question, if one were to compare the means of the two groups by t-test post hoc given the sample provided (of course with unequal n's and well...whatever I find out the sd's are for each), is there any way one can salvage the problem of finding out if the *magnitude* of the difference between the 2 groups is significant, particularly given that there was no
a priori powering to determine the within group sample size (and really unknown effect size). If we input the observed difference as the effect size and determine the achieved power post hoc (as if we're just doing a 'data discovery' project), is this appropriate?
Canning the original study for reasons that it wasn't conducted correctly wouldn't bode well for departmental headcount (alot of manpower appears to have put into this experiment)...so the pressure is on here to try and help the best way I can. Any thoughts are greatly appreciated.
Many thanks!
-------------------------------------------
Phillip Middleton
-------------------------------------------