in my opinion, here we find an item of the common confusion between a
tolerance interval and a
confidence interval. A fraction of the Universe or Population is a
fixed number without any standard error. In fact, for populations, we talk about
standard deviations and never talk we about
standard errors. Of course, errors could anyway exist as "measure errors". In addition, if tolerance intervals are
extrapolated by a sample they too will have their standard errors.
Therefore, I would be very careful in adding the info Confidence Interval to a fraction of the Population distribution as if a different sample could exist. The frequentist interpretation of a Confidence Interval corresponds to the hypothesis of
repeated samples in the same population. Each sample would show a different statistic whose distribution would be its sampling distribution. However, if no repetitive sample is possibile, no different statistic would show up and so speculating about p values would be meaningless.
I witnessed a extreme case of this confusion when somebody showed me a
panel of Banks, actually
all banks of the Region he was invetigating, under a Random Effects model. Of course this case should have been done under a
Fixed Effects model. Each bank is a dummy variable, a column in the Design Matrix, without any need of speculating how the other banks would behave. The other Banks do not exist if you take them all in your analysis.
The fact that packages always show p alues or similar is due to an "economic" way of doing. They print everything. What applies is up to you to decide.
Ulderico Santarelli
------------------------------
[Ulderico] [Santarelli]
[Las Vegas][Nevada]
------------------------------
Original Message:
Sent: 12-17-2019 16:00
From: Melissa N
Subject: Hypothesis Testing for Entire "Population"
Hello Everyone,
Example: I have data for all of the arrests in two different boroughs of NYC. I am being asked to use a proportion test to compare the proportion (number of female arrests)/(number of all arrests) between the two boroughs to see if there is a significant difference in the proportion of female arrests. However, I have ALL of the arrest data for the two boroughs for a specific time period. Does it make sense to do a hypothesis test/proportion test to assert if there is a significant difference even though we already have the true population proportions?
My opinion is no, however a colleague disagrees
Thanks