Yes Maria, you are right. Thank you for that correction.
Original Message:
Sent: 08-27-2015 09:53
From: Maria Shubina
Subject: Accessible Reading on Statistics vs Mathematics
------------------------------
Maria Shubina
------------------------------
| If you wanted to determine the number of degrees in a triangle inscribed on a sphere the total number of degrees
| in the three angles would be greater than 180.
Agree, sphere has positive curvature.
| Similarly, if you wanted to determine the number of degrees in a triangle inscribed on the inside surface of a sphere,
| like the inside of a world globe, the number of degrees would be less than 180.
Do not agree: inside of a sphere is just the same as its outside.
To get a triangle with the number of degrees < 180 you need to consider a surface of negative curvature, like hyperboloid.
Original Message:
Sent: 08-21-2015 00:34
From: C. Sterling Portwood
Subject: Accessible Reading on Statistics vs Mathematics
So, translating a little bit, you would like to to receive an explication of the relationship between mathematics and statistics?
Interestingly, I delivered a paper (really more of a think-piece) at JSM14 in Boston which spoke to those issues, but in a somewhat tangential way. The main focus of the presentation was observational causal inference, but, to get to that destination, I had to conduct a tour of my conception of how mathematics and statistics are related and will in the future be related. Surprisingly, it seems that certain aspects of the relationship will change over time. The paper appears in the JSM14 proceedings under the title, "Toward a General Theory of Observational Causal Inference" and can also be found at http://tinyurl.com/tgtoci
As a quick and very limited cut at the relationship between mathematics and statistics, I will first present the following one sentence Gestalt, pointing in the direction of my opinion:
"Frequentist and Bayesian statistics can be considered to be axiomatic, applied mathematical discourses,ultimately becoming inquiring paradigms."
If this opaque thought tells you everything you need to know, then you can move on to the next item on your to do list. If not, then I would suggest you read further.
For almost everyone this sentence may raise more questions than it answers. For that reason I have below presented an extract from the think-piece which rummages around in the area of interest, while traversing to the new observational statistical paradigm I call causal statistics. Oddly enough, this extraction may convey a fuller understanding of the connection between mathematics and statistics, than if I had started out with the goal of simply answering your question, because of the contextual relationships presented. Additionally, the extractive passage also points to how the sought relationship may morph in the future:
In 1953, Einstein, in a letter to a colleague, wrote (paraphrased here for clarity) that
the phenomenal success of the physical (i.e., experimental) sciences is based on two great achievements: 1) mathematical axiomization, as in the derivation of Euclidean geometry, and 2) causal inferences, arrived at through experimentation and leading to causal theories.
In this paper, we shall deal with both of these great achievements in the context of both new and old statistical paradigms and the sciences, especially the social sciences.
Thales and Einstein were thinking only about the physical sciences. But Thales’ insight is equally foundational and important for the social sciences. Yet the social sciences have shown not one 10,000th of the success of the physical sciences. Why is this? Think about it . . .
The general answer is that there are a number of reasons why, but the most important reason is the inability of social science researchers to experiment. In nonexperimental (also called observational) research, causal inferences are vastly more difficult. Social scientists are therefore very limited in their ability to build causal theories and social science research consumers are similarly limited in their ability to intervene appropriately. They are largely relegated to prediction, only.
- Can Classical Statistics Be Used to Draw Causal Inferences?
Summary of Section 2: Neither frequentist nor Bayesian statistical paradigms can be used to draw causal inferences because the word cause is not a part of their derivations. These inquiring systems cannot reach conclusions about a concept that they have no knowledge of, like the mathematical discourse of Euclidean geometry cannot by itself draw conclusions about the apparent color produced by overlaying florescent blue and yellow, congruent triangles. The answer is that the resulting figure would appear to be a green triangle, but that result could only be attained from geometry, if geometry were extended to incorporate the color wheel or if a new color geometry were derived.
Classical statistics can be considered to be an axiomatic mathematical discourse, as is Euclidean geometry. For purposes of this presentation frequentist and Bayesian statistics will be considered the components of classical statistics. Fiducial statistics will be disre- garded here because of its limited, current use.
Andrey Kolmogorov (1903–1987) derived probability theory beginning with six ax- ioms which later mathematicians reduced to three. Building on the pure mathematical discourse of probability theory, every time you teach your basic statistics class, you do a simplified derivation of classical statistics. The resulting pure mathematical discipline of statistics can then be transformed into the two alternative applied mathematical branches of frequentist and Bayesian statistics by inserting two different definitions for the technical element of probability.
An understanding of the relationships between Euclidean and non–Euclidean geome- tries will be helpful in comprehending the situation among Frequentist statistics, Bayesian statistics, and other potential statistical paradigms. Does Euclidean geometry applied to everything? No, it only applies to flat planes. If you wanted to determine the number of degrees in a triangle inscribed on a sphere the total number of degrees in the three angles would be greater than 180. Similarly, if you wanted to determine the number of degrees in a triangle inscribed on the inside surface of a sphere, like the inside of a world globe, the number of degrees would be less than 180. Various non–Euclidean geometries apply to these types of situations, but Euclidean geometry does not.
A non–Euclidean geometry is one in which one or more fundamental aspects, often a postulate, fundamental to Euclidean geometry, is altered. For example, one might alter the fifth postulate in Euclidean geometry, i.e., the famous parallel postulate, so that parallel lines do intersect, which does not happen on flat planes, but is something that happens on both inside and outside surfaces of a sphere. Then, using one of these modified axiom sets, new and different theorems can be derived. These altered theorems then apply to different parts of the world (e.g., on convex surfaces) from the world of flat planes, for which Euclidean geometry is applicable.
The point being that it is normal and typical for a given branch of applied mathematics, including its provable theorems, to be applicable to specific phenomena or specific aspects of the real world and not to others.
The question at issue here is whether or not classical statistics can be employed to make causal inferences from either experimental or nonexperimental studies. The applied math- ematical discourses of frequentist and Bayesian statistics are perfectly capable of handling, for example, sample means or correlations and inferring their values to populations, but, by themselves, mathematically incapable of drawing causal inferences from samples.
This is because in the derivations of these classical inquiring paradigms, the word
”cause” is never mentioned––not in the axioms, not in the technical terms or primitives, and not in the metalanguage (e.g., English). Therefore causes and causal inference can- not be a part of or a result of either one of these classical statistical paradigms, leading to the logically necessary conclusion that any statistical inquiring system which could validly draw statistical causal inferences must be a paradigm distinct from the classical statistical disciplines, i.e., an extension of classical statistics or a whole new paradigm.
Do you find it interesting that causality is the most fundamental and most important concept in science and yet the classical statistical paradigms cannot handle it? I find it shocking and almost incomprehensible! That’s a huge void in the firmament of statistics.
This is reminiscent of an old story which I will adapt for the situation. A statistician and a wheelchair–bound social science researcher were ambulating down a path in the rain forest and came upon a mango tree with ripe mangoes at the very top. The statistician went directly to the tree and started pulling off leaves. He then ate some of them and gave some to be researcher to eat. Puzzled, the social science researcher asked the statistician, You know the leaves have little food value and taste terrible, why not climb to the top and pick the mangoes to eat. The statistician responded, That would be far better, but the leaves are so much easier to get and there is no risk of falling.
Causal knowledge is far better, but correlations are much easier to get, there is no risk of failure, and parenthetically they are accepted for tenure.
- So, How Have the Physical Sciences Been so Successful at Drawing Causal Inferences?
Summary of Section 3: The physical/experimental sciences have been overwhelmingly suc- cessful, by intuitively extending the classical paradigms by implicitly inserting causality and injecting unstated, but simple and generally acceptable, assumptions.
Thales began science by claiming that natural phenomena had natural causes, as op- posed to being caused by the gods. The physical/experimental sciences have been over- whelmingly successful, by intuitively applying Thales’ insight. Yet few if any of the scien- tists (or statisticians for that matter) were conscious of the full explication of the mathemat- ics or statistics underlying their efforts. They were tacitly extending the classical statistics paradigms into a different paradigm that might be appropriately called experimental causal statistics. Their implicit processes and assumptions were typically not questioned because they were so intuitively appealing and generally acceptable.
The fact that scientists and statisticians are not cognizant of some of the logical founda- tions of their research activities may sound incredible, but it’s really not. Both Euclid and Einstein, in their derivations, made intuitive and implicit assumptions that they were not consciously aware of at the time. In fact in Euclid’s case it was 21 centuries before anyone became aware of his implicit assumptions.
To understand these intuitive and subconscious structures, one must at least allude to them explicitly. Analytically, one would begin with classical statistics and extend it with ad- ditional mathematical processes. Cause would be inserted into the extended axiomization as a technical term; its definition or interpretation would be specified using the metalan- guage (e.g., English); a modern statement of Thales’ insight would be input as an axiom; and other needed assumptions or axioms concerning the experimental design would be inserted.
The updated Thales axiom would note that all events have natural (i.e., non–mystical) causes and/or that all real correlations are in some way a result of causal mechanisms. This would lay the foundation for progressing from correlation (which is observable, measur- able, and calculable) to the inference of causation (which is none of those things).
A typical and generally acceptable experimental assumption would be that the exper- imenters’ action in manipulating the putative cause is not correlated with other potential causes of the putative effect. An example counter to this assumption would be the exper- imenter who arises early every morning and concludes that his Snap, Crackle, and Pop breakfast food causes the cock to crow, when in fact the sunrise is the causal variable.
Through this explicit, lengthened derivation, it is clear that the resulting applied math- ematical discourse is no longer a classical statistical paradigm. It is an expanded, more capable and far–reaching statistical inquiring system, arrived at via continued axiomiza- tion, beginning from the classical statistics discourse. This resulting paradigm, that I call experimental causal statistics, is capable of logically manipulating and relating correla- tion, causation, and experimental data and, in particular, of drawing causal inferences from experimental data. It gives an explicit, logical foundation for the research which physi- cal/experimental scientists have been carrying out intuitively and implicitly over the past few hundred years.
On the other hand, I don’t mean to imply that I have above actually performed the extended derivation. I have only been explicit about describing how it would be done. The actual extended axiomization would, by itself, be a full paper and the proofs of important theorems could be another paper or so.
Now, turning to nonexperimental/observational causal inference, that is a whole different kettle of fish.
- What about Causal Inference in the Social Sciences?
Summary of Section 4: The social sciences are largely limited to conducting nonexperi- mental studies. These and other observational sciences have had the opposite experience to that of the physical/experimental sciences. The assumptions required for extension of classical statistics to draw causal inferences from social/observational studies, are volumi- nous, complicated, and anything but generally acceptable. Therefore the development of a new observational causal inquiring paradigm is the only reasonable path forward.
One would hope that for the social and other observational sciences, one could do a similar thing, i.e., extend the classical statistics, to that which was done for the physi- cal/experimental sciences implicitly and in this paper alluded to explicitly. Unfortunately that turns out not to be the case.
Obtaining knowledge in the social/observational sciences is far more difficult at every level than obtaining knowledge in the physical/experimental sciences. There are far more variables involved in almost any social science situation. Typically the variables are far less precise; for example confidence may originate from intellectual performance or athletic performance. But, of all the difficulties faced by the social sciences, the most devastating is their general inability to experiment.
This inability to experiment makes drawing causal inferences almost impossible. Cer- tainly the comparatively simple extension to classical statistics that worked for the physical sciences has not and will not work for the social sciences, epidemiology, and other nonex- perimental disciplines. For social scientists to have any general chance of drawing causal connections from their observational studies, a completely new causal inquiring paradigm would have to be developed, derived.
The difficulties have dissuaded most statisticians and social scientists from even at- tempting observational causal inference, although many have chosen, inappropriately, to come down in the middle by using ambiguous synonyms like leads to, results in, yields, etc. All this does is add to the confusion.
A few of the best statisticians have, during the last 100 years, perceived the need and tried to appropriately fill it. They attempted to do largely as the physical scientists did, i.e., intuitively and more or less implicitly extend classical statistics to draw causal connections from nonexperimental data.
Unfortunately these efforts, although receiving a certain amount of acclaim in their time, have, in the end, been generally unsuccessful. We know that these efforts have been unsuccessful because, if any one of them was truly capable of, in a generalized and reason- ably applicable way, drawing observational causal inferences, the response would’ve been sensational. Any paradigm truly capable of that would revolutionize the social and other nonexperimental sciences and that hasn’t happened.
Further and surprisingly, there is now empirical evidence that these approaches have not worked. In 2011, Stanley Young and Alan Karr analyzed 52, effectively randomly selected, published studies which had reported success in their efforts to draw nonexper- imental causal inferences through the utilization an extension of classical statistics. The findings of Young and Karr were presented in Significance under the title, ”Deming, Data and Observational Studies: a Process Out Of Control and Needing Fixing.”
In the lead up to the paper, the editors wrote:
Any claim coming from an observational study is most likely to be wrong. Startling, but true. Coffee causes pancreatic cancer. Type A personality causes heart attacks. Transfat is a killer. Women who eat breakfast cereal give birth to more boys. All these claims come from observational studies; yet when the studies are carefully examined, the claimed links appear to be incorrect. What is going wrong? Some have suggested that the scientific method is failing, that nature itself is playing tricks on us. But it is our way of studying nature that is broken and that urgently needs mending . . . .
Specifically, at the end of their analysis, they concluded that not even one of the 52 ”dis- covered” nonexperimental causal conclusions was borne out by experimental replication. Further, to add insult to injury, in five of the cases, experimental findings yielded significant causal inferences in the opposite direction to those found by intuitive and implicit extension of classical statistics.
The natural question at this point is, why would an extension to classical statistics work so well for the physical sciences be so powerless when applied to the social and other nonexperimental sciences? The answer is that the experimenter acts as a touchstone, over repeated runs limiting the causing variables to one, the one manipulated by the experi- menter. In the nonexperimental case there is no such touchstone; any variables, even those not considered by the study, can be the cause or causes of a change in the putative effect variable. This is analogous to trying to solve an algebraic equation of 10 unknowns with only one equation. There are logical ways around this, but they are complicated and require questionable assumptions. It’s not impossible it’s only almost impossible.
This is a shocking and almost inconceivable finding and is testatory of the noxiousness of the causal–theory–construction difficulties faced by social, health, and other observa- tional scientists. As Young and Karr mildly, but pointedly, note in their title, this needs fixing.
To their credit they proposed a fix, but even if their suggestions were scrupulously followed, the outcome would be beneficial, but far from a complete fix. I dealt with this issue some time ago and the next Section is a report on the development and the results of that attempted fix.
- A Proposed Fix
Summary of Section 5: This Section reports on the development of a new statistical paradigm for inferring causal connections from observational studies. It outlines the philosophical and technical investigations, the logic which led to the conclusion that an axiomization was required, and the derivation of the new inquiring paradigm. The derivation and accompa- nying information and explanations are about 300 pages long, necessitating the summa- rization reported in this Section.
------------------------------
C. Sterling Portwood, PhD
Causal Statistician
Center for Interdisciplinary Science
------------------------------
Original Message:
Sent: 08-16-2015 08:21
From: Samuel Cook
Subject: Accessible Reading on Statistics vs Mathematics
A colleague of mine asked me if I had any readings about the differences (and maybe similarities) between Mathematics and Statistics for a Math Ed reading course.
I did not have anything off hand but figured their must be something. Any suggestions?
------------------------------
Samuel Cook
Assistant Professor
Wheelock College
------------------------------