Where it is actually used, a (not the) Scientific Method, which I’ll just refer to as a (big-M) Method, is used for scaling an instance of a small-s-small-m scientific method. One that achieved unreasonable traction with respect to a particular problem, suggesting hidden investigative potential: a Method-Mystery Fit (MMF), by analogy to the notion of a Product-Market Fit (PMF) in entrepreneurship. There is no canonical Scientific Method, just a plurality of scalable investigation processes that come and go with particular streams of discovery. When an investigative approach proves to be unreasonably effective, we scale it from method to Method. When we attempt scaling without MMF, we get the cargo cult process I called Science! (with exclamation point). You can tell the difference easily: true Methods are built around scientific instruments, not philosophical concepts. A class of instrument-makers emerges around a true Method. Telescopic observation is a Method. Some funding agency bureaucrat’s idea of “observation, hypothesis, experiment, theory” is not.
Science itself is a methodologically anarchic process, driven by a sensibility rather than a set of techniques. The aggregate of all currently fertile Methods constitute only a small part of all science. But the scaled Methods do share two features: a finite lifespan (there is no immortal Method that yields great discoveries for all eternity at a steady rate) and a deliberative element you could call “experiment design.”
Central to the idea of experiment design in the first codified Method — call it Method Zero — is the idea that one must only change one variable at a time. I suspect this one first developed around the laboratory equipment of traditional wet chemistry, with Lavoisier’s experiments being prototypes. This of course has limited Method Zero to narrow domains, where input isolation/purification and independent parameter variation in fully sealed containers is a possibility. Method Zero yields WEIRD results when used outside that narrow zone.
Method Zero is mostly a commodity today. Algorithms and robots can do it. You almost have to automate Method Zero today to get a yield rate worth the effort.
At the other extreme is what one might call Method Omega, used in astronomy, where the possibility of doing controlled design is basically non-existent. So experiment “design” consists of forming opinions about historical observations (sometimes made by ancient astronomers in the context of entirely non-scientific cosmological or astrological theories) and making predictions. Predictions in astronomy, even when they come true, are not necessarily conclusive evidence, as was the case with early evidence for general relativity based on flawed observations (involving eclipses and the orbit of Mercury). The flaws were overcome in later observations, but the point is, in Method Omega, one cannot conclusively prove that there an observation means one thing in particular. It is an uncontrolled natural phenomenon. A recent example is a significant correction to four centuries worth of sunspot data (with implications for the climate change debate). Method Omega is basically the Method of history (which is based on the instruments of archaeology) generalized.
Most Methods, such as the Method of macroeconomics, fall somewhere between Method Zero and Method Omega. The system under study is somewhere between fully controlled and completely uncontrolled. Despite the criticisms often leveled at it, macroeconomics does count as a science with a true Method, since it is based on at least two real and well-defined instruments: controlling interest rates, and issuing debt. Both should be considered scientific laboratory equipment, not engineering tools. We know how to use them to uncover new macroeconomic truths. We do not know how to use them to actually manage economies effectively enough to deserve the term “engineering.” In recent decades, for example, we’ve used these instruments to uncover mysterious phenomena with names like zero lower bound that nobody seems able to explain.
The ability to vary some parameters (interest rates in macroeconomics for example) independently while holding others constant goes from comprehensive to shaky as we progress from Method Zero to Method Omega. There is little potential for codification of Methods in this gray zone (though there is some). The effect of some parameters varying autonomously, some varying in controlled ways, and others in coupled ways, makes experiment design mostly a matter of aesthetics. In the social sciences, you also have many adversarial investigators moving variables around with entirely different discovery intentions and instruments. The macroeconomic relationship between the US and China is like two experimenters in a chemistry lab sharing the same bench and using the same equipment in uncoordinated ways and running experiments that severely interfere with each other.
What we can say about this gray zone of Methods is that repetition helps. Many effects and parameter movements cancel out over time, leaving behind only two things: the movement of the controlled variables and the effects of systematically varying uncontrolled parameters. These must then be modeled before experimentation with the controlled variables of interest can proceed. So discovery in such cases must proceed by recursion. You have to push the phenomenon of interest onto a stack while you model out the unknown, non-self-canceling effects. But the burden of modeling is not as high, because you merely want to bound your actual experiment away from the unmodeled phenomenon, not model it in detail. One way to think of this recursion process is as progressive refinement of frame assumptions: they go from “things don’t change until I change them” (the so-called “closed-world” frame) to “things change this way, and I have good reason to believe they can’t affect these other things within this scope” to “everything can change unpredictably and I cannot influence any of it” (what we might call the Omega frame of astronomy).
This recursive process carries the risk and opportunity of never coming back up to provide an account of the original phenomenon. You may get lost in yak-shaving for ever, or discover a more interesting phenomenon to explore.
The quality of scientific truths is always contingent upon the quality of sequestration achieved by the final refined frame of reference. The frame is never entirely “leak-proof” — dark matter and dark energy always threaten to sneak up our non-dark physics truths — but we look for a zone where the “truths” do seem to be stable, without constant interference from known unknowns. This is the generalized equivalent of “laboratory conditions.” Every scientific truth only holds within “laboratory conditions” in this sense, and is weird outside of it. In toy science, we understand the laboratory before we do the experiment. In real science, the outcome of the experiment defines the laboratory. The amount of repetition you need to define the laboratory in terms of its frame is also the amount of repetition needed to establish the Method. “Reproducibility” is the corresponding cargo-cult bureaucratic notion. There is really no point to reproduction that does not add some clarity to the definition of the “laboratory” frame.
If MMF is analogous to PMF, then the “laboratory” is analogous to the discovered “market.” You cannot tackle the scaling problem for a Method until you’ve gotten a sense of the scale of the “laboratory.”
Physics is the prototypical example of this sort of discovery, where truths are neither entirely contingent on laboratory conditions, nor entirely open-world, but sequestered in a big “laboratory” defined by phrases like “at non-relativistic speeds…” We build scaled instruments like the Large Hadron Collider only once we get a sense that the laboratory is sufficiently big to be worth the investment.
This is one good reason (there are others) to operate by Carl Sagan’s principle that “extraordinary claims require extraordinary evidence.” This principle does not follow in any obvious way from non-quantitative (lacking a measure of “extraordinary”) philosophies of science such as positivism or falsifiability.
But it does follow from the idea that gray zone Methods generate truths that live in implied laboratory conditions. Conditions that can only be uncovered with a great deal of repetition. A truth without a domain of applicability mapped is speculation.
When the burden of extraordinary claims and extraordinary evidence is met, you have defined an extraordinary laboratory: an unanticipated truth with an unexpectedly large domain of applicability. One that justifies the building of extraordinary instruments like the Hubble or the LHC.
Or economies based entirely on issuing debt as a basis for currency, and interest rates. We don’t normally think of modern economies that way, but they are to old gold-standard economies as the LHC or Hubble are to Galileo era laboratories.
Civilization is the world we build within these extraordinary laboratories. A contingent state of affairs that is only stable to the extent that the truths it is built on are sufficiently sequestered. Understood this way, all of civilization is one giant laboratory instrument, poking at the unknown.
I’m struck that you don’t directly address belief in this piece.
You have to believe in the meaningfulness of the measurements that your instrument provides you in order to follow a Method. At it’s strongest you have to have a fairly convincingly detailed model of your Method and how it corresponds to reality. This is the theoretical knowledge that allows you believe in its MMF, since in part your experience of reality is mediated through the very Method you believe in.
The difference between a Method and a cargo-cult Science! becomes how seriously they take the construction and theory of their instruments (Method). Science! always says “close enough!”, so long as the reality it presents is appealing enough, and in line with other beliefs.
This bit about civilization-as-laboratory is eye-opening, and in the same sense you’d have to extend that idea to organisms, phylogenetic trees, and just about any complex system. The aforementioned entity creates a stable(-ish) ontology by establishing some kind of equilibrium with its environment–that equilibrium is the “laboratory”.
To put it more simply and precisely, the “laboratory” is the creation of boundary conditions. A method may not be based on a-priori concepts (though I haven’t really thought hard enough about that claim), but whatever method one chooses needs to separate subject from object–i.e. it needs to play Maxwell’s Demon with respect to the domain in question. From this angle, one could say that the “scientific sensibility” is some kind of knack for doing so; and one that is illegible by virtue of how hard it is to create such boundary conditions.
I would in this sense not put astronomy in the category of “omega science”–crazy stuff comes out of left field, but that just seems to be the relatively ordinary problem of unknown unknowns. The real omega territory is the social sciences precisely because it’s there that we’re dealing with entities that are on the same scale as us and therefore interact with us as subjects. Psychology may have given us some empirical insights, but it did so by looking at a single aspect of human behavior and looking at *aggregates* of people, which is not the same thing as looking at a person.
What makes economic policy a truly wicked problem is that it’s very much *not* a laboratory in any sense of the word: it’s not just that the US and China are two separate laboratories trying to run experiments with the same equipment; it’s that there are tons of actors including central banks, governments, multinationals, pundits, economists, etc. all trying to outsmart one another. Sure, maybe they’re hoping to get in an experiment of some kind, but they’re largely playing against one another with no way to distinguish a rival laboratory from a test subject.
What we have in this case is something that’s patently not inductive: competing OODA loops in place of stable boundary conditions (as an aside, you can probably equate boundary conditions to having sufficiently infiltrated another agent’s OODA loop). That zone is where we find multiplicity and subject-to-subject engagement in the form of art, philosophy, social interaction, war, etc; a rich chaos from which new scientific conjectures eventually become a possibility.
(One more thought re: social science):
Social science can work in some sense because you can explore the boundary conditions created by the interactions of people–but if you’re a big enough institution those boundary conditions will probably be too close to your own size. Similarly, someone who wants to learn how to influence people might be able to repeatedly interact with people to see what kind of response they get on average, but in order to do so they’d have to blind themselves to the person’s subjectivity, since that would involve engaging with phenomena equal in scale to the observer.
So the gray zone is like trying to control the flow of water in a water bed – that’s floating in the ocean.
And bringing a product to market involves predicting the war between paradigm shifts and the cults of personality that keep existing paradigms in place.