The scientific method or process is considered fundamental to the scientific investigation and acquisition of new knowledge based upon physical evidence. Scientists propose new assertions about our world in the form of theories: observations, hypotheses, and deductions. Predictions from these theories are tested by experiment. If a prediction turns out to be correct, the theory survives. Any theory which is cogent enough to make predictions can then be tested reproducibly in this way. The method is commonly taken as the underlying logic of scientific practice. The scientific method is essentially an extremely cautious means of building a supportable, evidenced understanding of our world.
See also: History of Science
The development of the scientific method is indivisible from the development of science itself.
The Edwin Smith Papyrus (ca 1600 BC), an ancient textbook on surgery, describes in exquisite detail the examination (characterization), diagnosis (hypothesis), treatment (experiment), and prognosis (review) of numerous ailments (Encyclopædia Britannica). Additionally, although the Ebers papyrus (ca 1550 BC) is full of incantations and foul applications meant to turn away disease-causing demons and other superstition, in it there is also evidence of a long tradition of empirical practice and observation.
In his enunciation of a 'method' in the 13th century Roger Bacon was inspired by the writings of Arab alchemists who had preserved and built upon Aristotle's portrait of induction. Bacon described a repeating cycle of observation, hypothesis, experimentation, and the need for independent verification. In the 17th century, Francis Bacon attempted to describe a rational procedure for establishing causation between phenomena.
In 1619, Rene Descartes began writing his first major treatise on proper scientific and philosophical thinking, the unfinished Rules for the Direction of the Mind. With this document, Descartes established the framework for the scientific method's guiding principles.
Galileo Galilei introduced quantitative experimentation and mathematical analysis, which permitted the enunciation of general physical laws. Isaac Newton systematised these laws, becoming a model which other sciences sought to emulate.
Attempts to systematise the scientific method were faced with the Problem of induction, which points out that inductive reasoning is not logically valid. David Hume set the difficulty out in detail. Karl Popper, following others, argued that a hypothesis must be falsifiable: that is, it must be capable of disproof. Difficulties with this have led to the rejection of the very idea that there is a single method that is universally applicable to all the sciences, and that serves to distinguish science from non-science.
The question of how science operates has importance well beyond scientific circles or the academic community. In the judicial system and in public policy controversies, for example, a study's deviation from accepted scientific practice is grounds for rejecting it as "junk science" or pseudoscience.
The scientific method
The scientific method's essential elements are iterations
of the following four steps:
Characterization (Quantification, observation and measurement)
Hypothesis (a theoretical, hypothetical explanation) of the observations and measurements
Prediction (logical deduction from the hypothesis)
Experiment (test of all of the above)
The above is a hypothetico-deductive method, and includes observation in the first and fourth steps. Each step is subject to peer review for possible mistakes. These activities do not describe all that scientists do (see below) but apply mostly to experimental sciences (e.g., physics, chemistry). The steps above are often taught in education1.
Science is a social activity. The process is subject to evaluation by scientists directly involved, or by the scientific community, at any stage. A scientist's theory (or proposal) becomes accepted only once it is known to others (by publication or, ideally, peer reviewed publication) and criticised. A scientist cites the work of others, and desires to be cited often by other scientists.
The scientific method depends upon a careful observation or characterization of the subject of the investigation.
Scientific observation demands careful measurement and/or counting. The systematic, careful collection of measurements or counts of relevant quantities is often the critical difference between pseudo-sciences, such as alchemy, and a science, such as chemistry. Scientific measurements taken are usually tabulated, graphed, or mapped, and statistical manipulations, such as correlation and regression, performed on them. The measurements may be made in a controlled setting, such as a laboratory, or made on more or less inaccessible or unmanipulatable objects such as stars or human populations. The measurements often require specialized scientific instruments such as thermometers, spectroscopes, or voltmeters, and the progress of a scientific field is usually intimately tied to their invention and development.
Measurements demand the use of operational definitions of relevant quantities. That is, a scientific quantity is described or defined by how it is measured, as opposed to some more vague, inexact or "idealized" definition. For example, electrical current, measured in Amperes, may be operationally defined in terms of the mass of silver deposited in a certain time on an electrode in an electrochemical device that is described in some detail. The operational definition of a thing often relies on comparisons with standards: the operational definition of "mass" ultimately relies on the use of an artifact, such as a certain kilogram of platinum kept in a laboratory in France.
The scientific definition of a term sometimes differs substantially from their natural language usage. For example, mass and weight are often used interchangeably in common discourse, but have distinct meanings in physics. Scientific quantities often have dimensions and are described in terms of certain physical units.
Measurements in scientific work are also usually accompanied by estimates of their uncertainty. The uncertainty is often estimated by making repeated measurements of the desired quantity. Uncertainties may also be calculated by consideration of the uncertainties of the individual underlying quantities that are used. Counts of things, such as the number of people in a nation at a particular time, may also have an uncertainty due to limitations of the method used. Counts may only represent a sample of desired quantities, with an uncertainty that depends upon the sampling method used and the number of samples taken.
New theories sometimes arise upon realizing that certain terms had not previously been sufficiently clearly defined. For example, Albert Einstein's first paper on relativity begins by defining simultaneity and the means for determining length. These ideas were skipped over by Isaac Newton with, "I do not define time, space, place and motion, as being well known to all." Einstein's paper then demonstrates that they (viz., absolute time and length independent of motion) were approximations.
includes a suggested explanation of the subject. It will generally provide a causal
explanation or propose some correlation.
Observations have the general form of existential statements, stating that some particular instance of the phenomenon being studied has some characteristic. Causal explanations have the general form of universal statements, stating that every instance of the phenomenon has a particular characteristic. It is not deductively valid to infer a universal statement from any series of particular observations. This is the problem of induction. Many solutions to this problem have been suggested, including falsifiability and Bayesian inference.
Scientists use whatever they can — their own creativity, ideas from other fields, induction, systematic guessing(!), etc. — to imagine possible explanations for a phenomenon under study. There are no definitive guidelines for the production of new hypotheses. The history of science is filled with stories of scientists claiming a "flash of inspiration", or a hunch, which then motivated them to look for evidence to support or refute their idea. Michael Polanyi made such creativity the centrepiece of his discussion of methodology.
Prediction from the hypothesis
A useful hypothesis will enable predictions
, by deductive reasoning
, that can be experimentally assessed. If results contradict the predictions, then the hypothesis under test is incorrect or incomplete and requires either revision or abandonment. If results confirm the predictions, then the hypothesis might be correct but is still subject to further testing.
Einstein's theory of General Relativity makes several specific predictions about the observable structure of space-time, such as a prediction that light bends in a gravitational field and that the amount of bending depends in a precise way on the strength of that gravitational field. Observations made during a 1919 solar eclipse supported General Relativity rather than Newtonian gravitation.
Predictions refer to experiment designs with a currently unknown outcome; the classic example was Edmund Halley's prediction of the year of return of Halley's comet which returned after his death. A prediction differs from a consequence, which does not necessarily bear a time-dependent connotation. Thus, one consequence of General Relativity, which Einstein deduced, was the size of the precession of the perihelion of the orbit of the planet Mercury. The observed value, on the order of 42 arc-seconds per century, was one of the pieces of evidence for Einstein's characterization of his theory of General Relativity. This consequence (43 arc-seconds per century, the size of the precession) was known to Einstein, in contrast to his predictions, in which he had enough confidence to publish, but which yet required corroboration as of 1915.
Once a prediction is made, an experiment
is designed to test it. The experiment may seek either confirmation
of the hypothesis. Yet an experiment is not an absolute requirement. In observation based fields of science actual experiments must be designed differently than for the classical laboratory based sciences.
Scientists assume an attitude of openness and accountability on the part of those conducting an experiment. Detailed recordkeeping is essential, to aid in recording and reporting on the experimental results, and providing evidence of the effectiveness and integrity of the procedure. They will also assist in reproducing the experimental results.
The experiment's integrity should be ascertained by the introduction of a control. Two virtually identical experiments are run, in only one of which the factor being tested is varied. This serves to further isolate any causal phenomena. For example in testing a drug it is important to carefully test that the supposed effect of the drug is produced only by the drug itself. Doctors may do this with a double-blind study: two virtually identical groups of patients are compared, one of which receives the drug and one of which receives a placebo. Neither the patients nor the doctor know who is getting the real drug, isolating its effects.
Once an experiment is complete, a researcher determines whether the results (or data) gathered are what was predicted. If the experimental conclusions fail to match the predictions/hypothesis, then one returns to the failed hypothesis and re-iterates the process. If the experiment(s) appears "successful" - i.e. fits the hypothesis - then its details become published so that others (in theory) may reproduce the same experimental results.
Evaluation and iteration
Testing and improvement
The scientific process is iterative. At any stage it is possible that some consideration will lead the scientist to repeat an earlier part of the process. Failure to develop an interesting hypothesis may lead a scientist to re-define the subject they are considering. Failure of a hypothesis to produce interesting and testable predictions may lead to reconsideration of the hypothesis or of the definition of the subject. Failure of the experiment to produce interesting results may lead the scientist to reconsidering the experimental method, the hypothesis or the definition of the subject.
Science is a social enterprise, and scientific work will become accepted by the community only if they can be verified. Crucially, experimental and theoretical results must be reproduced by others within the science community.
scientific knowledge is in a state of flux, for at any time new evidence could be presented that contradicts a long-held hypothesis. A particularly luminous example is the theory of light
. Light had long been supposed to be made of particles. Isaac Newton
, and before him many of the Classical Greeks, was convinced it was so, but his light-is-particles account was overturned by evidence in favor of a wave theory of light
suggested most notably in the early 1800s by Thomas Young
, an English physician. Light as waves neatly explained the observed diffraction and interference of light when, to the contrary, the light-as-a-particle theory did not. The wave interpretation of light was widely held to be unassailably correct for most of the 19th century. Around the turn of the century, however, observations were made that a wave theory of light could not explain. This new set of observations could be accounted for by Max Planck
's quantum theory (including the photoelectric effect
and Brownian motion
—both from Albert Einstein
), but not by a wave theory of light. Nor, for that matter, by the particle theory. More ...
Peer review evaluation
Scientific journals use a process of peer review
, in which scientists' manuscripts are submitted by editors of scientific journals to (usually one to three) fellow (usually anonymous) scientists familiar with the field for evaluation. The referees may or may not recommend publication, publication with suggested modifications, or, sometimes, publication in another journal. This serves to keep the scientific literature free of unscientific or crackpot work, helps to cut down on obvious errors, and generally otherwise improve the quality of the scientific literature. Work announced in the popular press before going through this process is generally frowned upon. Sometimes peer review inhibits the circulation of unorthodox work, and at other times may be too permissive. The peer review process is not always successful, but has been very widely adopted by the scientific community.
The reproducibility or replication of scientific observations, while usually described as being very important in the scientific method, is actually seldom actually reported, and is in reality often not done. Referees and editors rightfully and generally reject papers purporting only to reproduce some observations as being unoriginal and not containing anything new. Occasionally reports of a failure to reproduce results are published--mostly in cases where controversy exists or a suspicion of fraud develops. The threat of failure to replicate by others, however, serves as a very effective deterrent for most scientists, who will usually replicate their own data several times before attempting to publish.
Sometimes useful observations or phenomena themselves cannot be reproduced. They may be unique events. How does one reproduce the extinction of a dinosaur by a huge meteor? But measurements of the concentration of iridium (used to infer the meteor) in sediment at different places can, and should be done by different laboratories and different methods too.
Reproducibility of observations and replication of experiments is not a guarantee that they are correct or properly understood. Errors can all too often creep into more than one laboratory. There are no easy guarantees.
Evidence and assumptions
Evidence comes in different forms and quality, mostly due to underlying assumptions. An underlying assumption that 'objects heavier than air fall to the ground when dropped' is not likely to incite much disagreement. An underlying assumption like 'aliens abduct humans' however is an extraordinary claim which requires solid proof. Many extraordinary claims also do not survive Occam's razor
Elegance of hypothesis
In evaluating a hypothesis, scientists tend to look for theories that are "elegant
" or "beautiful
". In contrast to the usual English use of these terms, scientists have more specific meanings in mind. "Elegance" (or "beauty") refers to the ability of a theory to neatly explain as many of the known facts as possible, as simply as possible, or at least in a manner consistent with Occam's Razor
while at the same time being aesthetically pleasing.
The study of the scientific method is distinct from the practice of science and is more a part of the philosophy
of science than of science itself. While such studies have limited direct impact on day-to-day scientific practice, they have a vital role in justifying and defending the scientific approach.
We find ourselves in a world that is not directly understandable. We find that we sometimes disagree with others as to the facts of the things we see in the world around us, and we find that there are things in the world that sometimes are at odds with our present understanding. The scientific method attempts to provide a way in which we can reach agreement and understanding. A perfect scientific method would work in such a way that rational application of the method would always result in agreement and understanding; in effect a perfect method would not leave any room for rational agents to disagree. Philosophers of science have long sought such a method. The material presented below is intended to show that, as with all philosophical topics, the search has been neither straightforward nor simple.
Theory-dependence of observation
The scientific method depends on observation, in defining the subject under investigation and in performing experiments.
Observation involves perception, and so is a cognitive process. That is, one does not make an observation passively, but is actively involved in distinguishing the thing being observed from surrounding sensory data. Therefore, observations depend on some underlying understanding of the way in which the world functions, and that understanding may influence what is perceived, noticed, or deemed worthy of consideration. (See the Sapir-Whorf hypothesis for an early version of this understanding of the impact of cultural artifacts on our perceptions of the world.)
Empirical observation is supposedly used to determine the acceptability of some hypothesis within a theory. When someone claims to have made an observation, it is reasonable to ask them to justify their claim. Such a justification must itself make reference to the theory - operational definitions and hypotheses - in which the observation is embedded. That is, the observation is a component of the theory that also contains the hypothesis it either verifies or falsifies. But this means that the observation cannot serve as a neutral arbiter between competing hypotheses. Observation could only do this "neutrally" if it were independent of the theory.
Thomas Kuhn denied that it is ever possible to isolate the theory being tested from the influence of the theory in which the observations are grounded. He argued that observations always rely on a specific paradigm, and that it is not possible to evaluate competing paradigms independently. By "paradigm" he meant, essentially, a logically consistent "portrait" of the world, one that involves no logical contradictions. More than one such logically consistent construct can each paint a usable likeness of the world, but it is pointless to pit them against each other, theory against theory. Neither is a standard by which the other can be judged. Instead, the question is which "portrait" is judged by some set of people to promise the most in terms of “puzzle solving”.
For Kuhn, the choice of paradigm was sustained by, but not ultimately determined by, logical processes. The individual's choice between paradigms involves setting two or more “portraits" against the world and deciding which likeness is most promising. In the case of a general acceptance of one paradigm or another, Kuhn believed that it represented the consensus of the community of scientists. Acceptance or rejection of some paradigm is, he argued, more a social than a logical process.
That observation is embedded in theory does not mean that observations are irrelevant to science. Scientific understanding derives from observation, but the acceptance of scientific statements is dependent on the related theoretical background or paradigm as well as on observation. Coherentism and scepticism offer alternatives to foundationalism for dealing with the difficulty of grounding scientific theories in something more than observations.
Indeterminacy of theory under empirical test
thesis points out that any theory can be made compatible with any empirical observation by the addition of suitable ad hoc hypotheses. This is analogous to the way in which an infinite number of curves can be drawn through any set of data points on a graph.
This thesis was accepted by Karl Popper, leading him to reject naïve falsification in favour of 'survival of the fittest', or most falsifiable, of scientific theories. In Popper's view, any hypothesis that does not make testable predictions is simply not science. Such a hypothesis may be useful or valuable, but it cannot be said to be science. Confirmation holism, developed by W. V. Quine, states that empirical data is not sufficient to make a judgement between theories. A theory can always be made to fit with the empirical data available.
That empirical evidence does not serve to determine between alternate theories does not imply that all theories are of equal value. Rather than pretending to use a universally applicable methodological principle, the scientist is making a personal choice when she chooses some particular theory over another.
One result of this is that specialists in the philosophy of science stress the requirement that observations made for the purposes of science be restricted to intersubjective objects. That is, science is restricted to those areas where there is general agreement on the nature of the observations involved. It is comparatively easy to agree on observations of physical phenomena, harder for them to agree on observations of social or mental phenomena, and difficult in the extreme to reach agreement on matters of theology or ethics.
Scientific Method is touted as one way of determining which disciplines are scientific and which are not. Those which follow the scientific method might be considered sciences; those that do not are not. That is, method might be used as the criterion of demarcation
between science and non-science. If it is not possible to articulate a definitive method, then it may also not be possible to articulate a definitive distinction between science and non-science, between science and pseudo-science, and between scientists and non-scientists.
Feyerabend denies there is a scientific method, and in his book Against Method argues that scientific progress is not the result of the application of any particular method. In essence, he says that anything goes.
Science as a communal activity
In his book The Structure of Scientific Revolutions
Kuhn argues that the process of observation and evaluation take place within a paradigm. 'A paradigm is what the members of a community of scientists share, and, conversely, a scientific community consists of men who share a paradigm' (postscript, part 1). On this account, science can be done only as a part of a community, and is inherently a communal activity.
For Kuhn the fundamental difference between science and other disciplines is in the way in which the communities function. Others, especially Feyerabend and some post-modernist thinkers, have argued that there is insufficient difference between social practices in science and other disciplines to maintain this distinction. It is apparent that social factors play an important and direct role in scientific method, but that they do not serve to differentiate science from other disciplines. Furthermore, although on this account science is socially constructed, it does not follow that reality itself is a social construct. Kuhn’s ideas are equally applicable to both realist and anti-realist ontologies.
The scientific method is a source of ongoing debate and contention, and this area of study is undergoing considerable change. It appears that positivist, empiricist and falsificationist theories are unable to satisfy their aim of giving a definitive account of the logic of science. It may also be that the sociology of science is incapable of accounting for the success of the scientific enterprise.
The scientific method as an everyday toolCarl Sagan
, in his book The Demon-Haunted World
, argues that we should use the scientific method as a tool for skeptical thinking
. When we are presented with a new concept - ESP
, for example - we should test the claims of its proponents against experiment ourselves (or gather evidence from as many sources as possible), and reject the theory if the evidence shows its claims to be false. Sagan was particularly interested in those movements which misrepresent science - pseudoscience
Scientific method and the practice of science
The primary constraints on science are:
- Publication, i.e. Peer review
- Resources (mostly, funding)
It has not always been like this: in the old days of the "gentleman scientist" funding (and to a lesser extent publication) were far weaker constraints.
Both of these constraints indirectly bring in the scientific method — work that too obviously violates the constraints will be difficult to publish and difficult to get funded. Journals do not require submitted papers to conform to anything more specific than "good scientific practice" and this is mostly enforced by peer review. Originality, importance and interest are more important - see for example the author guidelines for Nature.
Criticisms (see Critical theory) of these restraints are that they are so nebulous in definition (e.g. "good scientific practice") and open to ideological, or even political, manipulation apart from a rigorous practice of the scientific method, that they often serve to censor rather than promote scientific discovery. Apparent censorship through refusal to publish ideas unpopular with mainstream scientists (unpopular because of ideological reasons and/or because they seem to contradict long held scientific theories) has soured the popular perception of scientists as being neutral or seekers of truth and often denigrated popular perception of science as a whole.
Annotated list of related issues
Paradigm, perhaps the most abused word in English.
Thomas Kuhn wrote influentially on the sociology of scientific revolutions in The Structure of Scientific Revolutions.
Paradigm shift is a Kuhnian term referring to the change between one pervasively accepted theory (eg, Aristotian motion) and another (eg, Newtonian gravitation). Kuhn himself came to prefer other terminology.
The problem of induction questions the logical ground for induction as a basis for science.
When Method goes wrong
Note 1:Teachers using inquiry as a teaching method sometimes teach a slightly modified version of the scientific method in which an inquiry, a "Question", is substituted for the first step of the scientific method: "Characterization, Observation, Definition, etc.".
Historical references to scientific method
W. Stanley Jevons, 1874, 1877. The Principles of Science, 786pp., index. Reprinted by Dover, 1958, with a forward by Ernst Nagel.