Probability theory is the mathematical study of probability.
Mathematicians think of probabilities as numbers in the interval from 0 to 1 assigned to "events" whose occurrence or failure to occur is random. Probabilities P(E) are assigned to events E according to the probability axioms.
The probability that an event E occurs given the known occurrence of an event F is the conditional probability of E given F; its numerical value is (as long as P(F) is nonzero). If the conditional probability of E given F is the same as the ("unconditional") probability of E, then E and F are said to be independent events. That this relation between E and F is symmetric may be seen more readily by realizing that it is the same as saying .
Two crucial concepts in the theory of probability are those of a random variable and of the probability distribution of a random variable; see those articles for more information.
A somewhat more abstract view of probability
"Pure" mathematicians usually take probability theory to be the study of probability spaces and random variables — an approach introduced by Andrey Nikolaevich Kolmogorov in the 1930s. A probability space is a triple (Ω, F, P), where
 Ω is a nonempty set, sometimes called the "sample space", each of whose members is thought of as a potential outcome of a random experiment. For example, if 100 voters are to be drawn randomly from among all voters in California and asked whom they will vote for governor, then the set of all sequences of 100 Californian voters would be the sample space Ω.

F is a sigmaalgebra of subsets of Ω whose members are called "events". For example the set of all sequences of 100 Californian voters in which at least 60 will vote for Schwarzenegger is identified with the "event" that at least 60 of the 100 chosen voters will so vote. To say that F is a sigmaalgebra necessarily implies that the complement of any event is an event, and the union of any (finite or countably infinite) sequence of events is an event.
 P is a probability measure on F, i.e., a measure such that P(Ω) = 1.
It is important to note that P is defined on F and not on Ω. With Ω denumerable we can define F := powerset(Ω) which is trivially a sigmaalgebra and the biggest one we can create using Ω. In a discrete space we can therefore omit F and just write (Ω, P) to define it. If on the other hand Ω is nondenumerable and we use F = powerset(Ω) we get into trouble defining our probability measure P because F is too 'huge'. So we have to use a smaller sigmaalgebra F (e.g. the Borel algebra of Ω). We call this sort of probability space a continuous probability space and are led to questions in measure theory when we try to define P.
A random variable is a measurable function on Ω. For example, the number of voters who will vote for Schwarzenegger in the aforementioned sample of 100 is a random variable.
If X is any random variable, the notation P(X ≥ 60) is shorthand for P({ ω in Ω : X(ω) ≥ 60 }), so that "X ≥ 60" is an "event".
For an algebraic alternative to Kolmogorov's approach, see algebra of random variables.
Philosophy of application of probability
Some statisticians will assign probabilities only to events that they think of as random, according to their relative frequencies of occurrence, or to subsets of populations as proportions of the whole; those are frequentists. Others assign probabilities to propositions that are uncertain according either to subjective degrees of belief in their truth, or to logically justifiable degrees of belief in their truth. Such persons are Bayesians. A Bayesian may assign a probability to the proposition that there was life on Mars a billion years ago, since that is uncertain; a frequentist would not assign such a probability, since it is not a random event that has a longrun relative frequency of occurrence.
See also
Bibliography
 Pierre Simon de Laplace (1812) Analytical Theory of Probability

 The first major treatise blending calculus with probability theory, originally in French: Theorie Analytique des Probabilités.
 Andrei Nikolajevich Kolmogorov (1933) Foundations of the Theory of Probability

 The modern measuretheoretic foundation of probability theory, originally in German: Grundbegriffe der Wahrscheinlichkeitrechnung.
 Harold Jeffreys (1939) The Theory of Probability

 An empiricist, Bayesian approach to the foundations of probability theory.
 Edward Nelson (1987) Radically Elementary Probability Theory

 Discrete foundations of probability theory, based on nonstandard analysis and internal set theory.