In computer science and mathematics, a variable is a symbol denoting a quantity or symbolic representation. In mathematics, a variable often represents an unknown quantity; in computer science, it represents a place where a quantity can be stored. Variables are often contrasted with constants, which are known and unchanging.
In other scientific fields such as biology, chemistry, and physics, the word variable is used to refer to a measurable factor, characteristic, or attribute of an individual or a system. In a scientific experiment, so called "independent variables" are factors that can be altered by the scientist. For example, temperature is a common environmental factor that can be controlled in laboratory experiments. "Dependent variables" or "response variables" are those that are measured and collected as data.
Variables are used in open sentences. For instance, in the formula: x + 1 = 5, x is a variable which represents an "unknown" number. In mathematics, variables are usually represented by letters of the Roman alphabet, but are also represented by letters of other alphabets; as well as various other symbols. In computer programming, variables are usually represented by either single letters or alphanumeric strings.
Why are variables useful?
Variables are useful in mathematics and computer programming because they allow instructions to be specified in a general way. If one were forced to use actual values, then the instructions would only apply in a more narrow, and specific, set of situations. For example: specify a mathematical definition for finding the square of ANY number: square(x) = x · x.
Now, all we need to do to find the square of a number is replace x with any number we want.
- square(x) = x · x = y
- square(1) = 1 · 1 = 1
- square(2) = 2 · 2 = 4
- square(3) = 3 · 3 = 9
In the above example, the variable x is a "placeholder" for ANY number. One important thing we are assuming is that the value of each occurrence of x is the same -- that x does not get a new value between the first x and the second x. In computer programming languages without referential transparency, such changes can occur.
In programming languages, a variable can be thought of as a place to store a value in computer memory. Variables are convenient ways to mimic mathematics.
In general, a variable binds an object to a name so that the object could be accessed later, much like a person has a name and people could refer to him by that name. This is analogous to the use of variables in the mathematics and variables in computer programming work usually in the similar manner. Put in another way, an object could exist without it being bound to a certain variable.
Typically, the name of a variable is bound to a particular address of some bytes on the memory, and any operations on the variable would manipulate that block. This is called name binding. If the space is way too large or its size is unknown beforehand, the use of referencing is more common, in which a value is not directly stored in the variable but a location information for it is.
Importation questions about variables are twofold: its life-time and scope. For space efficiency, a memory space needed for a variable is allocated when first used and freed if no longer needed. The scope helps determine the life-time of variables. Usually, a variable is set to reside in some scope in program code, and entrance and leave of the scope coincides with the beginning and ending of a variable life, respectively. Put in conceptual terms, a variable is visible in its scope, and computers could assume the variable is needed only when it is visible. In this way, however, unused variables might be given a space, which is going to be never used. Because of this, a compiler often warns programmers when a variable is declared but not used at all.
While a variable stores simple data like integers and literal strings, some languages allow a variable to store datatype as well. They enable parametric polymorphic functions to be written. They operate like variables, in that they can represent any type. For example, with the function
length -- to determine the length of a list, it is only necessary to know the amount of elements in the list -- the type of the elements does not count, so the type signature can be represented with a type variable and thus is parametric polymorphic.
Variables could be either mutable or immutable. Mutable variables could be thought of ones having l-value while immutable ones having r-value. One characteristic of functional programming is that a variable is immutable. Because immutable variables are semantically the same as constants given a name or constant functions, when one talks about variables, they usually mean mutable variables.
See name for naming rules and convention of variable names.
In C++ (not in C), "mutable" is a keyword to allow a mutable member to be modified by a const member function.
In programming languages, a variable can be thought of as a place to store a value in computer memory. More precisely, a variable associates a name (sometimes called an identifier) with the location of the value; the value in turn is stored as a data object in this location. The specifics of variable allocation and the representation of values vary widely, both among languages and among implementations of any given language.
The names of variables
The naming of variables is a matter of syntax, convention, and taste. Each language has lexical requirements for what may be used as a variable name. In addition, programming communities have informal conventions on the naming of variables -- as do individual programmers.
For instance, in C and related languages, variable names must be made of letters, numbers, and underscores, and must begin with a letter. However, the language does not spell out whether a variable should be named x_coordinate or xCoordinate or simply x.
In other languages, the name of a variable might tell you what kind of value it might contain. For instance, in Fortran, the first letter in a variable's name indicates whether by default it is created as an integer or floating point variable. In BASIC, the suffix $ on a variable name indicates that its value is a string. Perl uses the prefixes $, @, %, and & to indicate scalar, array, hash, and subroutine variables.
Internally, names are mapped to memory in a symbol table. In Lisp languages, the symbol table is exposed: the names of variables are not strings but symbols, a special data type which can be manipulated by the program.
In many languages, such as Java, Common Lisp, and Python, variable names can be arranged into namespaces or packages. Each namespace can be considered a separate symbol table, so a given name can occur in different namespaces without collision: thus, if there are packages
app in a Common Lisp program, each can contain a variable called
*mode* without conflict. A portion of a program (such as a module) may use a given namespace as its default, and refer to variables from other namespaces only by mentioning the namespace explicitly — e.g. as
Scope and extent
The scope and extent (or lifetime) of a variable describe where in the program's text it may be used, and when in the program's execution it has a value.
In most languages, variables can have different scopes. The scope of a variable is the portion of the program code for which the variable's name has meaning. For instance, a variable with lexical scope is meaningful only within a certain block of statements or subroutine. A global variable, or one with indefinite scope, may be referred to anywhere in the program. When a variable has gone out of scope, it is erroneous or meaningless to refer to it. Lexical analysis of a program can determine whether variables are used out of scope.
Likewise, the bindings of variables to values can have different extent. The extent of a binding is the length of time -- part of the course of the program's execution -- during which the variable continues to refer to the same value or place. A running program may enter and leave a given extent many times, as in the case of a closure. A variable can be unbound, meaning that it is in scope but has never been given a value, or its value has been destroyed; in many languages, it is an error to try to use the value of an unbound variable, or may yield unpredictable results.
In other words, scope is a lexical fact, but extent a runtime (dynamic) fact. If a variable name is out of scope, then it is an error for that name to be used in the program code. In compiled languages, this error can be detected statically at compile-time. If a variable is out of extent, its value cannot be referred to (since it doesn't have one; it is unbound) but it may be given a value, which gives it a new extent.
When a variable binding extends (in time) as the program's execution passes out of the variable's scope, this is no bug. It is a Lisp closure or a C static variable: when execution passes back into the variable's scope, the variable may be referred to again. But when a variable's extent ends, it becomes unbound -- if it is still in scope, referring to it is an error (or, in C, gets you a nice arbitrary value).
Many programming languages employ a reserved value (often named null or nil) to indicate an invalid or uninitialized variable.
Bound variables have values. A value, however, is an abstraction, an idea; in implementation, a value is represented by some data object, which is stored somewhere in computer memory. The program, or the runtime environment, must set aside memory for each data object and, since memory is finite, ensure that this memory is yielded for re-use when the object is no longer needed to represent some variable's value.
The handling of memory for variables is highly dependent on the programming language environment. Many language implementations handle the simplest cases easily by distinguishing those variables whose extent lasts no longer than a single function call. Space for these local variables are allocated on the execution stack, where their memory is automatically reclaimed when the function returns.
Space for other objects must be allocated on the heap, or pool of unused memory. These must be reclaimed specially when the objects are no longer needed. In a garbage-collected (gc) language such as Java or Lisp, the runtime environment automatically "reaps" objects when it can be proven that no extant variable refers to them. In a non-gc language such as C, it is up to the program (and thus the programmer) to explicitly allocate memory; and in turn to state when memory can be reclaimed, by explicitly freeing it. Failure to do so leads to memory leaks, in which the heap is depleted over the program's run. If the program runs long enough, it will exhaust available memory and fail.
Memory allocation goes beyond single variables. A variable may refer to a data structure created dynamically, where many structure components are not directly named by variables, but are reachable from a variable by traversing the structure. For this reason, garbage collectors (and programs in languages which lack them) must deal with the case where a portion of the memory reachable from a variable needs to be reclaimed.
Typed and untyped variables
In statically-typed languages such as Java or ML, a variable also has type, meaning that only values of a given sort can be stored in it. In dynamically-typed languages such as Python or Lisp, it is values and not variables which carry type. See type system.
Typing of variables also allows polymorphisms to be resolved at compile time.
The arguments or formal parameters of functions are also referred to as variables. For instance, in these equivalent functions in Python and Lisp
return x + 2
(defun addtwo (x) (+ x 2))
the variable named x is an argument. It is given a value when the function is called. In most languages, function arguments have local scope; this specific variable named x can only be referred to within the addtwo function, though of course other functions can also have variables called x.
Last updated: 06-02-2005 02:04:05