R. M. R. Boltzman Statistics
Home ] Up ]

 

 
BOLTZMAN STATISTICS

Atomic particles in thermal equilibrium with their surroundings are in a 
state of rapid, but random, motion. In a gas, each particle moves along a straight 
line until it collides with another particle or a solid object such as the walls of the 
containing vessel. After a collision the particle rebounds in a random direction 
with random velocity until the next collision, and so on. The situation is much 
the same in a liquid except that the particles are closer together and the 
collisions are much more frequent. In a solid the heavier particles are more or 
less constrained to vibrate rapidly within the close neighborhood of a specific 
location. The three dimensional components of the motion at any given instant 
are random both in speed and direction. I will now present some background 
material intended to stimulate my reader's intuition about such matters.

The atoms in a solid tend to arrange themselves in orderly arrays, called 
crystal lattices, at least on a local scale. This can be seen by taking a 
photograph of X-rays scattered from almost any solid. If we know the wavelength 
of the X-rays, we can calculate the lattice spacings from the observed angles of 
refraction. Lattice spacings are usually on the order of several Angstroms, while 
the atoms themselves are often a factor of 3 or 4 less in diameter. (1 Angstrom = 
1E-10 meter). For our purposes here, we may imagine that the lattice sites in a 
solid are occupied by the heavy nuclei of the atoms which make up the solid and 
these are surrounded by a swarm of highly mobile electrons closely confined, for 
the most part, near the nuclei. The mass of a nuclear particle is roughly 1800 
times the mass of an electron. In the case of electrical conductors, a fraction of 
the electrons are more or less free to wander about through the lattice, able to 
transfer heat and electrical charge. In the case of insulating solids at low 
temperatures, almost all of the electrons are closely confined to the nuclei. 
These may aid in heat transfer by interactions with neighboring electrons, but 
they are not free to carry significant electric charge. At elevated temperatures, 
however, most insulators become conductors to some degree.

Although the atoms in a solid tend to be confined to specific locations in 
the lattice structure, even at room temperature they may occasionally escape the 
confines of their neighbors and wander to a similar location elsewhere in the 
lattice. At temperatures approaching the melting point the individual atoms may 
relocate themselves more or less rapidly while the lattice structure remains 
intact, but at the melting point even the structure disintegrates. 

In order to understand the process of diffusion in solids, it is necessary to 
understand the concepts of binding energy and random thermal motion. We note 
that the moon is bound to the earth by gravitational attraction. This pair as well 
as the other planets and their moons are also bound to the sun. Even the comets 
are bound to the sun, although they move at such high speed that they may 
spend years or centuries far from the sun or even the outermost planets before 
returning. When Newton reasoned out the laws of motion in a gravitational field, 
he realized that bodies of mass moving at or above a certain critical velocity 
could escape forever the gravitation field of the earth or the sun. We learned in 
freshman physics that the critical escape velocity for a mass in the earth's field of 
gravity is roughly 7 miles/second. In an idealized two body model, a projectile 
fired in an upward direction at a lower velocity would eventually come to rest and 
return to the earth, but if the velocity was greater the projectile would continue 
moving away indefinitely. The earth's atmosphere, for example, is relatively free 
of hydrogen while the moon is relatively free of all gas particles. A significant 
fraction of any hydrogen molecules, H2, in the atmosphere acquire a velocity in 
excess of that needed to escape the earth and the expected lifetime of a free 
hydrogen molecule in the atmosphere is short compared to the age of the earth. 
The escape velocity on the surface of the moon is roughly 1.5 miles/second and 
molecules lighter than Argon will have an expected lifetime short compared to 
the age of the moon.

When the kinetic theory of gases was being developed, the random nature 
of the motion of the individual particles was widely appreciated before Ludwig 
Boltzman came up with a satisfactory mathematical expression to describe the 
situation. Before putting forth a non-rigorous argument intended to stimulate my 
reader's intuition, allow me to put forth some other basic ideas, also intended as 
an aid to the intuition.

This is perhaps the time and place to meet my primary challenge head on... 
how to explain what is essentially a mathematical argument to people, mostly 
children, with little or no mathematical background. There is no shortage of 
wisdom to the effect that it cannot be done and should not be tried, but I have 
watched enough television to know what is expected of me as a person of the 
male persuasion, namely that I will blunder on ahead heedless of the advice of 
wiser heads. "Ah, gentle dames, it gars me greet... etc.", in the immortal words of 
Bobby Burns.

A great deal of modern mathematics is devoted to the concept of the 
function. Roughly speaking, a function is a recipe for finding one number, called 
the dependent variable, given another number, called the independent variable. 
For example, my pocket calculator has some functions like EXP, SIN, COS, TAN, 
LOG, LN, ?, 1/x, etc. I can enter the independent variable from the keyboard, 
press the appropriate key, and the dependent variable appears in the register . I 
know a number of people who are quite adept at using these and many similar 
buttons and computer icons to solve their everyday problems in business and 
engineering without any significant grasp of the underlying logic, although 
someone somewhere must understand. I can hope that at least one or two of my 
readers are sufficiently curious to take a few college level math courses 
whenever the opportunity may arise. Others may do very well indeed by simply 
forming a symbiotic relationship with a math nerd who is apt to be impaired with 
regard to the basic social skills. The mutual benefits could be substantial.

Variables may be thought of as symbols to which we can assign numerical 
values. To illustrate the concept further, consider the simple function y = 2x2. x 
and y are variables which can take on numerical values within some range, or 
domain, which we are relatively free to specify. If we restrict the domain of x to 
be the set of real numbers between -2 and 2, for example, we may write this as -2 
< x < 2. In this case the domain of y will be 0 <= y < 8. x is the independent 
variable, which means that we must specify its value first, and then, using the 
functional relationship, find y, the dependent variable. x is normally thought of 
as a continuous variable, which means that we may choose the value as precisely 
as we please. Furthermore, we can say that between any two specific values of x, 
however close to each other these values may be, we can always find an 
arbitrarily large number of values in between. The real number set is infinitely 
fine grained. Once x is chosen, we may then square it, multiply the result by 2, 
and find the value of y. The domain of y is continuous as well. For each value of 
x, there is one, and only one, value for y. In this case, y is a single valued 
function of x.

For any mathematical operation, we routinely insist on being able to invoke 
the inverse operation. My pocket calculator has an INV button which I can use 
whenever I wish to find the inverse of any of the functions listed above. With 
regard to the function y = 2x2, we might prefer to specify y first and find the 
consequent value of x. We may, following the laws of algebra , divide both sides 
of this expression by 2 and then take the square root of both sides to get a new 
expression x =(+/-)SQRT(y/2) with y being the independent variable and x being 
the dependent variable. Note that these two functions are not the same. In 
particular, x is not a single valued function of y. For each y in the domain 0 <= y < 
8 there are two values of x since, as we are told in freshman algebra, (-x)2 = +x2. 

I was also taught in high school that since -12 = +1 it followed that SQRT(-1) 
= +i or -i, where i is the unit imaginary, take it or leave it. These concepts were 
deeply troubling to me and I was unable to find anyone to explain away the 
difficulty I was having until my senior year in college. I finally found the 
satisfaction I was looking for in the concept of closure where we require that all 
operations, and their inverse, on numbers in a domain yield other numbers in the 
same domain. The set of positive integers, for example, yields only positive 
integers if all we are allowed to do is add and multiply. If, however, we wish to 
divide positive integers by positive integers of all sizes, we must invent rational 
fractions if we insist on closure. If we wish to subtract as well as add, we must 
invent negative numbers. If we wish to take squares and square roots when the 
independent domain includes fractions and negative numbers, we must then 
invent irrational numbers and imaginary numbers. I don't like the term, 
imaginary, because i = +SQRT(-1) is not more imaginary than -1 itself. I prefer the 
term, quadrature, since multiplication by i rotates a complex number by +90 
degrees counter-clockwise in the complex plane. Finally, we must invent the 
complex number set where each variable has both a real and a quadrature 
component, thus, x = a + ib is the complex form where a and b are real numbers. 
The beauty of the complex number set is that a wide variety of common 
functions, including all of the partial differential equations of physics and their 
solutions are to be found in the complex plane where closure is always assured. 
If, in the above example, we allow x to be a complex variable, x = a + ib, then y = 
2x2 = 2 * (a + ib)2 = 2 * (a2 + b2) + i4ab is a complex variable as well.

This one-to-one relationship is a primary reason why mathematical 
abstractions are so useful in describing concrete physical relationships. We can 
often find one-to-one relationships between measurable physical quantities and 
go on to find one-to-one mathematical relationships to correspond. We can then 
work with the mathematics and reliably predict what will happen in the physical 
situation . For example, people have observed that the distance travelled by a 
vehicle moving at a certain speed is proportional to the time travelled and they 
have expressed this physical fact by the mathematical model D = S * t where D is 
the distance, miles, S is the speed, miles/hour, and t is the time, hours. We can 
reason out, to a moral certainty, the relationship between time, speed, and 
distance without actually having to carry out the exercise. This kind of reasoning 
has been applied to almost every observable relationship in nature. From careful 
measurements of the positions of the planets over many years Johannes Kepler 
was able to set forth several functional relationships which serve as major 
milestones in modern scientific thinking. From these relationships and his own 
intuition, Newton was able to work out his underlying theory of gravity from 
which he could derive Kepler's Laws. Newton published his PRINCIPIA in 1687 
and roughly 100 years later a new planet, Uranus, was discovered by accident in 
1781. The only problem was that the orbit of Uranus didn't quite obey the laws of 
motion as established. Going on the hope that Newton's Laws were indeed valid 
after all and the discrepancies in the motion of Uranus were due to the presence 
of another planet, as yet undiscovered, astronomers made a number of tedious 
calculations trying to discover where to look for the new planet. Neptune was 
finally discovered in 1846 and, since the discovery of Neptune did not entirely 
explain the orbit of Uranus, Pluto was found in 1930. These discoveries illustrate 
an essential characteristic of scientific theories and mathematical models... 
although based on well established observations there are almost always a few 
fine points which, upon closer examination, lead to new and unexpected 
discoveries. 

Having introduced the concepts of dependent and independent variables, 
we now turn to the concept of random variables. We recall that the dependent 
variable is found after we are given the independent variable and the functional 
relationship between the two variables. A random variable is a number whose 
value is determined as the result of an experiment, either a physical experiment 
or an experiment in the mind. Consider, for example, a roll of a single die. The 
number of spots showing on the top of the cube is a random variable whose 
domain is the set of integers... 1, 2, 3, 4, 5, and 6.

Much of the theory of probability is concerned with finding the distribution 
of random variables. This is a device for relating random variables to continuous 
variables, or at least piecewise continuous variables. We seek a continuous, or 
piecewise continuous, function which represents the probability that the 
numerical outcome of an experiment is less than some continuous variable, or 
lies within some range of a continuous variable. The notation, P{Nrv < Xcv} = F(X) 
is often used. This says the probability that N, a random variable outcome of an 
experiment, is less than a continuous variable X, is a continuous, or piecewise 
continuous, function of X. F(X) is sometimes referred to as the Cumulative 
Distribution Function, CDF, of X. When N is a real number and the domain of X is 
the real number line, F(-?) = 0 while F(+?) = 1. There is no probability that the 
numerical outcome of any experiment will be allowed to have a value less than -
?. The probability that the numerical outcome of any experiment is less than +? 
is certainty, or 1. All CDF's have the value zero when X is arbitrarily large in the 
negative direction. All CDF's increase in value, and never decrease in value, as X 
increases. All CDF's -> 1 as X -> +?. At the risk of belaboring the point, consider 
the CDF when N is the number of spots showing at the roll of a single die. Since 
we can think of no rational argument to the contrary, we will assume that all 
outcomes are equally likely, thus P{N=1} = P{N=2}, etc., P{N=6} = 1/6. Thus 

for -? < X < 1, F(X) = 0
for 1 <= X < 2, F(X) = 1/6
for 2 <= X < 3, F(X) = 2/6 = 1/3
for 3 <= X < 4, F(X) = 3/6 = 1/2
for 4 <= X < 5, F(X) = 4/6 = 2/3
for 5 <= X < 6, F(X) = 5/6
for 6 <= X < +?, F(X) = 1

The Cumulative Distribution Function of X, a continuous real variable, is 
piecewise continuous everywhere except at the six discrete values of X where X = 
1 or 2 or 3 or 4 or 5 or 6. The CDF is thus a six step staircase from 0 to 1 from 
which we can find the value of P{Nrv < Xcv} = F(X) for any real value of X.

Another common probability function regarding random variables is the 
Probability Density Function, PDF, which specifies the probability that the 
numerical outcome of an experiment lies within a specified range of values. We 
are interested in P{X < Nrv < X + dX} = f(X) * dX. In other words, the probability 
that the random variable, Nrv, falls between X and X + dX, where dX is an 
arbitrarily small increment, is given by f(X) * dX. In the case of the number of 
spots showing at the roll of a single die, this function is zero except when the 
interval X + dX includes one of the integers 1, 2, 3, 4, 5, or 6. The value of f(X) can 
not be determined when x = 1, 2, 3, 4, 5, or 6, but the product, f(X) * dX = 1/6 in 
each case. 

In the discussion of random variables so far, I have only considered the 
case of a particular discrete random variable. Before the proliferation of digital 
computers, most math and science students were heavily schooled in analytic 
functions which deal, for the most part, with continuous complex numbers. 
Discontinuous events, such as the instantaneous onset of the flow of electric 
current when a switch is closed, were a bit awkward to describe even though a 
number of completely satisfying devices had been invented to deal with them. 
Gases were known to consist of discrete particles, roughly 3e19 of them per 
cm^3 in the air we breathe, but slide rules were only accurate to 2 or 3 decimal 
places so for all practical purposes gases were a continuous medium and could 
be adequately described by analytic functions. Modern computers are quite 
capable of discrete function analysis to more significant figures than are usually 
needed, but to discard analytic functions and the lore surrounding them would be 
a great mistake... like forgetting how to make stone tools when someone leaves 
you a pile of iron strapping. 

Purists today might recognize that the variables described by the partial 
differential equations of physics are really the expected values of random 
variables. The expected value of a random variable is, roughly speaking, the 
average value. In the case of the number of spots showing after the roll of a 
single die, the expected value is 3.5. Some explanation seems in order since we 
never expect to see 3.5 spots showing. By way of sharpening your intuition, 
consider the following carnival game: a single die is rolled and the payoff is $1 
for each spot showing. What is a fair wager to play this game? If the only 
outcome which paid $1 was the showing of a single spot, the player would win 
only 1/6th of the time, on average, and the fair price to play would be $1/6 = $.166. 
If 2 spots paid $2, the fair price of that chance alone is $2/6 = $.333, and so on. 
The value of an expectation is the sum of the payoff at a specific outcome times 
the probability of that outcome taken over all possibilities. The sum, 1/6 + 2/6 + 
3/6 + 4/6 + 5/6 + 6/6 = 21/6 = 3.5. The fair price to play the carnival game is $3.50... 
a greater price favors the house, a lesser price favors the player. 

Before getting back to Boltzman and the energy distribution of gas 
particles, let us play a few more mind games with the dice. Consider the number 
of spots showing at the roll of 2 dice, a red one and a green one. There is only 
one way to get snake eyes or box cars... both dice must come up 1 for snake 
eyes or 6 for box cars. There are 2 ways to get 3 spots showing... Nred=2 and 
Ngreen=1 or visa versa. There are 3 ways to have 4 spots showing, and so on. 
We can enumerate each case and find that there are 36 ways all told for the two 
dice to turn up. If we propose a carnival game in which the payoff is the number 
of spots showing, in dollars, we can use the principle stated above to find the fair 
price to play the game is $7. The expected number of spots showing on the roll 
of 2 dice is 7. The probability of snake eyes and the probability of boxcars is 1/36 
in each case. The reader is encouraged to calculate these results for himself as 
well as the expected number of spots showing.

In general, the expected number of spots showing after the roll of M dice 
is 3.5 * M. Consider the roll of 10 dice. We expect to see somewhere in the 
neighborhood of 35 spots showing. Quite generally, if we can choose an 
outcome in P ways and then independently choose a second outcome in Q ways 
and a third outcome in R ways, etc., we can make all three choices in P * Q * R 
ways, and so on. Thus there are thus 610 ways all told that 10 dice could come 
up... 60,466,176 ways, in fact. It is certainly possible that all of the dice could 
show a 1 or a six or any other single number, but the probabilities are extremely 
small. The expected number of spots is 35, but 33, 34, 36, & 37 are almost 
equally likely. An exact calculation of the number of ways 10 dice can show in 
the vicinity of 35 spots is a tedious proposition, but we can make a fairly good 
estimate using the logic set forth below.

The expected value of a random variable is the sum, taken over all possible 
outcomes, of the probability of each outcome times the numerical value of that 
outcome. We also need to know something about the spread of values... how 
widely disbursed about the expected value can we expect the outcomes to vary. 
The variance of a random variable is the measure we seek. The variance is 
defined as the sum, taken over all possible outcomes, of the probability of each 
outcome times the square of the difference between each outcome and the 
expected value. In the case of a single die, the calculation is carried out as 
follows:

Outcome (EV - OC)2 * P{OC} = Value

1 (3.5 - 1)2 / 6 = 6.25/6 
2 (3.5 - 2)2 / 6 = 2.25/6 
3 (3.5 - 3)2 / 6 = 0.25/6 
4 (3.5 - 4)2 / 6 = 0.25/6 
5 (3.5 - 5)2 / 6 = 2.25/6 
6 (3.5 - 6)2 / 6 = 6.25/6 

SUM = VARIANCE = 17.5/6 = 2.91667

The Standard Deviation may be a more familiar term to some. This is simply the 
square root of the variance, in this case the standard deviation, Sigma = å = 
?2.91667 = 1.7078. This result doesn't do much for our intuition in this case 
except to indicate that the dispersion of the number of spots showing as related 
to the expected value is fairly broad for a single die. As the number of dice is 
increased, however, the dispersion narrows considerably as we shall see.

Let us revisit the Central Limit Theorem, mentioned earlier. The number of 
spots showing at the toss of a number of dice is the sum of the spots showing on 
each individual die. The Central Limit Theorem tells us that the distribution of a 
random variable which is made up as the sum of random variables tends to a 
centralized distribution having a specific form without regard, in many cases, to 
the distributions of the component random variables. The centralized 
distribution is the Gaussian, or Normal, distribution which is the familiar bell 
shaped curve we study in statistics courses. The expected value of the SUM is 
the sum of the expected values of its components, while the variance of the SUM 
is the sum of the variances of the components, provided that the expected value 
and the variance of each component is small compared to the total. In other 
words, the Central Limit Theorem breaks down if any single component of the 
SUM dominates the process. 

In the case of a roll of 10 dice, the expected number of spots showing is 10 
* 3.5 = 35, while the variance is 10 * 17.5 = 175. The standard deviation is ?175 = 
13.229. We will not, therefore, be surprised to see as many as 35 + 13 = 48 spots 
or as few as 35 - 13 = 22 spots, but outside of this range, the probabilities get 
progressively smaller, and rapidly at that. We do not expect to see all 10 dice 
show the same number of spots in our lifetime, although it is clearly a possibility. 
We can, however write a simple BASIC computer routine to roll the dice in 
cyberspace and look for such an outcome, all the while keeping track of the 
number of rolls. When we get around to looking at diffusion in rocks we will see 
that some extremely rare events must happen in nature often enough to result in 
measurable concentration profile changes over geological or archaeological 
times.

I once shared an office with someone who spent a great deal of time 
studying the history of science and I may have got the impression from him that 
some of the universally accepted theorems in probability had never been properly 
proved. Moreover, certain major aspects of the Central Limit Theorem had only 
been proved during the heroic efforts of scientists working on war time projects 
during WWII. The name, Norbert Weiner, of CYBERNETICS fame rings a bell. 
Boltzman probably worried about the lack of rigor inherent is his treatment of the 
statistics of gases, but that didn't stop him from using his intuition and 
speculating. Newton had pretty well established that applying a force to a mass 
for a period of time would result in a change of the velocity of the mass. If we 
could somehow introduce a gas particle at rest into an aggregate of other gas 
particles, it would experience a series of impacts at random time intervals and in 
random directions and thus come into thermal equilibrium with its neighbors. 
The final random velocity components in the x, y, and z directions would result 
from the of sum of the random impacts in those directions. From the state of the 
Central Limit Theorem in his day, Boltzman reasoned that the three components 
of the velocity of a particle in thermal equilibrium in a gas were, most likely, 
distributed normally. The expected value of each component of the 
instantaneous velocity would be the sum of the expected values of the individual 
impulses while the variance of the instantaneous velocity would be the sum of 
the variances of the individual impulses. The number of impulses is arbitrarily 
large and we are in no position to estimate the indicated sums. We can, however, 
take the bull by the horns and say that, since the gas in a stationery closed 
vessel is not going anywhere, the average, or expected, value of the velocity 
components is zero. The variance presents a similar dilemma, but we can 
suppose that the sum of the individual variances is not zero nor infinite. 
Boltzman suggested that the variance was simply kT, where k, known today as 
Boltzman's Constant, is to be determined experimentally while T is the absolute 
temperature in Degrees Kelvin.

Moving ahead using Boltzman's assumption, we can calculate a number of 
the properties of a gas such as its pressure, viscosity, specific heat, and thermal 
conductivity and compare the results with experiment. The value of Boltzman's 
Constant was thus found in a number of independent ways, all giving about the same 
result. Continued experimental refinement over the years has led us to believe that we 
now know Boltzman's Constant to within 32 parts per million. 

Although the x, y, and z velocity components are normally distributed, the 
random kinetic energy of a particle in any specific direction is exponentially 
distributed. The probability that the kinetic energy exceeds some critical value, say 
Qd electron volts, is just EXP(-Qd/kT) = e(-Qd/kT). (I will describe the EXPonential 
function in a bit more detail later when it comes time to discuss Fick's Laws). At room 
temperature, roughly 300 DgKelvin (26.8 DgC), kT is approximately .026 ev... the 
energy acquired by an electron while falling through a potential of 1 volt. 

No course in bonehead chemistry would be complete without a discussion of 
surface tension and the heat of vaporization. The emphasis is apt to be on the 
evaporation of water... the transformation of water molecules on the surface of the 
liquid to water vapor in the space adjacent to the surface. The molecules within a 
liquid, as in a gas, are in rapid thermal motion and the random energy distribution 
also follows Boltzman Law. In a gas, the mean free path between collisions is 
typically hundreds or thousands of molecular diameters while in a liquid the 
molecules are typically 1 to 3 molecular diameters apart. In an ideal gas, essentially 
all of the energy is stored in the kinetics of motion, while in a liquid a large fraction of 
the total energy is stored in the attractive forces between the molecules. Work must 
be done to separate the molecules from each other. In the case of water, the escape 
energy is roughly 0.456 ev. When evaporation occurs, the escape energy comes from 
the tail end of the random thermal energy distribution. The threshold energy for the 
transition from liquid to vapor is called the heat of vaporization. The threshold energy 
for the transition from solid to liquid is called the heat of fusion. The threshold energy 
for the transition from one site of residence within a solid to a similar site of residence 
is called the heat of diffusion. There are other transitions from one state of matter to 
another characterized by a specific threshold of energy... adsorption, absorption, and 
desorption, to name a few. The probability that a particle in thermal equilibrium with 
its surroundings has sufficient energy at any specific time in excess of some 
threshold energy, Q, is just EXP(-Q/kT), typically an extraordinarily small number. 
Transitions do occur, however, because the frequency of attempts is very large, on 
the order of 1e12 to 1e13 times per second. The dwell time in such a situation is a 
random variable whose expectation is the reciprocal of the product of the probability 
of escape at a trial and the frequency of trials... a very small number times a very large 
number. The range of expected dwell times in our ordinary experience will range from 
nanoseconds to the age of the universe, depending on the temperature and the 
binding energy. 

Some old time math teachers have told me that these shortcut devices have cost
us a generation of mathematicians or at least most of those who survived New Math. 
My earlier story about the barrel hoops comes to mind. Algebra was invented in 
antiquity by an Arab olive oil merchant, one Al Jebra, by way of keeping his financial
affairs in order, or so I have been told.

I will discuss irrational numbers further in an appendix.

My mother-in-law could not bring herself to get on an airplane. She said that 
she couldn't see what held it up. It didn't help her when I explained that 
mathematics held it up.







BOLTZMAN STATISTICS

 

            Atomic particles in thermal equilibrium with their surroundings are in a

state of rapid, but random, motion.  In a gas, each particle moves along a straight

line until it collides with another particle or a solid object such as the walls of the

containing vessel.  After a collision the particle rebounds in a random direction

with random velocity until the next collision, and so on.  The situation is much

the same in a liquid except that the particles are closer together and the

collisions are much more frequent.  In a solid the heavier particles are more or

less constrained to vibrate rapidly within the close neighborhood of a specific

location.  The three dimensional components of the motion at any given instant

are random both in speed and direction.  I will now present some background

material intended to stimulate my reader's intuition about such matters.

 

            The atoms in a solid tend to arrange themselves in orderly arrays, called

crystal lattices, at least on a local scale.  This can be seen by taking a

photograph of X-rays scattered from almost any solid.  If we know the wavelength

of the X-rays, we can calculate the lattice spacings from the observed angles of

refraction.  Lattice spacings are usually on the order of several Angstroms, while

the atoms themselves are often a factor of 3 or 4 less in diameter. (1 Angstrom =

1E-10 meter).  For our purposes here, we may imagine that the lattice sites in a

solid are occupied by the heavy nuclei of the atoms which make up the solid and

these are surrounded by a swarm of highly mobile electrons closely confined, for

the most part, near the nuclei.  The mass of a nuclear particle is roughly 1800

times the mass of an electron.  In the case of electrical conductors, a fraction of

the electrons are more or less free to wander about through the lattice, able to

transfer heat and electrical charge.  In the case of insulating solids at low

temperatures, almost all of the electrons are closely confined to the nuclei. 

These may aid in heat transfer by interactions with neighboring electrons, but

they are not free to carry significant electric charge.  At elevated temperatures,

however, most insulators become conductors to some degree.

 

            Although the atoms in a solid tend to be confined to specific locations in

the lattice structure, even at room temperature they may occasionally escape the

confines of their neighbors and wander to a similar location elsewhere in the

lattice.  At temperatures approaching the melting point the individual atoms may

relocate themselves more or less rapidly while the lattice structure remains

intact, but at the melting point even the structure disintegrates. 

 

            In order to understand the process of diffusion in solids, it is necessary to

understand the concepts of binding energy and random thermal motion.  We note

that the moon is bound to the earth by gravitational attraction.  This pair as well

as the other planets and their moons are also bound to the sun.  Even the comets

are bound to the sun, although they move at such high speed that they may

spend years or centuries far from the sun or even the outermost planets before

returning.  When Newton reasoned out the laws of motion in a gravitational field,

he realized that bodies of mass moving at or above a certain critical velocity

could escape forever the gravitation field of the earth or the sun.  We learned in

freshman physics that the critical escape velocity for a mass in the earth's field of

gravity is roughly 7 miles/second.  In an idealized two body model, a projectile

fired in an upward direction at a lower velocity would eventually come to rest and

return to the earth, but if the velocity was greater the projectile would continue

moving away indefinitely.  The earth's atmosphere, for example, is relatively free

of hydrogen while the moon is relatively free of all gas particles.  A significant

fraction of any hydrogen molecules, H2, in the atmosphere acquire a velocity in

excess of that needed to escape the earth and the expected lifetime of a free

hydrogen molecule in the atmosphere is short compared to the age of the earth. 

The escape velocity on the surface of the moon is roughly 1.5 miles/second and

molecules lighter than Argon will have an expected lifetime short compared to

the age of the moon.

 

            When the kinetic theory of gases was being developed, the random nature

of the motion of the individual particles was widely appreciated before Ludwig

Boltzman came up with a satisfactory mathematical expression to describe the

situation.  Before putting forth a non-rigorous argument intended to stimulate my

reader's intuition, allow me to put forth some other basic ideas, also intended as

an aid to the intuition.

 

            This is perhaps the time and place to meet my primary challenge head on...

how to explain what is essentially a mathematical argument to people, mostly

children, with little or no mathematical background.  There is no shortage of

wisdom to the effect that it cannot be done and should not be tried, but I have

watched enough television to know what is expected of me as a person of the

male persuasion, namely that I will blunder on ahead heedless of the advice of

wiser heads.  "Ah, gentle dames, it gars me greet... etc.", in the immortal words of

Bobby Burns.

 

            A great deal of modern mathematics is devoted to the concept of the

function.  Roughly speaking, a function is a recipe for finding one number, called

the dependent variable, given another number, called the independent variable. 

For example, my pocket calculator has some functions like EXP, SIN, COS, TAN,

LOG, LN, ?, 1/x, etc.  I can enter the independent variable from the keyboard,

press the appropriate key, and the dependent variable appears in the register .  I

know a number of people who are quite adept at using these and many similar

buttons and computer icons to solve their everyday problems in business and

engineering without any significant grasp of the underlying logic, although

someone somewhere must understand.  I can hope that at least one or two of my

readers are sufficiently curious to take a few college level math courses

whenever the opportunity may arise.  Others may do very well indeed by simply

forming a symbiotic relationship with a math nerd who is apt to be impaired with

regard to the basic social skills.  The mutual benefits could be substantial.

           

            Variables may be thought of as symbols to which we can assign numerical

values.  To illustrate the concept further, consider the simple function y = 2x2.  x

and y are variables which can take on numerical values within some range, or

domain, which we are relatively free to specify.  If we restrict the domain of x to

be the set of real numbers between -2 and 2, for example, we may write this as -2

< x < 2.  In this case the domain of y will be 0 <= y < 8.  x is the independent

variable, which means that we must specify its value first, and then, using the

functional relationship, find y, the dependent variable.  x is normally thought of

as a continuous variable, which means that we may choose the value as precisely

as we please.  Furthermore, we can say that between any two specific values of x,

however close to each other these values may be, we can always find an

arbitrarily large number of values in between.  The real number set is infinitely

fine grained.  Once x is chosen, we may then square it, multiply the result by 2,

and find the value of y.  The domain of y is continuous as well.  For each value of

x, there is one, and only one, value for y.  In this case, y is a single valued

function of x.

 

            For any mathematical operation, we routinely insist on being able to invoke

the inverse operation.  My pocket calculator has an INV button which I can use

whenever I wish to find the inverse of any of the functions listed above.  With

regard to the function y = 2x2, we might prefer to specify y first and find the

consequent value of x.  We may, following the laws of algebra , divide both sides

of this expression by 2 and then take the square root of both sides to get a new

expression x =(+/-)SQRT(y/2) with y being the independent variable and x being

the dependent variable.  Note that these two functions are not the same.  In

particular, x is not a single valued function of y.  For each y in the domain 0 <= y <

8 there are two values of x since, as we are told in freshman algebra, (-x)2 = +x2. 

 

            I was also taught in high school that since -12 = +1 it followed that SQRT(-1)

= +i or -i, where i is the unit imaginary, take it or leave it.  These concepts were

deeply troubling to me and I was unable to find anyone to explain away the

difficulty I was having until my senior year in college.  I finally found the

satisfaction I was looking for in the concept of closure where we require that all

operations, and their inverse, on numbers in a domain yield other numbers in the

same domain.  The set of positive integers, for example, yields only positive

integers if all we are allowed to do is add and multiply.  If, however, we wish to

divide positive integers by positive integers of all sizes, we must invent rational

fractions if we insist on closure.  If we wish to subtract as well as add, we must

invent negative numbers.  If we wish to take squares and square roots when the

independent domain includes fractions and negative numbers, we must then

invent irrational  numbers and imaginary numbers.  I don't like the term,

imaginary, because i = +SQRT(-1) is not more imaginary than -1 itself.  I prefer the

term, quadrature, since multiplication by i rotates a complex number by +90

degrees counter-clockwise in the complex plane.  Finally, we must invent the

complex number set where each variable has both a real and a quadrature

component, thus, x = a + ib is the complex form where a and b are real numbers. 

The beauty of the complex number set is that a wide variety of common

functions, including all of the partial differential equations of physics and their

solutions are to be found in the complex plane where closure is always assured. 

If, in the above example, we allow x to be a complex variable, x = a + ib, then y =

2x2 = 2 * (a + ib)2 = 2 * (a2 + b2) + i4ab is a complex variable as well.

 

            This one-to-one relationship is a primary reason why mathematical

abstractions are so useful in describing concrete physical relationships.  We can

often find one-to-one relationships between measurable physical quantities and

go on to find one-to-one mathematical relationships to correspond.  We can then

work with the mathematics and reliably predict what will happen in the physical

situation .  For example, people have observed that the distance travelled by a

vehicle moving at a certain speed is proportional to the time travelled and they

have  expressed this physical fact by the mathematical model D = S * t where D is

the distance, miles, S is the speed, miles/hour, and t is the time, hours.  We can

reason out, to a moral certainty, the relationship between time, speed, and

distance without actually having to carry out the exercise.  This kind of reasoning

has been applied to almost every observable relationship in nature.  From careful

measurements of the positions of the planets over many years Johannes Kepler

was able to set forth several functional relationships which serve as major

milestones in modern scientific thinking.  From these relationships and his own

intuition, Newton was able to work out his underlying theory of gravity from

which he could derive Kepler's Laws.  Newton published his PRINCIPIA in 1687

and roughly 100 years later a new planet, Uranus, was discovered by accident in

1781.  The only problem was that the orbit of Uranus didn't quite obey the laws of

motion as established.  Going on the hope that Newton's Laws were indeed valid

after all and the discrepancies in the motion of Uranus were due to the presence

of another planet, as yet undiscovered, astronomers made a number of tedious

calculations trying to discover where to look for the new planet.  Neptune was

finally discovered in 1846 and, since the discovery of Neptune did not entirely

explain the orbit of Uranus, Pluto was found in 1930.  These discoveries illustrate

an essential characteristic of scientific theories and mathematical models...

although based on well established observations there are almost always a few

fine points which, upon closer examination, lead to new and unexpected

discoveries. 

 

            Having introduced the concepts of dependent and independent variables,

we now turn to the concept of random variables.  We recall that the dependent

variable is found after we are given the independent variable and the functional

relationship between the two variables.  A random variable is a number whose

value is determined as the result of an experiment, either a physical experiment

or an experiment in the mind.  Consider, for example, a roll of a single die.  The

number of spots showing on the top of the cube is a random variable whose

domain is the set of integers... 1, 2, 3, 4, 5, and 6.

 

            Much of the theory of probability is concerned with finding the distribution

of random variables.  This is a device for relating random variables to continuous

variables, or at least piecewise continuous variables.  We seek a continuous, or

piecewise continuous, function which represents the probability that the

numerical outcome of an experiment is less than some continuous variable, or

lies within some range of a continuous variable.  The notation, P{Nrv < Xcv} = F(X)

is often used.  This says the probability that N, a random variable outcome of an

experiment, is less than a continuous variable X, is a continuous, or piecewise

continuous, function of X.  F(X) is sometimes referred to as the Cumulative

Distribution Function, CDF, of X.  When N is a real number and the domain of X is

the real number line, F(-?) = 0 while F(+?) = 1.  There is no probability that the

numerical outcome of any experiment will be allowed to have a value less than -

?.  The probability that the numerical outcome of any experiment is less than +?

is certainty, or 1.  All CDF's have the value zero when X is arbitrarily large in the

negative direction.  All CDF's increase in value, and never decrease in value, as X

increases.  All CDF's -> 1 as X -> +?.  At the risk of belaboring the point, consider

the CDF when N is the number of spots showing at the roll of a single die.  Since

we can think of no rational argument to the contrary, we will assume that all

outcomes are equally likely, thus P{N=1} = P{N=2}, etc., P{N=6} = 1/6.  Thus

 

for -? < X < 1, F(X) = 0

 for 1 <= X < 2, F(X) = 1/6

       for 2 <= X < 3, F(X) = 2/6 = 1/3

       for 3 <= X < 4, F(X) = 3/6 = 1/2

       for 4 <= X < 5, F(X) = 4/6 = 2/3

 for 5 <= X < 6, F(X) = 5/6

for 6 <= X < +?, F(X) = 1

 

The Cumulative Distribution Function of X, a continuous real variable, is

piecewise continuous everywhere except at the six discrete values of X where X =

1 or 2 or 3 or 4 or 5 or 6.  The CDF is thus a six step staircase from 0 to 1 from

which we can find the value of P{Nrv < Xcv} = F(X) for any real value of X.

 

            Another common probability function regarding random variables is the

Probability Density Function, PDF, which specifies the probability that the

numerical outcome of an experiment lies within a specified range of values.  We

are interested in P{X < Nrv < X + dX} = f(X) * dX.  In other words, the probability

that the random variable, Nrv, falls between X and X + dX, where dX is an

arbitrarily small increment, is given by f(X) * dX.  In the case of the number of

spots showing at the roll of a single die, this function is zero except when the

interval X + dX includes one of the integers 1, 2, 3, 4, 5, or 6.  The value of f(X) can

not be determined when x = 1, 2, 3, 4, 5, or 6, but the product, f(X) * dX = 1/6 in

each case. 

 

            In the discussion of random variables so far, I have only considered the

case of a particular discrete random variable.  Before the proliferation of digital

computers, most math and science students were heavily schooled in analytic

functions which deal, for the most part, with continuous complex numbers. 

Discontinuous events, such as the instantaneous onset of the flow of electric

current when a switch is closed, were a bit awkward to describe even though a

number of completely satisfying devices had been invented to deal with them. 

Gases were known to consist of discrete particles, roughly 3e19 of them per

cm^3 in the air we breathe, but slide rules were only accurate to 2 or 3 decimal

places so for all practical purposes gases were a continuous medium and could

be adequately described by analytic functions.  Modern computers are quite

capable of discrete function analysis to more significant figures than are usually

needed, but to discard analytic functions and the lore surrounding them would be

a great mistake... like forgetting how to make stone tools when someone leaves

you a pile of iron strapping. 

 

            Purists today might recognize that the variables described by the partial

differential equations of physics are really the expected values of random

variables.  The expected value of a random variable is, roughly speaking, the

average value.  In the case of the number of spots showing after the roll of a

single die, the expected value is 3.5.  Some explanation seems in order since we

never expect to see 3.5 spots showing.  By way of sharpening your intuition,

consider the following carnival game:  a single die is rolled and the payoff is $1

for each spot showing.  What is a fair wager to play this game?  If the only

outcome which paid $1 was the showing of a single spot, the player would win

only 1/6th of the time, on average, and the fair price to play would be $1/6 = $.166.

 If 2 spots paid $2, the fair price of that chance alone is $2/6 = $.333, and so on. 

The value of an expectation is the sum of the payoff at a specific outcome times

the probability of that outcome taken over all possibilities.  The sum, 1/6 + 2/6 +

3/6 + 4/6 + 5/6 + 6/6 = 21/6 = 3.5.  The fair price to play the carnival game is $3.50...

a greater price favors the house, a lesser price favors the player.

 

            Before getting back to Boltzman and the energy distribution of gas

particles, let us play a few more mind games with the dice.  Consider the number

of spots showing at the roll of 2 dice, a red one and a green one.  There is only

one way to get snake eyes or box cars... both dice must come up 1 for snake

eyes or 6 for box cars.  There are 2 ways to get 3 spots showing... Nred=2 and

Ngreen=1 or visa versa.  There are 3 ways to have 4 spots showing, and so on. 

We can enumerate each case and find that there are 36 ways all told for the two

dice to turn up.  If we propose a carnival game in which the payoff is the number

of spots showing, in dollars, we can use the principle stated above to find the fair

price to play the game is $7.  The expected number of spots showing on the roll

of 2 dice is 7.  The probability of snake eyes and the probability of boxcars is 1/36

in each case.  The reader is encouraged to calculate these results for himself as

well as the expected number of spots showing.

 

            In general,  the expected number of spots showing after the roll of M dice

is 3.5 * M.  Consider the roll of 10 dice.  We expect to see somewhere in the

neighborhood of 35 spots showing.  Quite generally, if we can choose an

outcome in P ways and then independently choose a second outcome in Q ways

and a third outcome in R ways, etc., we can make all three choices in P * Q * R

ways, and so on.  Thus there are thus 610 ways all told that 10 dice could come

up... 60,466,176 ways, in fact.  It is certainly possible that all of the dice could

show a 1 or a six or any other single number, but the probabilities are extremely

small.  The expected number of spots is 35, but 33, 34, 36, & 37 are almost

equally likely.  An exact calculation of the number of ways 10 dice can show in

the vicinity of 35 spots is a tedious proposition, but we can make a fairly good

estimate using the logic set forth below.

 

            The expected value of a random variable is the sum, taken over all possible

outcomes, of the probability of each outcome times the numerical value of that

outcome.  We also need to know something about the spread of values... how

widely disbursed about the expected value can we expect the outcomes to vary. 

The variance of a random variable is the measure we seek.  The variance is

defined as the sum, taken over all possible outcomes, of the probability of each

outcome times the square of the difference between each outcome and the

expected value.  In the case of a single die, the calculation is carried out as

follows:

 

                        Outcome        (EV - OC)2 * P{OC} = Value

 

                                    1            (3.5 - 1)2 / 6  =  6.25/6             

                                    2            (3.5 - 2)2 / 6  =  2.25/6             

                                    3            (3.5 - 3)2 / 6  =  0.25/6             

                                    4            (3.5 - 4)2 / 6  =  0.25/6             

                                    5            (3.5 - 5)2 / 6  =  2.25/6             

                                    6            (3.5 - 6)2 / 6  =  6.25/6             

 

                                                SUM = VARIANCE  = 17.5/6 = 2.91667

 

The Standard Deviation may be a more familiar term to some.  This is simply the

square root of the variance, in this case the standard deviation, Sigma = å =

?2.91667 = 1.7078.  This result doesn't do much for our intuition in this case

except to indicate that the dispersion of the number of spots showing as related

to the expected value is fairly broad for a single die.  As the number of dice is

increased, however, the dispersion narrows considerably as we shall see.

 

            Let us revisit the Central Limit Theorem, mentioned earlier.  The number of

spots showing at the toss of a number of dice is the sum of the spots showing on

each individual die.  The Central Limit Theorem tells us that the distribution of a

random variable which is made up as the sum of random variables tends to a

centralized distribution having a specific form without regard, in many cases, to

the distributions of the component random variables.  The centralized

distribution is the Gaussian, or Normal, distribution which is the familiar bell

shaped curve we study in statistics courses.  The expected value of the SUM is

the sum of the expected values of its components, while the variance of the SUM

is the sum of the variances of the components, provided that the expected value

and the variance of each component is small compared to the total.  In other

words, the Central Limit Theorem breaks down if any single component of the

SUM dominates the process.   

 

            In the case of a roll of 10 dice, the expected number of spots showing is 10

* 3.5 = 35, while the variance is 10 * 17.5 = 175.  The standard deviation is ?175 =

13.229.  We will not, therefore, be surprised to see as many as 35 + 13 = 48 spots

or as few as 35 - 13 = 22 spots, but outside of this range, the probabilities get

progressively smaller, and rapidly at that.  We do not expect to see all 10 dice

show the same number of spots in our lifetime, although it is clearly a possibility.

 We can, however write a simple BASIC computer routine to roll the dice in

cyberspace and look for such an outcome, all the while keeping track of the

number of rolls.  When we get around to looking at diffusion in rocks we will see

that some extremely rare events must happen in nature often enough to result in

measurable concentration profile changes over geological or archaeological

times.

 

            I once shared an office with someone who spent a great deal of time

studying the history of science and I may have got the impression from him that

some of the universally accepted theorems in probability had never been properly

proved.  Moreover, certain major aspects of the Central Limit Theorem had only

been proved during the heroic efforts of scientists working on war time projects

during WWII.  The name, Norbert Weiner, of CYBERNETICS fame rings a bell. 

Boltzman probably worried about the lack of rigor inherent is his treatment of the

statistics of gases, but that didn't stop him from using his intuition and

speculating.  Newton had pretty well established that applying a force to a mass

for a period of time would result in a change of the velocity of the mass.  If we

could somehow introduce a gas particle at rest into an aggregate of other gas

particles, it would experience a series of impacts at random time intervals and in

random directions and thus come into thermal equilibrium with its neighbors. 

The final random velocity components in the x, y, and z directions would result

from the of sum of the random impacts in those directions.  From the state of the

Central Limit Theorem in his day, Boltzman reasoned that the three components

of the velocity of a particle in thermal equilibrium in a gas were, most likely,

distributed normally.  The expected value of each component of the

instantaneous velocity would be the sum of the expected values of the individual

impulses while the variance of the instantaneous velocity would be the sum of

the variances of the individual impulses.  The number of impulses is arbitrarily

large and we are in no position to estimate the indicated sums.  We can, however,

take the bull by the horns and say that, since the gas in a stationery closed

vessel is not going anywhere, the average, or expected, value of the velocity

components is zero.  The variance presents a similar dilemma, but we can

suppose that the sum of the individual variances is not zero nor infinite. 

Boltzman suggested that the variance was simply kT, where k, known today as

Boltzman's Constant, is to be determined experimentally while T is the absolute

temperature in Degrees Kelvin.

 

            Moving ahead using Boltzman's assumption, we can calculate a number of

the properties of a gas such as its pressure, viscosity, specific heat, and thermal

conductivity and compare the results with experiment.  The value of Boltzman's

Constant was thus found in a number of independent ways, all giving about the same

result.  Continued experimental refinement over the years has led us to believe that we

now know Boltzman's Constant to within 32 parts per million. 

 

            Although the x, y, and z velocity components are normally distributed, the

random kinetic energy of a particle in any specific direction is exponentially

distributed.  The probability that the kinetic energy exceeds some critical value, say

Qd electron volts, is just EXP(-Qd/kT) = e(-Qd/kT).  (I will describe the EXPonential

function in a bit more detail later when it comes time to discuss Fick's Laws).  At room

temperature, roughly 300 DgKelvin (26.8 DgC), kT is approximately .026 ev...  the

energy acquired by an electron while falling through a potential of 1 volt. 

 

            No course in bonehead chemistry would be complete without a discussion of

surface tension and the heat of vaporization.  The emphasis is apt to be on the

evaporation of water... the transformation of water molecules on the surface of the

liquid to water vapor in the space adjacent to the surface.  The molecules within a

liquid, as in a gas, are in rapid thermal motion and the random energy distribution

also follows Boltzman Law.  In a gas, the mean free path between collisions is

typically hundreds or thousands of molecular diameters while in a liquid the

molecules are typically 1 to 3 molecular diameters apart.  In an ideal gas, essentially

all of the energy is stored in the kinetics of motion, while in a liquid a large fraction of

the total energy is stored in the attractive forces between the molecules.  Work must

be done to separate the molecules from each other.  In the case of water, the escape

energy is roughly 0.456 ev.  When evaporation occurs, the escape energy comes from

the tail end of the random thermal energy distribution.  The threshold energy for the

transition from liquid to vapor is called the heat of vaporization.  The threshold energy

for the transition from solid to liquid is called the heat of fusion.  The threshold energy

for the transition from one site of residence within a solid to a similar site of residence

is called the heat of diffusion.  There are other transitions from one state of matter to

another characterized by a specific threshold of energy... adsorption, absorption, and

desorption, to name a few.  The probability that a particle in thermal equilibrium with

its surroundings has sufficient energy at any specific time in excess of some

threshold energy, Q, is just EXP(-Q/kT), typically an extraordinarily small number. 

Transitions do occur, however, because the frequency of attempts is very large, on

the order of 1e12 to 1e13 times per second.  The dwell time in such a situation is a

random variable whose expectation is the reciprocal of the product of the probability

of escape at a trial and the frequency of trials... a very small number times a very large

number.  The range of expected dwell times in our ordinary experience will range from

nanoseconds to the age of the universe, depending on the temperature and the

binding energy.   

             

     Some old time math teachers have told me that these shortcut devices have cost

us a generation of mathematicians or at least most of those who survived New Math. 

My earlier story about the barrel hoops comes to mind.  Algebra was invented in

antiquity by an Arab olive oil merchant, one Al Jebra, by way of keeping his financial

affairs in order, or so I have been told.

 

      I will discuss irrational numbers further in an appendix.

 

     My mother-in-law could not bring herself to get on an airplane.  She said that

she couldn't see what held it up.  It didn't help her when I explained that

mathematics held it up.

 

 

 

 

 

 

 

 

 
 

Everyday we rescue items you see on these pages!
What do you have hiding in a closet or garage?
What could you add to the museum displays or the library?

PLEASE CONTACT US!

===================

DONATE! Click the Button Below!


Thank you very much!

===================

Material © SMECC 2007 or by other owners 

Contact Information for
Southwest Museum of Engineering,
Communications and Computation 
&
www.smecc.org

Talk to us!
Let us know what needs preserving!


Telephone 
623-435-1522 

Postal address 
smecc.org - Admin. 
Coury House / SMECC 
5802 W. Palmaire Ave 
Glendale, AZ 85301 

Electronic mail 
General Information: info@smecc.org