Elliptic functions and integrals

The terminology for elliptic integrals and functions has changed during their investigation. What were originally called elliptic functions are now called elliptic integrals and the term elliptic functions reserved for a different idea. We will therefore use modern terminology throughout this article to avoid confusion.

It is important to understand how mathematicians thought differently at different periods. Early algebraists had to prove their formulas by geometry. Similarly early workers with integration considered their problems solved if they could relate an integral to a geometric object.

Many integrals arose from attempts to solve mechanical problems. For example the period of a simple pendulum was found to be related to an integral which expressed arc length but no form could be found in terms of ‘simple’ functions. The same was true for the deflection of a thin elastic bar.

The study of elliptical integrals can be said to start in 1655 when Wallis began to study the arc length of an ellipse. In fact he considered the arc lengths of various cycloids and related these arc lengths to that of the ellipse. Both Wallis and Newton published an infinite series expansion for the arc length of the ellipse.

At this point we should give a definition of an elliptic integral. It is one of the form

∫ r(x, √p(x) )dx

where r(x,y) is a rational function in two variables and p(x) is a polynomial of degree 3 or 4 with no repeated roots.

In 1679 Jacob Bernoulli attempted to find the arc length of a spiral and encountered an example of an elliptic integral.

Jacob Bernoulli, in 1694, made an important step in the theory of elliptic integrals. He examined the shape the an elastic rod will take if compressed at the ends. He showed that the curve satisfied

ds/dt = 1/√(1 – t⁴)

then introduced the lemniscate curve

(x²+y²)² = (x²–y²)

whose arc length is given by the integral from 0 to x of

dt/√(1 – t⁴)

This integral, which is clearly satisfies the above definition so is an elliptic integral, became known as the lemniscate integral.

This is a particularly simple case of an elliptic integral. Notice for example that it is similar in form to the function sin^-1(x) which is given by the integral from 0 to x of

dt/√(1 – t²)

The other good features of the lemniscate integral are the fact that it is general enough for many of its properties to be generalised to more general elliptic functions, yet the geometric intuition from the arc length of the lemniscate curve aids understanding.

In the year 1694 Jacob Bernoulli considered another elliptic integral

∫ t² dt/√(1 – t⁴)

and conjectured that it could not be expressed in terms of ‘known’ functions, sin, exp, sin^-1.

A history of the calculus

The main ideas which underpin the calculus developed over a very long period of time indeed. The first steps were taken by Greek mathematicians.

To the Greeks numbers were ratios of integers so the number line had “holes” in it. They got round this difficulty by using lengths, areas and volumes in addition to numbers for, to the Greeks, not all lengths were numbers.

Zeno of Elea, about 450 BC, gave a number of problems which were based on the infinite. For example he argued that motion is impossible:-

If a body moves from A to B then before it reaches B it passes through the mid-point, say B1 of AB. Now to move to B1 it must first reach the mid-point B2 of AB1 . Continue this argument to see that A must move through an infinite number of distances and so cannot move.

Leucippus, Democritus and Antiphon all made contributions to the Greek method of exhaustion which was put on a scientific basis by Eudoxus about 370 BC. The method of exhaustion is so called because one thinks of the areas measured expanding so that they account for more and more of the required area.

However Archimedes, around 225 BC, made one of the most significant of the Greek contributions. His first important advance was to show that the area of a segment of a parabola is ⁴/₃ the area of a triangle with the same base and vertex and ²/₃ of the area of the circumscribed parallelogram. Archimedes constructed an infinite sequence of triangles starting with one of area A and continually adding further triangles between the existing ones and the parabola to get areas

A, A + A/₄ , A + A/₄ + A/₁₆ , A + A/₄ + A/₁₆ + A/₆₄ , …

The area of the segment of the parabola is therefore

A(1 + ¹/₄ + ¹/_4² + ¹/_4³ + ….) = (⁴/₃)A.

This is the first known example of the summation of an infinite series.

Archimedes used the method of exhaustion to find an approximation to the area of a circle. This, of course, is an early example of integration which led to approximate values of π.

Here is Archimedes’ diagram

Among other ‘integrations’ by Archimedes were the volume and surface area of a sphere, the volume and area of a cone, the surface area of an ellipse, the volume of any segment of a paraboloid of revolution and a segment of an hyperboloid of revolution.

No further progress was made until the 16^th Century when mechanics began to drive mathematicians to examine problems such as centres of gravity. Luca Valerio (1552-1618) published De quadratura parabolae in Rome (1606) which continued the Greek methods of attacking these type of area problems. Kepler, in his work on planetary motion, had to find the area of sectors of an ellipse. His method consisted of thinking of areas as sums of lines, another crude form of integration, but Kepler had little time for Greek rigour and was rather lucky to obtain the correct answer after making two cancelling errors in this work.

Three mathematicians, born within three years of each other, were the next to make major contributions. They were Fermat, Roberval and Cavalieri. Cavalieri was led to his ‘method of indivisibles’ by Kepler‘s attempts at integration. He was not rigorous in his approach and it is hard to see clearly how he thought about his method. It appears that Cavalieri thought of an area as being made up of components which were lines and then summed his infinite number of ‘indivisibles’. He showed, using these methods, that the integral of xⁿ from 0 to a was aⁿ⁺¹/(n + 1) by showing the result for a number of values of n and inferring the general result.

Roberval considered problems of the same type but was much more rigorous than Cavalieri. Roberval looked at the area between a curve and a line as being made up of an infinite number of infinitely narrow rectangular strips. He applied this to the integral of x^m from 0 to 1 which he showed had approximate value

(0^m + 1^m + 2^m + … + (n-1)^m)/n^m+1.

Roberval then asserted that this tended to 1/(m + 1) as n tends to infinity, so calculating the area.

Fermat was also more rigorous in his approach but gave no proofs. He generalised the parabola and hyperbola:-

Parabola: ^y/_a = (^x/_b)² to (^y/_a)ⁿ = (^x/_b)^mHyperbola: ^y/_a = ^b/_x to (^y/_a)ⁿ = (^b/_x)^m.

In the course of examining ^y/_a = (^x/_b)^p, Fermat computed the sum of r^p from r = 1 to r = n.

Fermat also investigated maxima and minima by considering when the tangent to the curve was parallel to the x-axis. He wrote to Descartes giving the method essentially as used today, namely finding maxima and minima by calculating when the derivative of the function was 0. In fact, because of this work, Lagrange stated clearly that he considers Fermat to be the inventor of the calculus.

Descartes produced an important method of determining normals in La Géométrie in 1637 based on double intersection. De Beaune extended his methods and applied it to tangents where double intersection translates into double roots. Hudde discovered a simpler method, known as Hudde‘s Rule, which basically involves the derivative. Descartes’ method and Hudde‘s Rule were important in influencing Newton.

Huygens was critical of Cavalieri‘s proofs saying that what one needs is a proof which at least convinces one that a rigorous proof could be constructed. Huygens was a major influence on Leibniz and so played a considerable part in producing a more satisfactory approach to the calculus.

The next major step was provided by Torricelli and Barrow. Barrow gave a method of tangents to a curve where the tangent is given as the limit of a chord as the points approach each other known as Barrow’s differential triangle.

Here is Barrow’s differential triangle

Both Torricelli and Barrow considered the problem of motion with variable speed. The derivative of the distance is velocity and the inverse operation takes one from the velocity to the distance. Hence an awareness of the inverse of differentiation began to evolve naturally and the idea that integral and derivative were inverses to each other were familiar to Barrow. In fact, although Barrow never explicitly stated the fundamental theorem of the calculus, he was working towards the result and Newton was to continue with this direction and state the Fundamental Theorem of the Calculus explicitly.

Torricelli‘s work was continued in Italy by Mengoli and Angeli.

Newton wrote a tract on fluxions in October 1666. This was a work which was not published at the time but seen by many mathematicians and had a major influence on the direction the calculus was to take. Newton thought of a particle tracing out a curve with two moving lines which were the coordinates. The horizontal velocity x‘ and the vertical velocity y‘ were the fluxions of x and y associated with the flux of time. The fluents or flowing quantities were x and y themselves. With this fluxion notation y‘/x‘ was the tangent to f(x, y) = 0.

In his 1666 tract Newton discusses the converse problem, given the relationship between x and y‘/x‘ find y. Hence the slope of the tangent was given for each x and when y‘/x‘ = f(x) then Newton solves the problem by antidifferentiation. He also calculated areas by antidifferentiation and this work contains the first clear statement of the Fundamental Theorem of the Calculus.

Newton had problems publishing his mathematical work. Barrow was in some way to blame for this since the publisher of Barrow‘s work had gone bankrupt and publishers were, after this, wary of publishing mathematical works! Newton‘s work on Analysis with infinite series was written in 1669 and circulated in manuscript. It was not published until 1711. Similarly his Method of fluxions and infinite series was written in 1671 and published in English translation in 1736. The Latin original was not published until much later.

In these two works Newton calculated the series expansion for sin x and cos x and the expansion for what was actually the exponential function, although this function was not established until Euler introduced the present notation e^x.

You can see the series expansions for sine and for Taylor or Maclaurin series.

Newton‘s next mathematical work was Tractatus de Quadratura Curvarum which he wrote in 1693 but it was not published until 1704 when he published it as an Appendix to his Optiks. This work contains another approach which involves taking limits. Newton says

In the time in which x by flowing becomes x+o, the quantity xⁿ becomes (x+o)ⁿ i.e. by the method of infinite series,xⁿ + nox^n-1 + (nn–n)/2 oox^n-2 + . . .

At the end he lets the increment o vanish by ‘taking limits’.

Leibniz learnt much on a European tour which led him to meet Huygens in Paris in 1672. He also met Hooke and Boyle in London in 1673 where he bought several mathematics books, including Barrow‘s works. Leibniz was to have a lengthy correspondence with Barrow. On returning to Paris Leibniz did some very fine work on the calculus, thinking of the foundations very differently from Newton.

Newton considered variables changing with time. Leibniz thought of variables x, y as ranging over sequences of infinitely close values. He introduced dx and dy as differences between successive values of these sequences. Leibniz knew that ^dy/_dx gives the tangent but he did not use it as a defining property.

For Newton integration consisted of finding fluents for a given fluxion so the fact that integration and differentiation were inverses was implied. Leibniz used integration as a sum, in a rather similar way to Cavalieri. He was also happy to use ‘infinitesimals’ dx and dy where Newton used x‘ and y‘ which were finite velocities. Of course neither Leibniz nor Newton thought in terms of functions, however, but both always thought in terms of graphs. For Newton the calculus was geometrical while Leibniz took it towards analysis.

Leibniz was very conscious that finding a good notation was of fundamental importance and thought a lot about it. Newton, on the other hand, wrote more for himself and, as a consequence, tended to use whatever notation he thought of on the day. Leibniz‘s notation of d and ∫ highlighted the operator aspect which proved important in later developments. By 1675 Leibniz had settled on the notation

∫ y dy = y²/2

written exactly as it would be today. His results on the integral calculus were published in 1684 and 1686 under the name ‘calculus summatorius‘, the name integral calculus was suggested by Jacob Bernoulli in 1690.

After Newton and Leibniz the development of the calculus was continued by Jacob Bernoulli and Johann Bernoulli. However when Berkeley published his Analyst in 1734 attacking the lack of rigour in the calculus and disputing the logic on which it was based much effort was made to tighten the reasoning. Maclaurin attempted to put the calculus on a rigorous geometrical basis but the really satisfactory basis for the calculus had to wait for the work of Cauchy in the 19^th Century.

The trigonometric functions

The use of trigonometric functions arises from the early connection between mathematics and astronomy. Early work with spherical triangles was as important as plane triangles.

The first work on trigonometric functions related to chords of a circle. Given a circle of fixed radius, 60 units were often used in early calculations, then the problem was to find the length of the chord subtended by a given angle. For a circle of unit radius the length of the chord subtended by the angle x was 2sin (x/2). The first known table of chords was produced by the Greek mathematician Hipparchus in about 140 BC. Although these tables have not survived, it is claimed that twelve books of tables of chords were written by Hipparchus. This makes Hipparchus the founder of trigonometry.

The next Greek mathematician to produce a table of chords was Menelaus in about 100 AD. Menelaus worked in Rome producing six books of tables of chords which have been lost but his work on spherics has survived and is the earliest known work on spherical trigonometry. Menelaus proved a property of plane triangles and the corresponding spherical triangle property known the regula sex quantitatum.

Ptolemy was the next author of a book of chords, showing the same Babylonian influence as Hipparchus, dividing the circle into 360° and the diameter into 120 parts. The suggestion here is that he was following earlier practice when the approximation 3 for π was used. Ptolemy, together with the earlier writers, used a form of the relation sin² x + cos² x = 1, although of course they did not actually use sines and cosines but chords.

Similarly, in terms of chords rather than sin and cos, Ptolemy knew the formulas

sin(x + y) = sinx cos y + cosx sin ya/sin A = b/sin B = c/sin C.

Ptolemy calculated chords by first inscribing regular polygons of 3, 4, 5, 6 and 10 sides in a circle. This allowed him to calculate the chord subtended by angles of 36°, 72°, 60°, 90° and 120°. He then found a method of finding the cord subtended by half the arc of a known chord and this, together with interpolation allowed him to calculate chords with a good degree of accuracy. Using these methods Ptolemy found that sin 30′ (30′ = half of 1°) which is the chord of 1° was, as a number to base 60, 0 31′ 25″. Converted to decimals this is 0.0087268 which is correct to 6 decimal places, the answer to 7 decimal places being 0.0087265.

The first actual appearance of the sine of an angle appears in the work of the Hindus. Aryabhata, in about 500, gave tables of half chords which now really are sine tables and used jya for our sin. This same table was reproduced in the work of Brahmagupta (in 628) and detailed method for constructing a table of sines for any angle were give by Bhaskara in 1150.

The Arabs worked with sines and cosines and by 980 Abu’l-Wafa knew that

sin 2x = 2 sin x cos x

although it could have easily have been deduced from Ptolemy‘s formula sin(x + y) = sin x cos y + cos x sin y with x = y.

The Hindu word jya for the sine was adopted by the Arabs who called the sine jiba, a meaningless word with the same sound as jya. Now jiba became jaib in later Arab writings and this word does have a meaning, namely a ‘fold’. When European authors translated the Arabic mathematical works into Latin they translated jaib into the word sinus meaning fold in Latin. In particular Fibonacci‘s use of the term sinus rectus arcus soon encouraged the universal use of sine.

Chapters of Copernicus‘s book giving all the trigonometry relevant to astronomy was published in 1542 by Rheticus. Rheticus also produced substantial tables of sines and cosines which were published after his death. In 1533 Regiomontanus‘s work De triangulis omnimodis was published. This contained work on planar and spherical trigonometry originally done much earlier in about 1464. The book is particularly strong on the sine and its inverse.

The term sine certainly was not accepted straight away as the standard notation by all authors. In times when mathematical notation was in itself a new idea many used their own notation. Edmund Gunter was the first to use the abbreviation sin in 1624 in a drawing. The first use of sin in a book was in 1634 by the French mathematician Hérigone while Cavalieri used Si and Oughtred S.

It is perhaps surprising that the second most important trigonometrical function during the period we have discussed was the versed sine, a function now hardly used at all. The versine is related to the sine by the formula

versin x = 1 – cos x.

It is just the sine turned (versed) through 90°.

The cosine follows a similar course of development in notation as the sine. Viète used the term sinus residuae for the cosine, Gunter (1620) suggested co-sinus. The notation Si.2 was used by Cavalieri, s co arc by Oughtred and S by Wallis.

Viète knew formulas for sin nx in terms of sin x and cos x. He gave explicitly the formulas (due to Pitiscus)

sin 3x = 3 cos ²x sin x – sin ³ xcos 3x = cos ³x – 3 sin ²x cos x.

The tangent and cotangent came via a different route from the chord approach of the sine. These developed together and were not at first associated with angles. They became important for calculating heights from the length of the shadow that the object cast. The length of shadows was also of importance in the sundial. Thales used the lengths of shadows to calculate the heights of pyramids.

The first known tables of shadows were produced by the Arabs around 860 and used two measures translated into Latin as umbra recta and umbra versa. Viète used the terms amsinus and prosinus. The name tangent was first used by Thomas Fincke in 1583. The term cotangens was first used by Edmund Gunter in 1620.

Abbreviations for the tan and cot followed a similar development to those of the sin and cos. Cavalieri used Ta and Ta.2, Oughtred used t arc and t co arc while Wallis used T and t. The common abbreviation used today is tan by we write tan whereas the first occurrence of this abbreviation was used by Albert Girard in 1626, but tan was written over the angle

         tan
          A

cot was first used by Jonas Moore in 1674.

The secant and cosecant were not used by the early astronomers or surveyors. These came into their own when navigators around the 15^th Century started to prepare tables. Copernicus knew of the secant which he called the hypotenusa. Viète knew the results

cosec x/sec x = cot x = 1/tan x1/cosec x = cos x/cot x = sin x.

The abbreviations used by various authors were similar to the trigonometric functions already discussed. Cavalieri used Se and Se.2, Oughtred used se arc and sec co arc while Wallis used s and σ. Albert Girard used sec, written above the angle as he did for the tan.

The term ‘trigonometry’ first appears as the title of a book Trigonometria by B Pitiscus, published in 1595. Pitiscus also discovered the formulas for sin 2x, sin 3x, cos 2x, cos 3x.

The 18^th Century saw trigonometric functions of a complex variable being studied. Johann Bernoulli found the relation between sin^-1z and log z in 1702 while Cotes, in a work published in 1722 after his death, showed that

ix = log(cos x + i sin x ).

De Moivre published his famous theorem

(cos x + i sin x )ⁿ = cos nx + i sin nx

in 1722 while Euler, in 1748, gave the formula (equivalent to that of Cotes)

exp(ix) = cos x + i sin x .

The hyperbolic trigonometric functions were introduced by Lambert.

The function concept

If today we try to answer the difficult question “What is mathematics?” we often respond with an answer such as “It is the study of relations on sets” or “It is the study of functions on sets” or “It is the study of dependencies among variable quantities”. If these statements come anywhere close to the truth then it might be logical to suggest that the concept of a function must have arisen in the very earliest stages in the development of mathematics. Indeed if we look at Babylonian mathematics we find tables of squares of the natural numbers, cubes of the natural numbers, and reciprocals of the natural numbers. These tables certainly define functions from N to N. E T Bell wrote in 1945:-

It may not be too generous to credit the ancient Babylonians with the instinct for functionality; for a function has been successively defined as a table or a correspondence.

However this surely is the result of modern mathematicians seeing ancient mathematics through modern eyes. Although we can see that the Babylonians were dealing with functions, they would not have thought in these terms. We therefore have to reject the suggestion that the concept of a function was present in Babylonian mathematics even if we can see that they were studying particular functions.

If we move forward to Greek mathematics then we reach the work of Ptolemy. He computed chords of a circle which essentially means that he computed trigonometric functions. Surely, one might think, if he was computing trigonometric functions then Ptolemy must have understood the concept of a function. As O Petersen wrote in 1974 in [22]:-

But if we conceive a function, not as a formula, but as a more general relation associating the elements of one set of numbers with the elements of another set, it is obvious that functions in that sense abound throughout the Almagest.

Indeed Petersen is certainly correct to say that functions, in the modern sense, occur throughout the Almagest. Ptolemy dealt with functions, but it is very unlikely that he had any understanding of the concept of a function. As Thiele writes on the first page of [2]:-

From time to time, anachronistic comparisons like the one just given help us with the elucidation of documented facts, but not with the interpretation of their history.

Having suggested that the concept of a function is absent in these ancient pieces of mathematics, let us suggest, as Youschkevitch does in [32], that Oresme was getting closer in 1350 when he described the laws of nature as laws giving a dependence of one quantity on another. Youschkevitch writes [32]:-

The notion of a function first occurred in more general form in the 14th century in the schools of natural philosophy at Oxford and Paris.

Galileo was beginning to understand the concept even more clearly. His studies of motion contain the clear understanding of a relation between variables. Again another piece of his mathematics shows how he was beginning to grasp the concept of a mapping between sets. In 1638 he studied the problem of two concentric circles with centre O, the larger circle A with diameter twice that of the smaller one B. The familiar formula gives the circumference of A to be twice that of B. But taking any point P on the circle A, then PA cuts circle B in one point. So Galileo had constructed a function mapping each point of A to a point of B. Similarly if Q is a point on B then OQ produced cuts circle A in exactly one point. Again he has a function, this time from points of B to points of A. Although the circumference of A is twice the length of the circumference of B they have the same number of points. He also produced the standard one-to-one correspondence between the positive integers and their squares which (in modern terms) gave a bijection between N and a proper subset.

At almost the same time that Galileo was coming up with these ideas, Descartes was introducing algebra into geometry in La Géométrie. He says that a curve can be drawn by letting lines take successively an infinite number of different values. This again brings the concept of a function into the construction of a curve, for Descartes is thinking in terms of the magnitude of an algebraic expression taking an infinity of values as a magnitude from which the algebraic expression is composed takes an infinity of values.

Let us pause for a moment before reaching the first use of the word “function”. It is important to understand that the concept developed over time, changing its meaning as well as being defined more precisely as decades went by. We have already suggested that a table of values, although defining a function, need not be thought of by the creator of the table as a function. Early uses of the word “function” did encapsulate ideas of the modern concept but in a much more restrictive way.

Like so many mathematical terms, the word function was first used with its usual non-mathematical meaning. Leibniz wrote in August 1673 of:-

… other kinds of lines which, in a given figure, perform some function.

Johann Bernoulli, in a letter to Leibniz written on 2 September 1694, described a function as:-

… a quantity somehow formed from indeterminate and constant quantities.

In a paper in 1698 on isoperimetric problems Johann Bernoulli writes of “functions of ordinates” (see [32]). Leibniz wrote to Bernoulli saying:-

… I am pleased that you use the term function in my sense.

It was a concept whose introduction was particularly well timed as far as Johann Bernoulli was concerned for he was looking at problems in the calculus of variations where functions occur as solutions. See [28] for more information about how the author considers the calculus of variations to be the mathematical theory which developed most intimately in connection with the concept of a function.

One can say that in 1748 the concept of a function leapt to prominence in mathematics. This was due to Euler who published Introductio in analysin infinitorum in that year in which he makes the function concept central to his presentation of analysis. Euler defined a function in the book as follows:-

A function of a variable quantity is an analytic expression composed in any way whatsoever of the variable quantity and numbers or constant quantities.

This is all very well but Euler gives no definition of “analytic expression” rather he assumes that the reader will understand it to mean expressions formed from the usual operations of addition, multiplication, powers, roots, etc. He divides his functions into different types such as algebraic and transcendental. The type depends on the nature of the analytic expression, for example transcendental functions are not algebraic such as:-

… exponentials, logarithms, and others which integral calculus supplies in abundance.

Euler allowed the algebraic operations in his analytic expressions to be used an infinite number of times, resulting in infinite series, infinite products, and infinite continued fractions. He later suggests that a transcendental function should be studied by expanding it in a power series. He does not claim that all transcendental functions can be expanded in this was but says that one should prove it in each specific case. However there was a difficulty in Euler‘s work which was to lead to confusion, for he failed to distinguish between a function and its representation. However Introductio in analysin infinitorum was to change the way that mathematicians thought about familiar concepts. Jahnke writes [2]:-

Until Euler the trigonometric quantities sine, cosine, tangent etc., were regarded as lines connected with the circle rather than functions. … It was Euler who introduced the functional point of view.

The function concept had led Euler to make many important discoveries before he wrote Introductio in analysin infinitorum. For example it had led him to define the gamma function and to solve the problem which had defeated mathematicians for some considerable time, namely summing the series

¹/_1² + ¹/_2² + ¹/_3² + ¹/_4² + …

He showed that the sum was π²/6, publishing the result in 1740.

Let us return to the contents of Introductio in analysin infinitorum. In it Euler introduced continuous, discontinuous and mixed functions but since the first two of these concepts have different modern meanings we will choose to call Euler‘s versions E-continuous and E-discontinuous to avoid confusion. An E-continuous function was one which was expressed by a single analytic expression, a mixed function was expressed in terms of two or more analytic expressions, and an E-discontinuous function included mixed functions but was a more general concept. Euler did not clearly indicate what he meant by an E-discontinuous function although it was clear that Euler thought of them as more general than mixed functions. He later defined them as those functions which had arbitrarily handdrawn curves as their graphs (rather confusingly essentailly what we call a continuous function today).

In 1746 d’Alembert published a solution to the problem of a vibrating stretched string. The solution, of course, depended on the initial form of the string and d’Alembert insisted in his solution that the function which described the initial velocities of the each point of the string had to be E-continuous, that is expressed by a single analytic expression. Euler published a paper in 1749 which objected to this restriction imposed by d’Alembert, claiming that for physical reasons more general expressions for the initial form of the string had to be allowed. Youschkevitch writes [32]:-

d’Alembert did not agree with Euler. Thus began the long controversy about the nature of functions to be allowed in the initial conditions and in the integrals of partial differential equations, which continued to appear in an ever increasing number in the theory of elasticity, hydrodynamics, aerodynamics, and differential geometry.

In 1755 Euler published another highly influential book, namely Institutiones calculi differentialis. In this book he defined a function in an entirely general way, giving what we might reasonably say was a truly modern definition of a function:-

If some quantities so depend on other quantities that if the latter are changed the former undergoes change, then the former quantities are called functions of the latter. This definition applies rather widely and includes all ways in which one quantity could be determined by other. If, therefore, x denotes a variable quantity, then all quantities which depend upon x in any way, or are determined by it, are called functions of x.

This might have been a huge breakthrough but after giving this wide definition, Euler then devoted the book to the development of the differential calculus using only analytic functions. The first problems with Euler‘s definition of types of functions was pointed out in 1780 when it was shown that a mixed function, given by different formulas, could sometimes be given by a single formula. The clearest example of such a function was given by Cauchy in 1844 when he noted that the function

y = x for x ≥ 0, y = –x for x < 0

can be expressed by the single formula y = √(x²). Hence dividing functions into E-continuous or mixed was meaningless. However, a more serious objection came through the work of Fourier who stated in 1805 that Euler was wrong. Fourier showed that some discontinuous functions could be represented by what today we call a Fourier series. The distinction between E-continuous and E-discontinuous functions, therefore, did not exist. Fourier‘s work was not immediately accepted and leading mathematicians such as Lagrange did not accept his results at this stage. Luzin points out in [17] and [18] that confusion regarding functions had been due to a lack of understanding of the distinction between a “function” and its “representation”, for example as a series of sines and cosines. Fourier‘s work would lead eventually to the clarification of the function concept when in 1829 Dirichlet proved results concerning the convergence of Fourier series, thus clarifying the distinction between a function and its representation.

Other mathematicians gave their own versions of the definition of a function. Condorcet seems to have been the first to take up Euler‘s general definition of 1755, see [31] for details. In 1778 the first two parts of Condorcet intended five part work Traité du calcul integral was sent to the Paris Academy. It was never published but was seen by many leading French mathematicians. In this work Condorcet distinguished three types of functions: explicit functions, implicit functions given only by unsolved equations, and functions which are defined from physical considerations such as being the solution to a diffferential equation.

Lacroix, who had read Condorcet‘s unfinished work, wrote in 1797:-

Every quantity whose value depends on one or more other quantities is called a function of these latter, whether one knows or is ignorant of what operation it is necessary to use to arrive from the latter to the first.

Cauchy, in 1821, came up with a definition making the dependence between variables central to the function concept. He wrote in Cours d’anlyse:-

If variable quantities are so joined between themselves that, the value of one of these being given, one can conclude the values of all the others, one ordinarily conceives these diverse quantities expressed by means of the one of them, which then takes the name independent variable; and the other quantities expressed by means of the independent variable are those which one calls functions of this variable.

Notice that despite the generality of Cauchy‘s definition, which is designed to cover the case of explicit and implicit functions, he is still thinking of a function in terms of a formula. In fact he makes the distinction between explicit and implicit functions immediately after giving this definition. He also introduces concepts which indicate that he is still thinking in terms of analytic expressions.

Fourier, in Théorie analytique de la Chaleur in 1822, gave the following definition:-

In general, the function f(x) represents a succession of values or ordinates each of which is arbitrary. An infinity of values being given of the abscissa x, there are an equal number of ordinates f(x). All have actual numerical values, either positive or negative or nul. We do not suppose these ordinates to be subject to a common law; they succeed each other in any manner whatever, and each of them is given as it were a single quantity.

It is clear that Fourier has given a definition which deliberately moves away from analytic expressions. However, despite this, when he begins to prove theorems about expressing an arbitrary function as a Fourier series, he uses the fact that his arbitrary function is continuous in the modern sense!

Dirichlet, in 1837, accepted Fourier‘s definition of a function and immediately after giving this definition he defined a continuous function (using continuous in the modern sense). Dirichlet also gave an example of a function defined on the interval [ 0, 1] which is discontinuous at every point, namely f(x) which is defined to be 0 if x is rational and 1 if x is irrational.

In 1838 Lobachevsky gave a definition of a general function which still required it to be continuous:-

A function of x is a number which is given for each x and which changes gradually together with x. The value of the function could be given either by an analytic expression or by a condition which offers a means for testing all numbers and selecting one from them, or lastly the dependence may exist but remain unknown.

Certainly Dirichlet‘s everywhere discontinuous function will not be a function under Lobachevsky‘s definition. Hankel, in 1870, deplored the confusion which still reigned in the function concept:-

One person defines functions essentially in Euler‘s sense, the other requires that y must change with x according to a law, without giving an explanation of this obscure concept, the third defines it in Dirichlet‘s manner, the fourth does not define it at all. However, everybody deduces from his concept conclusions that are not contained in it.

Around this time many pathological functions were constructed. Cauchy gave an early example when he noted that f(x) = exp(-1/x²) for x ≠ 0, f(0) = 0, is a continuous function which has all its derivatives at 0 equal to 0. It therefore has a Taylor series which converges everywhere but only equals the function at 0. In 1876 Paul du Bois-Reymond made the distinction between a function and its representation even clearer when he constructed a continuous function whose Fourier series diverges at a point. This line was taken further in 1885 when Weierstrass showed that any continuous function is the limit of a uniformly convergent sequence of polynomials. Earlier, in 1872, Weierstrass had sent a paper to the Berlin Academy of Science giving an example of a continuous function which is nowhere differentiable. Lützen writes in [2]:-

Weierstrass’s function contradicted an intuitive feeling held by most of his contemporaries to the effect that continuous functions were differentiable except at “special points”. it created a sensation and, according to Hankel, disbelief when du Bois-Reymond published it in 1875.

Poincaré was unhappy with the direction that the definition of functions had taken. He wrote in 1899:-

For half a century we have seen a mass of bizarre functions which appear to be forced to resemble as little as possible honest functions which serve some purpose. … Formerly, when a new function was invented, it was in view of some practical end. Today they are invented on purpose to show that our ancestor’s reasoning was at fault, and we shall never get anything more than that out of them. If logic were the teacher’s only guide, he would have to begin with the most general, that is to say, the most weird functions.

Where have more modern definitions taken the concept? Goursat, in 1923, gave the definition which will appear in most textbooks today:-

One says that y is a function of x if to a value of x corresponds a value of y. One indicates this correspondence by the equation y = f(x).

Just in case this is not precise enough and involves undefined concepts such as ‘value’ and ‘corresponds’, look at the definition given by Patrick Suppes in 1960:-

Definition. A is a relation (x)(x ∈ A (y)(z)(x = (y, z)). We write y A z if (y, z) ∈ A.Definition. f is a function f is a relation and (x)(y)(z)(x f y and x f z y = z).

What would Poincaré have thought of Suppes’ definition?

The real numbers: Pythagoras to Stevin

Before we begin to discuss the historical development of the real number system it is useful to consider what a number is. Perhaps the reader might think that this is a silly question and that it is “obvious” what a number is. Well the first clear evidence that this is not so is the fact that the concept of number has changed greatly throughout the development of mathematics up to the present day. What is equally clear is that there is no reason, other than conceit, to believe that the present concept of number will not change in the future. Wittgenstein, in Philosophical Investigationswrites:-

Why do we call something a ‘number’? Well, perhaps because it has a direct relationship with several things that have hitherto been called number; and this can be said to give it an indirect relationship to other things we call the same name. And we extend our concept of number as in spinning a thread we twist fibre on fibre. And the strength of the thread does not reside in the fact that some one fibre runs through its whole length, but in the overlapping of many fibres. But if someone wished to say: “There is something common to all these constructions – namely the disjunction of all their common properties” – I should reply: “Now you are only playing with words. One might as well say: ‘Something runs through the whole thread – namely the continuous overlapping of those fibres.’ ““All right: the concept of number is defined for you as the logical sum of these individual interrelated concepts: cardinal numbers, rational numbers, real numbers etc.; and, in the same way the concept of a game is the logical sum of a corresponding set of sub-concepts.” – It need not be so. For I can give the concept ‘number’ rigid limits in this way, that is, use the word “number” for a rigidly limited concept, but I can also use it so that the extension of the concept is not closed by a frontier. And this is how we use the word “game”. For how is the concept of a game bounded?

We should begin a discussion of real numbers by looking at the concepts of magnitude and number in ancient Greek times. The first of these might refer to the length of a geometrical line while the second concept, namely number, was thought of as composed of units. Pythagoras seems to have thought that “All is number”; so what was a number to Pythagoras? It seems clear that Pythagoras would have thought of 1, 2, 3, 4, … (the natural numbers in the terminology of today) in a geometrical way, not as lengths of a line as we do, but rather in the form of discrete points. Addition, subtraction and multiplication of integers are natural concepts with this type of representation but there seems to have been no notion of division. A mathematician of this period, given the number 12, could easily see that 4 is a submultiple of it since 3 times 4 is exactly 12. Although to us this is clearly the same as division, it is important to see the distinction. We have used the word “submultiple” above, so should indicate what the Pythagoreans considered this to be. Nicomachus, following the tradition of Pythagoras, makes the following definition of a submultiple:-

The submultiple, which is by its nature the smaller, is the number which when compared with the greater can measure it more times than one so as to fill it out exactly.

Magnitudes, being distinct entities from numbers, had to have a separate definition and indeed Nicomachus makes such a parallel definition for magnitudes.

The idea of Pythagoras that “all is number” is explained by Aristotle in Metaphysics:-

[In the time of Pythagoras] since all other things seemed in their whole nature to be modelled on numbers, and numbers seemed to be the first things in the whole of nature, they supposed the elements of numbers to be the elements of all things, and the whole heaven to be a musical scale and a number. And all the properties of numbers and scales which they could show to agree with the attributes and parts and the whole arrangement of the heavens, they fitted into their scheme … the Pythagoreans say that things are what they are by intimating numbers … the Pythagoreans take objects themselves to be numbers and do not treat mathematical objects as distinct from them …

This concept certainly ran into difficulties once various magnitudes were studied. All numbers, essentially by definition, were, as we have seen, (positive integer) multiples of a base unit but ratios of lengths were shown not to have the property of being ratios of numbers (integers). The usual example given of this comes from a right angled triangle whose shorter sides are both of unit length. Such a triangle has as hypotenuse a line of length √2 times the lengths of the shorter sides. There is no length x such that 1 and √2 are both multiples (remember integer multiples) of x. Plato, in Theaetetus, tells of the discovery that √3, √5, … , √17 were not commensurable with 1:-

Theodorus was writing out for us something about roots, such as the sides of squares whose area was 3 or 5 units, showing that the sides are incommensurable with the unit: he took the examples up to 17, but there for some reason he stopped.

We suppose that the discovery that √2 was not commensurable with 1 came earlier which is why Theodorus started with √3. Heimonen, in [10], looks at the views of different historians concerning the discovery of the irrational numbers:-

Von Fritz has proposed that the Pythagorean Hippasos first proved the irrationality of the golden ratio by studying the regular pentagon. The proof is based on the fact that the continued fraction expansion of the ratio of its diagonal and size is periodic.The same idea of irrationality proof was expressed by Zeuthen and van der Waerden for the ratio of the diagonal and side of the square also, as well as for the square roots of 3, 5, … ,17, which according to Plato were proved to be irrational by Theodoros.

Knorr set out a new theory, trying especially to explain better why Theodoros stopped just at the square root of 17. His theory is some kind of geometrical version on the irrationality proof of the square root of 2 known from school.

Fowler accepted the main ideas of Knorr, but also returned to the continued fractions, maintaining even that also the common fractions were handled as continued fractions in Plato‘s time.

Before continuing to describe advances in ideas concerning numbers, it should be mentioned at this stage that the Egyptians and the Babylonians had a different notion of number to that of the ancient Greeks. The Babylonians looked at reciprocals and also at approximations to irrational numbers, such as √2, long before Greek mathematicians considered approximations. The Egyptians also looked at approximating irrational numbers.

Let us now look at the position as it occurs in Euclid‘s Elements. This is an important stage since it would remain the state of play for nearly the next 2000 years. In Book V Euclid considers magnitudes and the theory of proportion of magnitudes. It is probable (and claimed in a later version of The Elements) that this was the work of Eudoxus. Usually when Euclid wants to illustrate a theorem about magnitudes he gives a diagram representing the magnitude by a line segment. However magnitude is an abstract concept to Euclid and applies to lines, surfaces and solids. Also, more generally, Euclid also knows that his theory applies to time and angles.

Given that Euclid is famous for an axiomatic approach to mathematics, one might expect him to begin with a definition of magnitude and state some unproved axioms. However he leaves the concept of magnitude undefined and his first two definitions refer to the part of a magnitude and a multiple of a magnitude:

Definition V.1 A magnitude is a part of a magnitude, the less of the greater, when it measures the greater.

Again the term “measures” here is undefined but clearly Euclid means that (in modern symbols) the smaller magnitude x is a part of the greater magnitude y if nx = y for some natural number n > 1.

Definition V.2 The greater is a multiple of the less when it is measured by the less.

Then come the definition of ratio.

Definition V.3 A ratio is a sort of relation in respect of size between two magnitudes of the same kind.

This is an exceptionally vague definition of ratio which basically fails to define it at all. He then defines when magnitudes have a ratio, which according to the definition is when there is an multiple (by a natural number) of the first which exceeds the second and a multiple of the second which exceeds the first. Then comes the vital definition of when two magnitudes are in the same ratio as a second pair of magnitudes. As it is quite hard to understand in Euclid‘s language, let us translate it into modern notation. It says that a : b = c : d if given any natural numbers n and m we have

na > mb if and only if nc > md
na = mb if and only if nc = md
na < mb if and only if nc < md.

Euclid then goes on to prove theorems which look to a modern mathematician as if magnitudes are vectors, integers are scalars, and he is proving the vector space axioms. For example for magnitudes a and b and natural numbers n and m he proves:-

Proposition V.1   n(a + b) = na + nb
Proposition V.2    (n + m)a = na + ma
Proposition V.3   n(ma) = (nm)a

In Book VII Euclid studies numbers. He makes a series of definitions. First he defines a unit, then a number is defined as being composed of a multitude of units, and parts and multiples are defined as for magnitudes. We should note that Euclid, as earlier Greek mathematicians, did not consider 1 as a number. It was a unit and the numbers 2, 3, 4, … were composed of units. Various properties of numbers are assumed but are not listed as axioms. For example the commutative law for multiplication is assumed without ever being stated as an axiom as are the associative law for addition etc. He then introduces proportion for numbers and shows essentially that for numbers a, b, c, d that a : b = c : d precisely when the least numbers with ratio a : b is equal to the least numbers with ratio c : d. This is logically equivalent to saying in modern terms that the rational ^a/_b and the rational ^c/_d are equal if they become the same when reduced to their lowest terms. An important result in Book VII is the Euclidean algorithm. We should note that Euclid never identified the ratio 2 : 1 with the number 2. These were two quite different concepts.

Book X considers commensurable and incommensurable magnitudes. It is a long book, over one quarter of the whole of The Elements. We have:

Definition X.1 Those magnitudes are said to be commensurable which are measured by the same measure, and those incommensurable which cannot have any common measure.

Euclid then proves results such as:-

Proposition X.2 If, when two unequal magnitudes are set out and the lesser is always subtracted in turn from the greater, the remainder never measures the magnitude before it, then the magnitudes will be incommensurable.Proposition X.5 Commensurable magnitudes have to one another the ratio which a number has to a number.

Notice that Proposition X.2 says that two magnitudes are incommensurable if the Euclidean algorithm does not terminate. Euclid goes on to prove, among many other results, those of Theodorus, namely that segments of length √3, √5, … , √17 are incommensurable with a segment of unit length.

So where does Euclid‘s Elements leave us with respect to numbers. Basically numbers were 1, 2, 3, … and ratios of numbers were used which (although not considered to be numbers) basically allowed manipulation with what we call rationals. Also magnitudes were considered and these were essentially lengths constructible by ruler and compass from a line of unit length. No other magnitudes were considered. Hence mathematicians studied magnitudes which had lengths which, in modern terms, could be formed from positive integers by addition, subtraction, multiplication, division and taking square roots.

The Arabic mathematicians went further with constructible magnitudes for they used geometric methods to solve cubic equations which meant that they could construct magnitudes whose ratio to a unit length involved cube roots. For example Omar Khayyam showed how to solve all cubic equations by geometric methods. Fibonacci, using skills learnt from the Arabs, solved a cubic equation showing that its root was not formed from rationals and square roots of rationals as Euclid‘s magnitudes were. He then went on to compute an approximate solution. Although no conceptual advances were taking place, by the end of the fifteenth century mathematicians were considering expressions build from positive integers by addition, subtraction, multiplication, division and taking nth roots. These are called radical expressions.

By the sixteenth century rational numbers and roots of numbers were becoming accepted as numbers although there was still a sharp distinction between these different types of numbers. Stifel, in his Arithmetica Integra (1544) argues that irrationals must be considered valid:-

It is rightly disputed whether irrational numbers are true numbers or false. Because in studying geometrical figures, where rational numbers desert us, irrationals take their place, and show precisely what rational numbers are unable to show … we are moved and compelled to admit that they are correct …

However, he goes on to argue that, as they are not proportional to rational numbers, they cannot be true numbers even if they are correct. He ends up arguing that all irrational numbers result from radical expressions. Well the obvious question the reader might feel they want to ask Stifel is: what about the length of the circumference of a circle with radius of unit length? In fact Stifel gives an answer to this in an appendix to the book. First he makes a distinction between physical circles and mathematical circles. One can measure the properties of physical circles, he claims, but one cannot measure a mathematical circle with physical instruments. He then goes on to consider the circle as the limit of a sequence of polygons of more and more sides. He writes:-

Therefore the mathematical circle is rightly described as the polygon of infinitely many sides. And thus the circumference of the mathematical circle receives no number, neither rational nor irrational.

Not too good an argument, but nevertheless a remarkable insight that there were lengths which did not correspond to radical expressions but which could be approximated as closely as one wished.

A major advance was made by Stevin in 1585 in La Theinde when he introduced decimal fractions. One has to understand here that in fact it was in a sense fortuitous that his invention led to a much deeper understanding of numbers for he certainly did not introduce the notation with that in mind. Only finite decimals were allowed, so with his notation only certain rationals to be represented exactly. Other rationals could be represented approximately and Stevin saw the system as a means to calculate with approximate rational values. His notation was to be taken up by Clavius and Napier but others resisted using it since they saw it as a backwards step to adopt a system which could not even represent ¹/₃ exactly.

Stevin made a number of other important advances in the study of the real numbers. He argued strongly in L’Arithmetique (1585) that all numbers such as square roots, irrational numbers, surds, negative numbers etc should all be treated as numbers and not distinguished as being different in nature. He wrote:-

Thesis 1:   That unity is a number.
Thesis 2:   That any given numbers can be square, cubes, fourth powers etc.
Thesis 3:   That any given root is a number.
Thesis 4:   That there are no absurd, irrational, irregular, inexplicable or surd numbers.It is a very common thing amongst authors of arithmetics to treat numbers like √8 and similar ones, which they call absurd, irrational, irregular, inexplicable or surds etc and which we deny to be the case for number which turns up.

His first thesis was to argue against the Greek idea that 1 is not a number but a unit and the numbers 2, 3, 4, … were composed of units. The other three theses were encouraging people to treat different types of numbers, which were at that time treated separately, as a single entity – namely a number.

One further comment by Stevin in L’Arithmetique is worth recording. He noted that, as we stated above, Euclid‘s Proposition X.2 says that two magnitudes are incommensurable if the Euclidean algorithm does not terminate. Stevin writes about this pointing out what today we would say was the difference between an algorithm and a procedure (or semi-algorithm):-

Although this theorem is valid, nevertheless we cannot recognise by such experience the incommensurability of two given magnitudes. … even though it were possible for us to subtract by due process several hundred thousand times the smaller magnitude from the larger and continue that for several thousands of years, nevertheless if the two given numbers were incommensurable one would labour eternally, always ignorant of what could still happen in the end. This manner of cognition is therefore not legitimate, but rather an impossible position …

Further progress in the development of the real numbers only became possible after ideas of convergence were put on a firm basis. However, there was a strong influence in the other direction too, since progress in rigorous analysis required a deeper understanding of the real numbers. This is studied further in the article The real numbers: Stevin to Hilbert.

The real numbers: Stevin to Hilbert

By the time Stevin proposed the use of decimal fractions in 1585, the concept of a number had developed little from that of Euclid‘s Elements. Details of the earlier contributions are examined in some detail in our article: The real numbers: Pythagoras to Stevin

If we move forward almost exactly 100 years to the publication of A treatise of Algebra by Wallis in 1684 we find that he accepts, without any great enthusiasm, the use of Stevin‘s decimals. He still only considers finite decimal expansions and realises that with these one can approximate numbers (which for him are constructed from positive integers by addition, subtraction, multiplication, division and taking nth roots) as closely as one wishes. However, Wallis understood that there were proportions which did not fall within this definition of number, such as those associated with the area and circumference of a circle:-

… such proportion is not to be expressed in the commonly received ways of notation: particularly that for the circles quadrature. … Now, as for other incommensurable quantities, though this proportion cannot be accurately expressed in absolute numbers, yet by continued approximation it may; so as to approach nearer to it than any difference assignable.

For Wallis there were a variety of ways that one might achieve this approximation, so coming as close as one pleased. He considered approximations by continued fractions, and also approximations by taking successive square roots. This leads into the study of infinite series but without the necessary machinery to prove that these infinite series converged to a limit, he was never going to be able to progress much further in studying real numbers. Real numbers became very much associated with magnitudes. No definition was really thought necessary, and in fact the mathematics was considered the science of magnitudes. Euler, in Complete introduction to algebra (1771) wrote in the introduction:-

Mathematics, in general, is the science of quantity; or, the science which investigates the means of measuring quantity.

He also defined the notion of quantity as that which can be continuously increased or diminished and thought of length, area, volume, mass, velocity, time, etc. to be different examples of quantity. All could be measured by real numbers. However, Euler‘s mathematics itself led to a more abstract idea of quantity, a variable x which need not necessarily take real values. Symbolic mathematics took the notion of quantity too far, and a reassessment of the concept of a real number became more necessary. By the beginning of the nineteenth century a more rigorous approach to mathematics, principally by Cauchy and Bolzano, began to provide the machinery to put the real numbers on a firmer footing. Grabiner writes [2]:-

… though Cauchy implicitly assumed several forms of the completeness axiom for the real numbers, he did not fully understand the nature of completeness or the related topological properties of sets of real numbers or of points in space. … Cauchy did not have explicit formulations for the completeness of the real numbers. Among the forms of the completeness property he implicitly assumed are that a bounded monotone sequence converges to a limit and that the Cauchy criterion is a sufficient condition for the convergence of a series. Though Cauchy understood that a real number could be obtained as the limit of rationals, he did not develop his insight into a definition of real numbers or a detailed description of the properties of real numbers.

Cauchy, in Cours d’analyse (1821), did not worry too much about the definition of the real numbers. He does say that a real number is the limit of a sequence of rational numbers but he is assuming here that the real numbers are known. Certainly this is not considered by Cauchy to be a definition of a real number, rather it is simply a statement of what he considers an “obvious” property. He says nothing about the need for the sequence to be what we call today a Cauchy sequence and this is necessary if one is to define convergence of a sequence without assuming the existence of its limit. He does define the product of a rational number A and an irrational number B as follows:-

Let b, b’, b”, … be a sequence of rationals approaching B closer and closer. Then the product AB will be the limit of the sequence of rational numbers Ab, Ab’, Ab”, …

Bolzano, on the other hand, showed that bounded Cauchy sequence of real numbers had a least upper bound in 1817. He later worked out his own theory of real numbers which he did not publish. This was a quite remarkable achievement and it is only comparatively recently that we have understood exactly what he did achieve. His definition of a real number was made in terms of convergent sequences of rational numbers and is explained in [22] where Rychlik describes it as “not quite correct”. In [28] van Rootselaar disagrees saying that “Bolzano‘s elaboration is quite incorrect”. However in J Berg’s edition of Bolzano‘s Reine Zahlenlehre which was published in 1976, Berg points out that Bolzano had discovered the difficulties himself and Berg found notes by Bolzano which proposed amendments to his theory which make it completely correct. As Bolzano‘s contributions were unpublished they had little influence in the development of the theory of the real numbers.

Cauchy himself does not seem to have understood the significance of his own “Cauchy sequence” criterion for defining the real numbers. Nor did his immediate successors. It was Weierstrass, Heine, Méray, Cantor and Dedekind who, after convergence and uniform convergence were better understood, were able to give rigorous definitions of the real numbers.

Up to this time there was no proof that numbers existed that were not the roots of polynomial equations with rational coefficients. Clearly √2 is the root of a polynomial equation with rational coefficients, namely x² = 2, and it is easy to see that all roots of rational numbers arise as solutions of such equations. A number is called transcendental if it is not the root of a polynomial equation with rational coefficients. The word transcendental is used as such number transcend the usual operations of arithmetic. Although mathematicians had guessed for a long time that π and e were transcendental, this had not been proved up to the middle of the 19^th century. Liouville‘s interest in transcendental numbers stemmed from reading a correspondence between Goldbach and Daniel Bernoulli. Liouville certainly aimed to prove that e is transcendental but he did not succeed. However his contributions led him to prove the existence of a transcendental number in 1844 when he constructed an infinite class of such numbers using continued fractions. These were the first numbers to be proved transcendental. In 1851 he published results on transcendental numbers removing the dependence on continued fractions. In particular he gave an example of a transcendental number, the number now named the Liouvillian number

0.1100010000000000000000010000…

where there is a 1 in place n! and 0 elsewhere.

One of the first people to attempt to give a rigorous definition of the real numbers was Hamilton. Perhaps, if one thinks about it, it is logical that he would be interested in this since his introduction of the quaternions had shown that there were new previously unstudied number systems. In fact came close to the idea of a Dedekind cut, as Euclid had done in the Elements, but failed to make the idea into a definition (again Euclid had spotted the property but never thought to use it as a definition). For a number a he noted that there are rationals a‘, a“, b‘, b“, c‘, c“, d‘, d“, … with

a‘ < a < a”
b‘ < a < b”
c‘ < a < c”
d‘ < a < d”
…

but he never thought to define a number by the sets {a‘, b‘, c‘, d‘, … } and {a“, b“, c“, d“, … }. He tried another approach of defining numbers given by some law, say x x². Hamilton writes:-

If x undergoes a continuous and constant increase from zero, then will pass successively through every state of positive ration b, and therefore that every determined positive ration b has one determined square root √b which will be commensurable or incommensurable according as b can or cannot be expressed as the square of a fraction. When b cannot be so expressed, it is still possible to approximate in fractions to the incommensurable square root √b by choosing successively larger and larger positive denominators …

One can see what Hamilton is getting at, but much here is without justification – can a quantity undergo a continuous and constant increase. Even if one got round this problem he is only defining numbers given by a law. It is unclear whether he thought that all real numbers would arise in this way.

When progress came in giving a rigorous definition of a real number, there was a sudden flood of contributions. Dedekind worked out his theory of Dedekind cuts in 1858 but it remained unpublished until 1872. Weierstrass gave his own theory of real numbers in his Berlin lectures beginning in 1865 but this work was not published. The first published contribution regarding this new approach came in 1867 from Hankel who was a student of Weierstrass. Hankel, for the first time, suggests a total change in out point of view regarding the concept of a real number:-

Today number is no longer an object, a substance which exists outside the thinking subject and the objects giving rise to this substance, an independent principle, as it was for instance for the Pythagoreans. Therefore, the question of the existence of numbers can only refer to the thinking subject or to those objects of thought whose relations are represented by numbers. Strictly speaking, only that which is logically impossible (i.e. which contradicts itself) counts as impossible for the mathematician.

In his 1867 monograph Hankel addressed the question of whether there were other “number systems” which had essentially the same rules as the real numbers.

Two years after the publication of Hankel‘s monograph, Méray published Remarques sur la nature des quantités in which he considered Cauchy sequences of rational numbers which, if they did not converge to a rational limit, had what he called a “fictitious limit”. He then considered the real numbers to consist of the rational numbers and his fictitious limits. Three years later Heine published a similar notion in his book Elemente der Functionenlehre although it was done independently of Méray. It was similar in nature with the ideas which Weierstrass had discussed in his lectures. Heine‘s system has become one of the two standard ways of defining the real numbers today. Essentially Heine looks at Cauchy sequences of rational numbers. He defines an equivalence relation on such sequences by defining

a₁ , a₂ , a₃ , a₄ , … and b₁, b₂ , b₃ , b₄ , …

to be equivalent if the sequence of rational numbers a₁ – b₁, a₂ – b₂ , a₃ – b₃ , a₄ – b₄ , … converges to 0. Heine then introduced arithmetic operations on his sequences and an order relation. Particular care is needed to handle division since sequences with a non-zero limit might still have terms equal to 0.

Cantor also published his version of the real numbers in 1872 which followed a similar method to that of Heine. His numbers were Cauchy sequences of rational numbers and he used the term “determinate limit”. It was clear to Hankel (see the quote above) that the new ideas of number had suddenly totally changed a concept which had been motivated by measurement and quantity. Similarly Cantor realised that if he wants the line to represent the real numbers then he has to introduce an axiom to recover the connection between the way the real numbers are now being defined and the old concept of measurement. He writes about a distance of a point from the origin on the line:-

If this distance has a rational relation to the unit of measure, then it is expressed by a rational quantity in the domain of rational numbers; otherwise, if the point is one known through a construction, it is always possible to give a sequence of rationals a₁ , a₂ , a₃ , …, a_n , … which has the properties indicated and relates to the distance in question in such a way that the points on the straight line to which the distances a₁ , a₂ , a₃ , …, a_n , … are assigned approach in infinity the point to be determined with increasing n. … In order to complete the connection presented in this section of the domains of the quantities defined [his determinate limits] with the geometry of the straight line, one must add an axiom which simple says that every numerical quantity also has a determined point on the straight line whose coordinate is equal to that quantity, indeed, equal in the sense in which this is explained in this section.

As we mentioned above, Dedekind had worked out his idea of Dedekind cuts in 1858. When he realised that others like Heine and Cantor were about to publish their versions of a rigorous definition of the real numbers he decided that he too should publish his ideas. This resulted in yet another 1872 publication giving a definition of the real numbers. Dedekind considered all decompositions of the rational numbers into two sets A₁ , A₂ so that a₁ < a₂ for all a₁ in A₁ and a₂ in A₂. He called (A₁, A₂) a cut. If the rational a is either the maximum element of A₁ or the minimum element of A₂ then Dedekind said the cut was produced by a. However not all cuts were produced by a rational. He wrote:-

In every case in which a cut (A₁, A₂) is given that is not produced by a rational number, we create a new number, an irrational number a, which we consider to be completely defined by this cut; we will say that the number a corresponds to this cut or that it produces the cut.

He defined the usual arithmetic operations and ordering and showed that the usual laws apply.

Another definition, similar in style to that of Heine and Cantor, appeared in a book by Thomae in 1880. Thomae had been a colleague of Heine and Cantor around the time they had been writing up their ideas. He claimed that the real numbers defined in this way had a right to exist because:-

… the rules of combination abstracted from calculations with integers may be applied to them without contradiction.

Frege, however, attacked these ideas of Thomae. He wanted to develop a theory of real numbers based on a purely logical base and attacked the philosophy behind the constructions which had been published. Thomae added further explanation to his idea of “formal arithmetic” in the second edition of his text which appeared in 1898:-

The formal conception of numbers requires of itself more modest limitations than does the logical conception. It does not ask, what are and what shall the numbers be, but it asks, what does one require of numbers in arithmetic.

Frege was still unhappy with the constructions of Weierstrass, Heine, Cantor, Thomae and Dedekind. How did one know. he asked, that the constructions led to systems which would not produced contradictions? He wrote in 1903:-

This task has never been approached seriously, let alone been solved.

Frege, however, never completed his own version of a logical framework. His hopes were shattered when he learnt of Russell‘s paradox. Hilbert had taken a totally different approach to defining the real numbers in 1900. He defined the real numbers to be a system with eighteen axioms. Sixteen of these axioms define what today we call an ordered field, while the other two were the Archimedean axiom and the completeness axiom. The Archimedean axiom stated that given positive numbers a and b then it is possible to add a to itself a finite number of times so that the sum exceed b. The completeness property says that one cannot extend the system and maintain the validity of all the other axioms. This was totally new since all other methods built the real numbers from the known rational numbers. Hilbert‘s numbers were unconnected with any known system. It was impossible to say whether a given mathematical object was a real number. Most seriously, there was no proof that any such system actually existed. If it did it was still subject to the same questions concerning its consistency as Frege had pointed out.

By the beginning of the 20^th century, then, the concept of a real number had moved away completely from the concept of a number which had existed from the most ancient times to the beginning of the 19^th century, namely its connection with measurement and quantity.

The real numbers: Attempts to understand

Epple writes in [2]:-

What was a real number at the end of the 19^th century? An intuitive, geometrical or physical quantity, or a ratio of such quantities? An aggregate of things identical in thought? A creation of the human mind? An arbitrary sign subjected to certain rules? A purely logical concept? Nobody was able to decide this with certainty. Only one thing was beyond doubt: there was no consensus of any kind.

Were the real numbers consistent? Would an inconsistency appear one day and much of the mathematical building come tumbling down? Some of the intuitive difficulties that began to be felt revolved around the fact that the real numbers were not countable, that is, they could not be put in 1-1 correspondence with the natural numbers. Cantor proved that the real numbers were not countable in 1874. He produced his famous “diagonal argument” in 1890 which gave a second, more striking, proof that the real numbers were not countable. To do this he assumed that the real numbers were countable, that is they could be listed in order. Suppose that this list is

L = {n₁ , n₂ , n₃ , n₄ , … }

and let d(i, j) be the i-th digit of n_j . Define the real number r to have k-th digit 1 if d(k, k) = 2 and r to have k-th digit 2 if d(k, k) ≠ 2. Then the real number r is not in the list L since if it were then it would be n_t for some t. But the t-th digit of r differs from the t-th digit of n_t by construction, so we have a contradiction. Hence the real numbers are not countable.

We’ll construct a certain real number which, although historically not one that was looked at, will let us understand some of the questions that arose. Let us start with the 100 two digit numbers. A simple code will let us translate these into letters, 00 become a, 01 become b, … , 25 becomes z, 26 becomes A, 27 becomes B, … , 51 becomes Z, then code all the punctuation marks, and then make all the remaining numbers up to 99 translate to an empty space. Now create a number, say c, starting from the 100 2-blocks.

c = 0.01020304050607080910111213141516171819202122232425…

Then continue with the 10000 pairs of 2-blocks 0000, 0001, 0002, …, 0099, 0100, 0101, …

Then the 1000000 triples of 2-blocks etc. We can represent c as a point on a line segment of length 1. Yet every English sentence ever written or ever to be written, occurs in the decoding of c into letters. For example “one third” has 9 characters so will be decoded from c around 10¹⁸ digits after the decimal point. This article is there, both with the misprints which inevitably occur and a corrected version is there (but one has to go rather a long way to the right of the decimal point to find it!). The whole of Shakespeare is there, as is every book yet to be written, etc!

Let us use c to describe a paradox which was discovered in 1905. The first thing to notice is that all descriptions of real numbers in English (let us forget about words in other languages) must appear in c, since every possible sentence occurs in c. For example, “one third” will occur as we have noted, as will “the base of natural logs”, and “the ratio of the circumference of a circle to its diameter”, etc. This will enable us to explain Richard’s paradox, discovered by Jules Richard in 1905. There are only a countable number of such descriptions of real numbers in English so all real numbers (except a tiny countable subset) can never be described in English. However, this is not the paradox. Richard obtained that by using Cantor‘s diagonal argument. Create a real number r, say, as follows:

If the n-th block of c translates into a description of a real number r(n) then set the n-th digit of r to be different to the n-th digit of r(n). If the n-th block of c does not describe a real number (most of course will not even be meaningful in English) then set the n-th digit of r(n) to be 1.

Now the real number r cannot be described in English, since it differs by construction from every real number which can be described in English. That is a bit worrying. Even worse, of course, is the fact that we have just described r in English in the previous paragraph! If Richard‘s paradox tells us anything then perhaps it is a warning not to use English (or any other language for that matter) when we are doing mathematics.

Emile Borel introduced the concept of a normal real number in 1909. His idea was to provide a test as to whether the digits in a real number occurred in the sort of way they would if we chose each one at random. First assume that we have a real number written in base 10, that is a decimal expansion. Then if it is a “random” number the digit 1 should occur about ¹/₁₀ of the time so, if we denote by N(1,n) the number of times 1 occurs in the first n decimal digits, then N(1,n)/n should tend to ¹/₁₀ as n tends to infinity. Similarly for the all digits i in the set {0, 1, 2, …, 9) we should have N(i,n)/n tending to ¹/₁₀ as n tends to infinity. But a specific 2-digit number, say 47, should occur among all two digit blocks about ¹/₁₀₀ of the time etc. Borel called a number normal (in base 10) if every k-digit number occurred among all the k digit blocks about 1/10^k of the time. He called a number absolutely normal if it was normal in every base b.

Now Borel was able to prove that, in one sense, almost every real number was normal. His proof of this involved showing that the non-normal numbers formed a subset of the reals of measure zero. There were still an uncountable number of non-normal numbers, however, which was easily seen by taking the subset of all real numbers with no digit equal to 1. These are uncountable as can be seen using Cantor‘s diagonal argument, but clearly they are all non-normal. Clearly no rational is normal since eventually it ends in a repeating pattern. However despite proving these facts, Borel couldn’t show that any specific number was absolutely normal. This was achieved first by Sierpinski in 1917.

In his list of problems which he proposed to the International Congress of Mathematicians at Paris in August 1900, Hilbert stated that one of the most pressing issues for the foundations of mathematics was a proof of the consistency of arithmetic. He attempted to solve this himself but was unsuccessful. It was one thing trying to prove arithmetic consistent, but it was known that set theory led to paradoxes. These paradoxes worried many mathematicians and they felt that the foundations of mathematics needed to be built on a logical foundation which did not contain inherent contradictions. Three major paradoxes were due to Burali-Forti in 1897, Russell in 1902, and Richard in 1905. The first of these was derived from the fact that the ordinal numbers themselves formed an ordered set whose order type had to be an ordinal number. Russell‘s paradox is the well-known one relating to the set of all sets which do not contain themselves as an element, and Richard‘s paradox we have explained above. Solutions proposed by some mathematicians would only allow mathematics to treat objects which could be constructed. Poincaré (1908) and Weyl (1918) complained that analysis had to be based on a concept of the real numbers which eliminated the non-constructive features. Weyl argued in this 1918 work that analysis should be built on a countable continuum. It was the uncountable, and so non-constructible, aspects of the real line which Weyl felt caused problems.

Gödel proved some striking theorem in 1930. He showed that a formal theory which includes the arithmetic of the natural numbers had to lead to statements which could neither be proved nor disproved within the theory. In particular the consistency of arithmetic was unprovable unless one used a higher order system in which to create the proof, the consistency of this system being equally unprovable. In 1936 Gentzen proved arithmetic consistent, but only by using transfinite methods which were less accepted than arithmetic itself. Although this topic of research is still an active one, most mathematicians accept the uncountable world of Cantor and the non-constructive system of real numbers.

In 1933 David Champernowne, who was an undergraduate at Cambridge University and a friend of Alan Turing, devised Champernowne’s number. Write the numbers 1, 2, 3, …, 9, 10, 11, … in turn to form the decimal expansion of a number

0.12345678910111213141516171819202122232425262728293031323334353637383940 …

In The Construction of Decimals Normal in the Scale of Ten published in the Journal of the London Mathematical Society in 1933, Champernowne proved that his number was normal in base 10. In 1937 Mahler proved that Champernowne’s number was transcendental. In fact he proved the much stronger result that if p(x) is a polynomial with integer coefficients then the real number obtained by concatenating the integers p(1), p(2), p(3), … to get

0.p(1)p(2)p(3)…

is transcendental. Champernowne’s number is the special case where p(x) = x. In 1946 Copeland and Erdős proved that the number

0.2357111317192329313741434753596167717379838997101103107109113127131137139 …

obtained in a similar way to Champernowne’s number, but using primes instead of all positive integers, was normal. Neither Champernowne’s number nor the Copeland and Erdős number is absolutely normal.

It is reasonable to ask whether π, √2, e etc are normal. The answer is that despite “knowing” that such numbers must be absolutely normal, no proof of this has yet been found. In fact although no irrational algebraic number has yet been proved to be absolutely normal nevertheless it was conjectured in 2001 that this is the case.

In 1927 Borel came up with his “know-it-all” number. We illustrate this by again using our number c. Since c contains every English sentence, it contains every possible true/false question that can be asked in English. Create a real number k as follows. If the n-th block of c translates into a true/false question then set the n-th digit of k to be 1 if the answer to the question is true, and 2 if the answer is false. If the n-th block of c does not translates into a true/false question then set the n-th digit of k to be 3. Then k answers very possible question that has ever been asked, or ever will be asked, in English. Borel describes k as an unnatural real number, or an “unreal” real. Borel devotes a whole book [1], which he published in 1952, to discuss another idea, namely that of an “inaccessible number”.

An accessible number, to Borel, is a number which can be described as a mathematical object. The problem is that we can only use some finite process to describe a real number so only such numbers are accessible. We can describe rationals easily enough, for example either as, say, one-seventh or by specifying the repeating decimal expansion 142857. Hence rationals are accessible. We can specify Liouville‘s transcendental number easily enough as having a 1 in place n! and 0 elsewhere. Provided we have some finite way of specifying the n-th term in a Cauchy sequence of rationals we have a finite description of the resulting real number. However, as Borel pointed out, there are a countable number of such descriptions. Hence, as Chaitin writes in [6]:-

Pick a real at random, and the probability is zero that it’s accessible – the probability is zero that it will ever be accessible to us as an individual mathematical object.

In 1936 Turing published a paper called On computable numbers. Rather than look at the real numbers which can be described in English, Turing looked at a very precise description of a number, namely one which can be output digit by digit by a computer. He then took Richard‘s paradox and ran through it again, this time with computable numbers. Clearly computer programs, being composed from a finite number of symbols, are countable. Hence computable numbers are countable. List all computer programs — in fact they will all occur in the number c above. Create a new real number t by Cantor‘s diagonal argument whose n-th digit is defined as follows. If the n-th block is a program which outputs a real number, make the n-th digit of t different from the n-th digit of the computable number which is output. If the n-th block is not a valid programme to output a real number, then make the n-th digit of t equal to 1. Now t cannot be computable since, by construction, it differs from each computable number in at least one digit. However, we have just given a recipe to produce the number which could easily be programmed to output t, so t is computable.

Although the “English descriptions” of Richard‘s paradox must hold the key to the paradox, in this case our “computable numbers” are very precise and not subject to the same difficulties. Do we really have the ultimate paradox which shows that the real numbers are inconsistent? No! So where is the error in our paradox? The error lies in the fact that when we run the computer programmes we do not know whether they will ever output an n-th digit. We can deduce from this argument that it is impossible to tell whether a computer programme which has output k digits will ever output a k+1-st digit.