Discover numerical computation in Java, the study and devising of procedures, methods, and functions for solving problems with a computer.
Introduction
Numerical computation is the study and devising of procedures, methods, and functions for solving problems with a computer. An algorithm is used in a systematic way that solves a problem. A numeric computation software developer is not just interested in computing the answer for a problem, but also the best or optimal algorithm that gets there in the most efficient way. The efficiency can be measured by the number of steps in the algorithm, CPU time, and the amount of memory allocated for a specific numerical computation. The advantage of numerical computation is that a numerical answer to a specific model problem can always be obtained even when it has no analytic solution at all. Take the following equation:
If you try to find a closedform analytical solution to the above equation, do not bother; there is none. You cannot apply traditional methods such as integratingbyparts or integratingbysubstitution to solve for a closed form solution. Numerical computation can solve and compute the answer to the above equation or any type of model problem by using some wellknown numerical analysis techniques such as Euler, RungeKuttaFehlberg, multistep, AdamsMoulton, and so forth. The only mathematical operations that require solving for the solution are addition, subtraction, multiplication, division, and numeric comparisons. Because these simple operations are exactly what a computer does, numerical computation is best achieved using it. A numerical solution to a problem model is always numerical, while analytical methods produce a result in terms of mathematical functions. Analysis of computer errors and all sources of error in numerical computation are critical to the accuracy of calculation. Numerical computation can be done in any computer language, but there are traditional ones that are in favour with the community of software developers. Fortran has been the cornerstone choice from the early days of computers; Pascal, C, C++, and MatLab have also established themselves in numerical computing. Java has risen amongst the academic institutions, government research institutes, industry, and the scientific communities to be a preferred language for developing numericalbased software.
Arithmetic and Errors
Because numerical computation is compute intensive, and there are a lot of internal iterations for a specific algorithm, it is very important to set certain parameters at the correct level. If the tolerance level is set too small, it will consume a lot of memory or create an infinite loop. When it is too high between search boundaries, the procedure will terminate early and miss the solution. The following pseudocode illustrates this point. This is the algorithm for the bisection method, that solves the roots of a function with one independent variable z, that is f(z):
DO
c = (a + b)/2
IF( f(c)*f(a) < 0)
b = c
ELSE
a = c
WHILE ( ab < 2*TOL )

The bisection method searches for the root in an interval section (boundaries) between z=a and z=b for a value of z where f(z)=0, the value z=c. is a root solution to f(z) with an error of no more than ½ ab. A reasonable value of the tolerance (TOL) is 1.0E3 , and it should not be much less than, say, 1.0E5 or the problem will be too slow to converge to the solution. Sometimes tolerance value depends on the type of problem to be solved. For example, if you are developing software for nuclear physics analysis, you must set the tolerance to a smaller value that is to be less than the atomic or nucleon radius which is around the order of 1.0E9 to 1.0E15 meters; if you do not do this, your software will never find any solution (roots).
In numerical computing, the software developer should always be looking for tradeoffs and optimal parameter values so that computation is efficient. This always applies even though you are using the correct algorithm.
The following sections contain different types of common errors that arise in numerical computation; these originate from numerical analysis ways of implementation and also the inexact arithmetic of the computer:
Truncation error
This type of error is caused by the method of approximation by itself. A good example is the polynomial approximation of sin(x) by using the Taylor series. Let's say we approximate the first three terms (n=0,1,2):
Nevertheless, we do know by analytic form that to compute sin(x), it really requires an infinitely long series. The error arises because the series is truncated; it has nothing to do with the computer at all. The remaining summation represents the error. It is unavoidable, but the series must be truncated to enable any real problem or algorithm to be computable.
Roundoff error
Computers always use floatingpoint numbers (real numbers) of a fixed word length and the exact or true value is not expressed precisely by such computer representations. When floatingpoint numbers are rounded and stored, the roundoff error is smaller than if the trailing digits were simply lopped off.
Input data error
Errors of this type arise from input measurements and it is vital that there is a mechanism to control it. This is a very important issue in the design of DSP (Digital Signal Processor) chipsets and feedback control systems, where noise (unwanted data) is being fed into the system, which offsets the desired setpoint of the whole system. DSP is used in almost all electronic devices on the market, including computers, TVs, video and mobile phones, and so on.
Propagation error
Propagation errors occur from a repetitive accumulation of earlier, smaller errors in an internal iteration of of a numeric computation. As each new iteration executes the error from previous steps, it carries on to the next and tends to magnify. From the bisection algorithm pseudocode mentioned above, there is a decrease of precision every time the iteration starts from the DO loop, which stores the result from the previous step; this number will be the input for the next round of iteration. Again, this error type is taken into account by software engineers who are designing DSP and control systems.
Machine accuracy (EPSILON)
The accuracy of machine measurement is the smallest floatingpoint number that, when added to floatingpoint number 1.0, produces a result different from 1.0. This smallest floating number is called epsilon. All numericintensive applications where a likelihood of a dividebyzero is encountered, epsilon is added to the denominator of the intended function to stop the propagation of a "NAN" (NOTANUMBER) or "Inf" (INFINITY) datatype which will lead to floatingpoint overflow. An example of this is a type of signal known as the sinc pulse in the area of Digital Signal Processing. A sinc pulse is represented mathematically by the following equation as a function of time.
If you try to substitute zero for the independent variable t (t=0) in the sinc pulse equation, the result is an error with a dividebyzero. In Java, an ArithmeticException is thrown if a dividebyzero is found. The answer, as we know from applying the theory of limit in calculus to the sinc pulse, should be 1.0, but the computer returns an error. Now, if epsilon (2.0E16) is added to variable t, both in the numerator and the denominator, you will get the correct answer  that is, 1.0  and there is no ArithmeticException thrown. The addition of epsilon will not alter the magnitude of the target function at all because epsilon is so small (almost zero). It is basically like adding a zero to a number, which changes nothing. This is a normal practice (adding epsilon) in the development of numeric intensive software such as in the areas of particle Physics, DSP, Cosmology, Finance, Stochastic processes in Statistics, Econometric analysis, Engineering, and Oil Exploration.
Java API for Intensive Numeric Computing
Currently there is a Java Specification Request (JSR83) to draft an API for numeric computation. JSR83 proposes a package to implement rectangular multidimensional arrays for the Java platform. This draft is endorsed by the Java Grande Forum (JGF). This forum was formed by representatives from different industries and academic institutions with special interests to act in an advisory role to those working on Grande Applications. They address Java language design and implementation issues and relay this input directly to Sun. A Grande Application is defined as a software application that consumes a lot of computer resources, and any software which is numerically intensive fits this definition.
This API supports multidimensional arrays; a multiarray is characterised by its rank, datatype (double floatingpoint), and shape. The multiarray package supports all types of primitives and complex numbers. A multiarray package also supports the concept of regular multiarray sections, and this corresponds to a subarray of another multiarray.
This package also implements Fortranlike functionality for arrays, such as array elementwise multiplication, addition, subtraction, and division. Arrays can be concatenated (merged) columnwise, provided the arrays to be merged have the same number of rows or subarrays have the same number of columns. Elementary and transcendal functions are also implemented and an entry element of an array can be accessed by the getters and setters method. The multiarray package is implemented without a change to the JVM (Java Virtual Machine), JNI (Java Native Interface), or the Java language specification. Multidimensional arrays are the foundations of the scientific, engineering, and numericintensive types of computation.
The target platforms for this multiarray package are the desktop, server, personal, embedded, card, and so forth; the proposed name for this package is javax.math.multiarray, although it has not yet been finalized. The multiarray package is based on the NinJA (Numeric Intensive Java) project from IBM., which can be freely downloaded. This package also implements BLAS (Basic Linear Algebra Subroutines) routines.
Performance
A typical reaction from the numerical computing software developers to Java is that "it is slow." Indeed , this was the case when the first JVM appeared. It worked by completely interpreting the bytecode in the class files and performance was very poor. Technology has changed in the last few years and nearly all JVM that is available today for traditional computing devices such as PCs, workstations, and servers uses justintime (JIT) compiler technology. JIT operates as part of the JVM, compiling Java bytecode into native machine code at runtime. As the machine code is generated, it then executes a raw machine speed. JIT performs sophisticated optimisations, such as arraybound checking and stack allocation of objects.
A benchmark test conducted by members of the JGF from the National Institute of Standards and Technology (N.I.S.T.) using the most common kernels found in scientific applications such as: fastfouriertransform (FFT), successiveoverrelaxation (SOR) iterations, montecarlo quadrature (QUAD), sparsematrix multiplication, and densematrix factorisation (LU) for solution of linear systems. The result of this test shows that the performance of Java varies greatly across computing platforms. The difference is mainly due to different implementations of the JVM, rather than the underlying hardware architecture. The test showed that Java codes performed competitively with optimised C and Fortran. Java clearly outperforms C on the Windows platform, where both the Microsoft and Borland C compilers were used. The group emphasized that the test is not for comparison of the languages, but rather the different implementations of compilers and execution environments that differ from vendor to vendor. The JGF group concluded that Java performance is certainly competitive with C, even though numeric codes in Java give about 50% of the performance of conventional compiled languages.
Obstacles to Numerical Computing with Java
Despite the advances of JVM and compiler technology in recent years, challenges still remain with numerical computing with Java. These challenges are discussed in the following sections.
Overrestrictive floatingpoint semantics
Java currently forbids common optimisations, such as making use of the associativity property of mathematical operators; for example, (a+b)+c may produce a different rounded result than a+(b+c). In comparison to Fortran, compilers routinely make use of the associative property of real numbers for code optimisation. Java also forbids the use of fusedmultiplyadd (FMA) operations. FMA computes such a quantity as a*x+y as a single floatingpoint operation. This type of operation is found in many computeintensive applications, particularly matrix algebra operations.
As an example, most government treasury departments around the world use a wellknown economic model called the LeontiefInputOutput, named after the developer of this model who was an academic and Nobel Prize winner in Economics (1973). The Leontief model divides the nation's economy into many sectors such as manufacturing, communication, entertainment, and service industries. Its matrix representation is shown below, and the original model uses a [500 x 500] rowsbycolumns matrix to represent model variables; I  identity matrix, C  consumption matrix, D  demand colummatrix, Y  output columnmatrix of amount produced.
Y = C*Y + D
Y = (I  C)^{1}*D : (solution)

As you can see with a calculation that has hundreds of thousands (or even millions) of economic variables, it is going to take time for a supercomputer to solve the equation. The computer will mainly take two major steps to calculate the answer; first, it will compute the inverse matrix of the bracket term; second, it calculates the matrix multiplication of the inverseresult and D (demand columnmatrix). Calculation of the inversematrix from the first step is where the processor will spend most of its time. There are other algorithms that can be used; they are a bit faster instead of taking the inverse, such as partitioning of the larger matrix into smaller ones. Matrix partitioning must be addressed in the JSR83 multiarray package.
With the FMA instruction, only a single rounding occurs for the two arithmetic operations, yielding a more accurate result in less time than would be required for two separate operations. Java's strict language definition does not use FMA and thus sacrifices performance on some platforms.
For Java to have the fastest performance possible, it is necessary to further relax the above restrictions. It is possible to establish this by introducing a fast mode for the execution of floatingpoint operations in Java.
Inefficient support for complex numbers and alternate systems
Any language that seriously supports scientific and engineering computing should consider in its design the ease and efficiency with which computation with complex numbers can be done. There are some computeintensive applications that are best handled by complex numbers, such as those in Fluid Dynamics, OptoElectronics, Electromagnetics, DSP, Acoustics, Electrical Power Systems and Transmissionlines, and so forth. Because complex numbers can only be created in Java as objects, the semantic of assignments ( = ) and equals ( = = ) are different for objects. Complex arithmetic is slower than Java's arithmetic on primitive types because it takes longer to create and manipulate objects.
Objects also incur more storage overhead than primitive datatypes. Temporary objects must be created for almost every method call. Because every arithmetic operation is a method call, this leads to a glut of temporary objects which must be frequently dealt with by the garbage collector. In comparison to primitive datatypes, they are directly allocated on the stack, leading to very efficient manipulation.
The solution to the above shortfall can be achieved by the introduction of some new features to the Java language. The first feature is operator overloading and the second is lightweight objects. The operator overloading allows one to define, for example, the meaning of (A+B) when A and B are arbitrary objects. Operator overloading is available in other languages such as C++. Java would need the overload of arithmetic operators, comparison operators, and the assignment operator.
Another approach for both efficiency and a convenient notation would be to extend the Java language with complex primitive datatypes; this could be done without extending the JVM. This job could be taken care of by the compiler.
Allowing operator overloading and complex primitives in Java would make arithmetic coding easy to read, code, and maintain. To illustrate this point, take the following example from Electrical Engineering for a twoelement impedance (Z1,Z2) connected as a parallel network to a voltage source (V). This is how to implement the calculation of the electrical current (I) flowing through Z1 and Z2 by using the multiarray package from JSR83:
Complex Z1 = new Complex(blah, blah) ; //blah is a double floatingpoint
Complex Z2 = new Complex(blah, blah);
Complex ONE = new Complex(1,0); //real number one.
Complex V = new Complex(blah, blah);
Complex I = V.times(ONE.div(Z1).plus(ONE.div(Z2));

The above calculation only involves one voltage source (V) and a twoelement impedance (Z1 and Z2), but the real world of electronics and circuitry systems involves hundreds, thousands, or even millions of electrical components on a macro board or a microchip. You can imagine how difficult it is coding very long equations for multiple components circuits. It would be much more mathmatically readable if operator overloading or complex primitives were available in Java . The last line of coding from the above example would have looked like the following line if operator overloading were allowed:
Complex I = V*(1/Z1+1/Z2) ; //This way of coding is readable and easy.

The development of SPICE (Simulation Program with Integrated Circuit Emphasis) software in Java is starting to emerge from academic institutions. It is still at too early a stage to expect a massive rewrite of different current types of SPICEs in Java. As Java's numeric computation capability improves over time, Java will appeal to SPICE commercial software makers.
Multidimensional array
Java lacks true multidimensional arrays. It offers multidimensional arrays only as arrays of onedimensional arrays and this is implemented in this JSR83 multiarray package. The lack of true multidimensional arrays causes a problem for optimisation. Getting access to an element of a multidimensional array requires multiple pointer indirection and multiple bound checks at runtime. The proposal from JGF (Grande Forum) is that, if Java performance is to be brought to the same par as Fortran, the language specification requires true rectangular multidimensional arrays in which all rows have exactly the same length. Another proposal is to allow elegant and efficient access with a notation like a[i ,j] instead of calling the set(i ,j) and get(i ,j) accessor methods, which affects performance when dealing with huge matrix arrays.
Applications of Numerical Computing
The following sections give some examples of the application of numerical computing and how different industries adopt them.
Electrical and Electronic engineering
As mentioned in the above example about SPICE, software engineers develop such applications to be used by Electrical and Electronic engineers to build circuitry, microchips, and transistors. A transistor is made up of around a million components. All SPICEvariant software relies mainly on linear algebra techniques and a system of linear equations.
Oil Exploration and Petroleum Industries
Ships of some petroleum companies regularly search for offshore deposits, and their computer systems solve thousands of separate systems of linear equations every day. Seismic data for the equations are obtained from underwater shock waves created by explosions from air guns. The waves bounce off subsurface rocks and are measured by geophones attached to milelong cables behind the ship. Petroleum companies use supercomputers to solve the systems of linear equations that give them some hard data indicating where to drill.
Linear Programming and Operations Research
The term "Linear Programming" should not be mistaken for computer software programming, because it has nothing to do with computers. It is a branch of mathematics that deals with the optimisations in allocating limited resources. There are many important management decisions today; these are made on the basis of Linear Programming models that utilize hundreds and thousands of variables. The airline industry, for instance, employs Linear Programming software that schedules flight crews, monitors locations of aircraft, and plans the varied schedules of support services such as maintenance and terminal operations. The model of Linear Programming solves equations that involve matrix algebra with hugesize matrices.
Climate and Weather Forecast
In many applications, there is a need for a continuous time stream of weather data to be used as an input to a computer model of processes of interest. Highspeed statistical and stochastic models, differential calculus, and matrix algebra software are used to predict and forecast weather patterns.
Financial Modelling
For stock market analysts, stock brokers, merchant bankers, security analysts, commercial bankers, corporate financiers, and so on, it is vital to use numerical analysis software to assist in forming the right decision. Analytic financial modelling software has functionalities that employ common numerical algorithms which are heavily used for financial analysis, such as montecarlo, probability distributions, matrix algebra, differential calculus, numerical integration, spline approximation, and the like.
There is confusion between the meaning of Financial software and Financial Modelling. This confusion arises from accountants and software developers who have never previously experienced numerical computing. The two fields mean one thing to some developers but for a numerical analyst software developer it is two, similar, overlapping fields. Financial software deals with accountingtype calculations, such as balance sheets, interest rates, and so forth, which do not use any significant amout of memory at all. Financial Modelling involves heavy numeric calculations such as stockprice timeseries, portfolio covariance risk, market pricing options such as arbitrage pricing, capital asset pricing model, binomial and blackschole's pricing, and so on. These calculations usually involve massive datainput from financial databases as stock market repository to do calculations using the numerical algorithms mentioned above.
There have been a number of academic publications available on the Internet; these show researchers have used techniques that were previously only applied in signal processing, like wavelet analysis to analyse financial timeseries. The prediction of wavelet analysis is regarded as being very true to life.
Artificial Intelligence and SoftComputing
Machine learning, SoftComputing, and Datamining involve numbercrunching computation to find patterns in large databases such as mining NASA's huge astronomical databases to discover new planets, solar systems, galaxies, and so forth. NASA's datamining software often discovers black holes and galaxy planets that have been buried in the database since the 1970s or earlier. Data is collected through telescope imaging, satellite pictures, radioparticle detectors, and so on. Data mining is also employed by big corporate businesses.
Bioinformatics and Life Sciences
Bioinformatics is a new discipline of IT that has sprung up over the last decade. This discipline is the marrying of Computer Science, Mathematics, and Life Sciences (Biology, Biochemistry, and Pharmacology). Another term that is frequently used interchangeably in academic literature with Bioinformatics is Computational Biology. The Genome project (gene mapping) falls under this category. Bioinformatics deals with gene mapping and sequencing, protein sequence analysis, protein structure prediction, phylogenic inference, regulatory analysis, and so forth; the algorithm involves heavy numerics.
Satellite Image Processing
The processing of satellite radar imaging data is a compute and memoryintensive task, due to the heavy numeric algorithms involved. Satellite imaging software can process and handle thousands or more of 4000 x 4000 or even 8000 x 8000 pixel images in a short regular period. Such large images can be relayed directly from a satellite into a processing centre for realtime analysis.
The Future of Java in Numerical Computing
If Sun Microsystems implements some of the proposals from the JGF in future versions of the Java language specification, we will see Java as the number one language choice for developing numericcomputeintensive application software as it has done in other areas of computing such as serverside applications.
The multiarray API package that will be coming out from JSR83 of the Java Community Process (JCP) will add new numerical functionalities in future versions. There is no doubt that the multiarray API will evolve over time to include different specialised subpackages.
There has been informal talk from the JGF that the forum would like to further propose a Digital Signal Processing API, but such a proposal would have to wait until the package "javax.math.multiarray" (proposed name) emerges from the JCP. If there is ever to be a draft specification for a DSP API, expect Java to make a big move to embedded software. Defence projects, such as radar station software applications, could be developed all in Java and costs are expected to be lower than today. The reason for this is that DSP numerical algorithms are very complex to develop from the ground up. If there is an API for DSP, the development of systems like radar station software in Java will be faster to market. Also in the future, software developers who are involved in such projects would not have to worry about complicated numerical algorithms as in discretesignalconvolution (CONV), signalcrosscorrelation, finiteimpulseresponse, (IIR), and so on, because it is already implemented in the DSP API. They can just concentrate on application development.
Although there are language issues that need to be addressed in Java for future verisons relating to numerical computing, it is growing in popularity for writing heavy numeric computation software.
Java Numerical Libraries
There are numerous freeware, proprietary, commercial, and GPL numerical libraries available in Java today from commercial vendors, research institutes, academic institutions, and also computer hobbyists. There is a link from the National Institute of Standards and Technology from their Java Numeric home page (refer to the resources, below). Most of the popular technical computing environments for scientific and engineering computation have support for Java; these include MatLab from MathWorks, Mathematica from Wolfram Research, and Maple from Waterloo MapleSoft. All of these computing environments have interfaces and Java plugins to call external Java applications and also Java applications can call routines from these environments. The number of Java GPL numerical libraries is growing; new ones become available daily.
Resources
Web links and downloads
Books with excellent numerical algorithms
 Introductory Java for Scientists and Engineers by Richard Davies, pub: Addison Wesley Longman
 Java for Engineers and Scientists by Stephen J. Chapman, pub: Prentice Hall, 1999
 Digital Image Processing, a Practical Introduction using JAVA, by Nick Efford , pub: AddisonWesley
 Digital Image Processing, by R.C. Gonzalez and R.E. Woods, pub: AddisonWesley
 Numerical Recipes in Fortran 77, the Art of Scientific Computing (Volume 1) by William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, pub: Cambridge University Press, 1997
 Linear Algebra and Its Applications (Second Edition), by David C. Lay, pub: AddisonWesley Pub. Co.
 Applied Numerical Analysis (Sixth Edition) by Curtis F. Gerald and Patrick O. Wheatly, AddisonWesley Pub. Co., 1999
 Introduction to Algorithms by T.H. Cormen, C.E. Leiserson, and R.L. Rivest, pub: MIT Press
 Elements of the Theory of Computation by H.R. Lewis and C.H. Papadimitriou, pub: Prentice Hall
 Diffential Equations with Applications and Historical Notes (2nd Edition), by G.F. Simmons
 Calculus, Volume 2, 2nd Edition, by T.M. Apostal, pub: Wiley and Sons, Inc. (This book has execellent notes on finiteelement algorithms and also mutipleintegration)
 Signals and Linear System Analysis (2nd Edition), by G.E.Carlson, pub: Wiley and Sons, Inc. (This book is MatLabbased but is very rich in numerical algorithms)
 Complex Variables and Applications (6th Edition), by James Ward Brown and Ruel V. Churchill, pub: McGraw Hill College
Books for numerical application in finance
 Financial Theory and Corporate Policy (3rd Edition), by T.E. Copeland and J.F.Weston, pub: AddisonWesley
 Econometric Analysis (4th Edition), by W.H.Greene, pub: Prentice Hall
 Modern Industrial Statistics, Design, and Control of Quality and Reliability, by R.S. Kenett and S. Zacks, pub: Duxbury Press
About the Author
Sione Palu has developed software for Publishing Systems, Imaging, and Web Applications. Currently, Palu develops (Swingbased) his own software application in Symbolic Algebra and Visualization Mathematics for highschool level students. Palu graduated from the University of Auckland, New Zealand, with a science degree in mathematics and computing. He has a personal interest in applying Java and mathematics in the fields of mathematical modelling and simulations, symbolic AI and softcomputing, wavelets, digital signal processing, and control systems.