Lexical analysis of a chemical formula

Calculation of the molecular weight

Example
1: Calculation of molecular weight

Example
2: Preparation of a solution with a given concentration

Example
3: Preparation of a mixture of two compounds

## Lexical analysis of a chemical formula

The initial operation in
chemistry is to transform a chemical formula, which is a string corresponding
to a sequence of characters and digits, into a molecular weight (a real
number). It is obvious for the intellect of the chemist but cannot be easily
achieved by the computer.

Lexical analysis of the
chemical formula is performed in the class analysis: analysis:

an = new analysis(chemical_formula).

Characters are analysed
following the sequence:

7
Beginning
of the lexical analysis of a chemical formula with n atoms and n coefficients

o Atom n and corresponding coefficient

- First letter has to be
uppercase: H, Cl, ..
- If exists, second letter has
to be lowercase: Cl, Al, ...
- Then could be a number
(digit): H2O, Al2O3, ...
- Could be again a number if
the coefficient is more than 9
- Then could be a dot if it is
a real number: Fe0.9O, ...
- Then could be a digit (first
decimal)
- Then could be again a digit
(second decimal)
- The next character could be a
comma: NaCl,H2O
- Atom n and coefficient are obtained;
go to the atom n+1
- End of the lexical analysis

**class analysis{**

String
chemForm;

float
molmas = 0f;

analysis(String
cForm){

chemForm
= cForm;//Chemical formula

String
s[] = new String[20];// Symbols of the elements in the chemical formula

float massat = 0;//Atomic masses
---------------------------------------------

float
coeff[] = new float[20];// Coefficients --------------------------------

int
len = cForm.length();//Number of characters in the formula

char
c;

String
ch, coefficient;

int
a = 0, i = 0, end = 0;

cForm
= cForm + " ";

//
Lexical analysis of the chemical formula in args[0]

do{

ch
= ""; coefficient = "1"; coeff[a] =0;

//
First letter has to be uppercase

c
= cForm.charAt(i);

if(Character.isUpperCase(c)){

ch
= String.valueOf(c);

s[a]
= ch;

i++;

}

//
If exists, second letter has to be lowercase

c
= cForm.charAt(i);

if(Character.isLowerCase(c)){

ch
= String.valueOf(c);

s[a]
=s[a] + ch; // The symbol of the element is obtained

i++;

}

//
Then could be a number (digit)

c
= cForm.charAt(i);

if
(Character.isDigit(c)){

coefficient
= String.valueOf(c);

i++;

}

//
Could be again a number

c
= cForm.charAt(i);

if
(Character.isDigit(c)){

coefficient
= coefficient + String.valueOf(c);

i++;

}

//
Then could be a dot if it is a real number

c
= cForm.charAt(i);

if(c
=='.'){

coefficient = coefficient + ".";

i++;

}

//
Then could be a digit (first decimal)

c
= cForm.charAt(i);

if
(Character.isDigit(c)){

coefficient
= coefficient + String.valueOf(c);

i++;

}

//
Then could be again a digit (second decimal)

c
= cForm.charAt(i);

if
(Character.isDigit(c)){

coefficient =
coefficient + String.valueOf(c);

i++;

}

c
= cForm.charAt(i);

// The next character could be a comma

if(c
==',') i++;

coeff[a]
= Float.valueOf(coefficient).floatValue();

if
(coeff[a]==0) coeff[a] = 1;

a++;

}while(i<=len-1);
// End of the lexical analysis of the chemical formula

end
= a - 1;

calc_masmol
ms = new calc_masmol(end, s, coeff);

molmas
= ms.mt();

}

float
result(){return molmas;}

}

## Calculation of the molecular weight

Molecular weights are obtained from the class calc_massat. The atomic
symbols symb[] and weights ma[], are put in the program as final arrays. This
data could be read in an extra file but as they are definitively fixed it is
more convenient to compile them.

**class calc_masmol{**

float
masmol;

static
final String symb[] = {"Ac", "Ag", "Al",
"Am", "As", "At", "Au", "B",
"Ba",

"Be",
"Bi", "Bk", "Br", "C", "Ca",
"Cd", "Ce", "Cf", "Cl", "Co",
"Cr", "Cs", "Cu",

"Dy", "Er",
"Es", "Eu", "F", "Fe", "Ga",
"Gd", "Ge", "H", "Hf", "Hg",
"Ho", "I",

"In", "Ir",
"K", "La", "Li", "Lu", "Lr",
"Md", "Mg",
"Mn", "Mo",
"N", "Na", "Nb",

"Nd", "Ni", "No", "Np", "Os", "P", "Pa", "Pb", "Pd", "Pm", "Po", "Pr", "Pt", "Pu",

"Ra", "Rb", "Re", "Rh", "Ru", "S", "Sb", "Sc", "Se", "Si", "Sm", "Sn", "Sr", "Ta",

"Tb", "Tc", "Te", "Th", "Ti", "Tl", "Tm", "U", "V", "W", "Y", "Yb", "Zn", "Zr", "O"};

static final float ma[] = {227.0278f, 107.8682f, 26.98f, 243.0614f, 74.9216f, 209.987f,

196.966f, 10.811f, 137.327f, 9.012f, 208.980f, 247.07f, 79.904f,
12.011f,

40.078f,
112.411f,140.115f, 251.0796f, 35.4527f, 58.933f, 51.996f, 132.905f,

63.546f,
162.50f, 167.26f, 252.083f, 151.965f, 18.998f, 55.847f, 69.723f,

157.25f,
72.61f, 1.00794f, 178.49f, 200.59f, 164.930f, 126.905f, 114.82f,

192.22f,
39.0983f,138.906f, 6.941f, 174.967f, 260.1053f, 258.099f, 24.305f,

54.938f,
95.94f, 14.007f, 22.90f, 92.906f, 144.24f, 58.69f, 259.1009f, 237.048f,

190.2f,
30.974f, 231.036f,207.2f, 106.42f, 146.915f, 208.9824f, 140.908f,

195.08f,
244.064f, 226.03f, 85.47f, 186.207f, 102.91f, 101.07f, 32.066f, 121.75f,

44.96f,
78.96f, 28.09f, 150.36f, 118.71f, 87.62f, 180.95f, 158.93f, 98.91f,

127.6f,
232.04f, 47.88f, 204.38f, 168.93f, 238.029f, 50.94f, 183.85f, 88.91f,

173.04f,
65.39f, 91.224f, 15.994f};

calc_masmol(int
ed, String s[], float coeff[]){

float
massat[] = new float[ed + 1];

for
(int a = 0; a <= ed; a++){

for
(int i = 0; i<=symb.length-1; i++ ){

if
(s[a].equals(symb[i])){

massat[a]
= ma[i];

break;

}

}

}

for
(int a = 0; a <= ed; a++)

if
(massat[a] > 0) masmol= masmol + massat[a]*coeff[a];

else
{

masmol=0;

break;

}

}

float
mt(){return masmol;}

}

## Usage

These two previous classes can be used for many purposes in chemistry
such as calculation of a molecular weight, preparation of a solution with a
given concentration or preparation of a mixture of two compounds (obviously of n compounds).

In the next applications, formulae must be case sensitive: NaCl. The
coefficient of the element has to be written after the symbol: C6H6. Non integer
coefficients are accepted: Ba0.5Sr0.5TiO3. Additive formula - NaClO4,H2O
- is also possible but formula like FeCl3,6H2O is not accepted and has to be
written FeCl3, H12O6.

In order to shorten this article, the exceptions are not considered in
the following lines but they are in the downlodable application (chemCalcApp.java)

### Example 1: Calculation of molecular
weight

A very simple application can be written :

**public class chemCalcApp{**

public
static void main (String[] args){

analysis
an = new analysis(args[0]);

System.out.println("Molecular
weight of " + args[0] + " = " + an.result() + "g");

}

}

The result in the console is as following

D>java chemCalcApp H2O

Molecular weight of H2O = 18.00988g

### Example 2: Preparation of a solution with a
given concentration

**public class calcsol**{

public static void main (String[] args){

analysis an = new analysis(args[0]);

System.out.println("Weigth " +
an.result()*Float.valueOf(args[1]).floatValue()

+ " g "+ " of " + args[0] + " for 1 liter of
solvent");

}

}

The
chemical formula and the desired concentration are obtained from the command
line as args[0] and args[1].

In the
console:

D>java calcsol NaCl 0.01

Weigth 0.58352697 g of NaCl for 1 liter of solvent

### Example 3: Preparation of a mixture
of two compounds

**public class mixing{**

public static void main (String[] args){

analysis
an1 = new analysis(args[0]); float w1 = an1.result();

analysis
an2 = new analysis(args[1]); float w2 = an2.result();

float coef1
= Float.valueOf(args[2]).floatValue();

float coef2
= Float.valueOf(args[3]).floatValue();

float total
= Float.valueOf(args[4]).floatValue();

float
totalmolmas = w1*coef1 + w2*coef2;

System.out.println("Amount
to weight for a total mass of " + total + "g");

System.out.println(args[0]
+ " = "+ w1*total/totalmolmas + " g ");

System.out.println(args[1]
+ " = "+ w2*total/totalmolmas + " g ");

}

}

The
chemical formulae (args[0] and args[1]), the molar coefficients (args[2] and
args[3]) and the desired total weight (args[4]) are obtained from the command
line as.

In the
console:

D>java mixing NaCl KCl 1 1 10

Amount to weight for a total mass of 10.0g

NaCl = 4.3905997 g

KCl = 5.6094 g

The corresponding applets are found at:

http://www.icmcb.u-bordeaux.fr/chemcalc/ccindex.html

## Download

Download chemCalcApp.java = 5K

download the applets: = 1.23 M

### About the Author

*Josik Portier is Directeur de Recherche at the Institut de Chimie de la Matihre Condensie de Bordeaux of the Centre National de la Recherche Scientifique.*