Literally meaning the measurement of chemical information, chemometrics is the field of calculation of meaningful information from spectroscopic instruments. Protea have applied specialist chemometric techniques on hundreds of projects in the past. We also provide in-depth training courses on chemometric techniques and data analysis of spectroscopic measurements, enabling our customers to benefits from the power of chemometrics.
Chemometrics can be a relevant simple analysis process, requiring only a single calibration spectrum. Or it can be complex, involving calibration sets of spectra covering dozens of gas species, hundreds of spectra and many different analysis bands. However, any chemometric technique will only ever be as good as the calibration data it is based on.
Protea shares a calibration lab with our UKAS-accredited test house. This means all gas standards used and calibration equipment is traceable to UKAS standards. This ensures the accuracy and repeatability of our calibration library of spectra.
Univariate analysis methods involve the use of a single calibration spectrum, of a single measurement component at a single concentration. E.g. a spectrum of Sulphur Dioxide at 100ppm. By applying Beer’s Law, we can generate a simple univariate model by measuring the peak height or the peak area under this single spectrum. This can then be applied to sample data collected on the FTIR analyser.
Univariate models give quick quantified measurements, but are only suitable for single component sample mixes that are of limited ranges. Protea Analyser Software allows univariate models to be built quickly and simply, also allowing the model to be span corrected over a range to increase the measurement range. However, if there are interference between many overlapping spectral features more complex analysis techniques are required.
When a mix of many components is to be measured, over various ranges, then we can employ a number of multivariate analytical techniques – often referred to as modelling. A model is built upon a dedicated calibration set of spectra of known species at known concentrations. Once we have collected this calibration set, we apply different statistical techniques of regression on that data set. There are many different techniques for modelling. Differing manufacturers of spectrometers and software prefer differing analytical techniques, often claiming there is superior to others.
As practiced chemometricans and analysts, Protea know that there is never a single technique that works best in all applications. Protea offers a number of techniques with our instruments, enabling the best solution to any application problem to be found. Protea aims to deliver a working analytical model for each application into which a system is delivered, letting the customer sit back and enjoy the important process information.
Classical Least Squares (CLS)
One of simplest multi-variate technique and the most common is gas-phase Mid-IR analysis is Classical Least Squares (CLS). At the heart of the CLS method is the standard Beer’s Law equation (Equation 1), which describes the total spectral absorbance A as the summation of the absorptions for each species n. The absorption is described as the concentration c multiplied by a factor, K. This factor is described by the absorptivity of the substance and the absorbing pathlength of the sample cell. This is often where the name K-matrix comes as an alternative name for the CLS method.
Beer’s Law is expanded (Equation 2) to include all constituent components making up the sample spectrum and applying to each data point, i, in the spectra. What is very important with CLS modelling is that we know all the constituent components making up the sample.
The least squares method is applied, looking to reduce the sum-of-squares error between the model and the constituent absorbance. This is most commonly described by the matrix maths relationship in Equation 3.
What is important to understand about CLS modelling is that it provides a linear model of all the species that the model is built to include. If a substance not present in the model is in the sample, then the model will fail in its predictions. If a substance with a non-linear response of absorption vs. concentration is attempted to be model, a non-linear resulting model will be produced.
Various modifications to the standard CLS routine can be applied to attempt to overcome so of the limitations of the method. On the whole, CLS models are quick to build, to edit and to add new calibrations too and are a good method for applying to Mid-IR gas phase analyses.
Partial Least Squares (PLS)
When accurate,robust prediction results are required from spectroscopic data, the Partial Least Squares (PLS) algorithm is employed. PLS modelling involves the notion of factors that describe the collinear relationship between spectral absorption and concentration.
The PLS method is related to Principal Component Analysis (PCA) methods of screening data, where data is in-effect screened for correlations , each subsequent correlation being described by a Principal Component or factor.
In PLS modelling, the concentration data is used in the decomposition of the data into the descriptive factors as well as the spectroscopic data. This has the effect of weighting the higher concentration data more favourably and getting accurate prediction results from as few factors as possible.
The algorithms employed in PLS model building are significantly more complex that, say, CLS but what results is a set of models that are directly related to the concentration vs. absorbance response of the components of interest. If a PLS model is built with 5 constituents of interest, 5 PLS models are generated each specific to a constituent. Compared to CLS, where there is a single model for all constituents, this gives better predictive abilities in cases of wide concentration ranges and many interfering species.
In fact, PLS models have been shown to still predict well in the presence of unknown species in the sample spectra. As well as that, non-linearities in calibration spectra can be accounted for as well as predictions in samples of high noise.
For specialised applications, such as the MCERTS incineration measurements of the 204M or the online Titanium Tetrachloride 304L system, Protea favours PLS modelling to give long-term, robust predictions.