Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://analyt.chem.msu.ru/preconcentration/pletnev/papers/aca2k2/aca2k2.pdf
Äàòà èçìåíåíèÿ: Sat Nov 25 18:05:48 2006
Äàòà èíäåêñèðîâàíèÿ: Mon Oct 1 20:40:18 2012
Êîäèðîâêà:
Analytica Chimica Acta 455 (2002) 131­142

Classification of metal ions according to their complexing properties: a data-driven approach
Igor V. Pletnev , Vladimir V. Zernov
Department of Chemistry, Lomonosov Moscow State University, Vorobyovy Gory, 119899 Moscow, Russia Received 1 February 2001 ; received in revised form 22 November 2001 ; accepted 28 November 2001

Abstract Factor, cluster and self-organizing map analyses were performed for the stability constants of complexes of 24 metal ions and hydrogen with 3960 ligands (15606 values of log K1 ). Five factors reproduce 89% of data variability. Both direct clusterization and clusterization on the basis of factor analysis established the existence of six different classes of similar cations. The similarity series for metal ions and relative similarity of several ions are discussed and the Kohonen two-dimensional map, which visually represents the similarity, is presented. © 2002 Elsevier Science B.V. All rights reserved.
Keywords: Metal ion complexes; Stability constants; Chemometrics; Principal factor analysis; Hierarchical cluster analysis; Kohonen self-organizing map

1. Introduction Analytical chemistry has a long-term continuing interest in rationalization and prediction of complexing ability of various metal ions. To achieve this goal, chemists established a number of correlation equations and qualitative rules (reviews [1­3] and monographs [4,5]). However, most of these rules are restricted to specific (though important) classes of metals/ligands. The most general concepts, metal ions classification schemes related to complexing properties (Arhland­Chatt­Davies [6] and Pearson's HSAB principle [7]) are relatively old and were designed with the use of rather limited experimental data. This only emphasizes the significance of the work of those who pioneered the field, however, nowadays the array
Corresponding author. E-mail address: pletnev@analyt.chem.msu.ru (I.V. Pletnev).

of measured stability constants has significantly extended and one may attempt to classify metal cations in more detail. We are particularly interested in a purely phenomenological classification which uses no a priori assumptions. From one side, such a data-driven approach may seem inherently limited as it ignores existing knowledge about the subject. From another side, it may appear extremely helpful in broadening our knowledge regarding metal ions nature and in eliminating misinterpretations/pitfalls that accompany even the most elegant and successful theoretical concepts (for peculiar examples, see Pearson's comments [8] on implementation of HSAB principle). There exists a number of well-established statistical techniques for classification, namely, principal factor analysis (PFA) and cluster analysis [9­11]. Factor analysis tries to explain all the variability in complexation constants for m cations through the

0003-2670/02/$ ­ see front matter © 2002 Elsevier Science B.V. All rights reserved. PII: S0003-2670(01)01571-9


132

I.V. Pletnev, V.V. Zernov / Analytica Chimica Acta 455 (2002) 131­142

linear combination of increments inherent to a small number k of factors, k m. One may assume that the factors are related to archetypical metal ions or, better, to the archetypical features expressed, to different extent, in any metal's complexation behavior. After the initial data set is reduced to k factors, it is possible to place the points corresponding to various metal ions in k-dimensional space and to study their distribution by means of cluster analysis. Cluster analysis may also be applied to the original data set itself, however, pre-factorization reduces the size of the problem, eliminates redundancy/noise and helps to treat missed data. Finally, one may use a relatively new technique of Kohonen self-organizing maps (SOM) [13] for the analysis of similarity/dissimilarity in multidimensional space. The technique is based on projection of k-dimensional data onto a two-dimensional map thus providing easily analyzable patterns. This paper reports on the results of application of factor, cluster and SOM analyses to a large array of metal ions stability constants (24 metal and hydrogen cations, 3960 ligands, 15,606 values of log K1 ).

2. Theoretical background 2.1. Data representation The key concept in the following consideration is a metal ion complexation profile. The profile for an ion is an l-dimensional vector containing stability constant values for all the l ligands under study. It characterizes the relative affinity of the cation to different ligands. The similarity in complexing behavior of two metal ions is measured for their profile vectors by means of Pearson correlation coefficient, r. Note that this measure refers not to similarity in absolute stability constant values but to similarity in trends, i.e. the similarity in how values change along the ligands series. 2.2. PFA [9,11] Mathematically, PFA is a variation of a principal component analysis (PCA). The sample correlation matrix R is constructed from normalized initial data

matrix Z(m â n) as follows: R = (1/(n - 1))Z Z, where Z is transpose of matrix Z (i.e. data matrix contains normalized stability constant values, the columns correspond to cations and the rows to ligands). The diagonal elements of R(m â m), which is a unity in PCA, are replaced with the estimates of communality (here is the general difference between PFA and PCA). After that, factor analysis model is used for calculation of principal factors, R = AA + U, where A (m â k , where k the number of factors) is the factor pattern. The matrix of characteristic vectors is associated with linear principal factors (columns of A are factor loadings), A is A transpose, U the residual matrix. For easier interpretation, factors may be rotated according to various methods. PFA, in our context, reduces a set of metal ions complexing profiles to a set of archetypical profiles, or factors (the estimate of the number of factors may not be self-evident). The set of factors, in terms of factor analysis, is the set of latent variables that adequately describes the variability of initial data, linear combination of several factors must accurately reproduce any ion profile. The number of factors should be less than that of metals since some metals are similar and, by design, factors are orthogonal (i.e. they represent "pure" behavioral types, not "mixed" ones). Mathematically, PFA/PCA is performed through eigendecomposition of correlation matrix or equivalently, through singular value decomposition of normalized data matrix. 2.3. Cluster analysis [10,11] A common method to investigate how the samples group in multidimensional space is agglomerative cluster analysis. In this method, clusters are formed by grouping samples into bigger and bigger clusters until all samples become members of a single cluster. Before the analysis, each sample forms its own, separate cluster. At the first stage, two samples are combined in the single cluster, at the second, the third sample is added to the growing cluster, and so on. Graphically, this process is illustrated by agglomerative dendrogramm. There are two important issues: the way of measuring the distance between samples (metrics) and the way of measuring the distance between samples


I.V. Pletnev, V.V. Zernov / Analytica Chimica Acta 455 (2002) 131­142

133

and cluster (linkage rule). In the present study, we applied the popular Euclidean, squared Euclidean and Manhattan (city-block) metrics in combination with complete linkage, Ward's linkage, and weighted/unweighted pair-group average linkage rules. 2.4. SOM analysis [13] A powerful and popular method for visualization of multidimensional data on two-dimensional map is the so-called Kohonen SOM (they may be considered as a kind of unsupervised learning neural network). The approach relies on the immersion of the two-dimensional grid into multidimensional space. Then, in the process of "training", the grid deforms in order to approximate the initial "data cloud" and to preserve data points neighborhood. In a final pattern, point representation is locally continuous, two points which are near on the two-dimensional grid are also near in the initial multidimensional space (the opposite is generally wrong as the two points which are closely located in the initial space may be located far on the grid, it is the price of dimension reduction). The resulting map is similar to a common topographical map, it is divided into regions which correspond to different objects (metal ions). The more closely located their regions are, the higher their similarity.

sured at ionic strength 1 and at temperature 20­30 C were selected (we did not adjust data to a single ionic strength/temperature). If several log K1 values for the same complex were present, the median value was calculated and further used (however, if scattering exceeded three logarithmic units, the entry was excluded). The final data set contained stability constants of complexes for 25 cations: Ag+ , Al3+ , Ba2+ , Be2+ , Ca2+ , Cd2+ , Ce3+ , Co2+ , Cu2+ , Fe2+ , Fe3+ , H+ , Hg2+ , K+ , La3+ , Mg2+ , Mn2+ , Na+ , Ni2+ , Pb2+ , Sr2+ , UO2+ , VO2+ , Y3+ , Zn2+ with 3960 different ligands. Naturally, the corresponding data matrix contains empty cells, the total number of valid entries being 15606. A correlation matrix of size 25 â 25 was built for this data set (Fig. 1). Matrix element rij is a Pearson correlation coefficient between Mi and Mj cations. Each rij was calculated on the "common ligands base" (i.e. those ligands which form complexes with both metals i and j). In this relation, it would be more rigorous to call the obtained matrix a pseudocorrelation matrix, since "common ligands base" is different for different pairs of metals (interestingly, we have not met this approach to correlation matrix construction in literature). The size of "common ligands base" varies, for different ion pairs, from 16 to 1824, average number being 182 ligands. 3.2. Complexation profiles and their similarity

3. Results and discussion 3.1. Data selection and pre-treatment We have used experimental data on the first stability constants, log K1 , extracted from IUPAC stability constants database (SCDB, Academic Software, UK). Our licensed version of the database contains data up to 1993, inclusively. The size of the data set was defined as a result of a compromise between requirements to have compatible experimental conditions and the intention to obtain as large dataset as possible. We included metal ions for which data on stability constants, log K1 , were measured for more than N = 100 ligands (the exception is potassium, N = 99). Stability constant values mea-

Each entry in a pseudocorrelation matrix defines a similarity between complexation profiles for two cations. One may also build a similarity series for any ion. In such series, the rest of the ions are placed in the decreasing order of the correlation coefficient. Per se, this series is a more informative and chemically perceptual way to represent the correlation matrix. The similarity series for all the ions are presented below. The three nearest "neighbors" of each ion are shown in bold, in respective series. Ions with the "common ligands base" <50 ligands are put in brackets (we can not consider them as trusted "neighbors"). Proton in all series is underlined (notably, the correlation with H+ is poor for the majority of cations).


134

I.V. Pletnev, V.V. Zernov / Analytica Chimica Acta 455 (2002) 131­142


I.V. Pletnev, V.V. Zernov / Analytica Chimica Acta 455 (2002) 131­142

135

Ag

+

Al3+

Ba2

+

Be

2+

Ca2

+

Cd

2+

Ce3+
+

Co2

Cu2

+

Fe2

+

Fe3+
+

H

Hg

2+

K

+

La3+

) > (Na+ ) > Ba2+ > (Fe3+ ) > (Y3+ ) > Hg2+ > Pb2+ > Fe2+ > Sr 2+ > Cd2+ > (UO2 2+ ) > (La3+ ) > (Al3+ ) > (K+ ) > Zn2+ > Co2+ > Cu2+ > (Ce3+ ) > Mn2+ > Ni2+ > Ca2+ > Mg2+ > H+ > (Be2+ ) 2+ (VO ) > Fe3+ > (UO2+ ) > Pb2+ > Cu2+ > La3+ > Mn2+ > (Y3+ ) > (Ce3+ ) > (Be2+ ) > Cd2+ 2 > Zn2+ > Co2+ > H+ > Mg2+ > (Fe2+ ) > Ni2+ > (Hg2+ ) > Ba2+ > Sr 2+ > Ca2+ > (Ag+ ) > (Na+ ) > (K+ ) Sr2+ > Ce3+ > Y3+ > Ca2+ > La3+ > Mn2+ > Mg2+ > Fe3+ > Na+ > Pb2+ > Cd2+ > Co2+ > (VO2+ ) > Zn2+ > Ag+ > Cu2+ > Ni2+ > Al3+ > UO2 2+ > Hg2+ > (K + ) > (Be2+ ) > Fe2+ > H+ 2+ ) > (Al3+ ) > (Pb2+ ) > H+ > (Y3+ ) > Zn2+ > (Fe3+ ) > Cu2+ > Mg2+ > Mn2+ (UO2 > (Cd2+ ) > Co2+ > (La3+ ) > (VO2+ ) > (Fe2+ ) > (Ba2+ ) > (Sr 2+ ) > Ca2+ > (Hg2+ ) > (Na+ ) > (Ce3+ ) > Ni2+ > (K + ) > (Ag+ ) 2+ Sr > La3+ > Ce3+ > Mn2+ > Ba2+ > Y3+ > Mg2+ > Cd2+ > Pb2+ > Co2+ > Fe3+ > Zn2+ > Ni2+ > Cu2+ > VO2+ > UO2 2+ > Fe2+ > Al3+ > Be2+ > Hg2+ > Na+ > H+ > Ag+ > K+ 2+ Zn > Ce3+ > Co2+ > Mn2+ > Pb2+ > La3+ > Cu2+ > Ni2+ > Y3+ > Fe3+ > Ca2+ > Sr 2+ > Mg2+ > Ba2+ > Al3+ > VO2+ > Fe2+ > UO2 2+ > Hg2+ > (Be2+ ) > Ag+ > H+ > Na+ > (K + ) 3+ La > Y3+ > Mn2+ > Cd2+ > Ca2+ > Ba2+ > Fe3+ > Co2+ > Sr 2+ > Pb2+ > Zn2+ > Mg2+ > Cu2+ > Ni2+ > (VO2+ ) > (Al3+ ) > Hg2+ > Fe2+ > UO2 2+ > H+ > (Na+ ) > (Be2+ ) > (Ag+ ) > (K+ ) 2+ Ni > Zn2+ > Cd2+ > Mn2+ > Cu2+ > Ce3+ > Y3+ > La3+ > Pb2+ > Fe3+ > VO2+ > Ca2+ > Mg2+ > Fe2+ > Sr 2+ > Al3+ > Ba2+ > UO2 2+ > Be2+ > Hg2+ > H+ > Ag+ > Na+ > (K + ) 2+ Ni > Zn2+ > Co2+ > Cd2+ > VO2+ > Fe3+ > Ce3+ > Al3+ > La3+ > Y3+ > Pb2+ > UO2 2+ > Mn2+ > Ca2+ > Hg2+ > Sr 2+ > Fe2+ > Mg2+ > Ba2+ > Be2+ > H+ > Na+ > Ag+ > (K + ) (VO2+ ) > Fe3+ > (UO2 2+ ) > Ni2+ > Mn2+ > Co2+ > Zn2+ > Cd2+ > Pb2+ > (Al3+ ) > Hg2+ > Cu2+ > La3+ > Ce3+ > Mg2+ > Ca2+ > Ag+ > (Y3+ ) > (Be2+ ) > Sr 2+ > Ba2+ > H+ > (Na+ ) > (K+ ) 3+ > UO 2+ >3+ Ce3+ > Mn2+ > La 3+ > Cu2+ > (VO2+ ) > Fe2+ > Cd2+ > Pb2+ > Al3+ Y 2 > Zn2+ > Ni2+ > Co2+ > Sr 2+ > Hg2+ > Ba2+ > Mg2+ > Ca2+ > (Na+ ) > (Ag+ ) > (Be2+ ) > (K+ ) > H 3+ Al > Be2+ > UO22 + > VO2+ > Y3+ > Pb2+ > Cu2+ > Zn2+ > Ce3+ > La3+ > Cd2+ > Mg2+ > Fe3+ > Co2+ > Ni2+ > Ba2+ > Mn2+ > Sr 2+ > Ca2+ > Fe2+ > Hg2+ > Ag+ > Na+ > K+ 2+ (VO ) > Fe3+ > Cu2+ > La3+ > Fe2+ > Ce3+ > Mn2+ > (Al3+ ) > Ag+ > Pb2+ > Cd2+ > Ba2+ > (Y3+ ) > Zn2+ > Co2+ > Sr 2+ > Ni2+ > (Be2+ ) > Ca2+ > (Na+ ) > (UO2 2+ ) > Mg2+ > (K + ) > H+ + Na > (Ba2+ ) > Y3+ > Fe3+ > Ag+ > (Ce3+ ) > Be2+ > Sr 2+ > (Pb2+ ) > (Mn2+ ) > (La3+ ) > (Hg2+ ) > (Cd2+ ) > (VO2+ ) > (Al3+ ) > Ca2+ > (Mg2+ ) > (Cu2+ ) > (Zn2+ ) > H+ > (Fe2+ ) > (Co2+ ) > (Ni2+ ) > (UO2 2+ ) 3+ Ce > Y3+ > Pb2+ > Ca2+ > Mn2+ > Cd2+ > Sr 2+ > Ba2+ > Fe3+ > Co2+ > Mg2+ > Zn2+ > VO2+ > Cu2+ > Ni2+ > Al3+ > UO2 2+ > Hg2+ > (Na+ ) > Fe2+ > (Be2+ ) > H+ > (Ag+ ) > (K+ ) (VO
2+


136

I.V. Pletnev, V.V. Zernov / Analytica Chimica Acta 455 (2002) 131­142
2+

Mg

Mn

2+

Na

+

Ni

2+

Pb2

+

Sr2

+

UO

2

2+

VO2 Y3+

+

Zn2

+

> Ca2+ > La3+ > Y3+ > VO2+ > Ce3+ > Sr 2+ > Co2+ > Pb2+ > Ba2+ > Fe3+ > Cd2+ > Zn2+ > Al3+ > Ni2+ > UO2 2+ > Cu2+ > Fe2+ > Be2+ > Na+ > H+ > Hg2+ > (K + ) > Ag+ 3+ Ce > Pb2+ > La3+ > Cd2+ > Y3+ > Co2+ > Mg2+ > Zn2+ > Ca2+ > Sr 2+ > Fe3+ > Ba2+ > Ni2+ > VO2+ > Al3+ > Fe2+ > UO2 2+ > Cu2+ > Hg2+ > (Na+ ) > Be2+ > H+ > Ag+ > K+ K+ > Ba2+ > (Ag+ ) > (Fe3+ ) > Sr2+ > (La3+ ) > (Mn2+ ) > Mg2+ > (Ce3+ ) > (Pb2+ ) > Cd2+ > (VO2+ ) > (Y3+ ) > (Hg2+ ) > Ca2+ > (Be2+ ) > Cu2+ > (Al3+ ) > Zn2+ > Co2+ > Ni2+ > (Fe2+ ) > (UO2 2+ ) > H+ Co2+ > Zn2+ > Cu2+ > Cd2+ > Fe3+ > VO2+ > Mn2+ > Pb2+ > La3+ > UO2 2+ > Y3+ > Ce3+ > Fe2+ > Ca2+ > Mg2+ > Sr 2+ > Al3+ > Ba2+ > Hg2+ > H+ > Be2+ > Ag+ > Na+ > (K + ) 2+ (VO ) > Mn2+ > La3+ > Y3+ > Cd2+ > Zn2+ > Ce3+ > Co2+ > Fe3+ > Al3+ > Ca2+ > Ni2+ > Sr 2+ > Mg2+ > Cu2+ > Ba2+ > (Be2+ ) > UO2 2+ > Fe2+ > Hg2+ > Ag+ > H+ > (Na+ ) > (K+ ) 2+ Ba > Ca2+ > Mn2+ > Ce3+ > La3+ > Y3+ > Mg2+ > Cd2+ > Pb2+ > Fe3+ > Co2+ > Zn2+ > (VO2+ ) > Na+ > Cu2+ > Ni2+ > Al3+ > UO2 2+ > Ag+ > Fe2+ > Hg2+ > (Be2+ ) > H+ > K + 2+ (VO ) > Fe3+ > (Al3+ ) > (Fe2+ ) > (Be2+ ) > Ni2+ > Cu2+ > Mn2+ > Pb2+ > Y3+ > Zn2+ > La3+ > Co2+ > Cd2+ > Mg2+ > H+ > Ca2+ > Ba2+ > Ce3+ > Sr 2+ > (Ag+ ) > (Hg2+ ) > (Na+ ) > (K+ ) 2+ ) > (Pb2+ ) > (Fe2+ ) > (Al3+ ) > Cu2+ > (Fe3+ ) > (Y3+ ) > Zn2+ > Mg2+ > Ni2+ (UO2 > Co2+ > La3+ > Mn2+ > (Hg2+ (> (Ag+ ) > (Ce3+ ) > Cd2+ > (Ba2+ ) > (Sr 2+ ) > Ca2+ > H+ > (Be2+ ) > (Na+ ) > (K+ ) 3+ Ce > La3+ > Fe3+ > Pb2+ > Mn2+ > Ba2+ > Ca2+ > Co2+ > Sr 2+ > Cd2+ > Zn2+ > Mg2+ > (VO2+ ) > Cu2+ > Ni2+ > (Al3+ ) > UO2 2+ > (Be2+ ) > (Ag+ ) > (Hg2+ ) > H+ > (Fe2+ ) > (K+ ) > (Na+ ) Co2+ > Cd2+ > Ni2+ > Cu2+ > Mn2+ > Pb2+ > Y3+ > Ce3+ > La3+ > Fe3+ > VO2+ > Mg2+ > Ca2+ > Fe2+ > Al3+ > Sr 2+ > UO2 2+ > Ba2+ > Be2+ > Hg2+ > H+ > Ag+ > Na+ > (K + ) Mn
2+

3.3. Factor and cluster analyses PFA established that the minimal number of factors describing the whole dataset equals to four. This result was obtained from comparison of eigenvalues of pseudomatrix and eigenvalues of random (normal distribution) matrix. However, the other popular criteria recommends to retain all factors describing 90% of data variance, this results in selection of five factors which reproduce 89%. Characteristics of several first factors compared with random values are presented in Fig. 2. In Table 1, the factors are shown in more details. In PFA experiment, we estimated initial communalities, i.e. diagonal elements of correlation matrix, as squared multiple correlation coefficients for each

original variable (with all the others). The improvement of factor structure has been performed through varimax rotation. As is shown on Table 1, the total variance for the first two factors is high, 80%. It should be logical to try
Table 1 Characteristics of the first five factors Number of factors 1 2 3 4 5 Eigenvalue 18.574 1.519 0.937 0.816 0.373 Total variance (%) 74.297 6.076 3.748 3.265 1.494 Cumulative total variance (%) 74.297 80.373 84.121 87.386 88.880


I.V. Pletnev, V.V. Zernov / Analytica Chimica Acta 455 (2002) 131­142

137

Fig. 2. Comparison of the eigenvalues calculated for the correlation matrix under study and the random allocation matrix.

to cluster cations in the simple two-dimensional space of these factors, however, such analysis (performed also for other factor pairs) did not appear informative. The hierarchical cluster analysis (we experimented with Euclidean and Manhattan distances and with several linkage rules) was carried out in five factors loadings space. Alternatively, cluster analysis was applied to the initial data, where pseudocorrelation matrix was considered as a matrix of distances in 25-dimensional space (metrics was 1 - Pearson rij ). The dendrogramm of the last version of clusterization is presented on Fig. 3. 3.4. Metal ions classification All the analysis versions performed arrived to the similar picture. At the next to simple ternary classification (like [6,7]) level, there appear six classes. These are presented below. Cations that may migrate from class to class in different versions of clusterization are

placed in brackets (note that, for all migrations, involved cation may migrate only between two clusters). I. II. III. IV. V. VI. K+ Na+ Ag+ Hg2+ H+ Be2+ (Al3+ ) UO2 2+ VO2+ Fe2+ (Fe3+ ) (Al3+ ) Co2+ Cu2+ Ni2+ Zn2+ (Cd2+ ) Ba2+ Sr2+ Ca2+ Mg2+ Mn2+ Pb2 Ce3+ (Fe3+ ) (Cd2+ )

+

La

3+

Y

3+

Some notes seem appropriate here. Hg2+ and Ag+ , formally falling into a single class, form, in fact, two classes, which are located closer to each other then to the rest of clusters. Be2+ and H+ form a tight group, notably, these cations have the smallest ionic radii. Similarity of alkali and alkaline earth metal ions is self-explanatory. Borderline (according to Pearson) metal ions do group well. However, the position of manganese is interesting. Mn2+ ion is located nearer to alkaline earth


138

I.V. Pletnev, V.V. Zernov / Analytica Chimica Acta 455 (2002) 131­142

Fig. 3. Example clusterization diagram for 25 ions (Wards linkage rule, Pearson metrics).

and rare earth elements, as well as to Pb2+ (this may be attributed to d5 -electron configuration which does not supply electron stabilization by ligand field). The obtained classification does not contradict the well-known schemes by Arhland­Chatt­Davies/ Pearson but complements them. It allows to examine the complexing properties in more detail. Let us consider a typical practical task. For some metal ions, the number of measured stability constants is small, due to "poor" chemistry in aqueous solutions or simply as a result of the researchers' neglect. Estimation of missed stability constants would be of interest. One may naturally turn to similar metal ions and try to obtain the suitable (linear) dependence of their stability constants to those for metal of interest. After calibration, the dependence may be used to predict stability constants in missed cases. Naturally, the more similar cation we choose, the more chances exist for valid predictions. The classification helps to select "nearest neighbor" for building a trusted relationship.

Let us consider two examples. 1. The divalent lead ion. According to Pearson, Pb2+ is at borderline. It seems natural to correlate log K1 values for Pb2+ with those for ions in the same group, e.g., Ni2+ . However, this correlation is far from perfect, as is illustrated by Fig. 4a. According to our classification, lead belongs to cluster VI. The similarity series shows that its nearest neighbor in cluster is Mn2+ (VO2 2+ is even closely located but this conclusion is based on too small a "common ligands base"). The result is rather unexpected. In different variants of HSAB formulation, Mn2+ is either missing or classified as "hard", depending on the author (Pearson himself did not classify Mn2+ [8]). Most authors place the Pb2+ into borderline group. Anyway, the similarity between Mn2+ and Pb2+ is not what all the chemists were to expect.


I.V. Pletnev, V.V. Zernov / Analytica Chimica Acta 455 (2002) 131­142

139

Fig. 4. Correlation between the stability constants for complexes of Pb

2+

vs. Ni

2+

and Mn

2+

.


140

I.V. Pletnev, V.V. Zernov / Analytica Chimica Acta 455 (2002) 131­142

Fig. 5. Correlation between the stability constants for complexes of Ce

3+

vs. Ca

2+

and Cd

2+

.

Nevertheless, the observed correlation is evidently good (Fig. 4b). Linear regression makes it possible to estimate log K1 for Pb2+ through log K1 for Mn2+ (and vice versa) in a broad range of stability constant values, from 0 to 23 logarithmic units.

2. The trivalent cerium ion. It is, undoubtedly, "hard" cation in Pearson's classification. According to similarity series, Mn2+ and Cd2+ are similar to it. If Mn2+ is generally considered as "hard", Cd2+ is commonly claimed to be "soft". However, Ce3+ and Cd2+ is similar,


I.V. Pletnev, V.V. Zernov / Analytica Chimica Acta 455 (2002) 131­142

141

Fig. 6. The Kohonen self-organizing map representing distribution of metal ions by their complexing properties. The different gradations of gray indicate different clusters (see text); cations that may migrate from cluster to cluster are marked with white.

and, actually, more similar than Ce3+ and Ca2+ (Fig. 5). This observation seems even more unexpected if we consider ionic radii: 1.01 å for Ce3+ and 0.85 å for Cd2+ against practically identical values for Ce3+ and Ca2+ , 1.01 and 1.00 å [12].

3.5. Two-dimensional map of metal ions similarity To obtain visual patterns of similarity/dissimilarity in metal ions complexing behavior, we built a Kohonen map, Fig. 6. The map is built for the data in factor space where the points are positioned precisely, note that the direct mapping of initial complexation data is impossible due to empty cells in data matrix. The factor loadings of the five most significant factors were used. The map

was built with use of Viscovery SOMine program [14]. The map has hexagonal structure, grid size: 45 â 42 nodes "neurons", gaussian neighborhood function was used. The map is a powerful visualization tool, which complements similarity series and classification scheme. For example, the similarity between lead and manganese ions becomes evident at a first glance, also is evident that cadmium is located close to cerium, as is shown in the second example above. Multi-colored shading was used to help visualize the clusters established by hierarchical clusterization and to check the mapping for artifacts existing (two closely located points lying within the initial space may be located far on the grid). It appeared that the map reflects cations distribution in factor loading space with good precision.


142

I.V. Pletnev, V.V. Zernov / Analytica Chimica Acta 455 (2002) 131­142

3.6. Limitations We should emphasize that the reported analysis is based on a purely data-driven approach, with all its inherent drawbacks. Herein, we intentionally did not interpret the obtained factors and classes. The analysis missed some relatively important points. In particular, the stability constants were not adjusted to a common ionic strength and common temperature. Another problem is that measuring the inter-cation similarity was limited to existing "common ligands base", which may be insufficient. For example, calculated similarity of K+ ­Hg2+ or Na+ ­Ag+ is, undoubtedly, overestimated since the stability constants for complexes of alkaline metals with sulfur-containing ligands are practically not known and, naturally, are missed. Possibly, it would be better to assign them a formally negligible value but such an approach could not be considered general. Nevertheless, the presented classification reflects the numerous important patterns in metal ions complexation behavior, we hope that it will be helpful for analytical and coordination chemists. Acknowledgements The authors thank Dr. O. Obrezkov, Prof. Yu. Zolotov (Moscow University) and anonymous referees for

their interest and useful comments. We are also grateful to Eudaptics gmbh (Wien, Austria) who generously made their SOM analysis software (limited version) available from the Web site. References
[1] R.D. Hancock, Analyst 122 (1997) 51R. [2] P.W. Dimmock, P. Warwick, R.A. Robbins, Analyst 120 (1995) 2159. [3] R.D. Hancock, A.E. Martell, Chem. Rev. 89 (1989) 1875. [4] M. Bek, I. Nadpal, Chemistry of complex equilibria, Mir, Moscow, 1989 (in Russian). [5] V.N. Kumok, Principles of coordination complexes stability in solution, TGU-Publishing, Tomsk, 1977 (in Russian). [6] S. Ahrland, J. Chatt, N.R. Davies, Quart. Rev. 12 (1958) 265. [7] R.G. Pearson, J. Am. Chem. Soc. 85 (1963) 3533. [8] R.G. Pearson, Inorg. Chim. Acta 240 (1995) 93. [9] E.R. Malinovksi, Factor Analysis in Chemistry, 2nd Edition, Wiley, New York, 1991. [10] J.A. Hartigan, Clustering algorithms, Wiley, New York, 1975. [11] M.A. Sharaf, D.L. Illman, B.R. Kowalski, Chemometrics, Wiley, New York, 1986. [12] R.D. Shannon, Acta Cristallogr. Sect. A A32 (1976) 761. [13] T. Kohonen, Self-Organizing Map, 2nd Edition, Springer, Berlin, 1997, 426 pp. [14] Viscovery SOMine 3.0, copyright© by 1999 Eudaptics software gmbh; free limited version available from http://www.eudaptics.com.