Документ взят из кэша поисковой машины. Адрес оригинального документа : http://star.arm.ac.uk/~csj/papers/posters/2003ehb_winter_sdb_poster.ps
Дата изменения: Tue Oct 25 18:24:57 2005
Дата индексирования: Tue Oct 2 16:16:01 2012
Кодировка:
Поисковые слова: http astrokuban.info astrokuban

Automatic Classication of Subdwarf Spectra using a Neural Network
C. Winter 1 , C.S. Jeery 1 and J.S. Drilling 2
1 Armagh Observatory, College Hill, Armagh BT61 9DG, N. Ireland
2 Dept. of Physics and Astronomy, Louisiana State University, Baton Rouge, LA 70803
cwr@arm.ac.uk, csj@arm.ac.uk, drilling@rouge.phys.lsu.edu
Abstract
We apply a multilayer feed-forward back propagation
articial neural network to a sample of 380 subdwarf
spectra classied by Drilling et al. (2002), showing
that it is possible to use this technique on large sets
of spectra and obtain classications in good agreement
with the standard. We brie y investigate the impact
of training set size, showing that large training sets do
not necessarily perform signicantly better than small
sets. Plans for future work in this area are also outlined.
Automated Classification
Stellar spectra require the experience and judgement
of a trained expert in order to be classied. How-
ever, current and future digital sky survey projects,
like the SDSS, along with space-based missions, such
as GAIA, will collect huge amounts of spectra { quan-
tities human experts will be unable to cope with. In
light of this, investigation into automated classica-
tion schemes as supplementary tools is becoming more
urgently necessary if we are to stay ahead of the data
wave.
Following past examples (Gulati et al. 1994, von
Hippel et al. 1994, Bailer-Jones 1996), we aim to es-
tablish whether an articial neural network (ANN) is
capable of providing agreeable classications for a set
of subdwarf spectra previously classied by Drilling et
al. (2002). Additionally, we brie y investigate how the
size and content of the ANN's training data (analagous
to spectral classication standards) aects it's ability
to provide agreeable classications.
Data PreProcessing
Our samples of subdwarf spectra were taken from the
collection compiled by Drilling et al. (2002) from data
provided my Moehler et al. (1990a, 1990b), Dreizler
et al. (1990), and Theissen et al. (1993). It com-
prises a more-or-less representative sample of 174 PG
subdwarfs and blue horizontal branch stars, plus a few
other stars not included in the PG catalog.
The Drilling classication system uses a spectral
type running from sdO1 to sdA (1 { 20), analagous
to MK spectral classes. It introduces a helium class
(0 { 40) based on H, HeI and HeII line strengths, and
uses luminosity classes IV { VIII, where most subd-
warfs have luminosity class VII. The mapping be-
tween Drilling classes and those used elsewhere, e.g.
the PG survey (Green et al. 1986), is illustrated in
gure 1.
Figure 1. Comparison of Drilling spectral and helium
classes with the PG classes (from Drilling et al. 2002)
Our data has been been coarsely classied on the
helium scale dened by Drilling et al. (2002), with a
grain size of 4 helium classes.
Before applying the ANN, data must be in a homo-
geneous form. Simple investigations revealed dissimi-
larities in wavelength range, and bin size. A common
wavelength range of 4300 { 4850 A was established,
along with a common bin size of 0.6 A. Any spectra
unconformable to these were removed from the data
set.
Crudely rectifying large cosmic spikes, and in-
strumental end-eects, spectra were then velocity cor-
rected by way of a cross-correlation function. Fur-
ther elimination of those spectra with no correspond-
ing spectral classication resulted in a nal collection
of 380 spectra. These were then resampled onto a com-
mon wavelength range of 4200 { 4900 A, with a bin size
of 0.6 A, yielding 1167 data points per spectrum.
Figure 2. Jittered scatter plots showing the distribution
of our final 380 spectra across the three classification
dimensions. Note the concentration of spectra at Lu
minosity Class VII.
The Neural Network
An ANN is a statistical pattern recognition algorithm,
able to perform a non-linear, parameterised mapping
between two domains. Originally inspired by the struc-
ture of neuronal cells in the brain, a typical ANN con-
sists of an interlinked, hierarchical structure of pro-
cessing nodes. The interested reader should refer to
Bishop (1995) for more detailed instruction.
The feed-forward back propagation neural net-
work code statnet, by Dr. Coryn Bailer-Jones
(http://www.mpia-hd.mpg.de/homes/calj/
statnet.html), was used in this study.
Our main objective is to show whether ANNs are
able to perform the task of spectral classicaion, hence
the ANN architecture was kept reasonably simple. A
comittee of 5 networks was used, with each network
consisting of 1 input layer with 1167 input nodes, 1
hidden layer of 5 nodes, and 1 output node.
To test the eect of training set size on ANN per-
formance, two training sets were created for each pa-
rameter space we wanted to classify in. One set com-
prised 100 spectra for training, with the resulting ANN
being tested on the remaining 280 spectra. Similarly,
the second training set contained 280 spectra, with the
remaining 100 spectra used to test the ANN. In each
case, training set samples were chosen stochastically
from the main data set such that the parameter space
was represented evenly. An uneven representation lim-
its the ANN's ability to generalise, and would thus re-
duce performance.
Results
ANN classication was limited to spectral type
and helium class only. 64% of our spectra reside in
luminosity class VII, thereby making this class over-
represented, restricting the ability of the ANN to make
accurate classications in other areas of the parameter
space.
Table 1: Summary of results.
SpT HeC
100 280 100 280
RMS 2.09 1.99 4.79 4.55
r 0.89 0.90 0.92 0.94
Beginning with spectral type, the training set of
100 spectra allowed the ANN to provide classications
to within 2.09 subtypes, with a correlation coeфcient
of 0.89. Training with a set of 280 spectra, classi-
cations were accurate to within 1.99 subtypes, with a
correlation coeфcient of 0.90.
In terms of helium class, a training set of 100 spec-
tra enabled the ANN to classify to within 4.79 classes,
with a correlation coeфcient of 0.92. Using the train-
ing set of 280 spectra, classications were within 4.55
classes, with a correlation coeфcient of 0.94. The
rather large errors in helium classications are due to
the coarse grain of the original classication scale.
In each case, we see that the ANN trained on a
larger sample of 280 spectra yields a classication error
not signicantly smaller than the ANN trained using
the smaller sample of 100, suggesting a large training
set is not necessarily required for good performance.
Figure 2. The scatter plots show true classifications
against network classifications. Also plotted for each
case is a least--squares best fit line.
Conclusions
We have established that ANNs are capable of
providing spectral classications agreeable with those
made according to the classication standards.
In addition, a large training set is not necessarily
required for the ANN to yield good results.
Future work will allow us to determine if ANNs can
yield even better performance. We plan to investigate
a number of possibilities:
Attempting to locate an optimal training set,
and whether such a set should contain the clas-
sication standard spectra;
Restricting the ANN's attention to the same
spectral lines as used in dening the classica-
tion standards;
Network structure is an important factor in
ANN performance. If we vary ANN architec-
ture, by adding a second hidden layer, adjust-
ing the number of processing nodes, etc., what
is the corresponding eect on performance?
In many cases, our data set contains several
spectra from the same star. Is the ANN giving
each spectrum the same classication?
Pre-processing spectra with Principal Compo-
nents Analysis to remove noisy features and
compress the number of ANN inputs;
Increasing the size of the data set to provide
a richer representation of the parameter space,
allowing further studies into hot subdwarfs.
References
Bailer-Jones C. A. L. 1996, PhD thesis, University of
Cambridge
Bishop C. M. 1995, Neural Networks for Pattern
Recognition (Oxford: Oxford University Press)
Dreizler S., Heber U., Werner K., Moehler S., de Boer
K. S. 1990, A&A, 235, 234
Drilling, JS, Moehler, S, Jeery, CS, Heber, U, and
Napiwotzki, R 2002, Probing the Personalities of Stars
and Galaxies, ed. Richard Gray, in press.
Green R. F., Schmidt M., Liebert J. 1986, ApJS, 61,
305
Gulati R. K., Gupta R., Gothoskar P., & Khobragade
S. 1994, ApJ, 426, 340
Moehler S., Richter T., de Boer K. S., Dettmar R. J.,
Heber U. 1990, A&AS, 86, 53
Moehler S., Heber U., de Boer K. S. 1990, A&A, 239,
265
Theissen A., Moehler S., Heber U., de Boer K. S. 1993,
A&A, 273, 524
von Hippel T., Storrie-Lombardi L. J., Storrie-
Lombardi M. C., & Irwin M. J. 1994, MNRAS, 269,
97