Документ взят из кэша поисковой машины. Адрес оригинального документа : http://www.stsci.edu/science/share/catalogs_meurer_012202.txt
Дата изменения: Wed Feb 6 17:25:06 2002
Дата индексирования: Sat Dec 22 04:29:30 2007
Кодировка:
Source Segmentation and classification
######################################

Background and Rationale
========================

We define "segmentation" as the process of breaking up an astronomical
image into individual sources and measuring basic properties of the
sources such as position and flux. "Classification" is the process of
analyzing the light distribution of each source in order to determine
what type of object it is and provide further detail of its properties.
Several software packages to segment and/or classify sources exist
(SExtractor, FOCAS, DAOPHOT, GIM2D). Segmentation is usually done using
a peak finding or thresholding algorithm, often after filtering to
remove sky and to improve the ability to detect particular types of
objects. Classification requires a more detailed look at the light
distribution and techniques vary widely. A simple goal of
classification may be to discriminate between stars and galaxies, while
a more ambitious goal may be to classify different galaxy types or to
measure structural properties. The end product of segmentation and
classification is a "source catalog" which lists the sources and their
properties.

There are many uses for source segmentation and classification in
HST direct images. They can be loosely grouped as operation or science
reasons.

The primary operational uses of source catalogs concern astrometry. By
cross correlating the source catalog of an image with the Guide Star
Catalog or other large scale astrometric surveys one can improve the
absolute World Coordinate System (WCS) in the image header. With or
without astrometric standards in the field of view, one can use the
relative position of sources in overlapping images to fine tune
alignment and registration of images, thus allowing improved dither
combined products. The need for better WCS headers is discussed in more
detail in ****Megan's section****, while the need for making improved
dithered products is discussed in ****Bill's section****.

Many science uses can be made of source catalogs. The limits are
primarily set by what the software measures which could easily include: [1]
positions; [2] fluxes (within apertures, isophotes or model fitted); [3]
crude morphological information (position, angle, ellipticity of an
isophote, image moments); [4] star-galaxy discrimination; [5] local sky
level; and [6] detailed morphological properties (such as the bulge/disk
ratio, asymmetry parameters, and galaxy classification). These
quantities are already valuable in instrumental units (e.g. pixel
positions, flux in DN). Additional value is made by calibrating them
(e.g. world coordinate positions and fluxes in f_lambda or f_nu).
Since the calibrations are essentially in the image headers this is
readily achieved, either as a part of the segmentation and classification
process, or offline by the user.

Additional scientific utility can come from a statistical analysis of
the source catalog or by matching several catalogs. Possible examples
include determining the depth of an image from a magnitude histogram,
plotting color magnitude diagrams, looking for variability from repeated
observations, determining photometric redshifts,
and measuring lensing or shear with image shape parameters.

Segmentation and classification can be done at various stages of the
image processing, and the uses one makes of the catalogs can be
suitably tailored to the needs of that stage of processing. For
example structural properties may not be of much use before all the
images of a field in a given filter are dither combined, but the pixel
positions may be very useful for fine tuning the registration so as to
make a better dithered product can be made.

As one proceeds from the operational uses to the various scientific uses
[1] the quality requirements increase; [2] need for intelligent
processing increases (e.g. field dependence sparse vs. crowded, distant
vs. nearby galaxies, etc); [3] the applicability and target audience
narrows; and [4] the methods become more specialized.


Assumptions
===========

Existing segmentation software works best with cosmic ray free
images. If segmentation is to be done on single CCD frames, the software
must be able to identify cosmic rays (and perhaps remove them).

- Are these catalogs going to be archived by MAST? ****
- OTFR implications ****.

Required Decisions
==================

****

Minimum, Intermediate and Maximum Goals
=======================================

[1] The minimum goal is to provide operational grade source catalogs.
these can be constructed both before and after dither combining. The
operational grade catalog will be geared towards tieing down the WCS and
improving the image registration and alignment prior to dither
combining. The segmentation software must provide good pixel positions
and crude fluxes of compact local maxima and must be very robust against
false detections (e.g. have a high threshold cutoff).

[2] An intermediate goal is to provide "easy" science grade
catalogs. These should be made after cosmic ray or dither combining
images. The primary purpose of these is quick look and assessment of
sources. The segmentation software must be geared to the simplest
simplest fields (e.g. distant galaxy fields), and must be reasonably
robust against false detections. The users should be made aware of
limitations of these catalogs (i.e. buyer beware).

[3] The maximum goal is to provide complete science grade catalogs.
These would have the deepest possible threshold and include more
esoteric measurements (e.g. galaxy morphology). These should be
constructed after cosmic ray or dither combining images. The software
for constructing a complete science grade source catalog requires the
most intelligence in processing (e.g. field dependence in threshold
setting, bulge/disk decomposition only turned on for the highest S/N
extended sources) and requires the most quality control of output
(i.e. error estimates). Again, the users should be made aware of
limitations of these catalogs (i.e. buyer beware).

Implementation Plan
===================

It is likely that the same software (e.g. SExtractor) can be used to
achieve goals [1], [2], and much of goal [3] (specialized classification
software may be needed for potential science goals such as bulge/disk
decomposition and galaxy morphology determination). The difference
between the different grades of catalogs is largely a matter of which
parameters are measured and how the threshold for object detection is
set. Hence the easy science grade catalog can be consisdered a subset
of the complete science grade catalog, and the operational grade catalog
a subset of both of these.

Research phase: The main work required for implementing any segmentation and
classification software is research. First the various existing
software packages for segmentation must be identified and tested on the
complete range of existing HST data (e.g. all imaging instruments, crowded
and sparse fields, some with foreground sources such as nearby galaxies,
taken with a range of exposure times) as well as simulated images (for
which we know the truth, and to plan for upcoming instruments such as
WF3). The aim of the research would be to find out which packages work
best and which parameters should be adjusted to make the cuts between
the different grades of catalogs.

Planning phase: Next the data requirements (rules) for each level of
implementation, should be determined. These rules could include: only
use CR-combined images; do not use on moving targets; position
constraints (galactic latitude, distance from nearby galaxies, etc...).
This will also allow us to flesh out the goals of the different grades
of source catalogs and hence make a more detailed implementation plan.

Implementation phase: The rest of the work would involve writing
wrappers for the existing software to produce each catalog grade (which
will successively be phased in). In addition lots of stress testing
will be required to make the scripts robust and document the limitations
of the different catalog grades (e.g. false detection rate as a function
of S/N).


Required Resources and timescales
=================================

**** should discuss this. This is vaguely based on ACS IDT experience.

Research phase: at least one scientist/year.

Planning phase: scientist and programmer working together for three
months.

Implementation phase: On order of three programmer/months to implement
each catalog grade, followed by a similar amount of time by 1-2
scientists to stress test and document each grade.