|
Документ взят из кэша поисковой машины. Адрес
оригинального документа
: http://www.adass.org/adass/proceedings/adass03/P3-9/
Дата изменения: Thu Aug 26 22:59:39 2004 Дата индексирования: Tue Oct 2 04:33:12 2012 Кодировка: Поисковые слова: http www.astronomy.ru forum index.php topic 4644.0.html |
Current astronomical surveys, like the Sloan Digital Sky Survey are
expected to describe on the order of hundreds of millions of
objects. The metadata associated with each object is managed in a
database. Astronomers use databases of this kind to find objects not
only by their physical attributes, but also by their location. In fact
the spatial component of the query can take complex forms, such as
``find pairs of objects that are separated by
degrees,
and have other physical attributes''. Furthermore, spatial
correlation of pairs, triples, etc. of objects and regions of interest
require complex (therefore expensive, time consuming) geometric
computation. The challenge facing the astronomical database designer
is that databases handle information that is organized into rows and
columns (tables) very well, but the nature of spherical topology is
such, that mapping (flattening) a sphere into a rectangle using
any cartographic projection invariably introduces distortions, and
worse, unacceptable discontinuities. But relational database
management systems (RDBMS) are extremely good at organizing
information contained in tables by manipulating indices. By
mapping the sphere into a table, or onto a line, we harness the power
of the database engine maximally.
The 64-bit hids that represent unique trixels are used as indices by the RDBMS. As a result, queries that involve examining regions merely manipulate sets of hids. These functions are made accessible to the database engine by defining wraper functions particular to the databse in use. In our testbed and production system, Microsoft SQL Server 2000 is extended by providing an interface to our function through so-called extended stored procedures. These functions provide adequate encapsulation of the HTM methods, so that users need not know the details of HTM algorithms, but can formulate their requests at a substantially high level.
Familiar shapes, like rectangles, circles, bands, are transformed into an internal normal form based on the union of convexes, which, in turn are intersections of constraints. In a computer program, the region is an object that contains the hids of the trixels that represent the region. These are generated by the library from descriptions in terms of familiar shapes, such as circles, rectangles, arbitrary polygons. If a user needs to know whether an observation is outside of a region of interest, a simple call to the HTM toolkit with the coordinates of the observation provides the answer.
It is important to note, that any single shape (or a convex) is an intersection of a finite number of constraints, and any region consisting of disconnected shapes is nothing more that a union of convexes.
With SQL and the extra functions that implement the HTM algorithms, we can express these easily. The details of using SQL to express a general query is beyond the scope of the limited space in this article, but as an illustration, we show one simple example.
The most powerul funtion is HTMCover(). The argument given to it is a description of the region expressed as a union of convexes, but the interface also supports more intuitive shape descriptors made up of rectangles, polygons and circles. Figure 1 shows two search cones. In this example, our region of interest is their intersection, which is represented a convex with two constraints. The language which allows coordinates to be expressed in either as RA DEC pairs, or as Cartesian coordinates.
CONVEX CARTESIAN 1 1 0 0.8 1 0 0 0.8
This convex specification is a intersection of two constraints, both of whose
.
The two centers are
and
, respectively. The
toolkit automatically normalizes the vectors. The hids that cover the specified convex is the list:
32 62 141 253 9120 9121
16144 16146 36494 36608 64589 65408
The smaller numbers represent the larger spherical triangles in Figure 1. In some queries, the list of hids may become huge. Fortunately, the hid values have a local coherence property, that is to say, clumps of trixels in a neighborhood have large runs of consecutive hids. Therefore we can consider a list of runs of hids instead of a very large number of individual hids. Consequently, the actual value returned by HtmCover() is an SQL table where each row is a pair hid values. This is an effective run-length encoding of the hid list.
It is possible, that a particular request for a cover would produce a very fragmented hid range list. Therefore our software ensures that the list of hid range values is always bound by a specific, but tunable parameter (between 15 - 100). This is accomplished by merging ranges with small gaps first, then those with greater gaps until the number of ranges is within the specified parameter. This introduces ``false positives'' into the cover, but that is preferable to ignoring an hid value that is actually inside (or intersecting) the convex.
For an extensive discussion and references the reader is directed to the link at the end of this page.