Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://www.mso.anu.edu.au/~charley/papers/9903206.pdf
Äàòà èçìåíåíèÿ: Sun Dec 12 15:18:59 2004
Äàòà èíäåêñèðîâàíèÿ: Tue Oct 2 02:53:39 2012
Êîäèðîâêà:

Ïîèñêîâûå ñëîâà: m 81
Mon. Not. R. Astron. Soc. 000, 000­000 (0000)

Printed 12 December 2004

a (MN LTEX style file v1.4)

Applications of Wavelets to the Analysis of Cosmic Microwave Background Maps
L. Tenorio,1 A.H. Jaffe,2 S. Hanany,
1 2 3 4

2,3

C.H. Lineweaver

4

Mathematical and Computer Sciences, Colorado School of Mines, Golden, CO 80401 Center for Particle Astrophysics, University of California, Berkeley, CA 94720 School of Physics and Astronomy, University of Minnesota, Minneapolis, MN 55455 School of Physics, University of New South Wales, Sydney Australia

12 December 2004

arXiv:astro-ph/9903206 v1 13 Mar 1999

ABSTRACT

We consider wavelets as a tool to perform a variety of tasks in the context of analyzing cosmic microwave background (CMB) maps. Using Spherical Haar Wavelets we define a position and angular-scale-dependent measure of power that can be used to assess the existence of spatial structure. We apply planar Daubechies wavelets for the identification and removal of points sources from small sections of sky maps. Our technique can successfully identify virtually all point sources which are above 3 and more than 80% of those above 1 . We discuss the trade-offs between the levels of correct and false detections. We denoise and compress a 100,000 pixel CMB map by a factor of 10 in 5 seconds achieving a noise reduction of about 35%. In contrast to Wiener filtering the compression process is model independent and very fast. We discuss the usefulness of wavelets for power spectrum and cosmological parameter estimation. We conclude that at present wavelet functions are most suitable for identifying localized sources. Key words: cosmic microwave background--methods: statistical.

1

INTRODUCTION

The cosmic microwave background (CMB) radiation encodes a vast amount of information ab out the early universe and its subsequent evolution. A numb er of exp eriments: ground-, balloon- and space-based, are p oised to generate large data sets from which one hop es to decode this information. Optimal data analysis methods to b e utilized with these data sets are presently an active area of research. In this pap er we explore the application of wavelet methods to the analysis of CMB data sets. Cosmological theories usually model the CMB sky as a homogeneous random field on the sphere. Spherical harmonics arise naturally through the sp ectral decomp osition of the field. They are optimal for frequency localization--indeed, they define spatial frequencies, , on the sphere--but they do not allow assessment of local features. To represent a signal that is nonzero only on a small patch of the sky a large numb er (formally infinite) of spherical harmonics are required to obtain all necessary cancellations outside the patch. Wavelets, on the other hand, are locally supp orted and therefore, only those supp orted in the patch are needed to represent the signal. Wavelets have b ecome an attractive tool for data analysis and compression b ecause they are computationaly effic 0000 RAS

cient and have b etter time-frequency localization than the usual Fourier methods. Although wavelets are usually defined on the real line, or subsets of Rn , there have b een some recent generalizations to other surfaces like the sphere. In this pap er we use spherical wavelets when dealing with large p ortions of the sky and planar wavelets for small patches. Spherical Haar wavelets (SHW) were introduced by Sweldens (1995) as a generalization of planar Haar wavelets to the pixelized unit sphere. The basic idea of SHW is to transform the original map into the sum of a smoothed lower resolution map and a set of coefficients encoding the smallangular-scale details not captured by the smoothed map. The smoothing is done by lowering the resolution of the map through a hierarchical pixelization scheme. Thus, pixelizations of sky maps automatically yield SHW, and these are useful to study data as a function of p osition and angular scale. SHW are not smooth but are nonetheless attractive b ecause they can b e easily implemented and used as a computationally efficient exploratory data analysis tool. In Section 2 we present a wavelet formalism on the sphere adapted to CMB sky maps. We then use this formalism to compare structure in the COBE-DMR maps with predictions of exp ected structure based on model dep endent simulations. A common thread throughout the rest of the pap er is


2

L. Tenorio et al.
Up to normalizing constants, the exp ected value of the RMS2 of the map is T
i 2 i

that of denoising a noisy signal. Because wavelet functions have excellent localization prop erties and their `thresholding' is the asymptotically b est way to remove additive Gaussian white noise (Donoho 1992), they are useful for b oth the identification of sources emb edded in Gaussian noise and for suppression of such noise. A further attractive feature of wavelet denoising is that the process is computationally efficient. In Section 4 we use wavelets for the identification of p oint sources emb edded in noise. We do not use SHW for this particular application b ecause SHW are not smooth and thus are not optimal for estimating smooth localized sources. We do not know of any computationally efficient smooth wavelets on the sphere. For these reasons the algorithm we prop ose is more appropriate for small patches of the sky (where the flat sky approximation is appropriate) and we use tensor products of one dimensional Daub echies wavelets. Wavelet decomp osition of CMB maps for the purp ose of identifying non-Gaussianity has b een considered recently by Ferreira, Magueijo & Silk (1997), Pando, Valls-Gabaud & Fang (1998), and Hobson, Jones & Lasenby (1998). Several other groups are concentrating efforts on usage of wavelet for foreground identification and removal (e.g., Cayon 1999, Sanz et al. 1999). Current methods for compressing and denoising CMB maps are based on Karhunen-Loeve transformations (Bond 1995, Bunn & Sugiyama 1995) or Wiener filtering (Bunn et al. 1996). For a sky map with N pixels the computational cost of these methods is O(N 3 ) so their brute force application to future mega-pixel maps (N 106 ) is an unsolved (and p erhaps unsolvable) computational challenge. Map denoising by wavelet thresholding is significantly more efficient taking only O(N log (N )) op erations. For example, applying the wavelet transform, denoising and reconstructing a 105 pixel map takes less than 6 seconds on an Alpha 500/266. Furthermore, wavelet thresholding can b e done in a model indep endent way. In Section 5 we present such a technique using SHW. It has b een p ointed out rep eatedly (e.g., Bond, Jaffe & Knox 1998, Borrill 1999, Bond et al. 1999 and references therein) that b ecause of the size of future data sets new analysis techniques will need to b e develop ed. Techniques that can efficiently extract cosmological information from mega-pixel sky maps. In Section 6 we discuss the usefulness of wavelets for this task. Section 7 summarizes the results. A summary of SHW applied to the COBE pixelization can b e found in the App endix.




(2 + 1)C /4 .

(3)

An observation, di , of this (pure-signal) map is the sum of the map with some noise comp onent, di = Ti + ni , so d
i 2 i




(2 + 1)C /4 + Nii

(4)

where Nii is the noise correlation matrix, and in particular 2 Nii = i is the noise variance of pixel i. We now introduce an analogous formalism, expressing the map RMS2 using wavelets. Since wavelets are localized b oth in the spatial and frequency domain, a wavelet expansion easily leads to partitioning of the RMS2 into comp onents from different angular scales and from distinct p ortions of the map. We introduce the formalism in terms of spherical wavelets, most suitable for maps covering large p ortions of the sky. Spherical Haar wavelets were introduced by Schroeder & Sweldens (1995) as an example of what they call secondgeneration wavelets. These are wavelets which are not dilations and translations of a mother wavelet and which can b e defined in spaces more general than Rn . As far as we know they were the first to introduce computationally efficient wavelets on the sphere. Sweldens (1995) applied SHW and other second-generation spherical wavelets to data compression. The SHW transform of a map d = (di ) =
J

of maximum resolution J is (see App endix) Wd = = (Jo , Jo , · · · ,
J -1

)t ,

where the vector Jo is the map at the lowest resolution, Jo , and j , Jo j J , is a vector of wavelet coefficients at scale j . We define the scale-j p ower in the wavelet domain as a normalized sum of squares of the coefficients at scale j . To interpret this measure we show its relation to the RMS2 2 2 of a map: RMS2 = i wi d2 / i wi , for chosen weights wi . i The wavelet transform (where the weights wi are implicitely included in J ) yields an angular-scale decomp osition of the RMS2 1 RMS2 = ( J )t J w 1 = [ ( Jo ) t A Jo Jo + ( Jo ) t B Jo Jo w + · · · + ( J -1 ) t B J -1 J -1 ] = RMS2 o + · · · + RMS2 , J J
2 wi

2

WAVELET POWER AND CMB SKY MAPS

(5)

A CMB map is a vector T = (Ti ) of temp eratures T in pixels i, located at p osition xi . We decomp ose this vector-- ^ which is also a field on the sky--into its spherical-harmonic comp onents, T (x) = ^
m

a

m Y m

(x). ^

(1)

We can then define the p ower sp ectrum in the usual way as the exp ectation of the square of the comp onents a
m a m

= C mm .

(2)

and AJo , BJo , ..., BJ -1 are diagonal where w = i matrices defined by the wavelet transform (see App endix). Expression (5) separates contributions to the RMS2 from different resolutions. Each RMS2 corresp onds to a window i function of C . Table 1 shows the results of the RMS2 decomp osition of some spherical harmonics of different order . The table shows the p ercentage of the total RMS2 that goes into each scale. As exp ected, as increases a higher p ercentage of the p ower goes into the higher scales (i.e., higher j ). Table 1 also shows that spherical harmonics are not well localized in wavelet scale.
c 0000 RAS, MNRAS 000, 000­000


Applications of Wavelets to CMB Maps
Table 1. Percentage of the total RMS2 that goes into each scale j for some spherical harmonics of different orders . The lowest and highest resolutions are Jo = 5 and J = 8, respectively. 2 20 60 500 RMS2 5 100 73 3 2 RMS2 6 0 20 53 6 RMS2 7 0 6 32 21 RMS2 (%) 8 0 1 12 71

3

Figure 1. Histograms of the zj statistics for 1000 simulations of beam smoothed, constant power (( + 1)C = constant) largeangular-scale maps, 28. The dashed lines correspond to the values of zj for the actual DMR 53+90 GHz DMR.
250 200 150 100 50 0 -4 250 200 150 100 50 0

functions are still orthogonal and the null distributions are exactly as b efore (with smaller kj ). We note that with only a 10 galactic cut (i.e., with considerable galactic contamination) the values of the zj obtained from the DMR maps are 71.2, 5.9, and 1.6 for j = 4, 5 and 6, resp ectively. The histograms representing the simulations, which have only a CMB signal in them, remain virtually unchanged. As exp ected, more significant structure is detected when some of the galactic signal is present, and this structure is not consistent with a cosmological signal alone. At resolution 4 the mean of the simulations and the actual map are b oth considerably displaced from zero. At resolution 5 z5 is somewhat less displaced from zero and at resolution 6 z6 is consistent with zero. This is consistent with the b eam scale b eing right around the pixel size at resolution 5 so there is structure on that scale and at the lower resolution, but no structure at the higher resolution. As an aside, note that the covariance matrix = of is = W (C +
noise

)W t ,

(7)

-2

0 z6

2

4

6

0

2 z5

4

6

8

250 200 150 100 50 0

where C and noise are the covariance matrices of the model and noise, resp ectively. In general the matrix , which is used to transform the covariance matrix of to the identity, is =
-1/2

.

(8)

10

20

30 z4

40

50

It is useful to construct the standardized sum of squares, SSQj = (j )t j , where is a diagonal matrix that transforms the covariance matrix of the SHW coefficients to the identity. Assuming indep endent Gaussian noise the null distribution of SSQj , i.e., its distribution in the absence 2 of signal, is a kj N (kj , 2kj ), where kj is the subspace dimension at scale j . To search for structure in the map at scale j we use the statistic zj = SSQj - k 2k
j j

If we take no correlations from a cosmological model (C = 0) and the noise is uncorrelated and homoscedastic (i.e., noise I), then is diagonal. But is a lot more complicated when a cosmological model is assumed. Our analysis has not taken full advantage of the localization prop erties of wavelets. SHW can easily yield a spatial decomp osition. First, divide the sky into P disjoint patches. Each patch i can b e represented as J,i by filling with zeros those pixels not in patch i. All the J,i are orthogonal and RMS2 = 1 [(J,1 )t J,1 + · · · + (J,P )t J,P ]. w An angular-scale decomp osition like (5) can now b e applied to every term.

.

(6)

3

WAVELET DENOISING OF A MAP

With uncorrelated noise and known noise RMS the zj are centered at zero if and only if there is no structure at scale j , regardless of the Gaussianity of the noise. Also, if there is no structure then, by the central limit theorem, the zj are approximately Gaussian-distributed even with uncorrelated non-Gaussian noise. The distribution of zj is thus an indication of structure. We apply our statistic to test for structure in the COBE DMR map. Figure 1 shows histograms of the zj (Eq. 6) for 1000 simulations of large-angular-scale maps 28, with ( + 1)C = (24 /5)Q2ms-PS , with Qrms-PS = 15.3 µK, 7 r FWHM b eam smoothing and the noise level of the DMR 53+90 GHz map. The dashed lines in the figure corresp ond to the actual values of zj for the DMR 53+90 GHz map. The values are consistent with those exp ected from the simulations. Both the simulations and the DMR maps have a 20 galactic cut. SHW are easily adaptable to sky maps with Galactic cuts or with any others kinds of gaps. The wavelet
c 0000 RAS, MNRAS 000, 000­000

In the following sections we apply wavelet denoising with two different applications in mind. In the first application denoising will b e used to identify and remove localized sources from a CMB map. In the second application denoising will b e used to suppress instrumental, or other non-localized noise, and compress a map. In b oth op erations we use the same denoising technique. The techniques discussed are applicable to b oth spherical and flat space wavelets. Both typ es of wavelets will b e considered. Let di b e the measured temp erature in pixel i, we write di = Ts,i + Fi + ni , where Ts,i is the underlying temp erature at pixel i, Fi is a foreground source temp erature, and ni is a noise term describing noise sources which are not localized on the sky (e.g., instrumental, atmospheric ). The assumptions we make ab out Ts,i dep end on whether we remove sources or suppress noise in a map. In the first case we assume that Ts,i and ni have zero mean and are uncorrelated, and that Fi is fixed and unknown. Ts,i is assumed


4

L. Tenorio et al.
p ossible mean square error of the wavelet coefficients estimates. Under certain conditions minimax estimators are p osterior Bayesian estimates given a least favorable prior, see Lehmann & Casella (1998) for a more complete discussion regarding the relation b etween minimax and Bayesian estimators.

random, drawn from some p ower sp ectrum, and we attempt to discriminate fixed sources from random realizations of Ts in our sky. In the second application, when suppressing noise in a map, b oth Ts,i and Fi are assumed fixed. The goal is to suppress ni and thus obtain a b etter estimate of the actual realization of the signal in our sky. In either case the procedure consists of `denoising' by thresholding the wavelet coefficients of the data and transforming back. The difference b etween the two cases arises in what is assumed as noise during the thresholding stage. This will b e discussed in Section 3.2. We start by computing the wavelet transform of the map. Wavelet coefficients are then either set to or shrunk towards zero dep ending on whether their absolute values are ab ove or b elow a chosen threshold. Finally we compute the inverse wavelet transform of the thresholded coefficients. We call this procedure `denoising' (see Donoho 1992 for a discussion of this term.) Although the procedure is similar to the one often used with Fourier coefficients, Donoho (1992) showed that wavelet denoising efficiently supresses noise while preserving the sharpness of the original signal. This prop erty makes wavelet denoising app ealing for our applications. 3.1 Thresholding

3.2

Noise

`Thresholding' is a prescription for using the sample wavelet coefficients j,m to estimate the true wavelet coefficients, i.e., the wavelet coefficients of the noiseless map. We use thresholding functions of the form (Donoho 1992) (
j,m

)=

0

j,m

- +

j

j,m

j

if j,m j if |j,m | < j if j,m -j ,

(9)

where j,m is a wavelet coefficient of the map and j is a chosen threshold at scale j . A thresholding scheme corresp onds to a particular prescription for choosing the j . The j,m are now replaced by the (j,m ) which are used as the "denoised" wavelet coeffecients. Given a coarsest resolution Jo , Donoho & Johnstone (1994) suggest keeping the coefficients Jo where the signal dominates the noise and thresholding the rest. Wavelet thresholding is an active area of research. Comparing different thresholding methods is outside the scop e of this pap er. See Nason (1995) for a survey of thresholding methods. We use the SureShrink procedure of Donoho & Johnstone (1995). Ideally one wants to use the threshold that minimizes the mean square error (MSE) MSE =
m

( (

j,m

) - µj,m )2 ,

(10)

Denoising requires information ab out what constitutes noise and what its distribution is. When we use wavelets to identify foreground sources, Fi , b oth the cosmological signal Ts,i and the noise ni are assumed to b e random comp onents of the noise that contaminates the wavelet coefficients, i.e., 2 2 2 = CMB + noise , and the denoised wavelet coeffecients are identified with the sources alone. In the second application, when we only suppress noise in a map, only ni is considered noise, i.e., = noise , and the signal comp onent is taken to b e the cosmological CMB comp onent, Ts,i . These values of are used by SureShrink to compute the estimate of the MSE (10). In this pap er we assume that pixel noise is uncorrelated and Gaussian. This was approximately the case for the DMR maps (Lineweaver et al. 1994) but it may not b e a reasonable assumption for other exp eriments. For cases where the noise is correlated we p oint out the following. If the correlation structure is known then the map can b e transformed to the uncorrelated case; however, dep ending on the size of the covariance matrix, such a transformation might b e computationally exp ensive. Moreover, this transformation is not appropriate when searching for localized sources b ecause it destroys and widens their shap e. More generally, such a decorrelating transformation no longer gives a "sky map", but a linear combination of sky pixels. Only in the case of suitably mild correlations can such a decorrelated dataset still b e considered a map p er se. One should also note that for the process of identifying localized sources, when the signal Ts,i is considered part of the noise, the cosmological correlation of Ts,i must also b e included in the noise correlation structure. Dep ending on the amount of correlation, normalizing only by the variance (as for uncorrelated noise in Section 2) may b e sufficient. Johnstone & Silverman (1997) show that SureShrink still works well under mild conditions on the correlation structure. They show that long range correlations in the data are reduced by the wavelet transform, that there is almost no correlation b etween coefficients from different scales, and that Stein's MSE estimate is still unbiased.

where µj,m is the mean of the coefficient j,m . However, the means µj,m are unknown and therefore so is the MSE (10). To overcome this difficulty SureShrink uses Stein's unbiased estimate of the MSE (see Donoho 1992). SureShrink selects a threshold j at each scale j that is the b est, for rules of the form (9), in the sense that it minimizes Stein's estimate of the MSE (10). The computational cost at each level is O(kj log(kj )). Denoising a map with gaps is done in exactly the same way but using only those coefficients based on pixels outside the gaps (see App endix). SureShrink wavelet thresholding is asymptotically minimax (Donoho & Johnstone 1995), i.e., it minimizes the worst

4

ESTIMATION AND REMOVAL OF POINT SOURCES WITH WAVELETS

We now use the denoising procedure to reduce localized source contamination in sky maps. Here Ts,i and ni are assumed to have zero mean and to b e uncorrelated, while Fi is fixed and unknown. The goal is to estimate the foreground source field F. Since the noise and the cosmological model have zero mean (we assume that the CMB monop ole has b een subtracted), uncontaminated denoised wavelet coefficients should b e close to zero. Our goal is to find an estimate
c 0000 RAS, MNRAS 000, 000­000


Applications of Wavelets to CMB Maps
Figure 2. Estimating a source field (top left) from a noisy source field (top right) by thresholding Haar (bottom left) or Daubechies (bottom right) wavelet coefficients. All sources have identical signal, and the s/n is 3, where s/n is defined as (peak amplitude)/(RMS noise). Daubechies wavelets yield higher s/n estimates of the source location and shape.
Source Field Sources + Noise

5

removal. Note also that source estimates are very close to flat wherever there are no sources. This minimizes p otential systematic effects arising from false source identification. 4.1 Single Source in White Noise

6 4 2 0 -2 -4 -6 80 60 40 20 0 0 Haar 40 20 60 80

6 4 2 0 -2 -4 -6 80 60 40 20 0 0 Daubechies 40 20 60 80

6 4 2 0 -2 -4 -6 80 60 40 20 0 0 40 20 60 80

6 4 2 0 -2 -4 -6 80 60 40 20 0 0 40 20 60 80

To learn ab out the denoising process we first apply the procedure to a single source buried in white noise. Table 2 (left) shows the relative error in reconstructing the source. We cal^ culate RMS(F - F)/RMS(F) inside the source, where `inside' is defined as pixels where F > 1% of the noise RMS. This measure of relative error is an indication of p ower in the source field not accounted for in the estimated field. The s/n is the ratio of the source p eak to the noise RMS. The size is the ratio b etween the numb er of pixels inside the source and the total numb er of pixels in the patch in p ercent. The first and second rows corresp ond to Haar and Daub echies wavelets, resp ectively. In all cases Daub echies wavelets yield smaller relative errors. In Table 2 (right) we quantify to what extent the wavelet thresholding affects regions originally not contaminated by the source. We calcu^ late the ratio RMS(F - F)/RMS(noise) outside the source. The table shows that the ripples introduced in the originally uncontaminated region have an amplitude of less than 20% of the noise RMS, well within the noise level, and that this amplitude is very nearly constant with source size. 4.2 Spectrum of Source Intensity

^ F of the foreground field F by thresholding the wavelet coefficients. We then use a criterion to identify sources in the field. We suggest masking pixels in identified locations. To estimate the source field we normalize the coefficients using the matrix defined by Equation (8). To estimate the variability of the wavelet coefficients at each resolution, i.e., the diagonal elements of , we use the median absolute deviation. This measure is resistant to outliers and thus less affected by p oint sources. Once coefficients have b een normalized they are denoised using SureShrink. By assumption, the means in Eq. 10 satisfy µj,m = 0 for all j,m not contaminated by sources. Using SHW on the whole sky has some drawbacks. First, SHW are not smooth and therefore do not provide good estimates of smooth p eaks of localized sources. (Localized sources are at least as broad and smooth as the b eam of the exp eriment.) Also, since local sources are sparse they are difficult to detect when considering the entire sky. For noise dominated data SureShrink uses a large threshold prop orlog(kj ). These two problems are alleviated by tional to taking smaller sky sections where smooth planar wavelets can b e used. Patches are easily defined through the map pixelization by taking lower resolution pixels as patches. On each patch we use tensor products of one-dimensional Daub echies wavelets. (We found that b etter results were obtained with four vanishing moments, and that other wavelet bases do not lead to significantly different results from those obtained with Daub echies wavelets.) Note that SHW on small patches reduce to tensor products of Haar wavelets. Figure 2 shows the original (six sources of the same amplitude) and original plus white noise (s/n = 3, where s/n is the ratio of the p eak amplitude to the noise RMS) source fields on a patch, and thresholded estimates using planar Haar and Daub echies wavelets. Daub echies wavelets recover the p eaks of the sources b etter than Haar wavelets. Hence we use Daub echies wavelets for p oint source identification and
c 0000 RAS, MNRAS 000, 000­000

We will now attempt to identify and remove a more realistic distribution of sources from a map. We generate a source field using a p ower-law d N (s ) s
-2.5

ds,

(11)

where N is the numb er of sources and s is signal-to-noise ratio. (As b efore, s/n is the ratio of p eak amplitude and noise RMS.) Source centers are randomly distributed on the patch and only sources with s > 1 are included. The p ower-law exp onent, -2.5, is appropriate for a p opulation of sources with a single luminosity distributed randomly through an infinite, flat, euclidean space. The total numb er of sources in the patch is determined by the choice of different ratios (total area inside sources)/(total patch area), where `inside' is defined as in the previous section. The noise is assumed to b e either white or characterized by a Gaussian correlation function with damping scale as a fraction of the maximum wavenumb er. We used two Gaussian random fields, G1 and G2 , with damping scales 0.8 and 0.6, resp ectively. The field G1 has correlations of approximately 12%, 2% and 1% b etween first, second and third neighb ors, resp ectively. The correlations for the second Gaussian field, G2 , are 20%, 3% and 1%. Rather than using a particular CMB model, we used the Gaussian correlated noise as a generic model for a correlated signal. As we will see our results are not sensitive to the particular correlation structure assumed. To determine source locations in the thresholded map ^ we p erform a second thresholding on F. This is done by comparing pixel absolute magnitudes with k â(MAD), where ^ MAD is the median absolute deviation of F, for different values of the threshold k. Figures 3, 4 show the results of the p oint-source identification process for sources that were ab ove 1 and 3 ,


6

L. Tenorio et al.
identified as sources. To reduce the numb er of `incorrect detections' over the entire map, which presumably consists of many patches, we may choose not to sub ject every patch to this source cleaning procedure. Rather, it may b e advantageous to first calculate the value of zj for the patch, as describ ed in Section 2, and then proceed with the source cleaning procedure only on patches with abnormally large values of zj at some scale j . Further analysis and simulations are necessary to optimize the method. One can compare the p erformance of wavelets in identifying p oint sources to the p erformance achieved by a numb er of alternative techniques. A detailed study is in progress. Here we only compare with the simple practice of removing map pixels whose amplitudes are larger than k , where is the RMS of the noisy source field. The dotted lines in Figures 3, 4 corresp ond to a 1 and 3 cuts, resp ectively. Wavelets achieve a significantly higher prop ortion of correct detections, and less errors are made when searching for low s/n sources. Interestingly, the k cut has fewer errors when searching for high s/n sources. That suggests that the optimal p oint source identification procedure may involve a combination of techniques, with wavelets b eing the leading tool.

Figure 3. (a) Proportion of correctly detected source pixels that were above 1 as a function of Area for different thresholds k âMAD. `Area' is the fraction of pixels covered with sources, k is an integer and MAD is the mean absolute deviation of the denoised map - see text for more details. (b) Proportion of pixels outside the source that were incorrectly identified as source pixels. As a comparison, the dotted lines correspond to the process of detecting source pixels by setting a simple 1 cut on the map.
(a) 1 (b) k=2 k=3 0.1 0.6 k=5

0.8

0.4

0.05

0.2

0

0

5

10 15 Area (%)

20

25

0

0

5

10 15 Area (%)

20

25

Figure 4. (a) Proportion of correctly detected source pixels that were above 3 as a function of Area for different thresholds k âMAD. `Area' is the fraction of pixels covered with sources, k is an integer and MAD is the mean absolute deviation of the denoised map - see text for more details. (b) Proportion of pixels outside the source that were incorrectly identified as source pixels. As a comparison, the dotted lines correspond to the process of detecting source pixels by setting a simple 3 cut on the map.
(a) 1 (b) k=2 k=3 0.1 0.6 k=5

5

MAP DENOISING AND COMPRESSION

0.8

0.4

0.05

0.2

0

0

5

10 15 Area (%)

20

25

0

0

5

10 15 Area (%)

20

25

resp ectively. For each figure, plate (a) shows the prop ortion of correctly identified source pixels for k = 2, 3, 5 and plate (b) shows the prop ortion of pixels outside sources that were incorrectly identified as source pixels, for the same k values. The area is defined as the prop ortion of source pixels ab ove 1 . As k increases the probability of correct detection decreases as does that of incorrect detections. The results in Figures 3, 4 are for sources in white noise but no significant difference was observed with Gaussian correlated noise. Even a Gaussian field with damping scale 0.3 did not show significant differences. The figures show that wavelets are extremely effective in identifying p oint sources. When we are looking for sources of moderate s/n (Figure 4) which cover up to 5% of the patch area, virtually all source pixels are identified regardless of the value of k. For low s/n sources (Figure 3), a criterion of k < 3 identifies more than 80% of the source pixels. Comparing Figures 3 and 4 we see that for wavelet thresholding the prop ortion of false detections does not change much with the s/n of the source b eing searched for. In addition, simulations show that when there are no sources in the patch only ab out 1% of the pixels in the patch are incorrectly

In this section we apply wavelet denoising with the goal of compressing information and reducing noise for visualisation of CMB maps. Denoised maps are useful for displaying or transmitting compressed map files. Here we assume that the CMB sky, Ts , is fixed and that the random field we are trying to isolate and remove is the instrumental noise. Hence, only the noise variability is included in the matrix (Eq. 8), and it is estimated using only the wavelet coefficients at the highest resolution where we exp ect little structure. Alternatively, could b e estimated using the difference of two indep endent maps (as with the A - B DMR maps). This time SureShrink determines the threshold j that minimizes (10) with µj,m b eing the unknown wavelet coefficients of the fixed sky Ts . The denoised estimate of Ts is then W-1 ( ), the inverse transform of the thresholded coefficients. The computational cost is O(N log (N )). Note also that the estimate of the underlying temp erature field achieved in this process is model indep endent. In other words, it did not require a cosmological model. In other work Wiener filtering has b een used for denoising CMB maps (Bunn et al. 1996). The Wiener filter estimate of Ts is Cd, where the matrix C is determined by assuming a cosmological model for Ts and minimizing ||Cd - Ts ||2 . The process takes O(N 3 ) op erations and dep ends on the fiducial model chosen. Thus, while Wiener filtering provides a p osterior mean estimate given a prior cosmological model, wavelet thresholding provides an estimate of the actual realization of the unknown field without assuming a cosmological model. The Wiener filter can b e defined as the maximum-probability signal contribution given the data, the underlying signal p ower sp ectrum and noise correlation, and assuming Gaussianity for b oth signal and noise. SureShrink wavelet thresholding, on the other hand,
c 0000 RAS, MNRAS 000, 000­000


Applications of Wavelets to CMB Maps
^ Table 2. Left: Relative error RMS(F - F)/RMS(F) in estimating a field with a single source embedded in white noise. The relative error is calculated using only pixels inside the source, where 'inside' is defined in the text. The s/n is the ratio of the source peak to the noise RMS. The size is the percentage of pixels of the patch inside the source. The first and second rows correspond to Haar and Daubechies wavelets respectively. Right: the ratio of the residual RMS to the noise RMS outside the source. s/n 2 4 6 18 . . . . . . 33 26 23 17 19 13 11 . . . . . . 64 53 41 29 32 22 size (%) 5 3 . . . . . . 83 73 52 40 42 31 1.02 .93 .63 .51 .52 .41 1 1.34 1.28 .81 .74 .67 .57 18 . . . . . . 11 12 15 16 17 17 11 . . . . . . 11 11 15 16 16 16 size (%) 5 3 . . . . . . 11 11 15 16 16 16 . . . . . . 11 11 15 16 16 16 . . . . . . 1 10 10 15 15 16 16

7

Table 3. Noise reduction and compression achieved by SureShrink denoising of a standard CDM model for different signal to noise ratios. signal/noise .9 .7 .6 RMS(residuals)/RMS(noise) .82 .65 .57 % coefficients used 33 14 7

wavelet domain when the goal is to estimate C s and cosmological parameters. Assuming Gaussianity of b oth the cosmological signal and the noise, the log-likelihood of the data given a model with covariance matrix M is - 2ln L(pi |d) = = ln |M| + dt M
t -1

d M
-1

ln | | + W

-t

W

-1



(12)

is aymptotically minimax (see Section 3.1 and Donoho & Johnstone 1995). Figure 5 shows the denoising of a simulated standard CDM map with b = .05, = 1, H0 = 50 km/s/Mp c. The signal to noise in the signal plus noise map (middle) is .7, where the signal to noise is the ratio of the noise RMS to the signal RMS. There are 98304 pixels in the map. The denoised map (b ottom) achieved a 35% noise reduction, as measured by the ratio RMS(true map - denoised map)/(noise RMS), using only 14% of the 98304 wavelet coefficients. It also has only 14% of the original numb er of pixels, but pixels have a variety of sizes, adapted to the local structure as identified by the thresholding procedure. The entire denoising process took less than 6 seconds on an Alpha 500/266 workstation. The compression rate that denoising can achieve dep ends on the signal to noise. As it increases more coefficients have to b e used. Table 3 shows the noise reduction and compression rate for different signal to noise ratios using the same standard CDM model. Wavelet denoising is a flexible procedure in that it allows one to choose the threshold level which will achieve a particular compression rate regardless of the denoising achieved. In our case the thresholds were chosen to minimize an estimate of the mean square error (10) but this may not b e the optimal procedure. Dep ending on the goal at hand one may b e able to devise a thresholding procedure tailored for a particular task.

6

SHW AND COSMOLOGICAL INFORMATION

It has b ecome traditional to consider the cosmological information from CMB datasets in the context of likelihood functions, usually expressed in pixels or in the spherical harmonic domain. Since no information is lost by transforming to wavelet space we can ask whether it is b etter to work in
c 0000 RAS, MNRAS 000, 000­000

(up to an additive constant) where pi are the cosmological parameters and = Wd is the wavelet transform of the data d. M = C + noise is the sum of contributions from the signal and the noise covariance. In the second equality we have transformed from the map pixel basis to the wavelet domain (see also Eq. 7). To estimate the pi it is enough to consider the eigenmodes of C which contribute the most to L (this is the Karhunen-Loeve or "signal-to-noise eigenmode" transformation, Bond 1995). This simplifies the likelihood by compressing the data. The reduced likelihood can b e used to estimate cosmological parameters. Can we apply the same technique on the thresholded wavelet coefficients? Because the thresholding procedure is nonlinear the distribution of the denoised = ( ) is no longer Gaussian and therefore Eq. 12 written in terms of is no longer the correct log-likelihood. It is computationally exp ensive to compute the correct likelihood function of the thresholded coefficients for each plausible cosmological model. We can still maximize Eq. 12, with resp ect to pi , to estimate the parameters but the p erfomance of this procedure is still to b e investigated. The difficulties due to the non-linearity of the wavelet thresholding process can p erhaps b e resolved by treating the process as a linear op eration. That is, we consider the thresholding op eration of Eq. 9 as if the thresholds j are fixed b eforehand. Then, the new wavelets coeffecients (j,m ) can b e considered as linearly related to the old coeffecients: (j,m ) = j m j,m . This is similar to, for example, Wiener filtering, where each Fourier mode is modified by some factor 0 k 1 (although in that case the process is strictly linear since the k do not dep end up on the data). Thus, the denoised wavelet coeffecients can b e considered a much smaller linear rendition of the original data. The correlation structure may indeed b e much more complicated than the raw data, but b ecause there may b e many fewer coefficients, the O(N 3 ) op erations necessary for manipulating the likelihood are much faster. This procedure is under further investigation.


8

L. Tenorio et al.
Figure 6. Histograms of the z4 statistic for the same kind of simulations as in Figure 1 but with spectral index n = 0.5 (instead of n = 1), where n is the spectral index of the primordial desnity fluctuation power spectrum which in turn gives ( + 1)C n-1 . The distribution of z4 under this model is clearly different from that of the n = 1 model (Figure 1). The dashed line indicates the value of z4 for the DMR 53+90 GHz map.
250

Figure 5. Top: a simulated, signal only, CMB map based on a standard CDM model. There are about 98000 pixels in the map. Middle: the signal map with noise added. The s/n in this map is .7. Bottom: The noisy map after denoising with SureShrink. Only 14% of the wavelet coefficients were used, and the denoising process took less than 6 seconds on an Alpha 500/266 workstation.

200

150

100

-200.00

231.00

50

0

0

5

10

15

20 z4

25

30

35

40

-200.00

231.00

-200.00

231.00

Since working with the denoised coefficients seems problematic one could use the likelihood without denoising the coefficients. But then nothing seems to b e gained by working in wavelet domain since the quadratic term in (12) does not split into indep endent comp onents of different scales--the second equality ab ove is not easier to compute than its version in the pixel domain. Consider, for example, a quadratic estimator t EJ of C (e.g., Tegmark 1997; Bond, Jaffe & J Knox 1998). Since the wavelet transform acts as a band-pass filter, one might hop e to reduce leakage and decrease computational cost by considering only high resolution coefficients to estimate small-angular-scale (large ) C , splitting t EJ J as in (5) and taking only high resolution coefficients. There

are three problems: (1) Spherical harmonics have comp onents in the lower wavelet scales (Table 1) that need to b e estimated; (2) Wavelet coefficients j are orthonormal with resp ect to the identity but not with resp ect to any matrix E. In general the cross terms in a quadratic estimator using a decomp osition of the form (5) do not cancel; (3) The numb er of coefficients at each scale grows exp onentially; using only the highest resolutions yields little compression. For example, 93% of the coefficients of a resolution 10 map b elong to the two highest resolutions. Because of these problems, the estimator (like the likelihood function itself ab ove) does not app ear to b e any easier to compute in the wavelet domain. Another approach is to consider using the wavelet coefficients themselves as intermediate parameters in the process of cosmological parameter estimation. A homogeneous random field is characterized by its p ower sp ectrum C , which is in turn parametrized by the cosmological parameters. In principle cosmological parameters can b e estimated by fitting to the C . Power in the wavelet domain provides some information ab out cosmological parameters. For example, Figure 6 shows the distribution of the normalized p ower at resolution 4 (z4 ) for the same large-angular-scale models of Figure 1 but now using n = 0.5 (instead of n = 1), where n is the sp ectral index of the primordial density fluctuation p ower sp ectrum which in turn gives ( + 1)C n-1 . Comparing this histogram to the one in Figure 1 it is obvious that the distribution of z4 can discriminate b etween the two models. However, even future high resolution CMB maps will only b e of resolutions j 10, and only the highest resolutions j 7 will b e most useful for extracting cosmological parameters . Using the wavelet p ower at scales j = 7, 8, 9, 10 will not yield enough degrees of freedom to fit 11 cosmological parameters. This paucity of information in the wavelet domain is b ecause the numb er of indep endent wavelet p ower coefficients (the zj s) scales only as the logarithm of the numb er of pixels. In contrast the numb er of indep endent p ower sp ectrum coeffecients C (which completely determine the prop erties of the homogeneous and isotropic Gaussian CMB field) scales as the square root of the numb er of pixels. Usc 0000 RAS, MNRAS 000, 000­000


Applications of Wavelets to CMB Maps
ing the wavelet p ower as we have defined it here will not b e sufficient. We could instead consider abandoning b oth C and the wavelet p ower defined here in exchange for a sp ectrum more localized in b oth location and scale. The question is whether we can find a sp ectral representation of homogeneous random fields on the sphere in terms of localized functions. However, it is easy to show that there is no smooth basis {i,j }, with local supp ort decreasing with j , with the property that any homogeneous random field on the sphere can 2 i j where i,j , i ,j = j i j . b e written as i,j i,j i,j That is, any other "p ower sp ectrum" we define in terms of locally supp orted functions will have an extended "window function" in , as with the wavelet coeffecients of Section 2; the usual C sp ectrum is still the most natural choice.

9

Bayesian thresholding methods that can b e used to include more informative prior information on the wavelet coefficients. See Abramovich et al. (1997) and references therein. The development of wavelet tools in statistics is a very active field of research. For a review see Silverman (1999) as well as the other articles in the same volume. The computer code used for work presented in this pap er will soon b e freely available at http://cfpa.b erkeley.edu/combat. Acknowledgements. This work was done under the auspices of the COMBAT collab oration supp orted by the NASA AISR program, grant NAG5-3941. We are grateful to L. Cay´n, D. Donoho , P. Ferreira and P. Stark o for helpful suggestions. Simulations of CMB maps were simplified thanks to Seljack & Zaldarriaga's CMBfast and E.Scannapieco's CMAP (http://cfpa.b erkeley.edu/combat). We thank G. Smoot for allowing access to his Alpha cluster. S.H. was partially supp orted by NASA grant NAG53941, A.J. by NASA LTSA NAG5-6552, and L.T. by grants NAG5-3941 and DMS-9404276. Part of this work uses the Wavelab package, available from http://wwwstat.stanford.edu/wavelab/.

7

SUMMARY

Wavelets provide a useful tool to investigate local structure in maps. We used SHW to define a local measure of p ower that is equivalent to a normalized local angular scale decomp osition of the map RMS. This measure can b e used to compare p ower in different regions in the sky. Similar measures can b e defined with other bases of orthogonal wavelets. SHW have the advantage of b eing easily implemented given any map pixelization scheme. SHW can also b e used to denoise and compress maps without doing any diagonalization or inversion of matrices. We denoised and compressed a 100,000 pixel CMB map by a factor of 10 in 5 seconds achieving a noise reduction of ab out 35%. In contrast to Wiener filtering the compression process is model indep endent. The total computational cost of wavelet transforming and thresholding a region with N pixels is O(N log (N )). Since wavelets are locally supp orted they can b e used to reduce localized source contamination in maps. We applied planar Daub echies wavelets for the identification and removal of p oints sources from small planar-like sections of sky maps. Our technique successfully identified most p oint sources which are ab ove 3 and more than 80% of those ab ove 1 . In addition, wavelet thresholded estimates of source fields introduce little structure in uncontaminated regions. We have concentrated on local source detection and map compression. We did not find useful direct applications of spherical wavelets to the estimation of angular p ower sp ectra or cosmological parameters. Ideally we would like to have a wavelet basis which approximately diagonalizes the covariance matrix of the comological model as well as the noise covariance matrix, thus simplifying the likelihood function. This is not the case for SHW. But in the future there may b e hop e for spherical wavelets that are more compatible with spherical harmonics. See for example Freeden & Windheuser (1997), and references therein. They define smooth spherical wavelets that combine a spherical harmonic expansion for the low order terms and wavelet expansions for the small scale structure. However, they do not yet have a fast wavelet transform. In Section 3.1 we p ointed out that SureShrink wavelet thresholding provides p osterior estimates for a priori distributions of the wavelet coefficients that give the largest mean square error. Although we have not used them there are also
c 0000 RAS, MNRAS 000, 000­000

APPENDIX A: HAAR WAVELETS FOR THE COBE PIXELIZATION Spherical Haar wavelets (Sweldens 1995, Schroeder & Sweldens 1995) are a generalization of planar Haar wavelets. To define SHW we need a hierarchical pixelization scheme. One example of such is the COBE sky cub e (CSC) pixelization where the surface of the unit sphere is divided into 6 · 4(j -1) approximately equal area pixels at resolution j . Each pixel k at resolution j , Sj,k , has four children, Sj +1,k1 , ..., Sj +1,k4 , at resolution j + 1. Let K (j ) b e the pixel numb ers corresp onding to resolution j . The functions that model the main features of the data at each scale are scaling functions, and the functions that represent the details of the data not captured by the scaling functions are the wavelets. The scaling functions of the SHW are defined as j,k ( ) = 1, if Sj,k and 0 otherwise. Define Vj L2 as the closed subspace generated by the j,k : Vj = clos{j,k | k K (j )}. For example, a DMR sky map is an element of V6 . By definition Vj Vj +1 . The SHW are defined as an orthogonal basis for the orthogonal complement, Wj , of Vj in Vj +1 . By construction, for J -1 Wi . For example, a resoluany chosen J , VJ = Vj i=j tion 10 map is a sum of a coarser resolution 9 map plus a function from W9 encoding the details not captured by the coarser map, or a resolution 8 map plus detail functions from W8 and W9 . The functions that capture the leftover details are combinations of the wavelets at different resolutions: there are three wavelets, j,m1 j,m2 , j,m3 , associated to each pixel Sj,k . Let µj,k b e the area of pixel Sj,k and L(j, k) = {mo (k), m1 (k), m2 (k), m3 (k)} b e the pixel numb ers of its four children, then the three wavelets for pixel Sj,k are j, j,
m1

=

m2

j +1,mo + j +1,m3 j - 2µj +1,mo + 2µj +1,m3 2µj j +1,m1 j +1,m2 = - 2µj +1,m1 2µj +1,m2

+1,m1 +1,m1

+ j +1,m2 + 2µj +1,m2


10
j,
m3

L. Tenorio et al.
= j +1,mo j +1,m3 - 2µj +1,mo 2µj +1,m3 REFERENCES
Abramovich F., Sapatinas T. & Silverman B.W., 1998, J. Roy. Statist. Soc. Ser. B, 60, 725-749 Bond J.R., 1995, Phys. Rev. Lett., 74 Bond J.R., Jaffe A.H. & Knox L.E., 1998, Phys. Rev. D, in press; astro-ph/9708203 Bond J.R., Crittenden R.C, Jaffe A.H. & Knox L.E., 1999, Computers in Science & Engineering (to appear) Borrill J., 1999, Phys. Rev. D, 59, 7302, astro-ph/9712121. See also http://cfpa.berkeley.edu/borrill/cmb/quadest.html Bunn E. & Sugiyama N., 1995, ApJ, 446 Bunn E., Sugiyama N. & Silk J., 1996, ApJ, 464, 1 Cayon L. et al., 1999, submitted ´ Donoho D.L., 1992, Technical Report, Stanford University Donoho D.L. & Johnstone I.M., 1994, Biometrika, 81 Donoho D.L. & Johnstone I.M., 1995, J. Amer. Statist. Assoc., 90 Ferreira P., Magueijo J. & Silk J., 1997, Phys. Rev. D, 56, 4592 Freeden W. & Windheuser U., 1997, Appl. Comput. Harmonic Anal., 4 Hobson M.P., Jones A.W. & Lasenby A.N., 1999, MNRAS, submitted, astro-ph/9810200 Johnstone I.M. & Silverman B.W., 1997, J. Roy. Statist. Soc. Ser. B, 59, 319-351 Lehmann E.L. & Casella G., 1998, Theory of Point Estimation, second edition. Springer-Verlag Lineweaver C.H. et al., 1994, ApJ, 436, 2 Nason G.P., 1995, in Wavelets and Statistics, Springer-Verlag, 329 Pando J., Valls-Gabaud D. & Fang L., 1998, Phys. Rev. Lett. 81, 4568 Sanz J.L. et al. , 1999, in Oliveira-Costa A. & Tegmark M., eds, Microwave Foregrounds. ASP, San Francisco Schroeder P. & Sweldens W., 1995, Technical Report, University of South Carolina Silverman B.W., 1999, Phil. Trans. R. Soc. Lond. A (to appear) Sweldens W., 1995, Technical Report, University of South Carolina Tegmark M., astro-ph/9611174, Phys. Rev. D, 55, 5895 (1997)

The wavelets {j,mi (k) | i = 1, 2, 3; k K (j )} form a basis for Wj . In the limit, for any choice of Jo , {j,m , j j0 } {j0 ,k | k K (j0 )} is an orthogonal basis for the space of integrable functions on the sphere. For example, take a function T (e.g., a sky map) at a finest resolution level J (i.e., T VJ ) and choose a starting resolution Jo , then T can b e written as
J -1

T (i) =
kK (j0 )



j0 ,k

j0 ,k (i) +
j =j0 m



j,m

j,m (i).

(A1)

The first sum corresp onds to the lowest resolution map and the second one to the details from the higher resolutions. The coefficients in the expansion (A1) define the wavelet transform WT of T WT = = (jo , j o , · · · , ({
Jo , k J -1

)

t j,m

, k K (Jo )}, {

, (A2)

m M (j ), Jo j J - 1})t .

No linear system needs to b e solved in order to compute the comp onents of , they are obtained recursively starting from the finest resolution coefficients J = T
j,k

=
lK (j +1)

~ j, h

k,l j +1,l

, and

j,m

=
lK (j +1)

gj, ~

m,l j +1,l

.

To reconstruct a function given the wavelet coefficients we have a recursive inverse transform
j +1,l

=
k K (j )

hj,

k,l j,k

+
mM (j )

g

j,m,l j,m

.

The coefficients for the forward and inverse transforms defined ab ove are ~ hj,
k ,l

=

µj +1,l µj,k

0 1 0

if l L(j, k) otherwise

hj,

k ,l

=

if l L(j, k) otherwise
Sj +1,l

gj, ~

m,l

=

~ j,

m



if l L(j, k) otherwise

0
1 µj +1,k Sj +1,l

g

j,m,l

=

j,

m



if l L(j, k) otherwise,

0

~ where j,m is the dual of j,m . In order to avoid biasing analyses of CMB skies, data contaminated by Galactic emissions or other identified foreground sources are usually discarded. To compute a wavelet transform of data with gaps, we discard those pixels at the lowest resolution Jo which are inside the gaps. Only the wavelet coefficients correp onding to descendants of pixels outside the gaps are used. Note that the scaling and wavelet functions corresp onding to these selected pixels form an orthogonal basis for the sky with gaps. This is due to the local supp ort of the functions and to the vanishing of the integral of j,m over any pixel Sj,k .
c 0000 RAS, MNRAS 000, 000­000