Документ взят из кэша поисковой машины. Адрес
оригинального документа
: http://www.mrao.cam.ac.uk/ppeuc/astronomy/papers/hobson/node7.html
Дата изменения: Tue Feb 24 13:18:53 1998 Дата индексирования: Tue Oct 2 19:01:28 2012 Кодировка: Поисковые слова: universe |
A Bayesian value for may be found simply by treating it as another
parameter in our hypothesis space. This procedure is outlined for the case of
real images in Skilling (1989) and Gull & Skilling (1990), and we modify their
treatment here in order to accommodate complex images
.
After including into our hypothesis space, the full joint probability
distribution can be expanded as
where in the last factor we can drop the conditioning on since it is
alone that induces the data
. We then recognise
this as the likelihood. Furthermore, the second factor
can be identified as the entropic prior and so (39)
becomes
where and
are respectively the normalisation constants for
the entropic prior and the likelihood such that the total probability density
function in each case integrates to unity. For convenience we have dropped the
explicit dependence of the cross entropy
on the models
and
.
Since we have assumed the instrumental noise on the data to be Gaussian, the
likelihood function is also Gaussian and so the normalisation factor is
easily found. Evaluating the appropriate Gaussian integral gives
where is the dimension of the complex data vector
and is
equal to the number of observing frequencies that make up the Planck Surveyor
data set;
is the determinant of the noise covariance matrix defined
in (7).
The normalisation factor for the entropic prior is more difficult
to calculate since this prior is not Gaussian in shape. Nevertheless, we find
that a reasonable approximation to
for all
may be
obtained by making a Gaussian approximation to the prior at its maximum, which
occurs at
. As discussed in
Appendix A, the Hessian matrix of the entropy at this point is given by
, where
is the metric on image space evaluated at the maximum of the
prior
; the metric matrix is real and diagonal. Remembering
that
and using the Gaussian approximation,
is then given by
where is the dimension of the complex (hidden) image vector
and is equal to the number of physical components present in the
simulations.
Now, returning to (40), in order to investigate more closely the
role of , we begin by considering the joint probability distribution
, which may be obtained by integrating out
in (40):
where we have defined the normalisation integral . In order to
calculate
, we follow a similar approach to that use to
calculate
and make a Gaussian approximation to
about its maximum at
. The required Hessian matrix
is given by
(38) evaluated at
. Let us, however, define a new
matrix
that is given by
The integral is then approximated by
Thus, substituting into (42) the expressions for and
given by (41) and (44) respectively, we
find that in the Gaussian approximation the joint probability distribution
has the form
Now, in order to obtain a Bayesian estimate for , we should choose an
appropriate form for the prior
. Nevertheless, for realistically
large data sets, the distribution
is so strongly
peaked that it overwhelms any reasonable prior on
, and so we assign
the Bayesian value
of the regularisation constant to be that
which maximises
. Taking logarithms we obtain
Differentiating with respect to , and noting that the
-derivatives cancel, we find
where we have used the identity
which is valid for any non-singular matrix . From
(43), however, we see that
. Substituting this relation into (45) and equating to
the result to zero, we find that in order to maximise
, the parameter
must satisfy