Abstract
We present a Bayesian hierarchical framework to analyze photometric galaxy survey data with stellar population synthesis (SPS) models. Our method couples robust modeling of spectral energy distributions with a population model and a noise model to characterize the statistical properties of the galaxy populations and real observations, respectively. By self-consistently inferring all model parameters, from high-level hyperparameters to SPS parameters of individual galaxies, one can separate sources of bias and uncertainty in the data. We demonstrate the strengths and flexibility of this approach by deriving accurate photometric redshifts for a sample of spectroscopically confirmed galaxies in the COSMOS field, all with 26-band photometry and spectroscopic redshifts. We achieve a performance competitive with publicly released photometric redshift catalogs based on the same data. Prior to this work, this approach was computationally intractable in practice due to the heavy computational load of SPS model calls; we overcome this challenge by the addition of neural emulators. We find that the largest photometric residuals are associated with poor calibration for emission-line luminosities and thus build a framework to mitigate these effects. This combination of physics-based modeling accelerated with machine learning paves the path toward meeting the stringent requirements on the accuracy of photometric redshift estimation imposed by upcoming cosmological surveys. The approach also has the potential to create new links between cosmology and galaxy evolution through the analysis of photometric data sets.
Export citation and abstract BibTeX RIS
![](https://content.cld.iop.org/journals/0067-0049/264/1/23/revision1/apjsac9d99license.gif)
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
1. Introduction
Deriving accurate redshifts and redshift distributions from photometry alone is of central importance for the scientific exploitation of modern galaxy surveys. In particular, cosmological analyses involving galaxy clustering and weak gravitational lensing require exquisitely accurate estimates of the redshift distributions of the selected galaxy populations (see, e.g., Hildebrandt et al. 2020; Abbott et al. 2022). Photometric redshifts (photo-z's) of individual galaxies are also useful for selecting targets of interest (e.g., at high redshift).
Photo-z's are typically obtained via two types of methods: template fitting, and machine learning. 8 Template fitting (e.g., Benitez 2000; Ilbert et al. 2006, 2006; Brammer et al. 2008; Acquaviva et al. 2015; Tanaka 2015; Carnall et al. 2018; Battisti et al. 2019) relies on lists of grids or templates for the spectral energy distributions (SEDs) of the galaxies considered, combined with a prior on the parameters involved (galaxy type, magnitude, etc). Machine-learning techniques (e.g., Collister & Lahav 2004; Kind & Brunner 2013) involve flexible models trained on spectroscopic or synthetic data. Despite their past successes, both these approaches are unable to satisfy the accuracy requirements imposed by upcoming Stage IV surveys, as they are not robust to errors in the data (calibration offsets, underestimated uncertainties), in the SED model, or to the non-representativity or sample variance of the spectroscopic data used for training and validation (e.g., Hartley et al.2020; Gatti et al. 2021; Myles et al. 2021; Newman & Gruen 2022). For example, the Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) will require the uncertainty on the mean of redshift distributions to be smaller than 0.001(1 + z) by its Year 10 data release (The LSST Dark Energy Science Collaboration et al. 2018), which in turn imposes very stringent requirements on the accuracy of individual photo-z's.
We introduce a framework capable of addressing these needs: a hierarchical model of galaxy photometry, consisting of three components describing a process for generating survey data:
- 1.galaxy SEDs generated with stellar population synthesis;
- 2.a population model describing the distributions of the intrinsic parameters of galaxies in the sample analyzed;
- 3.a noise model for generating photometric observations (i.e., a likelihood function).
This provides a powerful combination of empirical modeling with physics-driven components, incorporating known (astro)physics while leaving sufficient room for data-driven corrections, addressing the aforementioned issues of flexibility, interpretability, miscalibration and extrapolation outside of training data. Thus, once applied and calibrated to large survey data sets, it has the potential to yield both precise and accurate inferences on intrinsic galaxy parameters, including redshift. Building on previous work on hierarchical modeling in photometric surveys (Leistedt et al. 2016; Jones & Heavens 2018; Sánchez & Bernstein 2018; Rau et al. 2019; Alarcon et al. 2020; Sánchez et al. 2020), our framework takes advantage of the increasing accuracy and flexibility of SPS, and of methodological innovations in Bayesian inference for large hierarchical models, in particular involving machine-learning-facilitated acceleration of computationally intensive analysis steps.
In this paper, we demonstrate the efficacy of this framework by applying it to the COSMOS2020 data set (Weaver et al. 2022). For the choice of SED model, we consider Prospector-
The outline of this paper is as follows. In Section 2 we describe our framework in the form of a generative model of photometric data. An inference formalism is presented in Section 3. We describe the data in Section 4 and the results in Section 5. We discuss the results in further detail in Section 6 and conclude in Section 7. In what follows, a Planck 2015 (Ade et al. 2016) and a Chabrier (2003) initial mass function are adopted for all relevant calculations. p( · ∣ · ) refers to the probability density of the quantity before the "∣," conditioned on quantities behind it. We work with fluxes in AB units.
2. Generative Model for Galaxy Photometry
Hierarchical models tackling photometric redshifts and redshift distributions have been developed (Leistedt et al. 2016; Jones & Heavens 2018; Sánchez & Bernstein 2018; Rau et al. 2019; Alarcon et al. 2020; Sánchez et al. 2020), but typically assume a fixed SED model. Leistedt et al. (2019) pioneered the approach of inferring redshifts jointly with the calibration of the hyperparameters of an SED model. However, like most photo-z methodologies (Ilbert et al. 2006; Brammer et al. 2008; Acquaviva et al. 2015; Tanaka 2015; Carnall et al. 2018; Battisti et al. 2019), the model still relied on a curated set of SED templates, which made it difficult to construct more structured, interpretable corrections to the SEDs or to the population model. Our work introduces the use of full continuous stellar population synthesis (SPS) models for redshift estimation, tailored to the multilevel calibration and inference that will be needed for the upcoming generation of wide-deep photometric surveys (Newman & Gruen 2022).
2.1. SED Model
SPS provides a powerful way to model the SED of a galaxy. It exploits fundamental principles to generate and sum the contributions of "simple" stellar populations (as observed in, e.g., star clusters) and to apply the effects of additional complications, such as dust and nebular emission, in order to create galaxy SEDs within a forward modeling framework. This step involves a set of specific modeling choices for these components; for example an explicit star formation history (SFH). We use the Flexible Stellar Population Synthesis (FSPS; Conroy et al. 2009a, 2009b; Conroy & Gunn 2010) code, accessed through the python-FSPS binding (Foreman-Mackey et al. 2014).
For the SED model we consider a variation of Prospector-
- 1.Eight parameters describing the SFH: one parameter for the total stellar mass formed
[M⊙] (integral of the SFH); one parameter for the stellar metallicity
, assumed to be the same for all stars in the galaxy; and six parameters for the relative age SFH bins (ratios ri of the piece-wise SFR in adjacent temporal bins), where the seven time bins are spaced following Leja et al. (2020).
- 2.
- 3.A gas-phase metallicity parameter
for the nebular emission (decoupled from the stellar metallicity) modeled after the grids of Byler et al. (2017).
- 4.Two parameters, logfAGN and
τ AGN, for the AGN torus emission model of Nenkova et al. (2008). - 5.Redshift, z.
The SED includes dust heating from stars via energy balance, via a dust SED of fixed shape (Draine & Li 2007).
2.2. Emission Lines
Prospector-
We model offsets and uncertainty in the strength of the emission lines by encoding them into parametric bias and variance parameters. For a set of SPS parameters
![Equation (1)](https://content.cld.iop.org/journals/0067-0049/264/1/23/revision1/apjsac9d99eqn1.gif)
where ℓj is the amplitude of the jth line,
2.3. Synthetic Photometry
We can obtain total model fluxes in the bth band by integrating the SED through the bandpass filter Wb
(
![Equation (2)](https://content.cld.iop.org/journals/0067-0049/264/1/23/revision1/apjsac9d99eqn2.gif)
where dL
(z) is the luminosity distance and
2.4. Population Model
So far we have a mechanism for using SPS to generate the photometry of a single galaxy given its intrinsic properties,
We adopt minor variations on the fiducial Prospector-
![Equation (3)](https://content.cld.iop.org/journals/0067-0049/264/1/23/revision1/apjsac9d99eqn3.gif)
Most parameters have simple uniform or log-uniform priors, except in the cases discussed below.
Table 1. Summary of Parameters and Priors Describing the SPS Model
Parameter | Prior Bounds | Prior |
---|---|---|
![]() | [7, 13] | Uniform(7, 13) |
![]() | [−1.98, 0.19] | Uniform (−1.98, 0.19) |
![]() | [−5, 5] | Stu(0, 0.3, 2) |
| [0, 4] |
![]() |
| [0, 2] |
![]() |
n | [−1, 0.4] |
![]() |
logfAGN | [10−5, 150] | LogUniform(10−5, 150) |
| [−2, 0.5] | LogUniform(5, 150) |
![]() | [−2, 0.5] |
![]() |
z | [0, 2.5] | Uniform(0, 2.5) |
Note.
refers to a truncated normal distribution of location l, scale s, and in the range [m, M]. Our fiducial values for the parameters below are
,
Download table as: ASCIITypeset image
For the gas-phase metallicity, we take a normal
9
prior of mean at the stellar metallicity and standard deviation where
Finally, we collect the following hyperparameters in a vector
2.5. Noise Model
The final stage of the generative model is to simulate observed photometry from the model flux Fb
, by adding noise. The noise model can be fully characterized using a likelihood function. We use a scaled and translated Student's-t distribution
10
with two degrees of freedom, which is more robust to outliers than a normal distribution, since it has heavier tails:
![Equation (4)](https://content.cld.iop.org/journals/0067-0049/264/1/23/revision1/apjsac9d99eqn4.gif)
Aside from the observational uncertainty
We write the model uncertainty as the sum (in quadrature) of a fraction of the total model flux and a fraction of the total line fluxes,
![Equation (5)](https://content.cld.iop.org/journals/0067-0049/264/1/23/revision1/apjsac9d99eqn5.gif)
where Ljb
is the photometric flux of the jth line in the bth band, i.e., the result of applying Equation (2) to ℓj
3. Inference Formalism
We now assemble the components presented in the previous section into a hierarchical model for generating and analyzing photometric data. We describe an inference methodology and also computational acceleration using machine learning.
3.1. Hierarchical Model
A summary of the parameters of our model is provided in Table 2. A graphical representation is provided in Figure 1. Each galaxy has 15 intrinsic SPS parameters (including redshift), as described in Section 2.1, and in Sections 2.2–2.5 we have introduced five sets of hyperparameters, aiming to add some targeted sources of flexibility into the model. Furthermore, a hierarchical modeling approach has the ability to fix subsets of parameters, and to observe the effect on the others, offering additional robustness diagnostics.
Table 2. Notation for All Model Parameters
Parameter | Description |
---|---|
Hyperparameters | |
| Zero-point offset (for bth band, relative to the i band) |
| Additional flux uncertainty contribution for the bth band (fraction of the total model flux) |
| Additional flux for the jth emission line (fraction of the line strength in FSPS) |
| Additional flux uncertainty contribution for the jth emission line (fraction of the line strength in FSPS) |
| Hyperparameters describing the galaxy population model |
Latent parameters | |
| Stellar population parameters describing the rest-frame spectrum (per galaxy) |
z | Redshift (per galaxy) |
Derived parameters | |
ℓSPS(
| Flux energy density predicted by FSPS (per galaxy) |
ℓj
(
| Amplitude of emission line (Dirac delta model) in FSPS (per galaxy) |
Ljb
(
| Flux of the jth emission line in the bth band (per galaxy), i.e., ℓj inserted in Equation (3) |
Fb
(
| Total model flux in the bth band, with all contributions (per galaxy), i.e., Equation (1) inserted into Equation (3) |
Additional flux uncertainty in the bth band, with all contributions (per galaxy), see Equation (5) | |
Data | |
![]() | Measured flux in the bth band (per galaxy) |
| Flux measurement uncertainty in the bth band (per galaxy) |
Note. Vectors are indicated with bold symbols. Our total log likelihood and graphical model presented below include an extra index i per galaxy, which we have omitted in this table and all other equations in the paper for simplicity.
Download table as: ASCIITypeset image
Figure 1. Graphical representation of the main parameters involved of our model. Circles are inferred random variables, shaded circles are observed and dots indicate random variables that are deterministic in their inputs.
Download figure:
Standard image High-resolution imageThe full posterior distribution of our model (adding a subscript i for galaxies) is
![Equation (6)](https://content.cld.iop.org/journals/0067-0049/264/1/23/revision1/apjsac9d99eqn6.gif)
3.2. Priors
In our approach it is straightforward to encode prior knowledge or constraints on corrections such as zero-points or emission-line contributions, as the prior on hyperparameters appears explicitly as p(
Table 3. Priors on Hyperparameters
Hyperparameter | Prior |
---|---|
![]() | Laplace(0, 10) |
![]() | Normal(0, 1) |
![]() | Normal(1, 1) |
![]() | Normal(0, 1) |
Note. The numbers in parentheses indicate the location and scale of the distribution.
Download table as: ASCIITypeset image
We find that the Laplace prior for each
3.3. Optimization Strategy
Our inference strategy is to optimize the posterior distribution with respect to all parameters. This is because the mixture of broad bands (typically deeper) and noisier intermediate or narrow bands makes it possible to pin down SED features accurately and unequivocally. As a result, it is possible to derive estimates of the hyper- and latent parameters through optimization, without the need for more involved inference strategies, such as MCMC. We adopt the Adam optimizer with a learning rate of 10−3.
Challenges that need to be overcome at this stage involve the computational needs of SPS calls, and the fact that they are not differentiable analytically (although differentiable SPS models are now being developed, Hearin et al. 2021; Alarcon et al. 2023). We address these issues with neural emulators of the model predictions, which are both ∼104 faster than native SPS calls and differentiable (Alsing et al. 2020). We return to the details of our neural emulators in the next section.
In principle, one could determine all parameters in a single run. But, for the purpose of this demonstration study, we adopt a two-stage approach (e.g., Laigle et al. 2016; Weaver et al. 2022) that is commonly used and leverages the availability of spectroscopic redshifts. We first determine the hyperparameters with redshifts fixed at the values acquired via spectroscopy (optimizing all SPS parameters other than the redshift). A second run is then performed by fixing the hyperparameters at the inferred values, and only optimizing the latent SPS parameters (now including the redshift). The resulting optimized redshifts are denoted (omitting the galaxy index).
Finally, we employ bijectors to map the original parameters to new variables which have zero-mean, unit-variance normal priors. These are easier to work with than the original variables, since they do not have boundaries, which facilitates optimization (and sampling). The parameter bounds are shown explicitly in Table 1.
3.4. Emulators
As previously mentioned, one drawback associated with SPS models is the computational cost of each model call, which is the result of the many intensive numerical steps involved in predicting flux densities and photometry from stellar isochrones, initial mass function and the other ingredients of SPS. While this has been overcome with MCMC in the past and applied to medium-sized (∼105 objects) data sets (e.g., Leja et al. 2019), photometric galaxy surveys are approaching the order of a billion objects (The LSST Dark Energy Science Collaboration et al. 2018; Abbott et al. 2022), so brute-force inference is not practical. Hyperparameters or model calibration would make the computational load even heavier. We now discuss how we employ emulators 11 of fluxes and line emission in order to overcome this challenge.
3.4.1. SPS Fluxes
For each band, we use the methodology described in Alsing et al. (2020) to train an emulator to predict the photometric flux, i.e., ℓSPS fed into Equation (3). While, in principle, one could train emulators to predict the ℓSPS(
3.4.2. Emission-line Fluxes
Our model photometry has additional contributions from emission lines, so for any parameter vector (
3.4.3. Emulator Accuracy
When comparing the initial and the emulator-reconstructed fluxes (computed from fluxes), we find that they are accurate at a level better than 1% (99th percentile of the magnitude differences in the range spanned by the COSMOS2020 data below). This will be comfortably covered by
4. Data
We now construct a data set that demonstrates that our framework is able to produce accurate photometric redshift inferences from a benchmark sample. This consists of selecting broad- and narrow-band photometry in the COSMOS field, and cross-matching with a selection of spectroscopic data sets to enable validation of inferred redshifts.
4.1. Photometry
Weaver et al. (2022) presented COSMOS2020, the latest compilation of imaging in the COSMOS field. Two sets of photometric measurements were performed on these data: Classic and Farmer. For our case study we choose Farmer, which is based on the model-fitting software The Tractor. 13 The photo-z's published with this data set were derived using Eazy (Brammer et al. 2008), as well as LePhare (Ilbert et al. 2006, 2008) using slightly different subsets of bands. While we will compare our results to both of these methods, we adopt the same subset of selected bands as Eazy; its SED templates are also derived using FSPS, thus providing a more direct comparison. This selection of bands exclude the Subaru Suprime-Cam broad bands (shallower than other filters at similar wavelengths) and the GALEX bands (shallow and with broad PSFs) (Weaver et al. 2022). We apply the recommended "combined" mask, which retains the deepest regions with the greatest number of available bands and removes areas corrupted by bright stars and artifacts in all the relevant bands.
This photometric data set is prepared for further analysis using the code released with the COSMOS2020 data, which applies various flux corrections (including Galactic extinction) and unit conversions.
4.2. Spectroscopic Redshifts
We compile the positions and spectroscopic redshifts from three large campaigns covering the COSMOS field: zCosmos-bright (Lilly et al. 2007), DEIMOS (Hasinger et al. 2018)], and C3R2 (Masters et al. 2017, 2019; Stanford et al. 2021).
zCosmos-bright (Lilly et al. 2007) is a highly representative sample of 20,000 bright (i* ≤ 22.5) galaxies. DEIMOS (Hasinger et al. 2018) is a sample of ∼4 × 103 objects selected from a variety of input catalogs based on multiwavelength observations in the field and thus represents a diverse selection function. Finally, the Complete Calibration of the Color-Redshift Relation (C3R2, Masters et al. 2017, 2019; Stanford et al. 2021) is a multi-institution, multi-instrument survey targeting faint galaxies (i ∼ 24.5) populating regions of color space under-sampled by other surveys, in order to provide validation data for upcoming weak lensing surveys. We use data from the Data Releases 1, 2, and 3 (∼4 × 103 objects).
For all three samples, we keep objects with quality flags 3 and 4. We perform a spatial cross-match between the positions of these objects in the spectroscopic and photometric catalogs. Specifically, we keep the nearest unique photometric source within a radius of 06 around each spectroscopic object, as recommended by Weaver et al. (2022). We also remove objects classified as stars by Eazy or LePhare, as well as the few sources with spectroscopic redshift >2.5.
We are left with 12,473 objects: 6920, 3168, and 2043 for zCosmos-bright, DEIMOS and C3R2, respectively. This is the data set upon which we carry out our demonstration study.
This data set combines three samples and so has a complicated selection function (i.e., the probability, conditional on its intrinsic properties, that a galaxy appears in the sample). The selection function must be incorporated into any population level analysis, as is treated explicitly in the inference of redshift distributions presented by Alsing et al. (2022). But if the focus is on inference of object-level parameters (i.e., individual galaxy redshifts) then selection effects can be ignored if the data on detected objects is sufficiently constraining; in the limit that the photometry determines the redshift (and other parameters) perfectly, the posterior is a delta function and hence is completely decoupled from the population. While not strictly satisfied here—the high-quality photometry and broad wavelength coverage do yield finite parameter uncertainties—this is an excellent approximation. We hence do not consider selection effects further.
5. Results
We now evaluate the performance of our inference framework applied to the crossmatched data set described in the previous section. In what follows, unless explicitly stated, the hyperparameters of the population model (
In practice, to avoid degeneracies with other parameters of the model, we anchor the zero-point of the HSC i band to unity and define the others relative to this band.
5.1. Emission Lines
We perform an initial fit to the data with our model with no contribution from lines, i.e., setting , as well as the SPS parameters of each galaxy, with the procedure outlined in Section 3.3. This consists of a run, with redshifts fixed at the spectroscopic redshifts, to estimate the hyperparameters, and a second run now relaxing redshifts but fixing the hyperparameters. The resulting maximum a posteriori redshift estimates,
, are poor compared to the released COSMOS2020 photo-z's. We also observe sharp features in the redshift residuals
spec-z and in the flux residuals (data minus model at the best-fit parameters) binned in spec-z. Many of these features correspond to strong emission lines, such as [O ii], demonstrating that their luminosities are not correctly calibrated for our data set. Re-calibrating them is straightforward within our framework.
We must first determine a suitable subset of emission lines (out of the 128 implemented in FSPS by Byler et al. 2017) to include in the model. To achieve this we take the best-fit SEDs of our first run described above (at z = spec-z) and calculate the contribution of each emission line to the total photometric flux, i.e., the flux ratio Ljb /Fb , for all objects, lines, and bands. We then take the average of these values over objects and bands. This procedure yields the lines which make the strongest contributions to the data at hand. This set includes most of the lines with intrinsic large variance according to Byler et al. (2017). Importantly, this procedure also filters out lines that are outside our wavelength coverage. We are left with 44 lines. Note that nearby emission lines are not necessarily all well-resolved by the data, i.e., each may only unequivocally contribute to a few objects. This could be improved with more informative priors on the lines or on line selection; however we do not expect this to have a significant impact on our redshift results.
In total, following this line-selection procedure, we have 2 × Nbands + 2 × Nlines = 140 hyperparameters (excluding
.
5.2. Photo-z's
We evaluate the performance of this model using some commonly used metrics. For every object in our catalog, a spectroscopic redshift zspec and a redshift estimate are available. The main quantity of interest is the difference between the two, divided by the standard 1 + z factor accounting for the expected scaling of the quality of the estimates due to the effect of redshift on wavelength,
![Equation (7)](https://content.cld.iop.org/journals/0067-0049/264/1/23/revision1/apjsac9d99eqn7.gif)
We compute the mean, median, and standard deviation of
Figure 2 shows these metrics, calculated in redshift and magnitude bins (in the reference HSC i band, which is not affected by zero-points), for our framework, as well as the publicly released COSMOS2020 LePhare and Eazy redshifts.
Figure 2. Photo-z metrics calculated using our sample for our hierarchical model compared with the COSMOS2020 released LePhare and Eazy photo-z's. We compute the median of , as well as
Download figure:
Standard image High-resolution imageOverall, our approach yields comparable levels of bias (as measured by the median of
While these photo-z metrics are informative summaries of the results, it is also interesting to examine the distributions of redshift estimates themselves. The residuals
Figure 3. Differences between the maximum a posteriori redshift estimates and the spectroscopic redshifts (divided by 1 + z) for the new model compared with the COSMOS2020 released LePhare and Eazy redshifts. This highlights the areas where they achieve similar performance in terms of scatter and outliers. Splitting between the three sources of spectroscopic redshifts also reveals the differences in photo-z performance for samples with different selection, as described in Section 4.
Download figure:
Standard image High-resolution imageFigure 4. Maximum a posteriori redshift estimates versus spectroscopic redshifts for the three spectroscopic sources comprising the data we analyze. Differences are best highlighted in Figure 3, whereas these scatter plots with a logarithmic color bar better highlight the overall distributions and the outliers (which are rare).
Download figure:
Standard image High-resolution image5.3. Hyperparameters
In this section, we investigate the results of our optimization procedure for the model's hyperparameters and their implications for the robustness of the data model. The hyperparameters resulting from the optimization are shown in Figure 5. We also show the result of a run with the noise and emission-line offsets set to fiducial values
Figure 5. Hyperparameters obtained with the optimization procedure of Section 3.3. The black stars correspond to our best model, with all parameters optimized, while the green diamonds correspond to an additional model (used for robustness checks, see text for details) fixing the emission-line offsets
Download figure:
Standard image High-resolution image5.3.1. Zero-point Offsets
The inferred zero-points {
5.3.2. Examination of Photometric Uncertainties
Figure 5 shows that the inferred values of flux uncertainties
5.3.3. Emission-line Offsets
We find that including extra contributions from emission lines significantly improves the quality of the derived photo-z's. This is consistent with previous findings by Ilbert et al. (2006, 2008), Brammer et al. (2008) and Alarcon et al. (2021). Note that in Figure 5 emission lines are ordered by their contribution relative to the total broadband flux, following the procedure described above. Offsets are primarily given to the strongest lines, with some of them being almost entirely canceled (
5.4. Population Model
We perform one additional run, also optimizing the hyperparameters of the population model,
5.5. Residuals
Finally, we check the ability of the presented model to explain the statistical properties of the data. We examine the distribution of the flux residuals, defined as
![Equation (8)](https://content.cld.iop.org/journals/0067-0049/264/1/23/revision1/apjsac9d99eqn8.gif)
These are shown in Figure 6. The blue lines show our likelihood function, the Student's-t distribution described in Equation (4). There is good agreement between the chosen likelihood and the residuals. We have checked that this agreement significantly deteriorates when all of the hyperparameters are not optimized (not shown here). The deviations in the u and ch2 bands are likely to be due to residual data systematics which are not well described by our model.
Figure 6. Flux residuals (black), as defined in Equation (8), compared with the likelihood function of Equation (4) in blue, showing a satisfactory agreement between the data and the model predictions.
Download figure:
Standard image High-resolution imageWhen repeating this analysis with a normal likelihood function, we find that the residuals are worse and provide a poor fit of the likelihood function, despite optimizing the other components of the model. This setting also results in worse redshift estimates. This demonstrates the presence of outliers and shows that accommodating them with a suitable likelihood function can significantly improve the ability of a model to fit complicated data.
6. Discussion
Our results demonstrate that the hierarchical SPS inference framework presented here (and in Alsing et al. 2022) delivers accurate redshifts on a benchmark data set, therefore passing a stringent validation test. We now discuss possible extensions which relax some of its assumptions.
The model itself could be made more sophisticated in various ways in order to better suit specific applications and increase the accuracy of the photo-z's further. Such extensions include a stellar model, which could straightforwardly be incorporated to allow star-galaxy separation, akin to what is done in Eazy and LePhare with stellar templates.
In the future, it will be critical to include effects which are currently neglected, but which affect the data (and in turn, photometric redshifts) at significant levels for LSST, such as airmass-dependent fluxes (Graham et al. 2018). Furthermore, extending our model to include spatial information (Sánchez & Bernstein 2018; Jasche & Lavaux 2019; Alarcon et al. 2020; Sánchez et al. 2020) could provide a powerful probe of the connection between galaxies and dark matter and exploit more of the information offered by photometric surveys.
Our chosen optimization strategy, paired with the posterior distribution taken as a loss function, minimizes the residuals between the measured and the model fluxes. Other choices may be possible, especially when including spectroscopic redshifts. For example, one could explicitly minimize the residuals between photometric (inferred) and spectroscopic redshifts, although this will render the results more sensitive to selection effects in the data.
We neglected the uncertainties in hyperparameters and SPS parameters. This is because the extensive wavelength coverage of the 26-band COSMOS2020 data, the small flux uncertainties, and the availability of spectroscopic redshifts combine to yield well-constrained galaxy SEDs and in turn hyperparameters. Nevertheless, there are fundamental degeneracies expected between some of the SPS parameters (Leja et al. 2017, 2019), as well as with the redshift when it is not fixed to a spectroscopic value. Selection effects may also be important, and can be accounted for in the posterior distributions as described in Alsing et al. (2022). In analyzing data sets where the uncertainties in hyperparameters or SPS parameters (including redshift) are not negligible (e.g., broadband-only surveys like the LSST), one can resort to the variety of Bayesian inference techniques (i.e., MCMC, variational inference, Laplace approximation, simulation-based inference) available for hierarchical models. Since the impact of selection cuts and parameter uncertainties is strongly dependent on the specific data and models being considered, we defer further discussion and the development of effective inference strategies to future work.
7. Conclusion
We have presented a hierarchical model to infer redshift and other intrinsic galaxy properties from photometric data. This approach makes it possible to self-consistently encode knowledge of (1) galaxy formation and evolution via a population model, (2) astrophysics via SEDs predicted with stellar population synthesis, and (3) observational effects via a noise model and a likelihood function. By formulating the inference within a Bayesian hierarchical model, we are able to parameterize additional sources of bias and uncertainty in each of these components and to solve for them self-consistently. Thus, any manual tuning or inversion methods can be avoided, with the added benefit that it is possible to set informative priors on these additional terms.
Our photo-z's (as measured with traditional metrics on redshift point estimates) are competitive with the publicly released photo-z's from COSMOS2020 (Weaver et al. 2022). While this is a benchmark test on a relatively bright, low-redshift sample of spectroscopically confirmed galaxies, these results demonstrate the power and flexibility of our methodology and motivate its further development toward addressing the challenging setting presented by stage IV cosmological surveys. We made our inference tractable using neural emulators to accelerate model calls (Alsing et al. 2020).
In future applications of our framework, we will relax some of the modeling assumptions made here. In particular, we demonstrate how to incorporate selection in the companion paper Alsing et al. (2022). Comparing inferences from multiple SPS and population models will also provide valuable consistency checks. In fact, refining the SED and population models directly from the upcoming survey data is a promising avenue for directly probing the underlying (astro)physics using large amounts of untapped statistical power. Leveraging hybrid data sets and incorporating additional data available (e.g., spec-z's, or even full spectra) will be instrumental for correctly separating the different physical effects at play. This is straightforwardly included in hierarchical models (see, e.g., Alarcon et al. 2021; Nagaraj et al. 2022). Together, these attributes make the methodology presented here well-suited for addressing the challenges of upcoming photometric surveys and the robust scientific exploitation of their data.
We thank George Efstathiou for valuable input during the course of this project, and John R. Weaver for assistance with the COSMOS2020 data. We also thank Konrad Kuijken, Hendrik Hildebrandt, Angus Wright, and Will Hartley for useful discussions. B.L. is supported by the Royal Society through a University Research Fellowship. This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (grant agreement No. 101018897 CosmicExplorer). This work has also been enabled by support from the research project grant Understanding the Dynamic Universe' funded by the Knut and Alice Wallenberg Foundation under Dnr KAW 2018.0067. J.A., H.V.P. and D.J.M. were partially supported by the research project grant "Fundamental Physics from Cosmological Surveys" funded by the Swedish Research Council (VR) under Dnr 2017-04212. The work of H.V.P. was additionally supported by the Göran Gustafsson Foundation for Research in Natural Sciences and Medicine. H.V.P. and D.J.M. acknowledge the hospitality of the Aspen Center for Physics, which is supported by National Science Foundation grant PHY-1607611. The participation of H.V.P. and D.J.M. at the Aspen Center for Physics was supported by the Simons Foundation.
Author Contributions
B.L.: conceptualization, methodology, software, validation, formal analysis, writing—original draft. J.A.: conceptualization, methodology, software, validation, writing—review and editing. H.V.P.: conceptualization, methodology, validation, writing—review and editing, funding acquisition. D.M.: conceptualization, methodology, validation, writing—review and editing, funding acquisition. J.L.: conceptualization, methodology, validation, writing—review and editing.
Footnotes
- 8
- 9
Since we apply normal priors to parameters only defined in finite ranges, a renormalization is necessary, giving rise to truncated normals, as described in Table 1. For simplicity we omit this detail throughout this section.
- 10
The density of the scaled and translated Student's-t distribution is
with t = (x − l)/s as the residuals and
Γ ( · ) as the Gamma function. l and s are often referred to as location and scale parameters andν as the number of degrees of freedom. - 11
- 12
For reference, a standard FSPS spectrum with the default MILES spectral library has ∼6 × 103 wavelength points.
- 13