The Nonhomogeneous Poisson Process for Fast Radio Burst Rates

Earl Lawrence; Scott Vander Wiel; Casey Law; Sarah Burke Spolaor; Geoffrey C. Bower

doi:10.3847/1538-3881/aa844e

1. Introduction

Rates for astronomical events are typically computed using only some of the available information. When some part of the observed parameter space (e.g., sensitivity) is non-ideal, astronomers cut that space out prior to estimating a rate (e.g., "completeness limit"). This simplifies the analysis, but is dangerous when events are observed below the conservative threshold. Many compelling and active fields of astronomical research are subject to strong observational bias and low numbers of events, such as exoplanets (Foreman-Mackey et al. 2014), gravitational waves (Abbott et al. 2016), and near-earth asteroids (Stuart & Binzel 2004).

The study of fast radio bursts (FRBs) is another good example. See, for example, Lorimer et al. (2007), Thornton et al. (2013), or Burke-Spolaor & Bannister (2014). To date, 17 FRBs, discovered in millisecond radio transient surveys, are listed in the FRB Catalog⁵ of Petroff et al. (2016). In many of the detection papers, the rate is estimated by scaling the number of observed FRB detections by the product of the instrument observing area and the time-on-sky. This rate is generally assumed to be a cumulative rate for all events above a certain flux chosen to be the minimum detectable flux at the full-width-at-half-maximum (FWHM) sensitivity for the instrument. Measured flux values do not otherwise affect the analysis. However, the actual flux values contain information about how the rate changes as a function of flux. Oppermann et al. (2016) is an exception that does use the flux values. These studies typically, but not always, make other assumptions that are potentially detrimental. The telescope sensitivity is often modeled as a flat function covering the FWHM region of the beam (Deneva et al. 2009 is an exception). They also tend to include only those surveys in which FRBs are detected.

The approach based only on counts, correct but incomplete, can be limiting. This is especially true for FRB rate calculations because few have been observed thus far and we want to use all available information in the rate computations. For example, we know that radio telescopes have a sensitivity pattern (the "beam") that drops with distance from the center, so there is more observing area at lower sensitivity and recorded fluxes are likely to be lower than the true flux that arrives at Earth. Furthermore, since most of the universe is far away, we expect to see a much higher number of faint FRBs. The first effect reduces the flux that we record, while the second reduces the actual flux that reaches Earth. Both effects lead us to expect more events at one flux than we would at a higher flux.

The paper presents the use of the nonhomogeneous Poisson process (NHPP) for computing the rate of FRBs. The approach, frequently used in statistical modeling, will address the issues raised above and could be extended to address other issues that ultimately cause the probability of detection to vary. The method is applicable to event searches of any kind. The paper has two goals. One, it is intended to be a tutorial in the use of the nonhomogeneous Poisson process, which we expect to be widely applicable in astronomy. Second, we aim to produce an estimate for the rate of FRBs that combines data across several surveys. Section 2 describes the general NHPP model and the specifics of our approach for FRB surveys. Section 3 describes the results of the fit to the FRB survey data. Finally, Section 4 presents some discussion on the model and what can be added in a more sophisticated analysis.

2. The Nonhomogeneous Poisson Process Model

A homogeneous Poisson process is a counting process that says that (1) the number of events that occur in an interval (e.g., of time or space or some other domain) follows a Poisson distribution whose mean is proportional to the size of that interval (e.g., length in 1D, area in 2D, etc.) and that (2) the number of events in two disjoint intervals are independent of each other. Thus, for a region of the sky, a telescope's field of view, for example, with area A, we might think that the number of FRBs occurring in this region over an interval of time T is described by the Poisson distribution:

$\begin{eqnarray}&&P\{N(A,T)=n\}=\displaystyle \frac{1}{n!}{(\lambda {AT})}^{n}{e}^{-\lambda {AT}},\end{eqnarray} \tag{ 1 }$

where the parameter λらむだ is interpreted as the rate for infinitesimally small area and length of time. The mean in this case, $\lambda {AT}$ is just the integral of the constant rate over the spatial and temporal intervals.

The nonhomogeneous Poisson model assumes that the instantaneous rate may vary over the space of observations. For example, in practice, a radio telescope has varying sensitivity over its field of view. Assuming a radially symmetric sensitivity, the constant rate in the above example will need to be replaced by $\lambda (r)$ , which varies with the radius from the center of the beam. The $\lambda {AT}$ term would be replaced by an integral over the desired region. We will flesh out the detail below when we consider a rate that varies with both the radius and the flux measurement itself. For more details on Poisson processes, homogeneous and otherwise, see Kingman (1993) or Ross et al. (1996).

We assume that observations are made at a mix of single-dish radio telescopes (e.g., Parkes) and radio interferometers (e.g., VLA). We begin with a distinction between incident and observed flux. The incident flux is the flux that arrives at the telescope and that would be measured if the event were detected at the center of the beam. Let ${{\rm{\Lambda }}}_{I}({s}_{I})$ be the cumulative rate of events with incident flux, that is, the flux that is independent of attenuation from the observing device, greater than s_I. Assuming a uniform spatial distribution of possible events, we know that this should follow a power law with an exponent of −3/2. We will consider a general power-law form:

$\begin{eqnarray}&&{{\rm{\Lambda }}}_{I}({s}_{I})={{as}}_{I}^{-b},\end{eqnarray} \tag{ 2 }$

where $b=3/2$ is the so-called Euclidean scaling that is valid for any population with a fixed luminosity distribution, as long as the luminosity does not evolve with redshift and the population has a uniform spatial distribution (see Oppermann et al. 2016). Let ${\lambda }_{I}({s}_{I})$ be the instantaneous rate for events with incident flux s_I, which is given by the derivative of ${{\rm{\Lambda }}}_{I}(0)\mbox{--}{{\rm{\Lambda }}}_{I}(s)$ with respect to s_I:

$\begin{eqnarray}&&{\lambda }_{I}({s}_{I})={{abs}}_{I}^{-b-1}{{ds}}_{I}\end{eqnarray} \tag{ 3 }$

Our actual measurements are also affected by the telescope or array that we use. For simplicity, we will assume that the beam attenuation is radially symmetric and given by a function p(r). The flux we measure, s_O, at some radius r from the center of the beam is the product of the incident flux and the attenuation function ${s}_{O}={s}_{I}p(r)$ . We can make substitutions to determine the instantaneous rate for observed flux of value s_O at radius r:

$\begin{eqnarray}&&{\lambda }_{O}({s}_{O},r)={ab}{\left(\displaystyle \frac{{s}_{O}}{p(r)}\right)}^{-b-1}\displaystyle \frac{1}{p(r)}2\pi r\,{{ds}}_{O}\,{dr}\,{dt}.\end{eqnarray} \tag{ 4 }$

The $\tfrac{1}{p(r)}{{ds}}_{O}$ term replaces the ds_I term as part of the change of variables and $2\pi {rdr}$ is the infinitesimal observing area of the beam at radius r. The dt acknowledges that the process has a time dimension, although the rate is constant in time. (FRB 121102 is perhaps an exception as it has been observed to repeat.) For simplicity, we will assume a Gaussian beam shape for all telescopes, $p(r)={2}^{-{(r/R)}^{2}}$ , where R is the half-max radius for the beam. This gives a rate function of

$\begin{eqnarray}&&{\lambda }_{O}({s}_{O},r)={ab}{({s}_{O}{2}^{{(r/R)}^{2}})}^{-b-1}{2}^{{(r/R)}^{2}}2\pi r.\end{eqnarray} \tag{ 5 }$

Beams with sidelobes are considered in Section 4. For the beams of a single-dish telescope, the radius is not observed. For these devices, we need to integrate this function over the radius to produce the relevant rate function:

$\begin{eqnarray}&&{\lambda }_{O,{SD}}({s}_{O})=\displaystyle \frac{\pi {R}^{2}}{b{\rm{ln}}(2)}{{abs}}_{O}^{-b-1}.\end{eqnarray} \tag{ 6 }$

The $\tfrac{\pi {R}^{2}}{b\mathrm{ln}(2)}$ term can be interpreted as the effective area of the receiver.

We will take a maximum likelihood approach to estimating the model. In brief, we will write down the probability for the model, plug in the observed data, and then maximize this equation over the unknown parameters. The probability of observing n events with fluxes ${s}_{1},\,\cdots ,\,{s}_{n}$ over T hours at a single-dish telescope with minimum detectible flux σしぐま is described by

$\begin{eqnarray}&&P\{N=n,{\boldsymbol{S}}=({s}_{1},\,\cdots ,\,{s}_{n})\}\propto \exp \{-T{{\rm{\Lambda }}}_{O}(\sigma )\}\displaystyle \prod _{j=1}^{n}{\lambda }_{O,{SD}}({s}_{j}),\end{eqnarray} \tag{ 7 }$

where ${{\rm{\Lambda }}}_{O}(\sigma )={\int }_{\sigma }^{\infty }{\lambda }_{O,{SD}}({s}_{O}){{ds}}_{O}$ is the cumulative rate for observed fluxes greater than σしぐま. There are two useful intuitive ways to derive this probability. The first is to consider the Poisson probability of the number of events that will occur in the observing time and above the flux limit (the exponential term) and then, conditional on this number, consider the probability that these events occur at the observed fluxes (the product of λらむだ term). The second approach is to consider dividing the flux line into tiny intervals and treating ${\lambda }_{O,{SD}}({s}_{O})$ as constant over each interval. Now, each interval has a homogenous Poisson process over an interval small enough that the probability of more than one event is negligible. Letting the size of these intervals go to zero gives the above probability. For interferometers, the probability includes the measured radial location for each FRB:

$\begin{eqnarray}P\{N & = & n,{\boldsymbol{S}}=({s}_{1},\,\cdots ,\,{s}_{n}),{\boldsymbol{R}}=({r}_{1},\,\cdots ,\,{r}_{n})\}\\ & \propto & \exp \{-T{{\rm{\Lambda }}}_{O}(\sigma )\}\displaystyle \prod _{j=1}^{n}{\lambda }_{O}({s}_{j},{r}_{j}).\end{eqnarray} \tag{ 8 }$

Analysis is done on the log scale and the important quantity is the log-likelihood. Consider the log-likelihood component for survey i with n_i flux measurements ${\boldsymbol{s}}$ :

$\begin{eqnarray}{{\ell }}_{i}(a,b;{\boldsymbol{s}}) & = & -{T}_{i}{\displaystyle \int }_{{\sigma }_{i}}^{\infty }{\lambda }_{O,{SD}}(u;a,b)\,{du}\\ & & +\displaystyle \sum _{j=1}^{{n}_{i}}\mathrm{ln}{\lambda }_{O,{SD}}({s}_{j};a,b)\end{eqnarray} \tag{ 9 }$

for single-dish observations and

$\begin{eqnarray}{{\ell }}_{i}(a,b;{\boldsymbol{s}},{\boldsymbol{r}}) & = & -{T}_{i}{\displaystyle \int }_{{\sigma }_{i}}^{\infty }{\displaystyle \int }_{0}^{\infty }{\lambda }_{O}(u,v;a,b)\,{dv}\,{du}\\ & & +\displaystyle \sum _{j=1}^{{n}_{i}}\mathrm{ln}{\lambda }_{O}({s}_{j},{r}_{j};a,b)\end{eqnarray} \tag{ 10 }$

for interferometer observations with radius measurements ${\boldsymbol{r}}$ . The log-likelihood over all surveys is simply the sum of these:

$\begin{eqnarray}&&{\ell }(a,b;s,r)=\displaystyle \sum _{i}{{\ell }}_{i}(a,b;s,r).\end{eqnarray} \tag{ 11 }$

This function is maximized over the unknown parameters to obtain estimates for a and b. This can be done with a gradient-based numerical optimization scheme. The covariance of these parameters is estimated using the second derivative matrix of this function: $V(\hat{a},\hat{b})={[-\ddot{{\ell }}(\hat{a},\hat{b})]}^{-1}$ . See Theorem 5.1 of Lehmann & Casella (1998) for details of the asymptotic normality of these estimates and the rest of the book for background on the theory of maximum likelihood. Details of these computations are provided in Appendix A.

3. Results for FRB Surveys

Our data are given in Table 1. The data come from 12 surveys using four telescopes and recording a total of 15 FRBs. This list does not include all surveys or detections, but only those for which we could determine the relevant observing characteristics. The list does include a variety of surveys, including some with zero detections. It also includes nearly 90% of the published FRB detections. The table lists the number of FRBs, the observed fluxes, the minimum detectable flux, the time-on-sky, and the full width at half maximum for each instrument. The time-on-sky includes the number of beams for the single-dish instruments (for example, if the Parkes telescope observed for x hours, the time listed here would be $13x$ to account for the 13 beams). For simplicity, the beams are assumed to not overlap. We have focused the work on flux thresholds instead of fluence. Where necessary, we follow Law et al. (2015) and Burke-Spolaor et al. (2016) by assuming a pulse width of 3 ms based on the average value for the FRBs detected to date.

Table 1. Survey Data for Several Published FRB Searches with 15 Total Detections

Survey Ref.	n	Flux	Sensitivity	Time	FWHM
		s_i (Jy)	${\sigma }_{i}$ (Jy)	T_i (beams*hr)	$2{R}_{i}$ (arcmin)
Lorimer et al. (2007)	1	30	0.295	6376.4	14
Deneva et al. (2009)	0	NA	0.1825	3213.7	4
Keane et al. (2010)	0	NA	0.2235	20247.5	14
Siemion et al. (2011)	0	NA	59	4074	150
Burgay et al. (2013)	0	NA	0.2645	6923.8	14
Petroff et al. (2014)	0	NA	0.434	12038	14
Spitler et al. (2014)	1	0.4	0.105	80523.1	4
Burke-Spolaor & Bannister (2014)	1	0.3	0.3075	11924.9	14
Ravi et al. (2015)	1	2.1	0.2775	889.2	14
Petroff et al. (2015)	1	0.47	0.2775	1111.5	14
Law et al. (2015)	0	NA	0.12	166	30
Champion et al. (2016)	10	1.3, 0.4, 0.5, 0.5,	0.28	36224.5	14
		0.87, 0.42, 0.69
		1.67, 0.18, 2.2

Note. Except for the information for Law et al. (2015), the values for n, sensitivity, and time are derived from Burke-Spolaor et al. (2016), which assumes a pulse width of 3 ms. Flux values are taken directly from each reference. The flux for Ravi et al. (2015) was read from Figure 3 of that reference. The first four fluxes for Champion et al. (2016) are taken from Thornton et al. (2013). The last five were read from Figure 1 of the reference. Note that the FRB catalog includes two other detections, but the survey characteristics were difficult to obtain. The time-on-sky includes the number of beams for the single-dish instruments (for example, if the Parkes telescope observed for x hours, the time listed here would be 13x to account for the the 13 beams).

Download table as: ASCII Typeset image

The observed data are plugged into the log-likelihood described above. Note that the only interferometer survey, Law et al. (2015), has no detections, so the instantaneous rate including a radius is not used here (the summation in Equation (10) is not used). The likelihood is maximized at the values $\hat{a}=586.88$ events per sky per day above a flux of 1 Jy and power-law index $\hat{b}=0.91$ . The 95% confidence intervals are (271.86, 923.72) for a and (0.57, 1.25) for b. The latter is notable, as it excludes the Euclidean scaling value of 1.5. However, our confidence interval for the power-law index has substantial overlap with the 95% confidence interval reported by Oppermann et al. (2016), (0.8, 1.7), and the interval reported by Caleb et al. (2016), (0.9 ± 0.3). Li et al. (2017) parameterize the model differently and their estimates of the power-law index are generally even lower, with some of their intervals (see Figure 4 of that paper) overlapping ours and others excluding even a value of 1.

These estimates give distinctly lower rates than other commonly cited results. Champion et al. (2016) give a "fluence-complete rate" of 2500 FRBs per sky per day above about 2 Jy ms, or above 0.67 Jy assuming a 3 ms pulse. Their 95% confidence interval is 900–5700 FRBs per sky per day. The corresponding rate from our model is only about 849.29 FRBs per sky per day with a 95% confidence interval between about 508.86 and 1417.47 FRBs per sky per day. These intervals overlap, indicating some agreement. Rane et al. (2016) uses information from a collection of Parkes surveys and reports a rate of 4400 FRBs per sky per day above 4.0 Jy ms, or 1.33 Jy assuming a 3 ms pulse. They report a 99% confidence interval of 1300–9600 FRBs per sky per day. Our corresponding estimate is about 451.51 FRBs per sky per day with a 99% confidence interval between about 225.21 and 905.20 FRBs per sky per day. These intervals do not overlap suggesting a significant difference. These differences arise, at least in part, because our approach has included more surveys from a variety of telescopes, with more zero-detection results. Further, the methods are quite different with our approach attempting to leverage the information contained in the actual flux measurements.

Figure 1 shows the fit in terms of the cumulative rate as a function of incident flux, along with individual point estimates and uncertainties from each survey. The solid black line shows the maximum likelihood estimate and the dashed black lines show the 95% confidence interval for the fit (see Appendix B for details on the uncertainty of the cumulative rate function). The gray line shows the maximum likelihood fit under the constraint of Euclidean scaling (fix b = 1.5). This line falls outside of our confidence bounds, again giving some indication that the data do not support Euclidean scaling.

The individual survey results are plotted at their FWHM sensitivity values, as is commonly done. Their rate values are computed as described in Section 1, with the observing area assumed, as is commonly done, to be the area of the circle with diameter given by the full width at half maximum. This approach can be somewhat problematic however. Equation (6) indicates that this area needs to be scaled by $b\mathrm{ln}(2)$ . If b takes the assumed value of 3/2, the scaling term is about 1.04 and the simple area calculation slightly overestimates the true effective area and estimated rates are not much affected. If we use the maximum likelihood estimate of $\hat{b}=0.91$ , the scaling term is 0.63 and the simple area calculation underestimates the effective area by about 60% and therefore overestimates the effective rate by about 60%.

Figure 1 highlights the advantage of using the flux values. A crude estimate of the power-law model could be determined using the traditionally calculated individual rates by simply drawing a line through the various points. However, this approach is difficult, at least in part, because most of the surveys have similar flux thresholds. This makes it difficult to constrain the parameters, particularly the power-law index value. This would create the potential awkward desire to obtain detections using less sensitive surveys (those with higher thresholds) in order to constrain the index. The NHPP approach uses the flux values themselves to very directly estimate the power-law model.

To check the quality of the model fit, we compare the fitted survival function with the empirical survival function in Figure 2. In statistics, the survival function refers to one minus the cumulative distribution function. As the cumulative distribution function gives the probability of observing random variables less than or equal to a given value, the survival function gives the probability of observing random variables greater than a given value. This makes particular sense in this context since we are often discussing fluxes that are greater than or equal to a particular value. The unusual name arises from its use in reliability and medical studies where it gives the probability of a part or patient surviving at least a certain period of time. The x-axis is flux and the y-axis is the probability of an FRB with greater flux. In this case, the survival function $P(s,b)$ gives the probability of observing an FRB with flux greater than s. As shown below, the function is dependent on b, but not a. No detections currently come from interferometers, but if some did, we recommend treating the position as unknown and using the observed flux with the single-dish rate function for this model check. Ultimately, the comparison presented here is similar to the quantile-quantile plot commonly used in statistics to visually check a distributional assumption.

To derive this function, consider the Poisson process over the combination of all of the surveys. The rate function for this process is just the sum of the individual rate functions. Given that an FRB is detected in a survey, its flux value has a probability density proportional to the sum of the rate functions. The normalizing constant for this density is just the sum of the cumulative rate functions evaluated at the minimum sensitivity. To find the survival probability at a flux s, we just integrate each rate function from s to $\infty$ . Thus, the survival function is given by

$\begin{eqnarray}&&P(S\gt s;b)=\displaystyle \frac{{\sum }_{i}{T}_{i}{{\rm{\Lambda }}}_{i,O}(\max [s,{\sigma }_{i}])}{{\sum }_{i}{T}_{i}{{\rm{\Lambda }}}_{i,O}({\sigma }_{i})},\end{eqnarray} \tag{ 12 }$

$\begin{eqnarray}&&=\,\displaystyle \frac{{\sum }_{i}{T}_{i}\tfrac{\pi {R}_{i}^{2}}{b\mathrm{ln}(2)}\max {[s,{\sigma }_{i}]}^{-b}}{{\sum }_{i}{T}_{i}\tfrac{\pi {R}_{i}^{2}}{b\mathrm{ln}(2)}{\sigma }_{i}^{-b}},\end{eqnarray} \tag{ 13 }$

where i indexes the surveys, ${{\rm{\Lambda }}}_{i,O}$ is the cumulative rate function for observed flux for survey i, and ${\sigma }_{i}$ is the minimum observable flux for survey i. The $\max [s,{\sigma }_{i}]$ term arises because survey i can only see FRBs with flux greater than s when $s\lt {\sigma }_{i}$ . As a function of s, each term in the numerator is flat until s exceeds ${\sigma }_{i}$ . As a result, the function's behavior can change sharply at these points. This behavior can be seen as kinks in the curves in Figure 2. The short dashes above the x-axis indicate the minimum detectable flux, ${\sigma }_{i}$ for each survey. The sharp behavior is most easily noticed through the last of the dashed lines below one and the dashed line at S = 59.

Figure 2 shows the survival function for the estimated model. The solid black line shows the function computed using the maximum likelihood estimate ( $\hat{a}=586.88$ , $\hat{b}=0.91$ ) and the dashed black lines show the 95% confidence interval on this function computed using a parametric bootstrap (see Appendix B). Let ${s}_{(i)}$ denote the ith ordered observed flux. This is a non-parametric estimate of the $(i-0.5)/N$ flux quantile, for N total observed fluxes. Thus, the y-axis value for the ith point sorted by flux is $(N-i+0.5)/N$ . This plot is useful as a diagnostic tool for model checking, as it compares quantiles from the fitted model with the empirical quantiles from the data. As all of the points fall close to the solid line and well within the confidence interval, there doesn't seem to be a lack of fit for the power-law model. The gray dashed lines show the 95% confidence interval based on the Euclidean scaling assumption. As before, the plot indicates that the data do not support Euclidean scaling.

The data in Table 1 and software used to make the FRB rate estimate are available at https://github.com/lanl/NHPP-for-FRBs.

4. Discussion

The NHPP model provides an approach that can use all of the available information from astronomical events by giving probabilities for both the number of events detected in a survey and the measured flux values of those events. This is especially important when events, like FRB detections, are infrequently observed across a number of surveys; the model provides a natural solution to combining all of the available data across multiple single-dish and interferometer surveys. This approach avoids the need to make conservative choices about sensitivity (e.g., computing the sensitivity and field of view for the FWHM) that throw away information. Using the NHPP model, we estimate a rate of 586.88 sky⁻¹ day⁻¹ above a flux of 1 Jy. This rate is significantly lower than described in other works for two reasons. First, other rate estimates have not considered how non-Euclidean brightness distributions can change the effective volume probed. In the best-fit model discussed here, this increases the effective volume by 60%, which reduces the rate by a similar factor. Second, this rate estimate includes surveys from all points on the sky and does not attempt to model Galactic latitude dependence (discussed below).

One important use for this model is survey planning. We can estimate the expected number of hours until a detection using Equation (6). In the case of the VLA, this equation still applies assuming a uniform distribution of events over the beam. We integrate this equation over s_O to get the cumulative rate

$\begin{eqnarray*}&&{{\rm{\Lambda }}}_{O}(\sigma )=\displaystyle \frac{\pi {R}^{2}}{b{\rm{ln}}(2)}a{\sigma }^{-b},\end{eqnarray*}$

and plug in the minimum sensitivity value to get a Poisson rate for a particular telescope. We can incorporate all sources of uncertainty by sampling a and b using the parametric bootstrap to get a rate and then drawing an exponential random variable using the resulting rate. Assuming the sensitivity and FWHM values used for Spitler et al. (2014), Petroff et al. (2015), and Law et al. (2015), we can get estimates for the time-to-first detection for Arecibo, Parkes, and the VLA. The mean time-to-first detection for these instruments is about 5989, 650, and 861 hours respectively. In the case of Parkes, this value, multiplied by 13 beams and giving 8450 beam hours, is reasonably close to the total number of beam hours divided by the total number of detections, 6838 beam hours per event, across all surveys in our data set. The 95% upper bounds on the time-to-first detection are 18375, 2017, and 2693 hr for Arecibo, Parkes, and the VLA respectively.

The method presented here can be extended to include anything that systematically varies the sensitivity of a detection including dispersion measure scalloping, unknown pulse widths, or galactic position. The method can also be extended to include differences in sensitivities and effective areas from different pointings within the same survey. The approach can also be extended to a slightly generalized probability model. The negative binomial model is also a counting model, which can be derived from the Poisson model by letting the rate follow a gamma distribution. This introduces extra variation that can be used to account for effects that are not included in the rate model.

An important extension could be the inclusion of more precise beam shapes. To improve incrementally on commonly assumed flat beams, we have assumed radially symmetric Gaussian beams, that do not overlap in the case of the single-dish telescopes. Although uncertainty quantification and model checking are more difficult, the approach can be extended to include numerical integration of precise descriptions of the beam, including two-dimensional descriptions of the multi-beam single-dish telescopes that properly include beam overlap. Figure 3 shows the results from a limited exploration of the beam shape. Here, the sensitivity is based on the square of a sinc function: $p{(r)={{\rm{\sin }}}^{2}(\tau r)/(\tau r)}^{2}$ , where τたう is estimated for each telescope to give the correct FWHM. The 95% confidence interval for Gaussian beam fit is shown in gray. Three sinc-squared fits are shown in black. The solid black line shows the rate fit using three radially symmetric sidelobes for each telescope. The dashed line uses 15 radially symmetric sidelobes for the telescope. Finally, the dotted line uses as many complete sidelobes that span less than 90° (recall that this is integrated around the circle). For all of the sinc-squared fits, the power-law index is approximately the same as the value obtained from the Gaussian beam fit. The increasing number of sidelobes increases the effective observing area and thus lowers the overall rate. For a small number of sidelobes, the estimate is within the uncertainty for the Gaussian beam fit. Beyond approximately 12 sidelobes, the rate estimate differs significantly from the Gaussian beam estimate. This suggests that accurate models of the beam could be very important for rate estimation given the small number of observed FRBs.

**Figure 3.** Comparison of the rate estimates using the Gaussian beam (95% confidence interval in gray) and a beam described by the square of a sinc function (black lines). The sinc-squared sensitivity is considered with differing numbers of radially symmetric sidelobes: 3 for each telescope, 15 for each, and the maximum number of sidelobes that cover less than 90°. The addition of sidelobes increases the effective observing area, thus lowering the rates. The power-law index stays approximately constant as the overall rate drops. With fewer than approximately 12 sidelobes, the estimated rate is within the 95% confidence bands of the Gaussian fit.
Download figure:
Standard image High-resolution image

A second important extension is the inclusion of dependence on galactic latitude. We can crudely consider latitude dependence by estimating the model parameters using just the surveys that collected data exclusively or mostly at high galactic latitudes. We include Lorimer et al. (2007), Ravi et al. (2015), Petroff et al. (2015), Law et al. (2015), and Champion et al. (2016) in this group. These surveys produce an estimate of $\hat{a}=1314.96$ FRBs per sky per day above 1 Jy (95% CI: 509.83, 2152.37) and a power-law index of $\hat{b}=0.79$ (95% CI: 0.38, 1.22). The 1 Jy rate rate is higher and the index is lower. As before, we assume a 3 ms pulse to compare with other rates given in fluence. Our predicted rate above 0.67 Jy is about 1814.18 FRBs per sky per day with a 95% confidence interval between about 1029.68 and 3196.40, which falls entirely within the interval given by Champion et al. (2016) at this flux. Our predicted rate above 1.33 Jy is about 1046.52 FRBs per sky per day with a 99% confidence interval between about 500.80 and 2186.89, which does now overlap with the interval given by Rane et al. (2016), but their rate is computed using data over a range of latitudes and their rate is given as an all-sky rate. While this suggests some dependence, we are not comfortable stating this with certainty without building latitude into the model more directly.

This model also finds evidence for a non-Euclidean flux distribution of FRBs with a power-law index of $0.91$ . This is consistent with estimates made with other techniques and data (Vedantham et al. 2016). A non-Euclidean distribution produces unintuitive properties for observed FRB properties, such as that it is easier to detect more distant events and that small, wide-field radio telescopes can detect more events than large, sensitive telescopes. Finally, we note that a flat flux distribution is inherently surprising, as demonstrated by the extreme brightness of the first FRB reported by Lorimer et al.(2007).

This work was supported by the University of California Lab Fees Program under award number #LF-12-237863.

Appendix A: Gradient and Hessian of the Log-likelihood

Here we derive the gradient and Hessian for the log-likelihood. The latter is important for computing an estimate of the variance of the unknown parameters. For convenience, we will reparameterize the model in terms of $\alpha ={ab}$ and $\beta =b+1$ .

Recall that the log-likelihood is the sum of the log-likelihoods from each survey:

$\begin{eqnarray}&&{\ell }(\alpha ,\beta ;s,r)=\displaystyle \sum _{i}{{\ell }}_{i}(\alpha ,\beta ;s,r),\end{eqnarray} \tag{ 14 }$

thus, we can consider the derivatives of each summand individually. Consider the gradient of one survey-level log-likelihood, a vector of the partial derivatives.

$\begin{eqnarray}{\dot{{\ell }}}_{i}(\alpha ,\beta ;s,r) & = & -{T}_{i}{\displaystyle \int }_{{\sigma }_{i}}^{\infty }{\displaystyle \int }_{0}^{\infty }{\dot{\lambda }}_{O}(u,v;\alpha ,\beta )\,{dv}\,{du}\\ & & +\displaystyle \sum _{j=1}^{{n}_{i}}\displaystyle \frac{{\dot{\lambda }}_{O}({s}_{j},{r}_{j};\alpha ,\beta )}{{\lambda }_{O}({s}_{j},{r}_{j};\alpha ,\beta )}.\end{eqnarray} \tag{ 15 }$

The result for a single-dish term is similar. The key thing to note is that this function only depends on the derivatives of the rate function. The same is true of the second derivative matrix, the Hessian, which only depends on the first and second derivatives of the rate function,

$\begin{eqnarray}&&{\ddot{{\ell }}}_{i}(\alpha ,\beta ;s,r)=-{T}_{i}{\displaystyle \int }_{{\sigma }_{i}}^{\infty }{\displaystyle \int }_{0}^{\infty }{\ddot{\lambda }}_{O}(u,v;\alpha ,\beta )\,{dv}\,{du}\\ &&\ +\displaystyle \sum _{j=1}^{{n}_{i}}\left[\displaystyle \frac{{\ddot{\lambda }}_{O}({s}_{j},{r}_{j};\alpha ,\beta )}{{\lambda }_{O}({s}_{j},{r}_{j};\alpha ,\beta )}-\displaystyle \frac{{\dot{\lambda }}_{O}({s}_{j},{r}_{j};\alpha ,\beta ){\dot{\lambda }}_{O}{({s}_{j},{r}_{j};\alpha ,\beta )}^{T}}{{\lambda }_{O}^{2}({s}_{j},{r}_{j};\alpha ,\beta )}\right].\end{eqnarray} \tag{ 16 }$

Note that ${\dot{{\ell }}}_{i}(\alpha ,\beta ;s,r)$ is column vector of length two and that ${\ddot{{\ell }}}_{i}(\alpha ,\beta ;s,r)$ is a 2 × 2 matrix.

Here are the first and second partial derivative terms for ${\lambda }_{O,{SD}}$

$\begin{eqnarray}&&\displaystyle \frac{\partial }{\partial \alpha }=\displaystyle \frac{\pi {R}^{2}}{\mathrm{ln}(2)}\displaystyle \frac{{s}_{O}^{-\beta }}{\beta -1}\end{eqnarray} \tag{ 17 }$

$\begin{eqnarray}&&\displaystyle \frac{\partial }{\partial \beta }=-\displaystyle \frac{\pi {R}^{2}}{\mathrm{ln}(2)}\displaystyle \frac{\alpha {s}_{O}^{-\beta }}{\beta -1}\left[\mathrm{ln}(s)+\displaystyle \frac{1}{\beta -1}\right]\end{eqnarray} \tag{ 18 }$

$\begin{eqnarray}&&\displaystyle \frac{{\partial }^{2}}{\partial {\alpha }^{2}}=0\end{eqnarray} \tag{ 19 }$

$\begin{eqnarray}&&\displaystyle \frac{{\partial }^{2}}{\partial \alpha \partial \beta }=-\displaystyle \frac{\pi {R}^{2}}{\mathrm{ln}(2)}\displaystyle \frac{{s}_{O}^{-\beta }}{\beta -1}\left[\mathrm{ln}(s)+\displaystyle \frac{1}{\beta -1}\right]\end{eqnarray} \tag{ 20 }$

$\begin{eqnarray}&&\displaystyle \frac{{\partial }^{2}}{\partial {\beta }^{2}}=\displaystyle \frac{\pi {R}^{2}}{\mathrm{ln}(2)}\displaystyle \frac{\alpha {s}_{O}^{-\beta }}{\beta -1}\left[{\mathrm{ln}}^{2}({s}_{O})+\displaystyle \frac{2\mathrm{ln}({s}_{O})}{\beta -1}+\displaystyle \frac{2}{{(\beta -1)}^{2}}\right].\end{eqnarray} \tag{ 21 }$

Here are the first and second partial derivative terms for ${\lambda }_{O}$

$\begin{eqnarray}&&\displaystyle \frac{\partial }{\partial \alpha }={2}^{{(r/R)}^{2}+1}\pi r{(s{2}^{{(r/R)}^{2}})}^{-\beta }\end{eqnarray} \tag{ 22 }$

$\begin{eqnarray}&&\displaystyle \frac{\partial }{\partial \beta }=-{2}^{{(r/R)}^{2}+1}\pi r\alpha {(s{2}^{{(r/R)}^{2}})}^{-\beta }\left[\mathrm{ln}(s)+{\left(\displaystyle \frac{r}{R}\right)}^{2}\mathrm{ln}(2)\right]\end{eqnarray} \tag{ 23 }$

$\begin{eqnarray}&&\displaystyle \frac{{\partial }^{2}}{\partial {\alpha }^{2}}=0\end{eqnarray} \tag{ 24 }$

$\begin{eqnarray}&&\displaystyle \frac{{\partial }^{2}}{\partial \alpha \partial \beta }=-{2}^{{(r/R)}^{2}+1}\pi r{(s{2}^{{(r/R)}^{2}})}^{-\beta }\left[\mathrm{ln}(s)+{\left(\displaystyle \frac{r}{R}\right)}^{2}\mathrm{ln}(2)\right]\end{eqnarray} \tag{ 25 }$

$\begin{eqnarray}&&\displaystyle \frac{{\partial }^{2}}{\partial {\beta }^{2}}={2}^{{(r/R)}^{2}+1}\pi r{(s{2}^{{(r/R)}^{2}})}^{-\beta }{\left[\mathrm{ln}(s)+{\left(\displaystyle \frac{r}{R}\right)}^{2}\mathrm{ln}(2)\right]}^{2}.\end{eqnarray} \tag{ 26 }$

The gradient is helpful for maximizing the likelihood. The Hessian is necessary for the covariance of the parameters, as described above.

Appendix B: Confidence Intervals Based on the Parametric Bootstrap

Figures 1 and 2 show confidence intervals for the cumulative rate of incident flux and the survival plot. Both sets of bounds are computed using the parametric bootstrap. This uses the covariance of the estimated parameters give in Section 2. The statistical theory says that our unknown parameters are approximately normally distributed with mean given by the maximum likelihood estimates and covariance given by $V(\hat{\alpha },\hat{\beta })={[-\ddot{{\ell }}(\hat{\alpha },\hat{\beta })]}^{-1}$ . The parametric bootstrap propagates this uncertainty by sampling the parameters from this distribution and computing a quantity of interest (e.g., the cumulative rate over some grid of flux values) for each sample. This provides an empirical distribution for the quantity of interest from which we can compute quantiles to use as bounds.

The Nonhomogeneous Poisson Process for Fast Radio Burst Rates

Article metrics

Permissions

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. The Nonhomogeneous Poisson Process Model

3. Results for FRB Surveys

4. Discussion

Appendix A: Gradient and Hessian of the Log-likelihood

Appendix B: Confidence Intervals Based on the Parametric Bootstrap

Footnotes

The Nonhomogeneous Poisson Process for Fast Radio Burst Rates

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. The Nonhomogeneous Poisson Process Model

3. Results for FRB Surveys

4. Discussion

Appendix A: Gradient and Hessian of the Log-likelihood

Appendix B: Confidence Intervals Based on the Parametric Bootstrap

Footnotes