2 One Parameter Families
As we want inferences to be unaffected by the choice of parameter,
we describe the basics of inference without these. Parameterization
will be introduced to describe the smooth structures of estimators.
Let M π³ subscript π π³ M_{\mathcal{\mathcal{X}}} italic_M start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT be a family of probability measures
having common support π³ π³ \mathcal{X} caligraphic_X . While π³ π³ \mathcal{X} caligraphic_X can be
an abstract space, for most applications π³ β β d π³ superscript β π \mathcal{X}\subset\mathbb{R}^{d} caligraphic_X β blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT .
Points in M π³ subscript π π³ M_{\mathcal{X}} italic_M start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT serve as models for a population whose
individuals take values in π³ π³ \mathcal{X} caligraphic_X . We consider inference
for models from M π³ subscript π π³ M_{\mathcal{X}} italic_M start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT based on a sample that is denoted
by y π¦ y italic_y and let π΄ π΄ \mathcal{Y} caligraphic_Y be the corresponding sample space.
The relationship between π³ π³ \mathcal{X} caligraphic_X and π΄ π΄ \mathcal{Y} caligraphic_Y will depend
on the sampling plan, conditioning, and dimension reduction using
sufficient statistics. For a simple random sample of size n π n italic_n without
conditioning and no dimension reduction π΄ = π³ n π΄ superscript π³ π \mathcal{Y}=\mathcal{X}^{n} caligraphic_Y = caligraphic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT .
Let M = M π΄ π subscript π π΄ M=M_{\mathcal{Y}} italic_M = italic_M start_POSTSUBSCRIPT caligraphic_Y end_POSTSUBSCRIPT be the family of probability measures obtained
from M π³ subscript π π³ M_{\mathcal{X}} italic_M start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT using a sampling plan whose sample space is
π΄ π΄ \mathcal{Y} caligraphic_Y . For π΄ = π³ n π΄ superscript π³ π \mathcal{Y}=\mathcal{X}^{n} caligraphic_Y = caligraphic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT
M = { m : m β’ ( y ) = β m π³ β’ ( x i ) , m π³ β M π³ } . π conditional-set π formulae-sequence π π¦ product subscript π π³ subscript π₯ π subscript π π³ subscript π π³ M=\left\{m:m(y)=\prod m_{\mathcal{X}}(x_{i}),\ m_{\mathcal{X}}\in M_{\mathcal{%
X}}\right\}. italic_M = { italic_m : italic_m ( italic_y ) = β italic_m start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , italic_m start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT β italic_M start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT } .
For the Bernoulli family of distributions, π³ = { 0 , 1 } π³ 0 1 \mathcal{X}=\left\{0,1\right\} caligraphic_X = { 0 , 1 } ,
M π³ = { m : 0 < m β’ ( 1 ) < 1 , m β’ ( 0 ) + m β’ ( 1 ) = 1 } . subscript π π³ conditional-set π formulae-sequence 0 π 1 1 π 0 π 1 1 M_{\mathcal{X}}=\left\{m:0<m(1)<1,m(0)+m(1)=1\right\}. italic_M start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT = { italic_m : 0 < italic_m ( 1 ) < 1 , italic_m ( 0 ) + italic_m ( 1 ) = 1 } .
For a sample of size n π n italic_n we use the sufficient statistic y = β x i π¦ subscript π₯ π y=\sum x_{i} italic_y = β italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
so that π΄ = { 0 , 1 , 2 , β¦ , n } π΄ 0 1 2 β¦ π \mathcal{Y}=\left\{0,1,2,\ldots,n\right\} caligraphic_Y = { 0 , 1 , 2 , β¦ , italic_n } and
M = { m : m β’ ( y ) = ( n y ) β’ m π³ β’ ( 1 ) y β’ m π³ β’ ( 0 ) n β y , m π³ β M π³ } . π conditional-set π formulae-sequence π π¦ binomial π π¦ subscript π π³ superscript 1 π¦ subscript π π³ superscript 0 π π¦ subscript π π³ subscript π π³ M=\left\{m:m(y)={n\choose y}m_{\mathcal{X}}(1)^{y}m_{\mathcal{X}}(0)^{n-y},\ m%
_{\mathcal{X}}\in M_{\mathcal{X}}\right\}. italic_M = { italic_m : italic_m ( italic_y ) = ( binomial start_ARG italic_n end_ARG start_ARG italic_y end_ARG ) italic_m start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( 1 ) start_POSTSUPERSCRIPT italic_y end_POSTSUPERSCRIPT italic_m start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( 0 ) start_POSTSUPERSCRIPT italic_n - italic_y end_POSTSUPERSCRIPT , italic_m start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT β italic_M start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT } .
(1)
When π΄ π΄ \mathcal{Y} caligraphic_Y is open it will be convenient to let m π m italic_m be a
probability density with respect to a dominating measure ΞΌ γΏγ
γΌ π \mu italic_ΞΌ γΏγ
γΌ . For
π³ = β π³ β \mathcal{X}=\mathbb{R} caligraphic_X = blackboard_R and function Ο > 0 italic-Ο 0 \phi>0 italic_Ο > 0 such that β« Ο β’ ( x ) β’ π ΞΌ γΏγ
γΌ = 1 italic-Ο π₯ differential-d π 1 \int\phi(x)d\mu=1 β« italic_Ο ( italic_x ) italic_d italic_ΞΌ γΏγ
γΌ = 1
there is a location family
M π³ = { m : m β’ ( x ) = Ο β’ ( x β a ) , a β β } . subscript π π³ conditional-set π formulae-sequence π π₯ italic-Ο π₯ π π β M_{\mathcal{X}}=\left\{m:m(x)=\phi(x-a),\ a\in\mathbb{R}\right\}. italic_M start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT = { italic_m : italic_m ( italic_x ) = italic_Ο ( italic_x - italic_a ) , italic_a β blackboard_R } .
For a simple random sample with y = ( x 1 , x 2 , β¦ , x n ) π π¦ superscript subscript π₯ 1 subscript π₯ 2 β¦ subscript π₯ π π y=\left(x_{1},x_{2},\ldots,x_{n}\right)^{{\tt t}} italic_y = ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , β¦ , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT
M = { m : m β’ ( y ) = β Ο β’ ( x i β a ) , a β β } . π conditional-set π formulae-sequence π π¦ product italic-Ο subscript π₯ π π π β M=\left\{m:m(y)=\prod\phi(x_{i}-a),\ a\in\mathbb{R}\right\}. italic_M = { italic_m : italic_m ( italic_y ) = β italic_Ο ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_a ) , italic_a β blackboard_R } .
If Ο β’ ( x ) = ( 2 β’ Ο γ±γ ) β 1 / 2 β’ exp β‘ ( β 1 2 β’ x 2 ) italic-Ο π₯ superscript 2 π 1 2 1 2 superscript π₯ 2 \phi(x)=\left(2\pi\right)^{-1/2}\exp\left(-\frac{1}{2}x^{2}\right) italic_Ο ( italic_x ) = ( 2 italic_Ο γ±γ ) start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT roman_exp ( - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )
then M π³ subscript π π³ M_{\mathcal{X}} italic_M start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT is the normal location family with unit variance.
Using the sufficient statistic y = x Β― = ( β x i ) / n β π΄ = β π¦ Β― π₯ subscript π₯ π π π΄ β y=\bar{x}=(\sum x_{i})/n\in\mathcal{Y}=\mathbb{R} italic_y = overΒ― start_ARG italic_x end_ARG = ( β italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) / italic_n β caligraphic_Y = blackboard_R ,
M = { m : m β’ ( y ) = n β’ Ο β’ ( n β’ ( y β a ) ) , a β β } . π conditional-set π formulae-sequence π π¦ π italic-Ο π π¦ π π β M=\left\{m:m\left(y\right)=\sqrt{n}\phi\left(\sqrt{n}\left(y-a\right)\right),a%
\in\mathbb{R}\right\}. italic_M = { italic_m : italic_m ( italic_y ) = square-root start_ARG italic_n end_ARG italic_Ο ( square-root start_ARG italic_n end_ARG ( italic_y - italic_a ) ) , italic_a β blackboard_R } .
(2)
If Ο β’ ( x ) = Ο γ±γ β 1 β’ ( 1 + x 2 ) β 1 italic-Ο π₯ superscript π 1 superscript 1 superscript π₯ 2 1 \phi(x)=\pi^{-1}\left(1+x^{2}\right)^{-1} italic_Ο ( italic_x ) = italic_Ο γ±γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 1 + italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT then M π³ subscript π π³ M_{\mathcal{X}} italic_M start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT
is the Cauchy location family with unit scale factor. There is no
sufficient statistic of dimension less than n π n italic_n so we use y = ( x 1 , x 2 , β¦ , x n ) π π¦ superscript subscript π₯ 1 subscript π₯ 2 β¦ subscript π₯ π π y=\left(x_{1},x_{2},\ldots,x_{n}\right)^{{\tt t}} italic_y = ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , β¦ , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ,
M = { m : m β’ ( y ) = Ο γ±γ β n β’ β ( 1 + ( x i β a ) 2 ) β 1 , a β β } . π conditional-set π formulae-sequence π π¦ superscript π π product superscript 1 superscript subscript π₯ π π 2 1 π β M=\left\{m:m(y)=\pi^{-n}\prod\left(1+\left(x_{i}-a\right)^{2}\right)^{-1},a\in%
\mathbb{R}\right\}. italic_M = { italic_m : italic_m ( italic_y ) = italic_Ο γ±γ start_POSTSUPERSCRIPT - italic_n end_POSTSUPERSCRIPT β ( 1 + ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_a ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , italic_a β blackboard_R } .
(3)
For real-valued measurable function h β h italic_h we define the expected value
of h β h italic_h at m π m italic_m ,
E m β’ h = β« π΄ h β’ ( y ) β’ m β’ ( y ) β’ π ΞΌ γΏγ
γΌ subscript πΈ π β subscript π΄ β π¦ π π¦ differential-d π E_{m}h=\int_{\mathcal{Y}}h(y)m(y)d\mu italic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_h = β« start_POSTSUBSCRIPT caligraphic_Y end_POSTSUBSCRIPT italic_h ( italic_y ) italic_m ( italic_y ) italic_d italic_ΞΌ γΏγ
γΌ
when π΄ π΄ \mathcal{Y} caligraphic_Y is open and E m β’ h = β y β π΄ h β’ ( y ) β’ m β’ ( y ) subscript πΈ π β subscript π¦ π΄ β π¦ π π¦ E_{m}h=\sum_{y\in\mathcal{Y}}h(y)m(y) italic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_h = β start_POSTSUBSCRIPT italic_y β caligraphic_Y end_POSTSUBSCRIPT italic_h ( italic_y ) italic_m ( italic_y )
when π΄ π΄ \mathcal{Y} caligraphic_Y is discrete. We use the following Hilbert space
H M = { h : E m β’ h 2 < β , β m β M } subscript π» π conditional-set β formulae-sequence subscript πΈ π superscript β 2 for-all π π H_{M}=\left\{h:E_{m}h^{2}<\infty,\ \forall\ m\in M\right\} italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT = { italic_h : italic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_h start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT < β , β italic_m β italic_M }
which has a family of inner products indexed by M π M italic_M ,
β¨ h , h β² β© m = E m β’ ( h β’ h β² ) β’ for allΒ β’ h , h β² β H M . formulae-sequence subscript β superscript β β²
π subscript πΈ π β superscript β β² for allΒ β superscript β β² subscript π» π \langle h,h^{\prime}\rangle_{m}=E_{m}\left(hh^{\prime}\right)\mbox{for all }h,%
h^{\prime}\in H_{M}. β¨ italic_h , italic_h start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT β© start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = italic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_h italic_h start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT ) for all italic_h , italic_h start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT β italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT .
When E m β’ ( h β’ h β² ) = 0 subscript πΈ π β superscript β β² 0 E_{m}(hh^{\prime})=0 italic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_h italic_h start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT ) = 0 the vectors h β h italic_h and h β² superscript β β² h^{\prime} italic_h start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT are m π m italic_m -orthogonal
and we write h β m h β² subscript perpendicular-to π β superscript β β² h\perp_{m}h^{\prime} italic_h β start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_h start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT . At each m β M π π m\in M italic_m β italic_M there is a copy of
H M subscript π» π H_{M} italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT and this collection we denote by
H β’ M = M Γ H M . π» π π subscript π» π H\!M=M\times H_{M}. italic_H italic_M = italic_M Γ italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT .
The copy of H M subscript π» π H_{M} italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT at m π m italic_m with inner product β¨ β
, β
β© m subscript β
β
π \langle\cdot,\cdot\rangle_{m} β¨ β
, β
β© start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT
is H m subscript π» π H_{m} italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT which we also write as H m β’ M subscript π» π π H_{m}M italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_M to indicate its relationship
to H β’ M π» π H\!M italic_H italic_M . For inference, H m subscript π» π H_{m} italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT will be restricted to the orthogonal
complement of the constant functions, H m β = { h β H m : E m β’ h = 0 } superscript subscript π» π perpendicular-to conditional-set β subscript π» π subscript πΈ π β 0 H_{m}^{\perp}=\{h\in H_{m}:E_{m}h=0\} italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT = { italic_h β italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT : italic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_h = 0 } ,
so that
H m = H m β β H m 0 β’ and β’ H m β β m H m 0 . subscript π» π direct-sum superscript subscript π» π perpendicular-to superscript subscript π» π 0 and superscript subscript π» π perpendicular-to subscript perpendicular-to π superscript subscript π» π 0 H_{m}=H_{m}^{\perp}\oplus H_{m}^{0}\ \mbox{and}\ H_{m}^{\perp}\perp_{m}H_{m}^{%
0}. italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT β italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT β start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT .
(4)
Note E m β’ h = β¨ h , 1 β© m subscript πΈ π β subscript β 1
π E_{m}h=\langle h,1\rangle_{m} italic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_h = β¨ italic_h , 1 β© start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT and H m 0 superscript subscript π» π 0 H_{m}^{0} italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT does not depend
on m π m italic_m . Since (4 ) holds for each m π m italic_m we write
H β’ M = H β β’ M β H 0 β’ M β’ and β’ H β β’ M β H 0 β’ M π» π direct-sum superscript π» perpendicular-to π superscript π» 0 π and superscript π» perpendicular-to π perpendicular-to superscript π» 0 π H\!M=H^{\perp}\!M\oplus H^{0}M\ \mbox{and}\ H^{\perp}M\perp H^{0}M italic_H italic_M = italic_H start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT italic_M β italic_H start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_M and italic_H start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT italic_M β italic_H start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_M
(5)
where β perpendicular-to \perp β indicates β m subscript perpendicular-to π \perp_{m} β start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT holds for H m β β’ M = H m β superscript subscript π» π perpendicular-to π superscript subscript π» π perpendicular-to H_{m}^{\perp}M=H_{m}^{\perp} italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT italic_M = italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT .
As the notation suggests, H β’ M π» π H\negmedspace M italic_H italic_M is a vector bundle on
M π M italic_M with vector space H M subscript π» π H_{M} italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT . It extends the tangent bundle T β’ M π π T\!M italic_T italic_M
since T β’ M β H β β’ M π π superscript π» perpendicular-to π T\!M\subset H^{\perp}\!M italic_T italic_M β italic_H start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT italic_M .
For inference regarding models in M π M italic_M , we consider functions g M : π΄ Γ M β β : subscript π π β π΄ π β g_{M}:\mathcal{Y}\times M\rightarrow\mathbb{R} italic_g start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT : caligraphic_Y Γ italic_M β blackboard_R
such that
g M β’ ( β
, m ) β H m β β’ Β for allΒ β’ m β M . subscript π π β
π superscript subscript π» π perpendicular-to Β for allΒ π π g_{M}\left(\cdot,m\right)\in H_{m}^{\perp}\mbox{ for all }m\in M. italic_g start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( β
, italic_m ) β italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT for all italic_m β italic_M .
(6)
We also want g M subscript π π g_{M} italic_g start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT to be a continuous on M π M italic_M ,
g M β’ ( y , β
) β C β’ ( M ) β’ Β for a.e.Β β’ y β π΄ , subscript π π π¦ β
πΆ π Β for a.e.Β π¦ π΄ g_{M}\left(y,\cdot\right)\in C(M)\mbox{ for a.e. }y\in\mathcal{Y}, italic_g start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_y , β
) β italic_C ( italic_M ) for a.e. italic_y β caligraphic_Y ,
(7)
so that the expectation of g M subscript π π g_{M} italic_g start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT is a continuous function. For
point estimators of a parameter, say ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ , the expectation of
the estimator ΞΈ γγΌγ ^ ^ π \hat{\theta} over^ start_ARG italic_ΞΈ γγΌγ end_ARG is a real number. To emphasize this
distinction we use the sans serif font to indicate the expectation
of g M subscript π π g_{M} italic_g start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT
π€ β’ g M β C β’ ( M ) β’ Β whileΒ β’ E β’ ΞΈ γγΌγ ^ β β . π€ subscript π π πΆ π Β whileΒ πΈ ^ π β \mathsf{E}g_{M}\in C(M)\mbox{ while }E\hat{\theta}\in\mathbb{R}. sansserif_E italic_g start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT β italic_C ( italic_M ) while italic_E over^ start_ARG italic_ΞΈ γγΌγ end_ARG β blackboard_R .
Expectation π€ π€ \mathsf{E} sansserif_E operates on C β’ ( M ) πΆ π C(M) italic_C ( italic_M ) -valued distributions,
whereas E πΈ E italic_E operates on β β \mathbb{R} blackboard_R -valued distributions. To be
a generalized estimator, g M β’ ( y , β
) subscript π π π¦ β
g_{M}(y,\cdot) italic_g start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_y , β
) will be required to have
continuous derivatives on M π M italic_M and these will be described using parameterizations
that are diffeomorphisms.
We assume M π M italic_M is a 1-dimensional smooth manifold. While more general
manifolds can be considered (e.g., Fisherβs circle model), we will
only consider families that have a global parameterization
ΞΈ γγΌγ : M β Ξ γγΌγ β β : π β π Ξ γγΌγ β \theta:M\rightarrow\Theta\subset\mathbb{R} italic_ΞΈ γγΌγ : italic_M β roman_Ξ γγΌγ β blackboard_R
(8)
and are connected so that Ξ γγΌγ Ξ γγΌγ \Theta roman_Ξ γγΌγ is an open interval. For g M : π΄ Γ M β β : subscript π π β π΄ π β g_{M}:\mathcal{Y}\times M\rightarrow\mathbb{R} italic_g start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT : caligraphic_Y Γ italic_M β blackboard_R
we define g Ξ γγΌγ = g M β ΞΈ γγΌγ β 1 : π΄ Γ Ξ γγΌγ β β : subscript π Ξ γγΌγ subscript π π superscript π 1 β π΄ Ξ γγΌγ β g_{\Theta}=g_{M}\circ\theta^{-1}:\mathcal{Y}\times\Theta\rightarrow\mathbb{R} italic_g start_POSTSUBSCRIPT roman_Ξ γγΌγ end_POSTSUBSCRIPT = italic_g start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT β italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT : caligraphic_Y Γ roman_Ξ γγΌγ β blackboard_R .
Unless more than one parameterization is being used, we drop the subscript
and write g π g italic_g for g Ξ γγΌγ subscript π Ξ γγΌγ g_{\Theta} italic_g start_POSTSUBSCRIPT roman_Ξ γγΌγ end_POSTSUBSCRIPT . The log likelihood function
on Ξ γγΌγ Ξ γγΌγ \Theta roman_Ξ γγΌγ for y π¦ y italic_y is the function defined by
β = β β’ ( y , β
) = β M β’ ( y , β
) β ΞΈ γγΌγ β 1 β β π¦ β
subscript β π π¦ β
superscript π 1 \ell=\ell(y,\cdot)=\ell_{M}(y,\cdot)\circ\theta^{-1} roman_β = roman_β ( italic_y , β
) = roman_β start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_y , β
) β italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT
where β M β’ ( y , m ) = log β‘ m β’ ( y ) subscript β π π¦ π π π¦ \ell_{M}\left(y,m\right)=\log m\left(y\right) roman_β start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_y , italic_m ) = roman_log italic_m ( italic_y ) . The score
function on Ξ γγΌγ Ξ γγΌγ \Theta roman_Ξ γγΌγ for y π¦ y italic_y is
s = β β = β β / β ΞΈ γγΌγ . π β β β π s=\nabla\ell=\partial\ell/\partial\theta. italic_s = β roman_β = β roman_β / β italic_ΞΈ γγΌγ .
We only consider M π M italic_M such that s β’ ( β
, ΞΈ γγΌγ ) β H M π β
π subscript π» π s(\cdot,\theta)\in H_{M} italic_s ( β
, italic_ΞΈ γγΌγ ) β italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT for all
ΞΈ γγΌγ β Ξ γγΌγ π Ξ γγΌγ \theta\in\Theta italic_ΞΈ γγΌγ β roman_Ξ γγΌγ . Because M π M italic_M is a smooth manifold s β’ ( y , β
) β C 1 β’ ( Ξ γγΌγ ) β’ a . e . y formulae-sequence π π¦ β
superscript πΆ 1 Ξ γγΌγ π π π¦ s\left(y,\cdot\right)\in C^{1}(\Theta)\ a.e.\ y italic_s ( italic_y , β
) β italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Ξ γγΌγ ) italic_a . italic_e . italic_y
and since π€ β’ s = 0 π€ π 0 \mathsf{E}s=0 sansserif_E italic_s = 0 ,
s β’ ( β
, ΞΈ γγΌγ ) β H ΞΈ γγΌγ β . π β
π superscript subscript π» π perpendicular-to s(\cdot,\theta)\in H_{\theta}^{\perp}. italic_s ( β
, italic_ΞΈ γγΌγ ) β italic_H start_POSTSUBSCRIPT italic_ΞΈ γγΌγ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT .
(9)
These properties of s π s italic_s are used to define generalized estimators.
Definition 1 .
A generalized estimator
for scalar parameter ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ is a function
g : π΄ Γ Ξ γγΌγ βΆ β : π βΆ π΄ Ξ γγΌγ β g:\mathcal{Y}\times\Theta\longrightarrow\mathbb{R} italic_g : caligraphic_Y Γ roman_Ξ γγΌγ βΆ blackboard_R
and g = g β’ ( y , β
) π π π¦ β
g=g(y,\cdot) italic_g = italic_g ( italic_y , β
) is the corresponding generalized estimate
at y π¦ y italic_y if
(i)
g β’ ( y , β
) β C 1 β’ ( Ξ γγΌγ ) β’ Β a.e. β’ y π π¦ β
superscript πΆ 1 Ξ γγΌγ Β a.e. π¦ \displaystyle\ \ g\left(y,\cdot\right)\in C^{1}(\Theta)\mbox{\ a.e.}\ y italic_g ( italic_y , β
) β italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Ξ γγΌγ ) a.e. italic_y
(ii)
g β’ ( β
, ΞΈ γγΌγ ) β H ΞΈ γγΌγ β β’ Β for allΒ β’ ΞΈ γγΌγ π β
π superscript subscript π» π perpendicular-to Β for allΒ π \displaystyle\ \ g\left(\cdot,\theta\right)\in H_{\theta}^{\perp}\mbox{ for %
all }\theta italic_g ( β
, italic_ΞΈ γγΌγ ) β italic_H start_POSTSUBSCRIPT italic_ΞΈ γγΌγ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT for all italic_ΞΈ γγΌγ
(iii)
π΅ β’ ( g ) > 0 π΅ π 0 \displaystyle\ \ \mathsf{V}\left(g\right)>0 sansserif_V ( italic_g ) > 0
where π΅ β’ ( g ) = π€ β’ ( g 2 ) β C 1 β’ ( Ξ γγΌγ ) . π΅ π π€ superscript π 2 superscript πΆ 1 Ξ γγΌγ \mathsf{V}\left(g\right)=\mathsf{E}\left(g^{2}\right)\in C^{1}(\Theta). sansserif_V ( italic_g ) = sansserif_E ( italic_g start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) β italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Ξ γγΌγ ) .
The space of generalized estimators for ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ is π’ π’ \mathcal{G} caligraphic_G
which we write as π’ Ξ γγΌγ subscript π’ Ξ γγΌγ \mathcal{G}_{\Theta} caligraphic_G start_POSTSUBSCRIPT roman_Ξ γγΌγ end_POSTSUBSCRIPT if we consider more than
one parameterization. Any function f β H ΞΈ γγΌγ β π superscript subscript π» π perpendicular-to f\not\in H_{\theta}^{\perp} italic_f β italic_H start_POSTSUBSCRIPT italic_ΞΈ γγΌγ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT
that satisfies f β H ΞΈ γγΌγ π subscript π» π f\in H_{\theta} italic_f β italic_H start_POSTSUBSCRIPT italic_ΞΈ γγΌγ end_POSTSUBSCRIPT and conditions (i) and (iii) of
Definition 1 is a pre generalized
estimator , or simply, a pre estimator . For any pre-estimator
f π f italic_f , its orthogonalization
f β = f β f β€ β π’ . superscript π perpendicular-to π superscript π top π’ f^{\perp}=f-f^{\top}\in\mathcal{G}. italic_f start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT = italic_f - italic_f start_POSTSUPERSCRIPT β€ end_POSTSUPERSCRIPT β caligraphic_G .
(10)
where f β€ = π€ β’ f superscript π top π€ π f^{\top}=\mathsf{E}f italic_f start_POSTSUPERSCRIPT β€ end_POSTSUPERSCRIPT = sansserif_E italic_f .
Godambe, (1960 ) has similar criteria but allows π΅ β’ ( g ) = 0 π΅ π 0 \mathsf{V}(g)=0 sansserif_V ( italic_g ) = 0
for some ΞΈ γγΌγ β Ξ γγΌγ π Ξ γγΌγ \theta\in\Theta italic_ΞΈ γγΌγ β roman_Ξ γγΌγ and adds that π€ β’ ( β g ) 2 > 0 π€ superscript β π 2 0 \mathsf{E}\left(\nabla g\right)^{2}>0 sansserif_E ( β italic_g ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT > 0
so that π€ β’ ( β g ) π€ β π \mathsf{E}\left(\nabla g\right) sansserif_E ( β italic_g ) can never be zero on Ξ γγΌγ Ξ γγΌγ \Theta roman_Ξ γγΌγ .
We do not need this restriction since we describe estimators in terms
of information rather than variance. Allowing π€ β’ ( β g ) π€ β π \mathsf{E}\left(\nabla g\right) sansserif_E ( β italic_g )
to be zero will be useful for nuisance parameters in the multi-dimension
setting. Because π΅ β’ ( g ) > 0 π΅ π 0 \mathsf{V}(g)>0 sansserif_V ( italic_g ) > 0 we can define the standardization
of g π g italic_g as
g Β― = g π΅ β’ ( g ) . Β― π π π΅ π \bar{g}=\frac{g}{\sqrt{\mathsf{V}(g)}}. overΒ― start_ARG italic_g end_ARG = divide start_ARG italic_g end_ARG start_ARG square-root start_ARG sansserif_V ( italic_g ) end_ARG end_ARG .
Since g Β― β’ ( β
, ΞΈ γγΌγ ) β H ΞΈ γγΌγ β Β― π β
π superscript subscript π» π perpendicular-to \bar{g}(\cdot,\theta)\in H_{\theta}^{\perp} overΒ― start_ARG italic_g end_ARG ( β
, italic_ΞΈ γγΌγ ) β italic_H start_POSTSUBSCRIPT italic_ΞΈ γγΌγ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT is a vector of
unit length, g Β― Β― π \bar{g} overΒ― start_ARG italic_g end_ARG is also called the direction of g π g italic_g .
Standardized estimators are the same in every parameterization. That
is, for any m β² β M superscript π β² π m^{\prime}\in M italic_m start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT β italic_M , g Β― Ξ γγΌγ β’ ( β
, ΞΈ γγΌγ β² ) = g Β― Ξ γγγΌ β’ ( β
, ΞΎ γγγΌ β² ) subscript Β― π Ξ γγΌγ β
superscript π β² subscript Β― π Ξ γγγΌ β
superscript π β² \bar{g}_{\Theta}(\cdot,\theta^{\prime})=\bar{g}_{\Xi}(\cdot,\xi^{\prime}) overΒ― start_ARG italic_g end_ARG start_POSTSUBSCRIPT roman_Ξ γγΌγ end_POSTSUBSCRIPT ( β
, italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT ) = overΒ― start_ARG italic_g end_ARG start_POSTSUBSCRIPT roman_Ξ γγγΌ end_POSTSUBSCRIPT ( β
, italic_ΞΎ γγγΌ start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT )
where ΞΈ γγΌγ β² = ΞΈ γγΌγ β’ ( m β² ) superscript π β² π superscript π β² \theta^{\prime}=\theta(m^{\prime}) italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT = italic_ΞΈ γγΌγ ( italic_m start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT ) and ΞΎ γγγΌ β² = ΞΎ γγγΌ β’ ( m β² ) superscript π β² π superscript π β² \xi^{\prime}=\xi(m^{\prime}) italic_ΞΎ γγγΌ start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT = italic_ΞΎ γγγΌ ( italic_m start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT ) .
A non-degenerate point estimator ΞΈ γγΌγ ^ ^ π \hat{\theta} over^ start_ARG italic_ΞΈ γγΌγ end_ARG whose first two moments
are smooth functions on Ξ γγΌγ Ξ γγΌγ \Theta roman_Ξ γγΌγ is a pre-estimator so that
ΞΈ γγΌγ ^ β π€ β’ ΞΈ γγΌγ ^ β π’ . ^ π π€ ^ π π’ \hat{\theta}-\mathsf{E}\hat{\theta}\in\mathcal{G}. over^ start_ARG italic_ΞΈ γγΌγ end_ARG - sansserif_E over^ start_ARG italic_ΞΈ γγΌγ end_ARG β caligraphic_G .
We use the sans serif notation because as a pre-estimator ΞΈ γγΌγ ^ β’ ( y , β
) ^ π π¦ β
\hat{\theta}(y,\cdot) over^ start_ARG italic_ΞΈ γγΌγ end_ARG ( italic_y , β
)
is a function on the parameter space, the constant function taking
the value of the point estimate at y π¦ y italic_y . The estimator need not be
unbiased, so that generalized estimation can be used to compare biased
and unbiased point estimators as well as estimators not constrained
to be constant on Ξ γγΌγ Ξ γγΌγ \Theta roman_Ξ γγΌγ . Generalized estimators are compared in
terms of their information.
Definition 2 .
The information for scalar parameter ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ
utilized by g π g italic_g is
Ξ γγγ β’ ( g ) = ( π€ β’ β g Β― ) 2 = ( π€ β’ β g ) 2 π€ β’ ( g 2 ) . Ξ γγγ π superscript π€ β Β― π 2 superscript π€ β π 2 π€ superscript π 2 \Lambda(g)=\left(\mathsf{E}\nabla\bar{g}\right)^{2}=\frac{(\mathsf{E}\nabla g)%
^{2}}{\mathsf{E}(g^{2})}. roman_Ξ γγγ ( italic_g ) = ( sansserif_E β overΒ― start_ARG italic_g end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = divide start_ARG ( sansserif_E β italic_g ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG sansserif_E ( italic_g start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG .
(11)
where the second equality follows from the definition of g Β― Β― π \bar{g} overΒ― start_ARG italic_g end_ARG
and π€ β’ g = 0 π€ π 0 \mathsf{E}g=0 sansserif_E italic_g = 0 .
The Fisher information for a sample of size n π n italic_n , I ( n ) subscript πΌ π I_{(n)} italic_I start_POSTSUBSCRIPT ( italic_n ) end_POSTSUBSCRIPT , and
the Fisher information in a single observation, I ( 1 ) subscript πΌ 1 I_{(1)} italic_I start_POSTSUBSCRIPT ( 1 ) end_POSTSUBSCRIPT , satisfy
I ( n ) = n β’ I ( 1 ) subscript πΌ π π subscript πΌ 1 I_{(n)}=nI_{(1)} italic_I start_POSTSUBSCRIPT ( italic_n ) end_POSTSUBSCRIPT = italic_n italic_I start_POSTSUBSCRIPT ( 1 ) end_POSTSUBSCRIPT . This relationship also holds for the information
utilized by an estimator
Ξ γγγ β’ ( g ( n ) ) = n β’ Ξ γγγ β’ ( g ( 1 ) ) . Ξ γγγ subscript π π π Ξ γγγ subscript π 1 \Lambda(g_{(n)})=n\Lambda(g_{(1)}). roman_Ξ γγγ ( italic_g start_POSTSUBSCRIPT ( italic_n ) end_POSTSUBSCRIPT ) = italic_n roman_Ξ γγγ ( italic_g start_POSTSUBSCRIPT ( 1 ) end_POSTSUBSCRIPT ) .
(12)
When considering only samples of size n π n italic_n we use I = I ( n ) πΌ subscript πΌ π I=I_{(n)} italic_I = italic_I start_POSTSUBSCRIPT ( italic_n ) end_POSTSUBSCRIPT and
Ξ γγγ β’ ( g ) = Ξ γγγ β’ ( g ( n ) ) Ξ γγγ π Ξ γγγ subscript π π \Lambda(g)=\Lambda(g_{(n)}) roman_Ξ γγγ ( italic_g ) = roman_Ξ γγγ ( italic_g start_POSTSUBSCRIPT ( italic_n ) end_POSTSUBSCRIPT ) .
As the score is the archetype for a generalized estimator g π g italic_g , the
log likelihood function is the archetype for the scalar potential.
Definition 3 .
A scalar potential of g π g italic_g
is any function G : π΄ Γ Ξ γγΌγ βΆ β : πΊ βΆ π΄ Ξ γγΌγ β G:\mathcal{Y}\times\Theta\longrightarrow\mathbb{R} italic_G : caligraphic_Y Γ roman_Ξ γγΌγ βΆ blackboard_R
such that β G = g β πΊ π \nabla G=g β italic_G = italic_g .
While G β π’ πΊ π’ G\not\in\mathcal{G} italic_G β caligraphic_G we define the information utilized by
G πΊ G italic_G to be the information of its derivative: Ξ γγγ β’ ( G ) = Ξ γγγ β’ ( g ) Ξ γγγ πΊ Ξ γγγ π \Lambda(G)=\Lambda(g) roman_Ξ γγγ ( italic_G ) = roman_Ξ γγγ ( italic_g ) .
Information is a local property and so does not distinguish between
a generalized estimator and its scalar potential. The scalar potential
is useful for finding confidence regions especially when the parameterization
is multidimensional.
We assume differentiation commutes with the integral sign so for any
pre-estimator f π f italic_f
β ( π€ β’ f ) = π€ β’ ( β f ) + ( β π€ ) β’ ( f ) β π€ π π€ β π β π€ π \nabla\left(\mathsf{E}f\right)=\mathsf{E}\left(\nabla f\right)+\left(\nabla%
\mathsf{E}\right)\left(f\right) β ( sansserif_E italic_f ) = sansserif_E ( β italic_f ) + ( β sansserif_E ) ( italic_f )
(13)
where ( β π€ ) β π€ \left(\nabla\mathsf{E}\right) ( β sansserif_E ) is the linear operator on H M subscript π» π H_{M} italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT
defined by
( β π€ ) β’ ( h ) = π€ β’ ( ( β β ) β’ h ) . β π€ β π€ β β β \left(\nabla\mathsf{E}\right)(h)=\mathsf{E}\left(\left(\nabla\ell\right)h%
\right). ( β sansserif_E ) ( italic_h ) = sansserif_E ( ( β roman_β ) italic_h ) .
Note that we use f π f italic_f and g π g italic_g for functions on π΄ Γ Ξ γγΌγ π΄ Ξ γγΌγ \mathcal{Y}\times\Theta caligraphic_Y Γ roman_Ξ γγΌγ
while h β H M β subscript π» π h\in H_{M} italic_h β italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT is a function on π΄ π΄ \mathcal{Y} caligraphic_Y . For generalized
estimator g π g italic_g , π€ β’ g π€ π \mathsf{E}g sansserif_E italic_g vanishes so (13 )
becomes, after switching left- and right-hand sides, the score
equation
π€ β’ ( β g ) + π€ β’ ( s β’ g ) = 0 . π€ β π π€ π π 0 \mathsf{E}\left(\nabla g\right)+\mathsf{E}\left(sg\right)=0. sansserif_E ( β italic_g ) + sansserif_E ( italic_s italic_g ) = 0 .
(14)
When g = s π π g=s italic_g = italic_s , the score equation gives the equivalent definitions of
the Fisher information for ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ
I = β π€ β’ ( β s ) = π€ β’ ( s 2 ) . πΌ π€ β π π€ superscript π 2 I=-\mathsf{E}(\nabla s)=\mathsf{E}(s^{2}). italic_I = - sansserif_E ( β italic_s ) = sansserif_E ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) .
The information upper bound follows from the score identity.
Theorem 1 .
The information for ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ utilized by g π g italic_g is bounded
by the Fisher information
Ξ γγγ β’ ( g ) Ξ γγγ π \displaystyle\Lambda\left(g\right) roman_Ξ γγγ ( italic_g )
β€ \displaystyle\leq β€
I . πΌ \displaystyle I. italic_I .
Furthermore, the score s π s italic_s attains this bound and for any g β π’ π π’ g\in\mathcal{G} italic_g β caligraphic_G
Ξ γγγ β’ ( g ) Ξ γγγ π \displaystyle\Lambda(g) roman_Ξ γγγ ( italic_g )
= \displaystyle= =
π΅ β’ ( π― g β’ s ) π΅ subscript π― π π \displaystyle\mathsf{V}(\mathsf{P}_{g}s) sansserif_V ( sansserif_P start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT italic_s )
= \displaystyle= =
π± 2 β’ I superscript π± 2 πΌ \displaystyle\mathsf{R}^{2}I sansserif_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_I
where π― g β’ s subscript π― π π \mathsf{P}_{g}s sansserif_P start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT italic_s is the projection of s π s italic_s onto the space
spanned by g π g italic_g and π± = π€ β’ ( s Β― β’ g Β― ) π± π€ Β― π Β― π \mathsf{R}=\mathsf{E}(\bar{s}\bar{g}) sansserif_R = sansserif_E ( overΒ― start_ARG italic_s end_ARG overΒ― start_ARG italic_g end_ARG ) is the
correlation between s π s italic_s and g π g italic_g .
Proof.
From the score equation
Ξ γγγ β’ ( g ) = π€ 2 β’ ( s β’ g Β― ) . Ξ γγγ π superscript π€ 2 π Β― π \Lambda(g)=\mathsf{E}^{2}(s\bar{g}). roman_Ξ γγγ ( italic_g ) = sansserif_E start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_s overΒ― start_ARG italic_g end_ARG ) .
(15)
The second displayed equality follows upon noting that π€ 2 β’ ( s β’ g Β― ) = π€ 2 β’ ( s Β― β’ g Β― ) β’ I = π± 2 β’ I superscript π€ 2 π Β― π superscript π€ 2 Β― π Β― π πΌ superscript π± 2 πΌ \mathsf{E}^{2}(s\bar{g})=\mathsf{E}^{2}(\bar{s}\bar{g})I=\mathsf{R}^{2}I sansserif_E start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_s overΒ― start_ARG italic_g end_ARG ) = sansserif_E start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( overΒ― start_ARG italic_s end_ARG overΒ― start_ARG italic_g end_ARG ) italic_I = sansserif_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_I .
The first equality follows by expressing the projection using basis
vector g Β― Β― π \bar{g} overΒ― start_ARG italic_g end_ARG
π΅ β’ ( π― g β’ s ) π΅ subscript π― π π \displaystyle\mathsf{V}(\mathsf{P}_{g}s) sansserif_V ( sansserif_P start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT italic_s )
= \displaystyle= =
π΅ β’ ( π€ β’ ( s β’ g Β― ) β’ g Β― ) π΅ π€ π Β― π Β― π \displaystyle\mathsf{V}\left(\mathsf{E}(s\bar{g})\bar{g}\right) sansserif_V ( sansserif_E ( italic_s overΒ― start_ARG italic_g end_ARG ) overΒ― start_ARG italic_g end_ARG )
= \displaystyle= =
π€ 2 β’ ( s β’ g Β― ) . superscript π€ 2 π Β― π \displaystyle\mathsf{E}^{2}(s\bar{g}). sansserif_E start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_s overΒ― start_ARG italic_g end_ARG ) .
β
Efficiency of a point estimator is defined using the ratio of its
variance to the variance bound. Efficiency of a generalized estimator
is defined as the ratio of its information to the information bound,
I πΌ I italic_I .
Definition 4 .
The Ξ γγγ Ξ γγγ \Lambda roman_Ξ γγγ -efficiency of g π g italic_g is
Eff Ξ γγγ β’ ( g ) = I β 1 β’ Ξ γγγ β’ ( g ) . superscript Eff Ξ γγγ π superscript πΌ 1 Ξ γγγ π \mbox{Eff}^{\Lambda}(g)=I^{-1}\Lambda(g). Eff start_POSTSUPERSCRIPT roman_Ξ γγγ end_POSTSUPERSCRIPT ( italic_g ) = italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Ξ γγγ ( italic_g ) .
An immediate corollary of Theorem 1 is that the Ξ γγγ Ξ γγγ \Lambda roman_Ξ γγγ -efficiency
is the square of the correlation between the estimator and the score.
Corollary 1 .
Eff Ξ γγγ β’ ( g ) superscript Eff Ξ γγγ π \displaystyle\mbox{Eff}^{\Lambda}(g) Eff start_POSTSUPERSCRIPT roman_Ξ γγγ end_POSTSUPERSCRIPT ( italic_g )
= \displaystyle= =
π΅ β’ ( π― g β’ s Β― ) π΅ subscript π― π Β― π \displaystyle\mathsf{V}\left(\mathsf{P}_{g}\bar{s}\right) sansserif_V ( sansserif_P start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT overΒ― start_ARG italic_s end_ARG )
= \displaystyle= =
π± 2 . superscript π± 2 \displaystyle\mathsf{R}^{2}. sansserif_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .
The Ξ γγγ Ξ γγγ \Lambda roman_Ξ γγγ -efficiency of a point estimator ΞΈ γγΌγ ^ ^ π \hat{\theta} over^ start_ARG italic_ΞΈ γγΌγ end_ARG is
the Ξ γγγ Ξ γγγ \Lambda roman_Ξ γγγ -efficiency of its generalized estimator g ΞΈ γγΌγ ^ = ΞΈ γγΌγ ^ β π€ β’ ΞΈ γγΌγ ^ subscript π ^ π ^ π π€ ^ π g_{\hat{\theta}}=\hat{\theta}-\mathsf{E}\hat{\theta} italic_g start_POSTSUBSCRIPT over^ start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT = over^ start_ARG italic_ΞΈ γγΌγ end_ARG - sansserif_E over^ start_ARG italic_ΞΈ γγΌγ end_ARG .
When ΞΈ γγΌγ ^ ^ π \hat{\theta} over^ start_ARG italic_ΞΈ γγΌγ end_ARG is unbiased Ξ γγγ β’ ( g ΞΈ γγΌγ ^ ) = π΅ β 1 β’ ( ΞΈ γγΌγ ^ ) Ξ γγγ subscript π ^ π superscript π΅ 1 ^ π \Lambda(g_{\hat{\theta}})=\mathsf{V}^{-1}(\hat{\theta}) roman_Ξ γγγ ( italic_g start_POSTSUBSCRIPT over^ start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT ) = sansserif_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( over^ start_ARG italic_ΞΈ γγΌγ end_ARG )
so that Ξ γγγ Ξ γγγ \Lambda roman_Ξ γγγ -efficiency is identical to efficiency based on
variance.
Even though these efficiencies can take the same numerical value,
it is incorrect to characterize the information as the reciprocal
of the variance. The information at ΞΈ γγΌγ β² = ΞΈ γγΌγ β’ ( m β² ) superscript π β² π superscript π β² \theta^{\prime}=\theta(m^{\prime}) italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT = italic_ΞΈ γγΌγ ( italic_m start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT ) , Ξ γγγ β’ ( g ) | ΞΈ γγΌγ = ΞΈ γγΌγ β² evaluated-at Ξ γγγ π π superscript π β² \Lambda(g)|_{\theta=\theta^{\prime}} roman_Ξ γγγ ( italic_g ) | start_POSTSUBSCRIPT italic_ΞΈ γγΌγ = italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ,
is a measure of how g π g italic_g changes in a neighborhood m β² β M superscript π β² π m^{\prime}\in M italic_m start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT β italic_M ; that
is, information depends on M π M italic_M . The variance at ΞΈ γγΌγ β² superscript π β² \theta^{\prime} italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT , π΅ β’ ( g ) | ΞΈ γγΌγ = ΞΈ γγΌγ β² evaluated-at π΅ π π superscript π β² \mathsf{V}(g)|_{\theta=\theta^{\prime}} sansserif_V ( italic_g ) | start_POSTSUBSCRIPT italic_ΞΈ γγΌγ = italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ,
depends only on m β² superscript π β² m^{\prime} italic_m start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT ; it is the same for the countless manifolds
we could choose that contain m β² superscript π β² m^{\prime} italic_m start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT . Another difference is that variance
is defined on horizontal distributions while information is defined
on vertical distributions. Horizontal and vertical distributions are
described in Example 1 .
Example 1 .
We consider inference for the proportion of a
population having a genetic variation or other specified characteristic.
We let 1 1 1 1 (0) indicate the characteristic is present (absent) so
π³ = { 0 , 1 } π³ 0 1 \mathcal{X}=\left\{0,1\right\} caligraphic_X = { 0 , 1 } and for a sample of size n π n italic_n ,
M π M italic_M is given by (1 ). Figure 1
shows the standardized score
s Β― = y β n β’ p n β’ p β’ ( 1 β p ) Β― π π¦ π π π π 1 π \bar{s}=\frac{y-np}{\sqrt{np(1-p)}} overΒ― start_ARG italic_s end_ARG = divide start_ARG italic_y - italic_n italic_p end_ARG start_ARG square-root start_ARG italic_n italic_p ( 1 - italic_p ) end_ARG end_ARG
where n = 20 π 20 n=20 italic_n = 20 and p π p italic_p is the parameter defined by p β’ ( m ) = m β’ ( 1 ) π π π 1 p(m)=m(1) italic_p ( italic_m ) = italic_m ( 1 ) with
parameter space P = ( 0 , 1 ) π 0 1 P=(0,1) italic_P = ( 0 , 1 ) . The graph of the estimate s Β― y subscript Β― π π¦ \bar{s}_{y} overΒ― start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT
when y = 6 π¦ 6 y=6 italic_y = 6 is the black curve. The estimator s Β― Β― π \bar{s} overΒ― start_ARG italic_s end_ARG is represented
by the family of 21 curves, one for each y π¦ y italic_y in the sample space
(unrealized estimates are shown in white).
Figure 1: The standardized score estimate s Β― 6 subscript Β― π 6 \bar{s}_{6} overΒ― start_ARG italic_s end_ARG start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT
obtained from the sample with y = 6 π¦ 6 y=6 italic_y = 6 and n = 20 π 20 n=20 italic_n = 20 for the Bernoulli
manifold with the parameter p = m β’ ( 1 ) π π 1 p=m(1) italic_p = italic_m ( 1 ) is shown by the black curve.
The standardized score estimator s Β― Β― π \bar{s} overΒ― start_ARG italic_s end_ARG is represented by the
family of 21 curves, one for each y π¦ y italic_y in the sample space (unrealized
estimates are shown in white). Of the continuum of vertical slices
two are shown at p = .50 π .50 p=.50 italic_p = .50 and p = .55 π .55 p=.55 italic_p = .55 . The distribution of the point
estimate p ^ ^ π \hat{p} over^ start_ARG italic_p end_ARG is shown by the intersection of these 21 curves
with the horizontal axis. Note that for two of these curves the intersection
occurs for a value outside of the parameter space.
Of the continuum of vertical slices two are shown, one at p = .50 π .50 p=.50 italic_p = .50
and another at p = .55 π .55 p=.55 italic_p = .55 . Every vertical slice for 0 < p < 1 0 π 1 0<p<1 0 < italic_p < 1 intersects
all 21 curves and while the ordinate of these points of intersection
depends on p π p italic_p the resulting distributions all have mean zero and
variance one. These vertical distributions are the same in every parameterization.
For any parameter ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ , s Β― β’ ( y , p β’ ( m β² ) ) = s Β― Ξ γγΌγ β’ ( y , ΞΈ γγΌγ β’ ( m β² ) ) Β― π π¦ π superscript π β² subscript Β― π Ξ γγΌγ π¦ π superscript π β² \bar{s}(y,p(m^{\prime}))=\bar{s}_{\Theta}(y,\theta(m^{\prime})) overΒ― start_ARG italic_s end_ARG ( italic_y , italic_p ( italic_m start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT ) ) = overΒ― start_ARG italic_s end_ARG start_POSTSUBSCRIPT roman_Ξ γγΌγ end_POSTSUBSCRIPT ( italic_y , italic_ΞΈ γγΌγ ( italic_m start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT ) )
for all y π¦ y italic_y and all m β² superscript π β² m^{\prime} italic_m start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT . In contrast, the abscissa values obtained
from the intersection of these curves with the parameter axis are
the same for all p π p italic_p but the mean and variance of these horizontal
distributions depends on the value of the parameter and on the choice
of parameterization. The horizontal distributions describe the inferential
properties in terms of the mean and variance of the roots of s π s italic_s
while the vertical distributions describe how each estimate s Β― y subscript Β― π π¦ \bar{s}_{y} overΒ― start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT
changes with the parameter.
When the maximum likelihood estimator exists and is unique it is,
by definition, the parameter-intercept of the score, p ^ = s β 1 β’ ( 0 ) ^ π superscript π 1 0 \hat{p}=s^{-1}(0) over^ start_ARG italic_p end_ARG = italic_s start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 0 ) .
For y = 0 π¦ 0 y=0 italic_y = 0 and for y = 20 π¦ 20 y=20 italic_y = 20 , the maximum likelihood estimate does
not exist since s y subscript π π¦ s_{y} italic_s start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT does not cross the parameter axis. Even when
the point estimate does not exist, confidence regions can be constructed
from the standardized score s Β― Β― π \bar{s} overΒ― start_ARG italic_s end_ARG . All 21 estimates s Β― y subscript Β― π π¦ \bar{s}_{y} overΒ― start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT
provide z π§ z italic_z -standard deviation intervals
CI β z β’ ( y ) = { p : s Β― y β’ ( p ) β₯ β z } , CI + z β’ ( y ) = { p : s Β― y β’ ( p ) β€ z } . formulae-sequence subscript CI π§ π¦ conditional-set π subscript Β― π π¦ π π§ subscript CI π§ π¦ conditional-set π subscript Β― π π¦ π π§ \mbox{CI}_{-z}(y)=\left\{p:\bar{s}_{y}(p)\geq-z\right\},\mbox{CI}_{+z}(y)=%
\left\{p:\bar{s}_{y}(p)\leq z\right\}. CI start_POSTSUBSCRIPT - italic_z end_POSTSUBSCRIPT ( italic_y ) = { italic_p : overΒ― start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ( italic_p ) β₯ - italic_z } , CI start_POSTSUBSCRIPT + italic_z end_POSTSUBSCRIPT ( italic_y ) = { italic_p : overΒ― start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ( italic_p ) β€ italic_z } .
The intersection of the curve s Β― 6 subscript Β― π 6 \bar{s}_{6} overΒ― start_ARG italic_s end_ARG start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT with the white lines
at s Β― y = Β± 2 subscript Β― π π¦ plus-or-minus 2 \bar{s}_{y}=\pm 2 overΒ― start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT = Β± 2 in Figure 1 show
the endpoints of CI β 2 β’ ( 6 ) subscript CI 2 6 \mbox{CI}_{-2}(6) CI start_POSTSUBSCRIPT - 2 end_POSTSUBSCRIPT ( 6 ) and CI + 2 β’ ( 6 ) subscript CI 2 6 \mbox{CI}_{+2}(6) CI start_POSTSUBSCRIPT + 2 end_POSTSUBSCRIPT ( 6 ) . Since
generalized estimators are parameter invariant, these intervals correspond
to subsets of the space of models M π M italic_M . The interpretation of these
intervals can be stated in terms of their complement: if the true
model is not in CI β 2 β’ ( y ) subscript CI 2 π¦ \mbox{CI}_{-2}(y) CI start_POSTSUBSCRIPT - 2 end_POSTSUBSCRIPT ( italic_y ) or CI + 2 β’ ( y ) subscript CI 2 π¦ \mbox{CI}_{+2}(y) CI start_POSTSUBSCRIPT + 2 end_POSTSUBSCRIPT ( italic_y ) then
the score test for the observed data y π¦ y italic_y is at least two standard
deviations from zero. That is, for models outside these intervals
the observed data y π¦ y italic_y would be improbable since the score is at least
two standard deviations from zero. Intervals based on tail probabilities
can be obtained by allowing z π§ z italic_z to be a function of the parameter;
for CI + z β’ ( 6 ) subscript CI π§ 6 \mbox{CI}_{+z}(6) CI start_POSTSUBSCRIPT + italic_z end_POSTSUBSCRIPT ( 6 ) the value for z π§ z italic_z would be obtained using
the mass assigned to the values { 0 , 1 , β¦ , 5 , 6 } 0 1 β¦ 5 6 \{0,1,\ldots,5,6\} { 0 , 1 , β¦ , 5 , 6 } .
Figure 2 shows the log likelihood ratio statistic
S π S italic_S for y = 6 π¦ 6 y=6 italic_y = 6 and its distribution on the other 20 values in the
sample space. The vertical slices at p = .50 π .50 p=.50 italic_p = .50 and p = .55 π .55 p=.55 italic_p = .55 correspond
to those from Figure 1 but the circles
are only plotted when the slope of the intersecting curve is negative.
Each vertical slice has 6 points of intersection corresponding to
samples as extreme as y = 6 π¦ 6 y=6 italic_y = 6 . The resulting p-value is the same as
for the score. This will be true for any vertical slice so that inference
from the score and the signed log likelihood ratio are identical in
this example. This will not be true when the curves of the estimator
g π g italic_g intersect. Also, inference from g π g italic_g and unsigned scalar potential
function G πΊ G italic_G will not be identical. In particular, the score and
unsigned log likelihood ratio are not identical in this example.
Figure 2: Twice the log likelihood ratio statistic
obtained from observing y = 6 π¦ 6 y=6 italic_y = 6 out of a sample of size n = 20 π 20 n=20 italic_n = 20 for
the Bernoulli manifold with the parameter p = m β’ ( 1 ) π π 1 p=m(1) italic_p = italic_m ( 1 ) is shown by the
black curve. The distribution of twice the log likelihood ratio statistic
is represented by the black curve and 20 white curves.
Example 2 .
β We consider the same population as before but
now the variable of interest is a measured quantity and we choose
M π³ subscript π π³ M_{\mathcal{X}} italic_M start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT to be the Cauchy family so that for a random sample
of size n π n italic_n , M π M italic_M is given by (3 ). For comparison
we also consider models from the Normal family for which the family
of sampling distributions is given by (2 ); we use M πΆπππ subscript π πΆπππ M_{{\tt Gaus}} italic_M start_POSTSUBSCRIPT typewriter_Gaus end_POSTSUBSCRIPT
to identify this manifold. For parameterization ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ , the graph
of a generalized estimate g Β― y subscript Β― π π¦ \bar{g}_{y} overΒ― start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT for an observation y = ( x 1 , x 2 , β¦ , x n ) π π¦ superscript subscript π₯ 1 subscript π₯ 2 β¦ subscript π₯ π π y=\left(x_{1},x_{2},\ldots,x_{n}\right)^{{\tt t}} italic_y = ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , β¦ , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT
is a curve over the parameter space Ξ γγΌγ Ξ γγΌγ \Theta roman_Ξ γγΌγ . This corresponds to
the black curve in the previous example. The distribution of the estimator
g Β― Β― π \bar{g} overΒ― start_ARG italic_g end_ARG is more difficult to represent since there are a continuum
of curves indexed by y π¦ y italic_y . For M πΆπππ subscript π πΆπππ M_{{\tt Gaus}} italic_M start_POSTSUBSCRIPT typewriter_Gaus end_POSTSUBSCRIPT there is also a continuum
of curves but now the sufficient statistic x Β― = n β 1 β’ β x i Β― π₯ superscript π 1 subscript π₯ π \bar{x}=n^{-1}\sum x_{i} overΒ― start_ARG italic_x end_ARG = italic_n start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT β italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
provides a one dimensional index. Nevertheless, the properties of
the vertical distributions for M π M italic_M and M πΆπππ subscript π πΆπππ M_{{\tt Gaus}} italic_M start_POSTSUBSCRIPT typewriter_Gaus end_POSTSUBSCRIPT still hold
and confidence regions for g y subscript π π¦ g_{y} italic_g start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT are defined in the same way.
3 Multi-parameter Families
We consider inference for a parameter ΞΈ γγΌγ = ( ΞΈ γγΌγ 1 , ΞΈ γγΌγ 2 , β¦ , ΞΈ γγΌγ k ) π β β k π superscript superscript π 1 superscript π 2 β¦ superscript π π π superscript β π \theta=\left(\theta^{1},\theta^{2},\ldots,\theta^{k}\right)^{{\tt t}}\in%
\mathbb{R}^{k} italic_ΞΈ γγΌγ = ( italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , β¦ , italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT β blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT
in the presence of a k β² superscript π β² k^{\prime} italic_k start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT -dimensional nuisance parameter ΞΈ γγΌγ ~ = ( ΞΈ γγΌγ ~ 1 , ΞΈ γγΌγ ~ 2 , β¦ , ΞΈ γγΌγ ~ k β² ) π ~ π superscript superscript ~ π 1 superscript ~ π 2 β¦ superscript ~ π superscript π β² π \undertilde{\theta}=(\undertilde{\theta}^{1},\undertilde{\theta}^{2},\ldots,%
\undertilde{\theta}^{k^{\prime}})^{{\tt t}} under~ start_ARG italic_ΞΈ γγΌγ end_ARG = ( under~ start_ARG italic_ΞΈ γγΌγ end_ARG start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , under~ start_ARG italic_ΞΈ γγΌγ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , β¦ , under~ start_ARG italic_ΞΈ γγΌγ end_ARG start_POSTSUPERSCRIPT italic_k start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT
so that M π M italic_M is a manifold of dimension ( k + k β² ) π superscript π β² (k+k^{\prime}) ( italic_k + italic_k start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT ) and ΞΈ γγΌγ Β― π = ( ΞΈ γγΌγ π , ΞΈ γγΌγ ~ π ) superscript Β― π π superscript π π superscript ~ π π \text{$\underline{\theta}$}^{{\tt t}}=(\theta^{{\tt t}},\undertilde{\theta}^{{%
\tt t}}) underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT = ( italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT , under~ start_ARG italic_ΞΈ γγΌγ end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT )
is a global parameterization ΞΈ γγΌγ Β― : M β Ξ γγΌγ Β― : Β― π β π Β― Ξ γγΌγ \text{$\underline{\theta}$}:M\rightarrow\text{$\underline{\Theta}$} underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG : italic_M β underΒ― start_ARG roman_Ξ γγΌγ end_ARG . We use β Β― Β― β \overline{\nabla} overΒ― start_ARG β end_ARG ,
β β \nabla β , and β ~ ~ β \widetilde{\nabla} over~ start_ARG β end_ARG to indicate differentiation with respect
to ΞΈ γγΌγ Β― Β― π \underline{\theta} underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG , ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ , and ΞΈ γγΌγ ~ ~ π \undertilde{\theta} under~ start_ARG italic_ΞΈ γγΌγ end_ARG , respectively,
so that
s ~ = β ~ β’ β = ( β β / β ΞΈ γγΌγ ~ 1 , β β / β ΞΈ γγΌγ ~ 2 , β¦ , β β / β ΞΈ γγΌγ ~ k β² ) π . ~ π ~ β β superscript β superscript ~ π 1 β superscript ~ π 2 β¦ β superscript ~ π superscript π β² π \undertilde{s}=\widetilde{\nabla}\ell=(\partial\ell/\partial\undertilde{\theta%
}^{1},\partial\ell/\partial\undertilde{\theta}^{2},\ldots,\partial\ell/%
\partial\undertilde{\theta}^{k^{\prime}})^{{\tt t}}. under~ start_ARG italic_s end_ARG = over~ start_ARG β end_ARG roman_β = ( β roman_β / β under~ start_ARG italic_ΞΈ γγΌγ end_ARG start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , β roman_β / β under~ start_ARG italic_ΞΈ γγΌγ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , β¦ , β roman_β / β under~ start_ARG italic_ΞΈ γγΌγ end_ARG start_POSTSUPERSCRIPT italic_k start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT .
Note that subscripts are used for the components of g π g italic_g while superscripts
are used for ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ . This convention allows us to use the Einstein
summation convention for calculations involving bases. It also reminds
us that the component g a subscript π π g_{a} italic_g start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT is not a point estimate for ΞΈ γγΌγ a superscript π π \theta^{a} italic_ΞΈ γγΌγ start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ;
if it were, we would want to use superscripts for the components of
g π g italic_g . While ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ and g π g italic_g are both k π k italic_k -tuples, geometrically
ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ is a contra-variant (tangent) vector while g π g italic_g is a covariant
vector as its components co-vary with the change of basis.
Generalized estimators may depend on the value of the nuisance parameter
but we can make them independent of the nuisance parameterization
by restricting to functions that are orthogonal to s ~ ~ π \undertilde{s} under~ start_ARG italic_s end_ARG . For
any fixed m β β M subscript π π m_{\circ}\in M italic_m start_POSTSUBSCRIPT β end_POSTSUBSCRIPT β italic_M there is a k β² superscript π β² k^{\prime} italic_k start_POSTSUPERSCRIPT β² end_POSTSUPERSCRIPT -dimensional submanifold
through m β subscript π m_{\circ} italic_m start_POSTSUBSCRIPT β end_POSTSUBSCRIPT
M | m β = { m β M : ΞΈ γγΌγ β’ ( m ) = ΞΈ γγΌγ β } evaluated-at π subscript π conditional-set π π π π subscript π M|_{m_{\circ}}=\left\{m\in M:\theta(m)=\theta_{\circ}\right\} italic_M | start_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT β end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_m β italic_M : italic_ΞΈ γγΌγ ( italic_m ) = italic_ΞΈ γγΌγ start_POSTSUBSCRIPT β end_POSTSUBSCRIPT }
where ΞΈ γγΌγ β = ΞΈ γγΌγ β’ ( m β ) . subscript π π subscript π \theta_{\circ}=\theta(m_{\circ}). italic_ΞΈ γγΌγ start_POSTSUBSCRIPT β end_POSTSUBSCRIPT = italic_ΞΈ γγΌγ ( italic_m start_POSTSUBSCRIPT β end_POSTSUBSCRIPT ) . The tangent space of M | m β evaluated-at π subscript π M|_{m_{\circ}} italic_M | start_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
at m β M | m β π evaluated-at π subscript π m\in M|_{m_{\circ}} italic_m β italic_M | start_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT β end_POSTSUBSCRIPT end_POSTSUBSCRIPT is
T ~ m β’ M = span β’ { s ~ β’ ( β
, ΞΈ γγΌγ Β― ) } | ΞΈ γγΌγ Β― π = ( ΞΈ γγΌγ β π , ΞΈ γγΌγ ~ π ) . subscript ~ π π π evaluated-at span ~ π β
Β― π superscript Β― π π superscript subscript π π superscript ~ π π \widetilde{T}_{m}M=\mbox{span}\{\undertilde{s}(\cdot,\text{$\underline{\theta}%
$})\}|_{\underline{\theta}^{{\tt t}}=(\theta_{\circ}^{{\tt t}},\undertilde{%
\theta}^{{\tt t}})}. over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_M = span { under~ start_ARG italic_s end_ARG ( β
, underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG ) } | start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT = ( italic_ΞΈ γγΌγ start_POSTSUBSCRIPT β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT , under~ start_ARG italic_ΞΈ γγΌγ end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT .
We will require estimators to be orthogonal to T ~ m β’ M subscript ~ π π π \widetilde{T}_{m}M over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_M
and so define
H m β₯ = { h β H M : E m β’ h = 0 , h β m T ~ m β’ M } . superscript subscript π» π bottom conditional-set β subscript π» π formulae-sequence subscript πΈ π β 0 subscript perpendicular-to π β subscript ~ π π π H_{m}^{\bot}=\left\{h\in H_{M}:E_{m}h=0,h\perp_{m}\widetilde{T}_{m}M\right\}. italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β₯ end_POSTSUPERSCRIPT = { italic_h β italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT : italic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_h = 0 , italic_h β start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_M } .
Equations (4 ) and (5 ) for the one
dimensional case become
H m subscript π» π \displaystyle H_{m} italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT
= \displaystyle= =
H m β β T ~ m β’ M β H m 0 direct-sum superscript subscript π» π perpendicular-to subscript ~ π π π superscript subscript π» π 0 \displaystyle H_{m}^{\perp}\oplus\widetilde{T}_{m}M\oplus H_{m}^{0} italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT β over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_M β italic_H start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT
(16)
H β’ M π» π \displaystyle H\!M italic_H italic_M
= \displaystyle= =
H β β’ M β T ~ β’ M β H 0 β’ M direct-sum superscript π» perpendicular-to π ~ π π superscript π» 0 π \displaystyle H^{\perp\!}M\oplus\widetilde{T}\!M\oplus H^{0}\!M italic_H start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT italic_M β over~ start_ARG italic_T end_ARG italic_M β italic_H start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_M
When M π M italic_M is parameterized by ΞΈ γγΌγ Β― Β― π \underline{\theta} underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG ,
m π m italic_m in (16 ) is replaced with ΞΈ γγΌγ Β― = ΞΈ γγΌγ Β― β’ ( m ) ΞΈ γγΌγ Β― = ΞΈ γγΌγ Β― π \text{$\underline{\theta}$=$\underline{\theta}$}(m) underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG = underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG ( italic_m ) .
Definition 5 .
A generalized estimator for a k π k italic_k -dimensional
parameter ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ is a function
g : π΄ Γ Ξ γγΌγ Β― βΆ β k : π βΆ π΄ Β― Ξ γγΌγ superscript β π g:\mathcal{Y}\times\underline{\Theta}\longrightarrow\mathbb{R}^{k} italic_g : caligraphic_Y Γ underΒ― start_ARG roman_Ξ γγΌγ end_ARG βΆ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT
and g y = g β’ ( y , β
) subscript π π¦ π π¦ β
g_{y}=g(y,\cdot) italic_g start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT = italic_g ( italic_y , β
) is the corresponding generalized estimate
at y π¦ y italic_y if
( I ) I \displaystyle\mathrm{(I)} ( roman_I )
g β’ ( y , β
) β C 1 β’ ( Ξ γγΌγ Β― , β k ) β’ Β a.e. β’ y π π¦ β
superscript πΆ 1 Β― Ξ γγΌγ superscript β π Β a.e. π¦ \displaystyle\ \ g\left(y,\cdot\right)\in C^{1}(\underline{\Theta},\mathbb{R}^%
{k})\mbox{\ a.e.}\ y italic_g ( italic_y , β
) β italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( underΒ― start_ARG roman_Ξ γγΌγ end_ARG , blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) a.e. italic_y
( II ) II \displaystyle\mathrm{(II)} ( roman_II )
g β’ ( β
, ΞΈ γγΌγ Β― ) β H ΞΈ γγΌγ Β― β₯ β’ Β for allΒ ΞΈ γγΌγ Β― β Ξ γγΌγ Β― π β
Β― π superscript subscript π» Β― π bottom Β for allΒ ΞΈ γγΌγ Β― Β― Ξ γγΌγ \displaystyle\ \ g(\cdot,\text{$\underline{\theta}$})\in H_{\text{$\underline{%
\theta}$}}^{\bot}\mbox{ for all }\text{$\underline{\theta}$}\in\underline{\Theta} italic_g ( β
, underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG ) β italic_H start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β₯ end_POSTSUPERSCRIPT for all Β―ΞΈ γγΌγ β underΒ― start_ARG roman_Ξ γγΌγ end_ARG
(III)
π΅ β’ ( g ) > 0 π΅ π 0 \displaystyle\ \ \mathsf{V}\left(g\right)>0 sansserif_V ( italic_g ) > 0
where π΅ β’ ( g ) = π€ β’ ( g β’ g π ) β C 1 β’ ( Ξ γγΌγ Β― , β k Γ k ) π΅ π π€ π superscript π π superscript πΆ 1 Β― Ξ γγΌγ superscript β π π \mathsf{V}(g)=\mathsf{E}(gg^{\mathsf{t}})\in C^{1}\left(\underline{\Theta},%
\mathbb{R}^{k\times k}\right) sansserif_V ( italic_g ) = sansserif_E ( italic_g italic_g start_POSTSUPERSCRIPT sansserif_t end_POSTSUPERSCRIPT ) β italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( underΒ― start_ARG roman_Ξ γγΌγ end_ARG , blackboard_R start_POSTSUPERSCRIPT italic_k Γ italic_k end_POSTSUPERSCRIPT ) .
The space of generalized estimators for ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ is π’ π’ \mathcal{G} caligraphic_G
which we write as π’ Ξ γγΌγ subscript π’ Ξ γγΌγ \mathcal{G}_{\Theta} caligraphic_G start_POSTSUBSCRIPT roman_Ξ γγΌγ end_POSTSUBSCRIPT if we consider more than
one parameterization. If f ΞΈ γγΌγ Β― = f β’ ( β
, ΞΈ γγΌγ Β― ) β H ΞΈ γγΌγ Β― subscript π Β― π π β
Β― π subscript π» Β― π f_{\text{$\underline{\theta}$}}=f(\cdot,\text{$\underline{\theta}$})\in H_{%
\underline{\theta}} italic_f start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT = italic_f ( β
, underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG ) β italic_H start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT
for all ΞΈ γγΌγ Β― β Ξ γγΌγ Β― Β― π Β― Ξ γγΌγ \text{$\underline{\theta}$}\in\text{$\underline{\Theta}$} underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG β underΒ― start_ARG roman_Ξ γγΌγ end_ARG and satisfies conditions (I) and (III) of
Definition 5 but f ΞΈ γγΌγ Β― β H ΞΈ γγΌγ Β― β subscript π Β― π superscript subscript π» Β― π perpendicular-to f_{\text{$\underline{\theta}$}}\not\in H_{\underline{\theta}}^{\perp} italic_f start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT β italic_H start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT ,
then f π f italic_f is a pre-estimator . The orthogonalization
of f π f italic_f at ΞΈ γγΌγ Β― Β― π \underline{\theta} underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG
f ΞΈ γγΌγ Β― β₯ = f ΞΈ γγΌγ Β― β f ΞΈ γγΌγ Β― β€ β H ΞΈ γγΌγ Β― β superscript subscript π Β― π bottom subscript π Β― π superscript subscript π Β― π top superscript subscript π» Β― π perpendicular-to f_{\text{$\underline{\theta}$}}^{\bot}=f_{\text{$\underline{\theta}$}}-f_{%
\underline{\theta}}^{\top}\in H_{\text{$\underline{\theta}$}}^{\perp} italic_f start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β₯ end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β€ end_POSTSUPERSCRIPT β italic_H start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT
(17)
where f ΞΈ γγΌγ Β― β€ = E ΞΈ γγΌγ Β― β’ ( f ΞΈ γγΌγ Β― ) + P ~ ΞΈ γγΌγ Β― β’ f ΞΈ γγΌγ Β― superscript subscript π Β― π top subscript πΈ Β― π subscript π Β― π subscript ~ π Β― π subscript π Β― π f_{\underline{\theta}}^{\top}=E_{\underline{\theta}}(f_{\text{$\underline{%
\theta}$}})+\widetilde{P}_{\text{$\underline{\theta}$}}f_{\text{$\underline{%
\theta}$}} italic_f start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β€ end_POSTSUPERSCRIPT = italic_E start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT ( italic_f start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT ) + over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT
and P ~ ΞΈ γγΌγ Β― subscript ~ π Β― π \widetilde{P}_{\text{$\underline{\theta}$}} over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT
is the orthogonal projection onto T ~ ΞΈ γγΌγ Β― β’ M subscript ~ π Β― π π \widetilde{T}_{\text{$\underline{\theta}$}}\!M over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG end_POSTSUBSCRIPT italic_M .
Since (17 ) holds for all ΞΈ γγΌγ Β― Β― π \underline{\theta} underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG
and expectation and orthogonal projections are smooth functions we
have
f β = f β f β€ β π’ superscript π perpendicular-to π superscript π top π’ f^{\perp}=f-f^{\top}\in\mathcal{G} italic_f start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT = italic_f - italic_f start_POSTSUPERSCRIPT β€ end_POSTSUPERSCRIPT β caligraphic_G
where f β€ = π€ β’ f + π― ~ β’ f superscript π top π€ π ~ π― π f^{\top}=\mathsf{E}f+\widetilde{\mathsf{P}}f italic_f start_POSTSUPERSCRIPT β€ end_POSTSUPERSCRIPT = sansserif_E italic_f + over~ start_ARG sansserif_P end_ARG italic_f .
The score β β β β \nabla\ell β roman_β is a pre-estimator so that we define s π s italic_s
to be the orthogonalized score
s = ( β β ) β₯ β π’ . π superscript β β bottom π’ s=(\nabla\ell)^{\bot}\in\mathcal{G}. italic_s = ( β roman_β ) start_POSTSUPERSCRIPT β₯ end_POSTSUPERSCRIPT β caligraphic_G .
The Fisher information for ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ is I = π΅ β’ ( β β ) πΌ π΅ β β I=\mathsf{V}\left(\nabla\ell\right) italic_I = sansserif_V ( β roman_β )
and the nuisance orthogonalized Fisher information is I β = π΅ β’ ( ( β β ) β₯ ) = π΅ β’ ( s ) superscript πΌ perpendicular-to π΅ superscript β β bottom π΅ π I^{\perp}=\mathsf{V}\left((\nabla\ell)^{\bot}\right)=\mathsf{V}\left(s\right) italic_I start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT = sansserif_V ( ( β roman_β ) start_POSTSUPERSCRIPT β₯ end_POSTSUPERSCRIPT ) = sansserif_V ( italic_s ) ;
both can be functions of ΞΈ γγΌγ ~ ~ π \undertilde{\theta} under~ start_ARG italic_ΞΈ γγΌγ end_ARG but only I β superscript πΌ perpendicular-to I^{\perp} italic_I start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT
is the same for all nuisance parameterizations.
The relationship between the score (information) and the orthogonalized
score (orthogonalized information) expressed in the ΞΈ γγΌγ Β― Β― π \underline{\theta} underΒ― start_ARG italic_ΞΈ γγΌγ end_ARG
parameterization is
( β β ) β superscript β β perpendicular-to \displaystyle\left(\nabla\ell\right)^{\perp} ( β roman_β ) start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT
= \displaystyle= =
β β β I β β ~ β’ I ~ β 1 β’ β ~ β’ β β β subscript πΌ β ~ β superscript ~ πΌ 1 ~ β β \displaystyle\nabla\ell-I_{\nabla\widetilde{\nabla}}\undertilde{I}^{-1}%
\widetilde{\nabla}\ell β roman_β - italic_I start_POSTSUBSCRIPT β over~ start_ARG β end_ARG end_POSTSUBSCRIPT under~ start_ARG italic_I end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG β end_ARG roman_β
I β superscript πΌ perpendicular-to \displaystyle I^{\perp} italic_I start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT
= \displaystyle= =
I β I β β ~ β’ I ~ β 1 β’ I β ~ β’ β πΌ subscript πΌ β ~ β superscript ~ πΌ 1 subscript πΌ ~ β β \displaystyle I-I_{\nabla\widetilde{\nabla}}\undertilde{I}^{-1}I_{\widetilde{%
\nabla}\nabla} italic_I - italic_I start_POSTSUBSCRIPT β over~ start_ARG β end_ARG end_POSTSUBSCRIPT under~ start_ARG italic_I end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT over~ start_ARG β end_ARG β end_POSTSUBSCRIPT
where
I Β― = I β Β― β’ β Β― = ( I I β β ~ I β ~ β’ β I ~ ) Β― πΌ subscript πΌ Β― β Β― β matrix πΌ subscript πΌ β ~ β subscript πΌ ~ β β ~ πΌ \underline{I}=I_{\overline{\nabla}\overline{\nabla}}=\begin{pmatrix}\begin{%
array}[]{ll}I&I_{\nabla\widetilde{\nabla}}\\
I_{\widetilde{\nabla}\nabla}&\undertilde{I}\end{array}\end{pmatrix} underΒ― start_ARG italic_I end_ARG = italic_I start_POSTSUBSCRIPT overΒ― start_ARG β end_ARG overΒ― start_ARG β end_ARG end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL start_ARRAY start_ROW start_CELL italic_I end_CELL start_CELL italic_I start_POSTSUBSCRIPT β over~ start_ARG β end_ARG end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_I start_POSTSUBSCRIPT over~ start_ARG β end_ARG β end_POSTSUBSCRIPT end_CELL start_CELL under~ start_ARG italic_I end_ARG end_CELL end_ROW end_ARRAY end_CELL end_ROW end_ARG )
and I = I β β πΌ subscript πΌ β β I=I_{\nabla\nabla} italic_I = italic_I start_POSTSUBSCRIPT β β end_POSTSUBSCRIPT and I ~ = I β ~ β’ β ~ ~ πΌ subscript πΌ ~ β ~ β \undertilde{I}=I_{\widetilde{\nabla}\widetilde{\nabla}} under~ start_ARG italic_I end_ARG = italic_I start_POSTSUBSCRIPT over~ start_ARG β end_ARG over~ start_ARG β end_ARG end_POSTSUBSCRIPT
are the Fisher informations for ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ and ΞΈ γγΌγ ~ ~ π \undertilde{\theta} under~ start_ARG italic_ΞΈ γγΌγ end_ARG .
When I β β ~ subscript πΌ β ~ β I_{\nabla\widetilde{\nabla}} italic_I start_POSTSUBSCRIPT β over~ start_ARG β end_ARG end_POSTSUBSCRIPT vanishes on Ξ γγΌγ Β― Β― Ξ γγΌγ \underline{\Theta} underΒ― start_ARG roman_Ξ γγΌγ end_ARG ,
parameterizations ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ and ΞΈ γγΌγ ~ ~ π \undertilde{\theta} under~ start_ARG italic_ΞΈ γγΌγ end_ARG are orthogonal .
The definition of the scalar potential in the multi-parameter case
is straight forward. And, as in the scalar parameter case, the log
likelihood β β \ell roman_β is the scalar potential for s π s italic_s .
Definition 6 .
A scalar potential of
g β π’ π π’ g\in\mathcal{G} italic_g β caligraphic_G is any function G : π΄ Γ Ξ γγΌγ Β― βΆ β : πΊ βΆ π΄ Β― Ξ γγΌγ β G:\mathcal{Y}\times\text{$\underline{\Theta}$}\longrightarrow\mathbb{R} italic_G : caligraphic_Y Γ underΒ― start_ARG roman_Ξ γγΌγ end_ARG βΆ blackboard_R
such that g = ( β G ) β₯ π superscript β πΊ bottom g=(\nabla G)^{\bot} italic_g = ( β italic_G ) start_POSTSUPERSCRIPT β₯ end_POSTSUPERSCRIPT .
The multivariate version of (13 ) is
β π€ β’ ( f π ) = π€ β’ ( β f π ) + ( β π€ ) β’ ( f π ) β π€ superscript π π π€ β superscript π π β π€ superscript π π \nabla\mathsf{E}(f^{{\tt t}})=\mathsf{E}\left(\nabla f^{{\tt t}}\right)+\left(%
\nabla\mathsf{E}\right)(f^{{\tt t}}) β sansserif_E ( italic_f start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) = sansserif_E ( β italic_f start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) + ( β sansserif_E ) ( italic_f start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT )
(18)
where
( β π€ ) β’ ( f π ) = π€ β’ ( ( β β ) β’ f π ) . β π€ superscript π π π€ β β superscript π π \left(\nabla\mathsf{E}\right)\left(f^{{\tt t}}\right)=\mathsf{E}\left(\left(%
\nabla\ell\right)f^{{\tt t}}\right). ( β sansserif_E ) ( italic_f start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) = sansserif_E ( ( β roman_β ) italic_f start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) .
Since g β H M β π superscript subscript π» π perpendicular-to g\in H_{M}^{\perp} italic_g β italic_H start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT we have π€ β’ ( ( β β ) β’ g π ) = π€ β’ ( s β’ g π ) π€ β β superscript π π π€ π superscript π π \mathsf{E}\left(\left(\nabla\ell\right)g^{{\tt t}}\right)=\mathsf{E}\left(sg^{%
{\tt t}}\right) sansserif_E ( ( β roman_β ) italic_g start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) = sansserif_E ( italic_s italic_g start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT )
so that the multivariate version of the score equation (14 )
is
π€ β’ ( β g π ) + π€ β’ ( s β’ g π ) = 0 . π€ β superscript π π π€ π superscript π π 0 \mathsf{E}\left(\nabla g^{{\tt t}}\right)+\mathsf{E}\left(sg^{{\tt t}}\right)=0. sansserif_E ( β italic_g start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) + sansserif_E ( italic_s italic_g start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) = 0 .
(19)
Differentiating with respect to the nuisance parameter we obtain
π€ β’ ( β ~ β’ g π ) + π€ β’ ( s ~ β’ g π ) = 0 π€ ~ β superscript π π π€ ~ π superscript π π 0 \mathsf{E}(\widetilde{\nabla}g^{{\tt t}})+\mathsf{E}(\undertilde{s}g^{{\tt t}}%
)=0 sansserif_E ( over~ start_ARG β end_ARG italic_g start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) + sansserif_E ( under~ start_ARG italic_s end_ARG italic_g start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) = 0
(20)
so that g π g italic_g being nuisance orthogonal means that the average change
of g π g italic_g in the direction of the nuisance parameter is zero.
For the mean slope to be meaningful we need to use its standardized
version.
Definition 7 .
For g β π’ π π’ g\in\mathcal{G} italic_g β caligraphic_G , define
g Β― = π΅ β 1 / 2 β’ g Β― π superscript π΅ 1 2 π \bar{g}=\mathsf{V}^{-1/2}g overΒ― start_ARG italic_g end_ARG = sansserif_V start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT italic_g
where π΅ = π΅ β’ ( g ) π΅ π΅ π \mathsf{V}=\mathsf{V}\left(g\right) sansserif_V = sansserif_V ( italic_g ) so that π΅ β’ ( g Β― ) π΅ Β― π \mathsf{V}\left(\bar{g}\right) sansserif_V ( overΒ― start_ARG italic_g end_ARG )
is I ππ½ subscript πΌ ππ½ I_{\mathsf{id}} italic_I start_POSTSUBSCRIPT sansserif_id end_POSTSUBSCRIPT , the k Γ k π π k\times k italic_k Γ italic_k identity matrix. Any g π g italic_g
such that π΅ β’ ( g ) = I ππ½ π΅ π subscript πΌ ππ½ \mathsf{V}\left(g\right)=I_{\mathsf{id}} sansserif_V ( italic_g ) = italic_I start_POSTSUBSCRIPT sansserif_id end_POSTSUBSCRIPT is called a
standardized estimator .
Definition 8 .
The information for ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ utilized by g π g italic_g
is
Ξ γγγ β’ ( g ) Ξ γγγ π \displaystyle\Lambda\left(g\right) roman_Ξ γγγ ( italic_g )
= ( π€ β’ β g Β― π ) β’ ( π€ β’ β g Β― π ) π absent π€ β superscript Β― π π superscript π€ β superscript Β― π π π \displaystyle=\left(\mathsf{E}\nabla\bar{g}^{\mathsf{t}}\right)\left(\mathsf{E%
}\nabla\bar{g}^{\mathsf{t}}\right)^{\mathsf{t}} = ( sansserif_E β overΒ― start_ARG italic_g end_ARG start_POSTSUPERSCRIPT sansserif_t end_POSTSUPERSCRIPT ) ( sansserif_E β overΒ― start_ARG italic_g end_ARG start_POSTSUPERSCRIPT sansserif_t end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_t end_POSTSUPERSCRIPT
= ( π€ β’ β g π ) β’ π΅ β 1 β’ ( g ) β’ ( π€ β’ β g π ) π . absent π€ β superscript π π superscript π΅ 1 π superscript π€ β superscript π π π \displaystyle=\left(\mathsf{E}\nabla g^{\mathsf{t}}\right)\mathsf{V}^{-1}(g)%
\left(\mathsf{E}\nabla g^{\mathsf{t}}\right)^{\mathsf{t}}. = ( sansserif_E β italic_g start_POSTSUPERSCRIPT sansserif_t end_POSTSUPERSCRIPT ) sansserif_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_g ) ( sansserif_E β italic_g start_POSTSUPERSCRIPT sansserif_t end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_t end_POSTSUPERSCRIPT .
The scalar information for ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ utilized by g π g italic_g is
Ξ» γγγ β’ ( g ) = t β’ r Ξ γγγ β’ ( g ) . π π π‘ π Ξ γγγ π \lambda(g)=\mathop{tr}\Lambda(g). italic_Ξ» γγγ ( italic_g ) = start_BIGOP italic_t italic_r end_BIGOP roman_Ξ γγγ ( italic_g ) .
Note Ξ γγγ β’ ( g ) β C 1 β’ ( Ξ γγΌγ Β― , β k Γ β k ) Ξ γγγ π superscript πΆ 1 Β― Ξ γγΌγ superscript β π superscript β π \Lambda(g)\in C^{1}(\text{$\underline{\Theta}$},\mathbb{R}^{k}\times\mathbb{R}%
^{k}) roman_Ξ γγγ ( italic_g ) β italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( underΒ― start_ARG roman_Ξ γγΌγ end_ARG , blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT Γ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) .
Using the Frobenius norm for matrix A π΄ A italic_A , β A β = t β’ r ( A π β’ A ) norm π΄ π‘ π superscript π΄ π π΄ ||A||=\sqrt{\mathop{tr}(A^{{\tt t}}A)} | | italic_A | | = square-root start_ARG start_BIGOP italic_t italic_r end_BIGOP ( italic_A start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT italic_A ) end_ARG ,
we see that the scalar information is the square of the norm of π€ β’ β g Β― π π€ β superscript Β― π π \mathsf{E}\nabla\bar{g}^{{\tt t}} sansserif_E β overΒ― start_ARG italic_g end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT
Ξ» γγγ β’ ( g ) = β π€ β’ β g Β― π β 2 . π π superscript norm π€ β superscript Β― π π 2 \lambda(g)=||\mathsf{E}\nabla\bar{g}^{{\tt t}}||^{2}. italic_Ξ» γγγ ( italic_g ) = | | sansserif_E β overΒ― start_ARG italic_g end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .
By replacing β β \nabla β with β ~ ~ β \widetilde{\nabla} over~ start_ARG β end_ARG in Definition 8
we could define Ξ γγγ ~ β’ ( g ) ~ Ξ γγγ π \undertilde{\Lambda}(g) under~ start_ARG roman_Ξ γγγ end_ARG ( italic_g ) , the information for ΞΈ γγΌγ ~ ~ π \undertilde{\theta} under~ start_ARG italic_ΞΈ γγΌγ end_ARG .
However, equation (20 ) shows Ξ γγγ ~ β’ ( g ) = 0 ~ Ξ γγγ π 0 \undertilde{\Lambda}(g)=0 under~ start_ARG roman_Ξ γγγ end_ARG ( italic_g ) = 0
for all g β π’ π π’ g\in\mathcal{G} italic_g β caligraphic_G . Restricting estimators to be orthogonal
to the space spanned by the nuisance parameters makes inferences independent
of the choice of the nuisance parameter but also means that estimators
for the parameter of interest have no information for the nuisance
parameter.
Theorem 2 .
For k-dimensional parameter ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ let s = ( β β ) β π superscript β β perpendicular-to s=(\nabla\ell)^{\perp} italic_s = ( β roman_β ) start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT
and let I β = π΅ β’ ( s ) superscript πΌ perpendicular-to π΅ π I^{\perp}=\mathsf{V}(s) italic_I start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT = sansserif_V ( italic_s ) be the orthogonalized Fisher information
for ΞΈ γγΌγ π \theta italic_ΞΈ γγΌγ . For any g β π’ π π’ g\in\mathcal{G} italic_g β caligraphic_G , Ξ γγγ β’ ( g ) β€ I β Ξ γγγ π superscript πΌ perpendicular-to \Lambda(g)\leq I^{\perp} roman_Ξ γγγ ( italic_g ) β€ italic_I start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT
and s π s italic_s attains this bound, Ξ γγγ β’ ( s ) = I β Ξ γγγ π superscript πΌ perpendicular-to \Lambda(s)=I^{\perp} roman_Ξ γγγ ( italic_s ) = italic_I start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT . Furthermore,
Ξ γγγ β’ ( g ) Ξ γγγ π \displaystyle\Lambda(g) roman_Ξ γγγ ( italic_g )
= \displaystyle= =
π΅ β’ ( π― g β’ s ) = π΅ β’ ( π― g β’ β β ) π΅ subscript π― π π π΅ subscript π― π β β \displaystyle\mathsf{V}(\mathsf{P}_{g}s)=\mathsf{V}(\mathsf{P}_{g}\nabla\ell) sansserif_V ( sansserif_P start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT italic_s ) = sansserif_V ( sansserif_P start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT β roman_β )
= \displaystyle= =
( I β ) 1 / 2 β’ π±π± π β’ ( I β ) 1 / 2 superscript superscript πΌ perpendicular-to 1 2 superscript π±π± π superscript superscript πΌ perpendicular-to 1 2 \displaystyle(I^{\perp})^{1/2}\mathsf{R}\mathsf{R}^{{\tt t}}(I^{\perp})^{1/2} ( italic_I start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT sansserif_RR start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ( italic_I start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT
where π± = π€ β’ ( s Β― β’ g Β― π ) π± π€ Β― π superscript Β― π π \mathsf{R}=\mathsf{E}(\bar{s}\bar{g}^{{\tt t}}) sansserif_R = sansserif_E ( overΒ― start_ARG italic_s end_ARG overΒ― start_ARG italic_g end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) is the correlation
matrix between s π s italic_s and g π g italic_g .
Proof.
The displayed equations in the Theorem are obtained from the score
equation (19 ) which gives
Ξ γγγ β’ ( g ) = π€ β’ ( s β’ g Β― π ) β’ π€ β’ ( g Β― β’ s π ) . Ξ γγγ π π€ π superscript Β― π π π€ Β― π superscript π π \Lambda(g)=\mathsf{E}(s\bar{g}^{{\tt t}})\mathsf{E}(\bar{g}s^{{\tt t}}). roman_Ξ γγγ ( italic_g ) = sansserif_E ( italic_s overΒ― start_ARG italic_g end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) sansserif_E ( overΒ― start_ARG italic_g end_ARG italic_s start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) .
The first equation follows from the definition of the projection and
its variance: π― g β’ s = π€ β’ ( s β’ g Β― π ) β’ g Β― subscript π― π π π€ π superscript Β― π π Β― π \mathsf{P}_{g}s=\mathsf{E}(s\bar{g}^{{\tt t}})\bar{g} sansserif_P start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT italic_s = sansserif_E ( italic_s overΒ― start_ARG italic_g end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) overΒ― start_ARG italic_g end_ARG
so π΅ β’ ( π― g β’ s ) = π€ β’ ( s β’ g Β― π ) β’ π€ β’ ( g Β― β’ s π ) π΅ subscript π― π π π€ π superscript Β― π π π€ Β― π superscript π π \mathsf{V}(\mathsf{P}_{g}s)=\mathsf{E}(s\bar{g}^{{\tt t}})\mathsf{E}(\bar{g}s^%
{{\tt t}}) sansserif_V ( sansserif_P start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT italic_s ) = sansserif_E ( italic_s overΒ― start_ARG italic_g end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) sansserif_E ( overΒ― start_ARG italic_g end_ARG italic_s start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) .
The second equation follows because β β = s + ( β β ) β€ β β π superscript β β top \nabla\ell=s+(\nabla\ell)^{\top} β roman_β = italic_s + ( β roman_β ) start_POSTSUPERSCRIPT β€ end_POSTSUPERSCRIPT
and g π g italic_g is orthogonal to ( β β ) β€ superscript β β top \left(\nabla\ell\right)^{\top} ( β roman_β ) start_POSTSUPERSCRIPT β€ end_POSTSUPERSCRIPT . The third
equation follows from π€ β’ ( s β’ g Β― π ) = ( I β₯ ) 1 / 2 β’ π€ β’ ( s Β― β’ g Β― π ) π€ π superscript Β― π π superscript superscript πΌ bottom 1 2 π€ Β― π superscript Β― π π \mathsf{E}(s\bar{g}^{{\tt t}})=(I^{\bot})^{1/2}\mathsf{E}(\bar{s}\bar{g}^{{\tt
t%
}}) sansserif_E ( italic_s overΒ― start_ARG italic_g end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT ) = ( italic_I start_POSTSUPERSCRIPT β₯ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT sansserif_E ( overΒ― start_ARG italic_s end_ARG overΒ― start_ARG italic_g end_ARG start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT )
since π΅ β’ ( s ) = I β₯ π΅ π superscript πΌ bottom \mathsf{V}(s)=I^{\bot} sansserif_V ( italic_s ) = italic_I start_POSTSUPERSCRIPT β₯ end_POSTSUPERSCRIPT . The inequality Ξ γγγ β’ ( g ) β€ I β₯ Ξ γγγ π superscript πΌ bottom \Lambda(g)\leq I^{\bot} roman_Ξ γγγ ( italic_g ) β€ italic_I start_POSTSUPERSCRIPT β₯ end_POSTSUPERSCRIPT
follows because the squared length of a projection cannot be longer
than the original vector.
β
When there are no nuisance parameters Theorem 2 holds
with I β₯ = I superscript πΌ bottom πΌ I^{\bot}=I italic_I start_POSTSUPERSCRIPT β₯ end_POSTSUPERSCRIPT = italic_I and s = β β π β β s=\nabla\ell italic_s = β roman_β .
Definition 9 .
The Ξ γγγ Ξ γγγ \Lambda roman_Ξ γγγ -efficiency of g π g italic_g is
Eff Ξ γγγ ( g ) = ( I β ) Ξ γγγ β 1 / 2 ( g ) ( I β ) . β 1 / 2 \mbox{Eff}^{\Lambda}\left(g\right)=(I^{\perp}){}^{-1/2}\Lambda(g)(I^{\perp}){}%
^{-1/2}. Eff start_POSTSUPERSCRIPT roman_Ξ γγγ end_POSTSUPERSCRIPT ( italic_g ) = ( italic_I start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT ) start_FLOATSUPERSCRIPT - 1 / 2 end_FLOATSUPERSCRIPT roman_Ξ γγγ ( italic_g ) ( italic_I start_POSTSUPERSCRIPT β end_POSTSUPERSCRIPT ) start_FLOATSUPERSCRIPT - 1 / 2 end_FLOATSUPERSCRIPT .
Corollary 2 follows immediately from Theorem 2 .
Corollary 2 .
Eff Ξ γγγ β’ ( g ) superscript Eff Ξ γγγ π \displaystyle\mbox{Eff}^{\Lambda}\left(g\right) Eff start_POSTSUPERSCRIPT roman_Ξ γγγ end_POSTSUPERSCRIPT ( italic_g )
= \displaystyle= =
π΅ β’ ( π― g β’ s Β― ) π΅ subscript π― π Β― π \displaystyle\mathsf{V}(\mathsf{P}_{g}\bar{s}) sansserif_V ( sansserif_P start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT overΒ― start_ARG italic_s end_ARG )
= \displaystyle= =
π±π± π . superscript π±π± π \displaystyle\mathsf{R}\mathsf{R}^{{\tt t}}. sansserif_RR start_POSTSUPERSCRIPT typewriter_t end_POSTSUPERSCRIPT .