(Translated by https://www.hiragana.jp/)
Efficient estimation of partially linear additive Cox models and variance estimation under shape restrictions

Efficient estimation of partially linear additive Cox models and variance estimation under shape restrictions

Junjun Lang KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai 200062, China Yukun Liu Corresponding author: ykliu@sfs.ecnu.edu.cn KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai 200062, China Jing Qin National Institute of Allergy and Infectious Diseases, National Institutes of Health, USA
Abstract

Shape-restricted inferences have exhibited empirical success in various applications with survival data. However, certain works fall short in providing a rigorous theoretical justification and an easy-to-use variance estimator with theoretical guarantee. Motivated by Deng et al. (2023), this paper delves into an additive and shape-restricted partially linear Cox model for right-censored data, where each additive component satisfies a specific shape restriction, encompassing monotonic increasing/decreasing and convexity/concavity. We systematically investigate the consistencies and convergence rates of the shape-restricted maximum partial likelihood estimator (SMPLE) of all the underlying parameters. We further establish the aymptotic normality and semiparametric effiency of the SMPLE for the linear covariate shift. To estimate the asymptotic variance, we propose an innovative data-splitting variance estimation method that boasts exceptional versatility and broad applicability. Our simulation results and an analysis of the Rotterdam Breast Cancer dataset demonstrate that the SMPLE has comparable performance with the maximum likelihood estimator under the Cox model when the Cox model is correct, and outperforms the latter and Huang (1999)’s method when the Cox model is violated or the hazard is nonsmooth. Meanwhile, the proposed variance estimation method usually leads to reliable interval estimates based on the SMPLE and its competitors.

Keywords:  Shape restriction; Righter-censored data; Additive model; Semiparametric efficiency; Variance estimation

1 Introduction

Shape restrictions (such as monotonicity and convexity) arise naturally in numerous practical scenarios. For instance, the growth curves of animals and plants in ecology and the dose-response in medicine must inherently exhibit non-decreasing characteristics (Chang et al., 2007; Wang and Ghosh, 2012). In the realm of economics, utility and production functions are often concave in income and prices (Matzkin, 1991; Varian, 1984), cost functions are monotone increasing, concave in input prices, and may exhibit non-increasing or non-decreasing returns to scale (Horowitz and Lee, 2017). In genetic epidemiology studies, the cumulative risk of a disease for individuals possess monotonicity (Qin et al., 2014). While in reliability analysis, the bathtub curve describing the failure rate typically displays convexity.

Incorporating shape restrictions into statistical analysis, apart from its exceptional interpretability and ability to enforce domain-specific constraints, often results in an estimation procedure that is devoid of tuning parameters, enhancing its efficiency and robustness. Therefore shape-restricted techniques has become an increasing popular tool for statistical inference or learning in various settings over the past decades. A comprehensive review on shape-restricted nonparametric inferences can be found in Groeneboom and Jongbloed (2014) and references therein. Recently, Chen and Samworth (2016) developed an algorithm for the estimation of the generalized additive model in which each of the additive components is linear or subject to a shape restriction. Balabdaoui et al. (2019) considered the estimation of the index parameter in a single-index model with a monotonically increasing link function. Deng and Zhang (2020) studied minimax and adaptation rates in general multiple isotonic regression. Feng et al. (2022) systematically investigate the theoretical properties of the least squared estimator of a S-shaped regression function.

This paper focus on the statistical inference for righter-censored survival data. Let T𝑇Titalic_T denote the survival time and (Z,X)p×d𝑍𝑋superscript𝑝superscript𝑑(Z,X)\in\mathbb{R}^{p}\times\mathbb{R}^{d}( italic_Z , italic_X ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT × blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT denote a (p+d)×1𝑝𝑑1(p+d)\times 1( italic_p + italic_d ) × 1 vector of covariates. We consider the partially linear Cox model of (Sasieni, 1992, PLCM) for modelling the conditional hazard function, i.e.

λらむだT(tx,z)=λらむだ(t)exp(βべーたx+g(z)),subscript𝜆𝑇conditional𝑡𝑥𝑧𝜆𝑡superscript𝛽top𝑥𝑔𝑧\displaystyle\lambda_{T}(t\mid x,z)=\lambda(t)\exp(\beta^{\top}x+g(z)),italic_λらむだ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_t ∣ italic_x , italic_z ) = italic_λらむだ ( italic_t ) roman_exp ( italic_βべーた start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT italic_x + italic_g ( italic_z ) ) , (1)

where λらむだ()𝜆\lambda(\cdot)italic_λらむだ ( ⋅ ) is the unspecified baseline hazard function, βべーたd𝛽superscript𝑑\beta\in\mathbb{R}^{d}italic_βべーた ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is unspecified and g():p:𝑔maps-tosuperscript𝑝g(\cdot):\mathbb{R}^{p}\mapsto\mathbb{R}italic_g ( ⋅ ) : blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ↦ blackboard_R is an unknown function. This model reduces to the renowned Cox proportional hazards model (Cox, 1972, 1975) when the covariate Z𝑍Zitalic_Z disappears, and it becomes the nonparametric Cox model (Sleeper and Harrington, 1990; O’Sullivan, 1993) in the absence of X𝑋Xitalic_X.

Many nonparametric techniques have been developed for the estimation of the PLCM, in particular for the linear covariate effect. Examples include profile partial likelihood together with a kernel technique (Heller, 2001), maximum likelihood estimation with a deep neural network (Zhong et al., 2022), and a kernel machine representation method (Rong et al., 2024), etc. However, these methods are either hampered by the curse of dimensionality or lack interpretability for g(Z)𝑔𝑍g(Z)italic_g ( italic_Z ), or suffer from tuning parameters, whose selection is not always straightforward. Alternatively, Huang (1999) proposed to model g(Z)𝑔𝑍g(Z)italic_g ( italic_Z ) by a generalized additive model (Hastie and Tibshirani, 1986, 1990), which effectively avoids the curse of dimensionality and enforces an additive effect for the covariate Z𝑍Zitalic_Z. Specifically,

g(Z)=j=1pgj(Z(j)),𝑔𝑍superscriptsubscript𝑗1𝑝subscript𝑔𝑗subscript𝑍𝑗\displaystyle g(Z)=\sum\limits_{j=1}^{p}g_{j}(Z_{(j)}),italic_g ( italic_Z ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT italic_g start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) , (2)

where for 1jp1𝑗𝑝1\leq j\leq p1 ≤ italic_j ≤ italic_p, Z(j)subscript𝑍𝑗Z_{(j)}italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT is the j𝑗jitalic_j-th component of Z𝑍Zitalic_Z and gjsubscript𝑔𝑗g_{j}italic_g start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is an unknown function. Huang (1999) proposed the use of polynomial splines to fit the unknown additive components. This method entails a number of tuning parameters, also yields convergence rates that lack conciseness and elegance. Furthermore, the spline method does not provide good interpretability for the additive covariate Z(j)subscript𝑍𝑗Z_{(j)}italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT.

Our paper is motivated by the work of Deng et al. (2023) which studied a shape-restricted and additive PLCM. Specifically, under models (1) and 2, they assume that each gjsubscript𝑔𝑗g_{j}italic_g start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is monotonic increasing/decreasing or convex/concave. An active-set optimization algorithm was provided to calculate the shape-restricted maximum likelihood estimator. The shape-restriction strategy facilitates the utilization of prior knowledge regarding the effect of the log conditional hazard function on each covariate Z(j)subscript𝑍𝑗Z_{(j)}italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT and leads to a tuning-parameter-free estimation procedure. However, they proved only a consistency result, and did not provide any asymptotic normality results. Qin et al. (2021) studied a PLCM with a single additive component subject to shape restrictions, but they did not establish any n𝑛\sqrt{n}square-root start_ARG italic_n end_ARG-consistency result. In addition, in general shape-restriction inferences, even if asymptotic normality results can be established, it is generally challenging to construct reasonable estimators for the asymptotic variances with theoretical guarantees (Groeneboom and Hendrickx, 2017).

This paper makes two main contributions to the literature of additive and shape-restricted PLCMs for survival data. The first contribution is to provide powerful statistical guarantees for the shape-restricted maximum partial likelihood estimator (SMPLE) and the induced Breslow-type estimator for the baseline cumulative hazard function under the model assumption of Deng et al. (2023). This includes a thorough convergence rate analysis for the estimators of the infinitely dimensional parameters, as well as establishing asymptotic normality and semiparametric efficiency for the estimator of the linear covariate effect. Our second contribution is to offer an easy-to-use estimator for the asymptotic variance of the linear covariate effect estimator. We show that this variance estimation method always provide consistent estimators once the corresponding asymptotic normality result holds. This method is very flexible and is applicable for general purpose especially in shape-restricted inferences, where theoretical guarantee of a bootstrap variance estimator is generally rather challenging (Groeneboom and Hendrickx, 2017). Our simulation results and an analysis of the Rotterdam Breast Cancer dataset demonstrate that the SMPLE has comparable performance with the maximum likelihood estimator under the Cox model when the Cox model is correct, and outperforms the latter and Huang (1999)’s method when the Cox model is violated and the hazard is nonsmooth. Meanwhile, the proposed variance estimation method usually leads to reliable interval estimates for the SMPLE and its competitors.

The rest of this paper is organized as follows. Section 2 introduces notations, data, and the shape-restricted maximum partial likelihood estimators (SMPLE). Section 3 investigates the convergence rates of the SMPLEs for all the unknown parameters, including βべーた𝛽\betaitalic_βべーた, the unknown additive components, and the baseline cumulative hazard function. Section 4 establishes the asymptotic normality and semiparametric efficiency of the SMPLE for βべーた𝛽\betaitalic_βべーた. A novel estimation method is also provided to estimate the asymptotic variance of the SMPLE of βべーた𝛽\betaitalic_βべーた. A simulation study and real data analysis are presented in Section 5 and 6, respectively. Section 7 contains concluding remarks. For clarity, all technical proofs are postponed to the supplementary material.

2 Methodology

2.1 Data and model assumptions

Let T𝑇Titalic_T and (X,Z)𝑋𝑍(X,Z)( italic_X , italic_Z ) be the survival time and the vector of covariates, respectively, in the introduction. Suppose that given (X,Z)𝑋𝑍(X,Z)( italic_X , italic_Z ), the conditional hazard function of T𝑇Titalic_T satisfies model (1) with g(Z)𝑔𝑍g(Z)italic_g ( italic_Z ) satisfying (2). The survival time T𝑇Titalic_T may be right censored by a censoring time C𝐶Citalic_C and we only observe Y=min(T,C)𝑌𝑇𝐶Y=\min(T,C)italic_Y = roman_min ( italic_T , italic_C ). Throughout this paper, we use 𝟏(A)1𝐴\mathbf{1}(A)bold_1 ( italic_A ) to denote the indicator function of the set A𝐴Aitalic_A and use a subscript 0 to highlight the true counterpart of a parameter. Let Δでるた=𝟏(TC)Δでるた1𝑇𝐶\Delta=\mathbf{1}(T\leq C)roman_Δでるた = bold_1 ( italic_T ≤ italic_C ) be the non-censoring indicator. Given n𝑛nitalic_n independent and identically distributed (iid) observations (Xi,Zi,Yi,Δでるたi)subscript𝑋𝑖subscript𝑍𝑖subscript𝑌𝑖subscriptΔでるた𝑖(X_{i},Z_{i},Y_{i},\Delta_{i})( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , roman_Δでるた start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ), 1in1𝑖𝑛1\leq i\leq n1 ≤ italic_i ≤ italic_n, from (X,Z,Y,Δでるた)𝑋𝑍𝑌Δでるた(X,Z,Y,\Delta)( italic_X , italic_Z , italic_Y , roman_Δでるた ), we wish to infer (βべーた0,g0)subscript𝛽0subscript𝑔0(\beta_{0},g_{0})( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) and the baseline cumulative hazard function Λらむだ0(y)=0yλらむだ0(t)𝑑tsubscriptΛらむだ0𝑦superscriptsubscript0𝑦subscript𝜆0𝑡differential-d𝑡\Lambda_{0}(y)=\int_{0}^{y}\lambda_{0}(t)dtroman_Λらむだ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_y ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_y end_POSTSUPERSCRIPT italic_λらむだ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_t ) italic_d italic_t.

The identifiability issue of models (1) and (2) was investigated by Deng et al. (2023), following which we assume 𝔼{g0,j(Z(j))Δでるた}=0𝔼subscript𝑔0𝑗subscript𝑍𝑗Δでるた0{\mathbb{E}}\{g_{0,j}(Z_{(j)})\Delta\}=0blackboard_E { italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT ( italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) roman_Δでるた } = 0, j=1,2,,p𝑗12𝑝j=1,2,\cdots,pitalic_j = 1 , 2 , ⋯ , italic_p, for identifiability. Furthermore, we assume that for 1jp1𝑗𝑝1\leq j\leq p1 ≤ italic_j ≤ italic_p, g0,j()subscript𝑔0𝑗g_{0,j}(\cdot)italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT ( ⋅ ) satisfies one of the four shape restrictions: monotone increasing, monotone decreasing, convex and concave, which are encoded as shape types 1, 2, 3 and 4, respectively. For any additive function g=j=1pgj(z(j))𝑔superscriptsubscript𝑗1𝑝subscript𝑔𝑗subscript𝑧𝑗g=\sum\limits_{j=1}^{p}g_{j}(z_{(j)})italic_g = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT italic_g start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ), we define sha(g)=(sha(g1),,sha(gp))sha𝑔superscriptshasubscript𝑔1shasubscript𝑔𝑝top{\rm sha}(g)=({\rm sha}(g_{1}),\cdots,{\rm sha}(g_{p}))^{\top}roman_sha ( italic_g ) = ( roman_sha ( italic_g start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , ⋯ , roman_sha ( italic_g start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT, where sha(h){1,2,3,4}sha1234{\rm sha}(h)\in\{1,2,3,4\}roman_sha ( italic_h ) ∈ { 1 , 2 , 3 , 4 } denotes the shape type of a univariate function hhitalic_h. We always denote 𝒌0=sha(g0)subscript𝒌0shasubscript𝑔0\boldsymbol{k}_{0}={\rm sha}(g_{0})bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = roman_sha ( italic_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). Let 𝒳𝒳\mathcal{X}caligraphic_X be the support of X𝑋Xitalic_X and for simplicity, we assume that the support of Z(j)subscript𝑍𝑗Z_{(j)}italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT is [0,1]01[0,1][ 0 , 1 ] for j=1,2,,p𝑗12𝑝j=1,2,\cdots,pitalic_j = 1 , 2 , ⋯ , italic_p.

2.2 SMPLE

For any (βべーた,g)𝛽𝑔(\beta,g)( italic_βべーた , italic_g ), denote ηいーた=(βべーた,g)𝜂𝛽𝑔\eta=(\beta,g)italic_ηいーた = ( italic_βべーた , italic_g ) and Rηいーた(U)=Xβべーた+g(Z)subscript𝑅𝜂𝑈superscript𝑋top𝛽𝑔𝑍R_{\eta}(U)=X^{\top}\beta+g(Z)italic_R start_POSTSUBSCRIPT italic_ηいーた end_POSTSUBSCRIPT ( italic_U ) = italic_X start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT italic_βべーた + italic_g ( italic_Z ), where U=(X,Z)𝑈superscriptsuperscript𝑋topsuperscript𝑍toptopU=(X^{\top},Z^{\top})^{\top}italic_U = ( italic_X start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT , italic_Z start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT. 1/n1𝑛1/n1 / italic_n times the usual partial log likelihood is

Ln(ηいーた)=1ni=1nΔでるたi[Rηいーた(Ui)log(j=1n𝟏(YjYi)exp(Rηいーた(Uj)))].subscript𝐿𝑛𝜂1𝑛superscriptsubscript𝑖1𝑛subscriptΔでるた𝑖delimited-[]subscript𝑅𝜂subscript𝑈𝑖superscriptsubscript𝑗1𝑛1subscript𝑌𝑗subscript𝑌𝑖subscript𝑅𝜂subscript𝑈𝑗\displaystyle L_{n}(\eta)=\frac{1}{n}\sum\limits_{i=1}^{n}\Delta_{i}\left[R_{% \eta}(U_{i})-\log\left(\sum\limits_{j=1}^{n}\mathbf{1}(Y_{j}\geq Y_{i})\exp(R_% {\eta}(U_{j}))\right)\right].italic_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_ηいーた ) = divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT roman_Δでるた start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT [ italic_R start_POSTSUBSCRIPT italic_ηいーた end_POSTSUBSCRIPT ( italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - roman_log ( ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT bold_1 ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≥ italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) roman_exp ( italic_R start_POSTSUBSCRIPT italic_ηいーた end_POSTSUBSCRIPT ( italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) ) ] .

We propose to estimate ηいーた𝜂\etaitalic_ηいーた by the shape-restricted maximum partial likelihood estimator (SMPLE),

ηいーた^:=(βべーた^,g^)=argmaxηいーたd×𝒢𝒌0Ln(ηいーた),assign^𝜂^𝛽^𝑔subscriptargmax𝜂superscript𝑑subscript𝒢subscript𝒌0subscript𝐿𝑛𝜂\displaystyle\hat{\eta}:=(\hat{\beta},\hat{g})=\operatorname{argmax}_{\eta\in% \mathbb{R}^{d}\times\mathcal{G}_{\boldsymbol{k}_{0}}}L_{n}(\eta),over^ start_ARG italic_ηいーた end_ARG := ( over^ start_ARG italic_βべーた end_ARG , over^ start_ARG italic_g end_ARG ) = roman_argmax start_POSTSUBSCRIPT italic_ηいーた ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT × caligraphic_G start_POSTSUBSCRIPT bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_ηいーた ) , (3)

where

𝒢𝒌0={g:[0,1]pg(Z)=j=1pgj(z(j)),sha(g)=𝒌0,𝔼[Δでるたgj(Z(j))]=0,1jp}subscript𝒢subscript𝒌0conditional-set𝑔formulae-sequencemaps-tosuperscript01𝑝conditional𝑔𝑍superscriptsubscript𝑗1𝑝subscript𝑔𝑗subscript𝑧𝑗formulae-sequencesha𝑔subscript𝒌0formulae-sequence𝔼delimited-[]Δでるたsubscript𝑔𝑗subscript𝑍𝑗01𝑗𝑝\displaystyle\mathcal{G}_{\boldsymbol{k}_{0}}=\bigg{\{}g:[0,1]^{p}\mapsto% \mathbb{R}\mid g(Z)=\sum\limits_{j=1}^{p}g_{j}(z_{(j)}),~{}{\rm sha}(g)=% \boldsymbol{k}_{0},\mathbb{E}\left[\Delta g_{j}(Z_{(j)})\right]=0,~{}1\leq j% \leq p\bigg{\}}caligraphic_G start_POSTSUBSCRIPT bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_g : [ 0 , 1 ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ↦ blackboard_R ∣ italic_g ( italic_Z ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT italic_g start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) , roman_sha ( italic_g ) = bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , blackboard_E [ roman_Δでるた italic_g start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) ] = 0 , 1 ≤ italic_j ≤ italic_p }

is the parameter space of g𝑔gitalic_g. With the SMPLE in (3), we estimate Λらむだ0(y)subscriptΛらむだ0𝑦\Lambda_{0}(y)roman_Λらむだ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_y ) by the Breslow-type estimator

Λらむだ^(y;ηいーた^)=1nj=1nΔでるたjS0,n(Yj,ηいーた^)𝟏(yYj),^Λらむだ𝑦^𝜂1𝑛superscriptsubscript𝑗1𝑛subscriptΔでるた𝑗subscript𝑆0𝑛subscript𝑌𝑗^𝜂1𝑦subscript𝑌𝑗\displaystyle\hat{\Lambda}(y;\hat{\eta})=\frac{1}{n}\sum\limits_{j=1}^{n}\frac% {\Delta_{j}}{S_{0,n}(Y_{j},\hat{\eta})}\mathbf{1}(y\geq Y_{j}),over^ start_ARG roman_Λらむだ end_ARG ( italic_y ; over^ start_ARG italic_ηいーた end_ARG ) = divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT divide start_ARG roman_Δでるた start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT 0 , italic_n end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over^ start_ARG italic_ηいーた end_ARG ) end_ARG bold_1 ( italic_y ≥ italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , (4)

where S0,n(y,ηいーた)=(1/n)i=1n{𝟏(Yiy)exp(Rηいーた(Ui))}.subscript𝑆0𝑛𝑦𝜂1𝑛superscriptsubscript𝑖1𝑛1subscript𝑌𝑖𝑦subscript𝑅𝜂subscript𝑈𝑖S_{0,n}(y,\eta)=(1/n)\sum\limits_{i=1}^{n}\{\mathbf{1}(Y_{i}\geq y)\exp(R_{% \eta}(U_{i}))\}.italic_S start_POSTSUBSCRIPT 0 , italic_n end_POSTSUBSCRIPT ( italic_y , italic_ηいーた ) = ( 1 / italic_n ) ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT { bold_1 ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≥ italic_y ) roman_exp ( italic_R start_POSTSUBSCRIPT italic_ηいーた end_POSTSUBSCRIPT ( italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) } .

The SMPLE defined in (3) can be calculated with the active-set algorithm introduced in Deng et al. (2023). Let g^jsubscript^𝑔𝑗\hat{g}_{j}over^ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT be functions satisfying g^(Z)=j=1pg^j(Z(j))^𝑔𝑍superscriptsubscript𝑗1𝑝subscript^𝑔𝑗subscript𝑍𝑗\hat{g}(Z)=\sum\limits_{j=1}^{p}\hat{g}_{j}(Z_{(j)})over^ start_ARG italic_g end_ARG ( italic_Z ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT over^ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) for all Z𝑍Zitalic_Z. The function g^(Z)^𝑔𝑍\hat{g}(Z)over^ start_ARG italic_g end_ARG ( italic_Z ) is unique only at the observed Zisubscript𝑍𝑖Z_{i}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and is therefore non-unique typically for Z𝑍Zitalic_Z other than Zisubscript𝑍𝑖Z_{i}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT’s, which is akin to general shape-restricted regression estimators (Chen and Samworth, 2016). This implies that g^j()subscript^𝑔𝑗\hat{g}_{j}(\cdot)over^ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( ⋅ ) is usually non-unique for 1jp1𝑗𝑝1\leq j\leq p1 ≤ italic_j ≤ italic_p, and the solution set of g^j()subscript^𝑔𝑗\hat{g}_{j}(\cdot)over^ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( ⋅ ) always contains a piece-wise linear function (Deng et al., 2023). See Figure 3 for an illustration.

3 Rate of convergence

The consistency property of the SMPLEs βべーた^^𝛽\hat{\beta}over^ start_ARG italic_βべーた end_ARG, g^^𝑔\hat{g}over^ start_ARG italic_g end_ARG, and g^jsubscript^𝑔𝑗\hat{g}_{j}over^ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT was established by Deng et al. (2023). In this section, we establish their convergence rates. We make the following assumptions.

Assumption 1.

(i) The observed {Yi}i=1nsuperscriptsubscriptsubscript𝑌𝑖𝑖1𝑛\{Y_{i}\}_{i=1}^{n}{ italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT are in the interval [0,τたう]0𝜏[0,\tau][ 0 , italic_τたう ], for some τたう>0𝜏0\tau>0italic_τたう > 0. (ii) Given U𝑈Uitalic_U, T𝑇Titalic_T and C𝐶Citalic_C are mutually independent of each other. (iii) Λらむだ0(τたう)<subscriptΛらむだ0𝜏\Lambda_{0}(\tau)<\inftyroman_Λらむだ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_τたう ) < ∞ and pr(CτたうU)c>0pr𝐶conditional𝜏𝑈𝑐0{\rm pr}(C\geq\tau\mid U)\geq c>0roman_pr ( italic_C ≥ italic_τたう ∣ italic_U ) ≥ italic_c > 0 almost surely for some constant c𝑐citalic_c. (iv) 𝔼[ΔでるたX]=0𝔼delimited-[]Δでるた𝑋0\mathbb{E}[\Delta X]=0blackboard_E [ roman_Δでるた italic_X ] = 0 and 𝔼[Δでるた]>0𝔼delimited-[]Δでるた0\mathbb{E}[\Delta]>0blackboard_E [ roman_Δでるた ] > 0.

Let \|\cdot\|∥ ⋅ ∥ denote the usual Euclidean norm and f()subscriptnorm𝑓\|f(\cdot)\|_{\infty}∥ italic_f ( ⋅ ) ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT the supreme norm of a real-valued function f𝑓fitalic_f. For any constant M>0𝑀0M>0italic_M > 0, define

𝒦M,𝒌0:={ηいーたηいーた=(βべーた,g),g𝒢𝒌0,βべーた+j=1pgj()M}.assignsubscript𝒦𝑀subscript𝒌0conditional-set𝜂formulae-sequence𝜂𝛽𝑔formulae-sequence𝑔subscript𝒢subscript𝒌0norm𝛽superscriptsubscript𝑗1𝑝subscriptnormsubscript𝑔𝑗𝑀\displaystyle\mathcal{K}_{M,\boldsymbol{k}_{0}}:=\left\{\eta\mid\eta=(\beta,g)% ,g\in\mathcal{G}_{\boldsymbol{k}_{0}},\|\beta\|+\sum\limits_{j=1}^{p}\|g_{j}(% \cdot)\|_{\infty}\leq M\right\}.caligraphic_K start_POSTSUBSCRIPT italic_M , bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT := { italic_ηいーた ∣ italic_ηいーた = ( italic_βべーた , italic_g ) , italic_g ∈ caligraphic_G start_POSTSUBSCRIPT bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , ∥ italic_βべーた ∥ + ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ∥ italic_g start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( ⋅ ) ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_M } .
Assumption 2.

The support 𝒳𝒳\mathcal{X}caligraphic_X of X𝑋Xitalic_X is a bounded subset of dsuperscript𝑑\mathbb{R}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and there exists a positive constant M0>0subscript𝑀00M_{0}>0italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT > 0 such that ηいーた0𝒦M0,𝐤0subscript𝜂0subscript𝒦subscript𝑀0subscript𝐤0\eta_{0}\in\mathcal{K}_{M_{0},\boldsymbol{k}_{0}}italic_ηいーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ caligraphic_K start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT.

Assumption 3.

There exists a small positive constant ϵitalic-ϵ\epsilonitalic_ϵ such that pr(Δでるた=1U)>ϵprΔでるたconditional1𝑈italic-ϵ{\rm pr}(\Delta=1\mid U)>\epsilonroman_pr ( roman_Δでるた = 1 ∣ italic_U ) > italic_ϵ almost surely with respect to the probability measure of U𝑈Uitalic_U.

Assumption 4.

The joint density of (Y,Z,Δでるた)𝑌𝑍Δでるた(Y,Z,\Delta)( italic_Y , italic_Z , roman_Δでるた ) satisfies

0<inf(y,z)[0,τたう]×[0,1]ppr(Y=y,Z=z,Δでるた=1)sup(y,z)[0,τたう]×[0,1]ppr(Y=y,Z=z,Δでるた=1)<.0subscriptinfimum𝑦𝑧0𝜏superscript01𝑝prformulae-sequence𝑌𝑦formulae-sequence𝑍𝑧Δでるた1subscriptsupremum𝑦𝑧0𝜏superscript01𝑝prformulae-sequence𝑌𝑦formulae-sequence𝑍𝑧Δでるた1\displaystyle 0<\inf_{(y,z)\in[0,\tau]\times[0,1]^{p}}{\rm pr}(Y=y,Z=z,\Delta=% 1)\leq\sup_{(y,z)\in[0,\tau]\times[0,1]^{p}}{\rm pr}(Y=y,Z=z,\Delta=1)<\infty.0 < roman_inf start_POSTSUBSCRIPT ( italic_y , italic_z ) ∈ [ 0 , italic_τたう ] × [ 0 , 1 ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_POSTSUBSCRIPT roman_pr ( italic_Y = italic_y , italic_Z = italic_z , roman_Δでるた = 1 ) ≤ roman_sup start_POSTSUBSCRIPT ( italic_y , italic_z ) ∈ [ 0 , italic_τたう ] × [ 0 , 1 ] start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_POSTSUBSCRIPT roman_pr ( italic_Y = italic_y , italic_Z = italic_z , roman_Δでるた = 1 ) < ∞ .
Assumption 5.

When sha(g0,j){3,4}shasubscript𝑔0𝑗34{{\rm sha}}(g_{0,j})\in\{3,4\}roman_sha ( italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT ) ∈ { 3 , 4 }, the density function Z(j)subscript𝑍𝑗Z_{(j)}italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT with respect to the Lebesgue measure has uniformly upper and lower bounds on [0,1]01[0,1][ 0 , 1 ].

Assumptions 12 are standard in the theoretical analysis of traditional Cox model and its variants (Huang, 1999; Zhong et al., 2022). Assumption 3 ensures that the probability of being uncensored is positive regardless of the covariate values, and it is used to establish the convergence rate results in Theorem 1. Assumption 4 is used in the calculation of the semiparametric efficiency lower bound (Huang, 1999). In Assumption 5, the upper bound requirement is used to calculate some entropy results needed in the proof of Proposition 1, and the lower bound requirement guarantees that the approximation errors of piecewise linear approximations of the convex/concave additive components to themselves are small enough in the proof of Theorem 3.

The Proposition below establishes the consistency of Rηいーた^()subscript𝑅^𝜂R_{\hat{\eta}}(\cdot)italic_R start_POSTSUBSCRIPT over^ start_ARG italic_ηいーた end_ARG end_POSTSUBSCRIPT ( ⋅ ), as an estimator of Rηいーた0()subscript𝑅subscript𝜂0R_{\eta_{0}}(\cdot)italic_R start_POSTSUBSCRIPT italic_ηいーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ⋅ ), which roughly implies the consistency of ηいーた^=(βべーた^,g^)^𝜂^𝛽^𝑔\hat{\eta}=(\hat{\beta},\hat{g})over^ start_ARG italic_ηいーた end_ARG = ( over^ start_ARG italic_βべーた end_ARG , over^ start_ARG italic_g end_ARG ). We would have proved the consistencies of βべーた^^𝛽\hat{\beta}over^ start_ARG italic_βべーた end_ARG and each g^jsubscript^𝑔𝑗\hat{g}_{j}over^ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT separately. However, the latter results are not needed in the proofs of the subsequent convergence rate results given the consistency of Rηいーた^()subscript𝑅^𝜂R_{\hat{\eta}}(\cdot)italic_R start_POSTSUBSCRIPT over^ start_ARG italic_ηいーた end_ARG end_POSTSUBSCRIPT ( ⋅ ).

Proposition 1.

Suppose that models (1) and (2) and Assumptions 1, 2 and 5 are satisfied. As n𝑛n\rightarrow\inftyitalic_n → ∞, we have

Rηいーた^()Rηいーた0()=op(1).subscriptnormsubscript𝑅^𝜂subscript𝑅subscript𝜂0subscript𝑜𝑝1\displaystyle\|R_{\hat{\eta}}(\cdot)-R_{\eta_{0}}(\cdot)\|_{\infty}=o_{p}(1).∥ italic_R start_POSTSUBSCRIPT over^ start_ARG italic_ηいーた end_ARG end_POSTSUBSCRIPT ( ⋅ ) - italic_R start_POSTSUBSCRIPT italic_ηいーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ⋅ ) ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT = italic_o start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( 1 ) . (5)

Define d2(ηいーた,ηいーた0)=𝔼U{(Rηいーた(U)Rηいーた0(U))2},superscript𝑑2𝜂subscript𝜂0subscript𝔼𝑈superscriptsubscript𝑅𝜂𝑈subscript𝑅subscript𝜂0𝑈2d^{2}(\eta,\eta_{0})=\mathbb{E}_{U}\left\{(R_{\eta}(U)-R_{\eta_{0}}(U))^{2}% \right\},italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_ηいーた , italic_ηいーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = blackboard_E start_POSTSUBSCRIPT italic_U end_POSTSUBSCRIPT { ( italic_R start_POSTSUBSCRIPT italic_ηいーた end_POSTSUBSCRIPT ( italic_U ) - italic_R start_POSTSUBSCRIPT italic_ηいーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_U ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } , where 𝔼Usubscript𝔼𝑈\mathbb{E}_{U}blackboard_E start_POSTSUBSCRIPT italic_U end_POSTSUBSCRIPT denotes the expectation with respect to U𝑈Uitalic_U. Let L2\|\cdot\|_{L_{2}}∥ ⋅ ∥ start_POSTSUBSCRIPT italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT denote the L2(P)subscript𝐿2𝑃L_{2}(P)italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_P ) norm and ρろー(𝒌0)=0.5+0.5𝟏(i=1p(sha(g0,i){1,2}))𝜌subscript𝒌00.50.51superscriptsubscript𝑖1𝑝shasubscript𝑔0𝑖12\rho(\boldsymbol{k}_{0})=0.5+0.5\cdot\mathbf{1}(\cup_{i=1}^{p}({\rm sha}(g_{0,% i})\in\{1,2\}))italic_ρろー ( bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = 0.5 + 0.5 ⋅ bold_1 ( ∪ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ( roman_sha ( italic_g start_POSTSUBSCRIPT 0 , italic_i end_POSTSUBSCRIPT ) ∈ { 1 , 2 } ) ). One of our main results is to establish the convergence rate of the SMPLE ηいーた^^𝜂\hat{\eta}over^ start_ARG italic_ηいーた end_ARG.

Theorem 1.

Assume the same conditions in Proposition 1. As n𝑛n\rightarrow\inftyitalic_n → ∞, we have

d(ηいーた^,ηいーた0)=Op(n12+ρろー(𝒌0)).𝑑^𝜂subscript𝜂0subscript𝑂𝑝superscript𝑛12𝜌subscript𝒌0d(\hat{\eta},\eta_{0})=O_{p}\left(n^{-\frac{1}{2+\rho(\boldsymbol{k}_{0})}}% \right).italic_d ( over^ start_ARG italic_ηいーた end_ARG , italic_ηいーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = italic_O start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 + italic_ρろー ( bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG end_POSTSUPERSCRIPT ) .

Furthermore, if Assumptions 34 are satisfied and I(βべーた0)𝐼subscript𝛽0I(\beta_{0})italic_I ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) (defined in (7)) is non-singular, then for 1jp1𝑗𝑝1\leq j\leq p1 ≤ italic_j ≤ italic_p,

βべーた^βべーた0=Op(n12+ρろー(𝒌0)),g^j(Z(j))g0,j(Z(j))L2=Op(n12+ρろー(𝒌0)).formulae-sequencenorm^𝛽subscript𝛽0subscript𝑂𝑝superscript𝑛12𝜌subscript𝒌0subscriptnormsubscript^𝑔𝑗subscript𝑍𝑗subscript𝑔0𝑗subscript𝑍𝑗subscript𝐿2subscript𝑂𝑝superscript𝑛12𝜌subscript𝒌0\displaystyle\|\hat{\beta}-\beta_{0}\|=O_{p}\left(n^{-\frac{1}{2+\rho(% \boldsymbol{k}_{0})}}\right),\quad\|\hat{g}_{j}(Z_{(j)})-g_{0,j}(Z_{(j)})\|_{L% _{2}}=O_{p}\left(n^{-\frac{1}{2+\rho(\boldsymbol{k}_{0})}}\right).∥ over^ start_ARG italic_βべーた end_ARG - italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ = italic_O start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 + italic_ρろー ( bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG end_POSTSUPERSCRIPT ) , ∥ over^ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) - italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT ( italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_O start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 + italic_ρろー ( bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG end_POSTSUPERSCRIPT ) .

According to Theorem 1, the rates of convergence of all g^jsubscript^𝑔𝑗\hat{g}_{j}over^ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT are Op(n2/5)subscript𝑂𝑝superscript𝑛25O_{p}(n^{-2/5})italic_O start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT - 2 / 5 end_POSTSUPERSCRIPT ) if none of the additive components of g𝑔gitalic_g is monotonic. Conversely, if one additive component of g𝑔gitalic_g is monotonic, then their convergence rates all slow down to Op(n1/3)subscript𝑂𝑝superscript𝑛13O_{p}(n^{-1/3})italic_O start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT - 1 / 3 end_POSTSUPERSCRIPT ). An explanation for this finding is that the complexity of the class of bounded and monotonic functions is much larger than that of the class of bounded and convex (or concave) functions. These convergence rate results are free from the covariate dimensionality and exhibit a much more elegant form than those in Huang (1999) and Zhong et al. (2022). Theorem 1 also establishes the convergence rate of βべーた^^𝛽\hat{\beta}over^ start_ARG italic_βべーた end_ARG, although it is sub-optimal.

With Theorem 1, we are able to establish the uniformly rate of convergence for the SMPLE Λらむだ^(y;ηいーた^)^Λらむだ𝑦^𝜂\hat{\Lambda}(y;\hat{\eta})over^ start_ARG roman_Λらむだ end_ARG ( italic_y ; over^ start_ARG italic_ηいーた end_ARG ) in (4) of the baseline cumulative hazard function Λらむだ0(y)subscriptΛらむだ0𝑦\Lambda_{0}(y)roman_Λらむだ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_y ). It turns out that Λらむだ^(y;ηいーた^)^Λらむだ𝑦^𝜂\hat{\Lambda}(y;\hat{\eta})over^ start_ARG roman_Λらむだ end_ARG ( italic_y ; over^ start_ARG italic_ηいーた end_ARG ) has the same convergence rate as ηいーた^^𝜂\hat{\eta}over^ start_ARG italic_ηいーた end_ARG, although their convergence rates are quantified by different distances.

Theorem 2.

Assume the same conditions as in Proposition 1. As n𝑛n\rightarrow\inftyitalic_n → ∞, it holds that

supy[0,τたう]|Λらむだ^(y;ηいーた^)Λらむだ0(y)|=Op(n12+ρろー(𝒌0)).subscriptsupremum𝑦0𝜏^Λらむだ𝑦^𝜂subscriptΛらむだ0𝑦subscript𝑂𝑝superscript𝑛12𝜌subscript𝒌0\displaystyle\sup_{y\in[0,\tau]}\left|\hat{\Lambda}(y;\hat{\eta})-\Lambda_{0}(% y)\right|=O_{p}\left(n^{-\frac{1}{2+\rho(\boldsymbol{k}_{0})}}\right).roman_sup start_POSTSUBSCRIPT italic_y ∈ [ 0 , italic_τたう ] end_POSTSUBSCRIPT | over^ start_ARG roman_Λらむだ end_ARG ( italic_y ; over^ start_ARG italic_ηいーた end_ARG ) - roman_Λらむだ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_y ) | = italic_O start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 + italic_ρろー ( bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG end_POSTSUPERSCRIPT ) .
Remark 1.

In practice, one may impose a combination of monotonicity and convexity/concavity constraints on the additive components according to domain knowledge. See (Chen and Samworth, 2016; Kuchibhotla et al., 2023; Deng et al., 2023) for further motivation on additional shape constraints. Proposition 1 and Theorems 14 still hold when model (2) incorporates additive components that satisfy both monotonicity and convexity/concavity restrictions. An intuitive explanation for this result is that the parameter space 𝒢𝐤0subscript𝒢subscript𝐤0\mathcal{G}_{\boldsymbol{k}_{0}}caligraphic_G start_POSTSUBSCRIPT bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT is reduced by additional constraints on the additive components and this can lead to better convergence rates of the SMPLE (if not the same).

4 Asymptotic normality and efficiency

Based on the convergence rate results in the previous section, in this section, we further show that our SMPLE in (3) for the linear covariate effect βべーた^^𝛽\hat{\beta}over^ start_ARG italic_βべーた end_ARG is asymptotically normal and semiparametric efficient, in the sense that its asymptotic variance achieves the semiparametric efficiency lower bound (Bickel et al., 1993) or the information bound of estimating βべーた0subscript𝛽0\beta_{0}italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT under models (1) and (2).

We begin with presenting the information bound for βべーた0subscript𝛽0\beta_{0}italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Recall that U=(X,Z)𝑈superscriptsuperscript𝑋topsuperscript𝑍toptopU=(X^{\top},Z^{\top})^{\top}italic_U = ( italic_X start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT , italic_Z start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT, and define

M(y)M(yY,Δでるた,U)=Δでるた𝟏(Yy)0y𝟏(Yt)exp{Rηいーた0(U)}𝑑Λらむだ0(t),𝑀𝑦𝑀conditional𝑦𝑌Δでるた𝑈Δでるた1𝑌𝑦superscriptsubscript0𝑦1𝑌𝑡subscript𝑅subscript𝜂0𝑈differential-dsubscriptΛらむだ0𝑡\displaystyle M(y)\equiv M(y\mid Y,\Delta,U)=\Delta\mathbf{1}(Y\leq y)-\int_{0% }^{y}\mathbf{1}(Y\geq t)\exp\{R_{\eta_{0}}(U)\}d\Lambda_{0}(t),italic_M ( italic_y ) ≡ italic_M ( italic_y ∣ italic_Y , roman_Δでるた , italic_U ) = roman_Δでるた bold_1 ( italic_Y ≤ italic_y ) - ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_y end_POSTSUPERSCRIPT bold_1 ( italic_Y ≥ italic_t ) roman_exp { italic_R start_POSTSUBSCRIPT italic_ηいーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_U ) } italic_d roman_Λらむだ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_t ) ,

which is a counting process martingale associated with the Cox model. The log-likelihood of model (1) based on one observation (X,Z,Y,Δでるた)𝑋𝑍𝑌Δでるた(X,Z,Y,\Delta)( italic_X , italic_Z , italic_Y , roman_Δでるた ) is (up to constant)

(βべーた,g,Λらむだ)=Δでるたlogλらむだ(Y)+Δでるた{Xβべーた+g(Z)}Λらむだ(Y)exp{Xβべーた+g(Z)}.𝛽𝑔ΛらむだΔでるた𝜆𝑌Δでるたsuperscript𝑋top𝛽𝑔𝑍Λらむだ𝑌superscript𝑋top𝛽𝑔𝑍\displaystyle\ell(\beta,g,\Lambda)=\Delta\log\lambda(Y)+\Delta\{X^{\top}\beta+% g(Z)\}-\Lambda(Y)\exp\{X^{\top}\beta+g(Z)\}.roman_ℓ ( italic_βべーた , italic_g , roman_Λらむだ ) = roman_Δでるた roman_log italic_λらむだ ( italic_Y ) + roman_Δでるた { italic_X start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT italic_βべーた + italic_g ( italic_Z ) } - roman_Λらむだ ( italic_Y ) roman_exp { italic_X start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT italic_βべーた + italic_g ( italic_Z ) } . (6)

Conisder a parametric smooth sub-model {λらむだ(νにゅー):νにゅー}conditional-setsubscript𝜆𝜈𝜈\{\lambda_{(\nu)}:\nu\in\mathbb{R}\}{ italic_λらむだ start_POSTSUBSCRIPT ( italic_νにゅー ) end_POSTSUBSCRIPT : italic_νにゅー ∈ blackboard_R } and {gj,(νにゅー):νにゅー}conditional-setsubscript𝑔𝑗𝜈𝜈\{g_{j,(\nu)}:\nu\in\mathbb{R}\}{ italic_g start_POSTSUBSCRIPT italic_j , ( italic_νにゅー ) end_POSTSUBSCRIPT : italic_νにゅー ∈ blackboard_R }, 1jp1𝑗𝑝1\leq j\leq p1 ≤ italic_j ≤ italic_p, with λらむだ(0)=λらむだ0subscript𝜆0subscript𝜆0\lambda_{(0)}=\lambda_{0}italic_λらむだ start_POSTSUBSCRIPT ( 0 ) end_POSTSUBSCRIPT = italic_λらむだ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and gj,(0)=g0,jsubscript𝑔𝑗0subscript𝑔0𝑗g_{j,(0)}=g_{0,j}italic_g start_POSTSUBSCRIPT italic_j , ( 0 ) end_POSTSUBSCRIPT = italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT. Define L2(PY)subscript𝐿2subscript𝑃𝑌L_{2}(P_{Y})italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_P start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ) to be the set of a()𝑎a(\cdot)italic_a ( ⋅ ) satisfying 𝔼{Δでるたa2(Y)}<𝔼Δでるたsuperscript𝑎2𝑌{\mathbb{E}}\{\Delta a^{2}(Y)\}<\inftyblackboard_E { roman_Δでるた italic_a start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_Y ) } < ∞ and a(y)=logλらむだ(νにゅー)(y)/νにゅー|νにゅー=0,𝑎𝑦evaluated-atsubscript𝜆𝜈𝑦𝜈𝜈0a(y)=\partial\log\lambda_{(\nu)}(y)/\partial\nu|_{\nu=0},italic_a ( italic_y ) = ∂ roman_log italic_λらむだ start_POSTSUBSCRIPT ( italic_νにゅー ) end_POSTSUBSCRIPT ( italic_y ) / ∂ italic_νにゅー | start_POSTSUBSCRIPT italic_νにゅー = 0 end_POSTSUBSCRIPT , for some submodel. Similarly, for 1jp1𝑗𝑝1\leq j\leq p1 ≤ italic_j ≤ italic_p, define L20(PZ(j))superscriptsubscript𝐿20subscript𝑃subscript𝑍𝑗L_{2}^{0}(P_{Z_{(j)}})italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_P start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) to be the set of hjsubscript𝑗h_{j}italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT satisfying 𝔼{Δでるたhj(Z(j))}=0𝔼Δでるたsubscript𝑗subscript𝑍𝑗0{\mathbb{E}}\{\Delta h_{j}(Z_{(j)})\}=0blackboard_E { roman_Δでるた italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) } = 0, 𝔼{Δでるたhj2(Z(j))}<𝔼Δでるたsuperscriptsubscript𝑗2subscript𝑍𝑗{\mathbb{E}}\{\Delta h_{j}^{2}(Z_{(j)})\}<\inftyblackboard_E { roman_Δでるた italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) } < ∞ and hj(z(j))=gj,(νにゅー)(z(j))/νにゅー|νにゅー=0subscript𝑗subscript𝑧𝑗evaluated-atsubscript𝑔𝑗𝜈subscript𝑧𝑗𝜈𝜈0h_{j}(z_{(j)})=\partial g_{j,(\nu)}(z_{(j)})/\partial\nu|_{\nu=0}italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) = ∂ italic_g start_POSTSUBSCRIPT italic_j , ( italic_νにゅー ) end_POSTSUBSCRIPT ( italic_z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) / ∂ italic_νにゅー | start_POSTSUBSCRIPT italic_νにゅー = 0 end_POSTSUBSCRIPT for some submodel. The following lemma, which is Theorem 3.1 of Huang (1999), gives the information bound of βべーた0subscript𝛽0\beta_{0}italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

Lemma 1 (Theorem 3.1 of Huang (1999)).

Suppose that models (1) and (2) and Assumptions 14 are satisfied. Let ((𝐚),(𝐡1),,(𝐡p))superscriptsuperscriptsuperscript𝐚topsuperscriptsuperscriptsubscript𝐡1topsuperscriptsuperscriptsubscript𝐡𝑝toptop((\boldsymbol{a}^{\star})^{\top},(\boldsymbol{h}_{1}^{\star})^{\top},\cdots,(% \boldsymbol{h}_{p}^{\star})^{\top})^{\top}( ( bold_italic_a start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT , ( bold_italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT , ⋯ , ( bold_italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT be the unique, vector-valued function in L2(PY)d×L20(PZ(1))d××L20(PZ(p))dsubscript𝐿2superscriptsubscript𝑃𝑌𝑑superscriptsubscript𝐿20superscriptsubscript𝑃subscript𝑍1𝑑superscriptsubscript𝐿20superscriptsubscript𝑃subscript𝑍𝑝𝑑L_{2}(P_{Y})^{d}\times L_{2}^{0}(P_{Z_{(1)}})^{d}\times\cdots\times L_{2}^{0}(% P_{Z_{(p)}})^{d}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_P start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT × italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_P start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT ( 1 ) end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT × ⋯ × italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_P start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT ( italic_p ) end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT that minimizes

𝔼{ΔでるたX𝒂(Y)𝒉1(Z(1))𝒉p(Z(p))2}.𝔼Δでるたsuperscriptnorm𝑋𝒂𝑌subscript𝒉1subscript𝑍1subscript𝒉𝑝subscript𝑍𝑝2\displaystyle{\mathbb{E}}\left\{\Delta\|X-\boldsymbol{a}(Y)-\boldsymbol{h}_{1}% (Z_{(1)})-\cdots-\boldsymbol{h}_{p}(Z_{(p)})\|^{2}\right\}.blackboard_E { roman_Δでるた ∥ italic_X - bold_italic_a ( italic_Y ) - bold_italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_Z start_POSTSUBSCRIPT ( 1 ) end_POSTSUBSCRIPT ) - ⋯ - bold_italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_Z start_POSTSUBSCRIPT ( italic_p ) end_POSTSUBSCRIPT ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } .
  • (1)

    The efficient score for estimation of βべーた0subscript𝛽0\beta_{0}italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is

    βべーた0(Y,Δでるた,U)=0τたう{X𝒂(y)𝒉(Z)}𝑑M(y),superscriptsubscriptsubscript𝛽0𝑌Δでるた𝑈superscriptsubscript0𝜏𝑋superscript𝒂𝑦superscript𝒉𝑍differential-d𝑀𝑦\displaystyle\ell_{\beta_{0}}^{\star}(Y,\Delta,U)=\int_{0}^{\tau}\{X-% \boldsymbol{a}^{\star}(y)-\boldsymbol{h}^{\star}(Z)\}dM(y),roman_ℓ start_POSTSUBSCRIPT italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( italic_Y , roman_Δでるた , italic_U ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τたう end_POSTSUPERSCRIPT { italic_X - bold_italic_a start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( italic_y ) - bold_italic_h start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( italic_Z ) } italic_d italic_M ( italic_y ) ,

    where 𝒉(Z)=j=1p𝒉j(Z(j))superscript𝒉𝑍superscriptsubscript𝑗1𝑝superscriptsubscript𝒉𝑗subscript𝑍𝑗\boldsymbol{h}^{\star}(Z)=\sum\limits_{j=1}^{p}\boldsymbol{h}_{j}^{\star}(Z_{(% j)})bold_italic_h start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( italic_Z ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT bold_italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( italic_Z start_POSTSUBSCRIPT ( italic_j ) end_POSTSUBSCRIPT ) and 𝒂(y)=𝔼{X𝒉(Z)Y=y,Δでるた=1}.superscript𝒂𝑦𝔼conditional-set𝑋superscript𝒉𝑍formulae-sequence𝑌𝑦Δでるた1\boldsymbol{a}^{\star}(y)={\mathbb{E}}\left\{X-\boldsymbol{h}^{\star}(Z)\mid Y% =y,\Delta=1\right\}.bold_italic_a start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( italic_y ) = blackboard_E { italic_X - bold_italic_h start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( italic_Z ) ∣ italic_Y = italic_y , roman_Δでるた = 1 } .

  • (2)

    The information bound for estimation of βべーた0subscript𝛽0\beta_{0}italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is

    I(βべーた0)=𝔼[{βべーた0(Y,Δでるた,U)}2]=𝔼[Δでるた{X𝒂(Y)𝒉(Z)}2],𝐼subscript𝛽0𝔼delimited-[]superscriptsuperscriptsubscriptsubscript𝛽0𝑌Δでるた𝑈tensor-productabsent2𝔼delimited-[]Δでるたsuperscript𝑋superscript𝒂𝑌superscript𝒉𝑍tensor-productabsent2\displaystyle I(\beta_{0})={\mathbb{E}}\left[\left\{\ell_{\beta_{0}}^{\star}(Y% ,\Delta,U)\right\}^{\otimes 2}\right]={\mathbb{E}}\left[\Delta\left\{X-% \boldsymbol{a}^{\star}(Y)-\boldsymbol{h}^{\star}(Z)\right\}^{\otimes 2}\right],italic_I ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = blackboard_E [ { roman_ℓ start_POSTSUBSCRIPT italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( italic_Y , roman_Δでるた , italic_U ) } start_POSTSUPERSCRIPT ⊗ 2 end_POSTSUPERSCRIPT ] = blackboard_E [ roman_Δでるた { italic_X - bold_italic_a start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( italic_Y ) - bold_italic_h start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( italic_Z ) } start_POSTSUPERSCRIPT ⊗ 2 end_POSTSUPERSCRIPT ] , (7)

    where A2=AAsuperscript𝐴tensor-productabsent2𝐴superscript𝐴topA^{\otimes 2}=AA^{\top}italic_A start_POSTSUPERSCRIPT ⊗ 2 end_POSTSUPERSCRIPT = italic_A italic_A start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT for any vector or matrix A𝐴Aitalic_A.

Additional assumptions are needed to obtain the asymptotic normality and efficiency of βべーた^^𝛽\hat{\beta}over^ start_ARG italic_βべーた end_ARG. Denote 𝒉j=(𝒉j,1,,𝒉j,d)superscriptsubscript𝒉𝑗superscriptsuperscriptsubscript𝒉𝑗1superscriptsubscript𝒉𝑗𝑑top\boldsymbol{h}_{j}^{\star}=(\boldsymbol{h}_{j,1}^{\star},\cdots,\boldsymbol{h}% _{j,d}^{\star})^{\top}bold_italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = ( bold_italic_h start_POSTSUBSCRIPT italic_j , 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT , ⋯ , bold_italic_h start_POSTSUBSCRIPT italic_j , italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT for 1jp1𝑗𝑝1\leq j\leq p1 ≤ italic_j ≤ italic_p.

Assumption 6.

When sha(g0,j){1,2}shasubscript𝑔0𝑗12{\rm sha}(g_{0,j})\in\{1,2\}roman_sha ( italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT ) ∈ { 1 , 2 }, there exist constant C~1>0subscript~𝐶10\tilde{C}_{1}>0over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > 0 and C~2>0subscript~𝐶20\tilde{C}_{2}>0over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 0 such that 𝐡j(x1)𝐡j(y1)C~1|x1y1|normsubscriptsuperscript𝐡𝑗subscript𝑥1subscriptsuperscript𝐡𝑗subscript𝑦1subscript~𝐶1subscript𝑥1subscript𝑦1\|\boldsymbol{h}^{\star}_{j}(x_{1})-\boldsymbol{h}^{\star}_{j}(y_{1})\|\leq% \tilde{C}_{1}|x_{1}-y_{1}|∥ bold_italic_h start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) - bold_italic_h start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∥ ≤ over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | and |g0,j1(x2)g0,j1(y2)|C~2|x2y2|.superscriptsubscript𝑔0𝑗1subscript𝑥2superscriptsubscript𝑔0𝑗1subscript𝑦2subscript~𝐶2subscript𝑥2subscript𝑦2|g_{0,j}^{-1}(x_{2})-g_{0,j}^{-1}(y_{2})|\leq\tilde{C}_{2}|x_{2}-y_{2}|.| italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) | ≤ over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | . for all x1,y1[0,1]subscript𝑥1subscript𝑦101x_{1},y_{1}\in[0,1]italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ [ 0 , 1 ] and x2,y2[M0,M0]subscript𝑥2subscript𝑦2subscript𝑀0subscript𝑀0x_{2},y_{2}\in[-M_{0},M_{0}]italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ [ - italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ].

Assumption 7.

When sha(g0,j){3,4}shasubscript𝑔0𝑗34{\rm sha}(g_{0,j})\in\{3,4\}roman_sha ( italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT ) ∈ { 3 , 4 }, the function 𝐡j,i~()superscriptsubscript𝐡𝑗~𝑖\boldsymbol{h}_{j,\tilde{i}}^{\star}(\cdot)bold_italic_h start_POSTSUBSCRIPT italic_j , over~ start_ARG italic_i end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ( ⋅ ) is p¯¯𝑝\bar{p}over¯ start_ARG italic_p end_ARG-Ho¨¨o\rm\ddot{o}over¨ start_ARG roman_o end_ARGlder continuous for all 1i~d1~𝑖𝑑1\leq\tilde{i}\leq d1 ≤ over~ start_ARG italic_i end_ARG ≤ italic_d and some p¯(ρろー(𝐤0), 2]¯𝑝𝜌subscript𝐤02\bar{p}\in(\rho(\boldsymbol{k}_{0}),\;2]over¯ start_ARG italic_p end_ARG ∈ ( italic_ρろー ( bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , 2 ].

Assumption 8.

When sha(g0,j){3}shasubscript𝑔0𝑗3{\rm sha}(g_{0,j})\in\{3\}roman_sha ( italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT ) ∈ { 3 }, the function g0,j(t)C~t2subscript𝑔0𝑗𝑡~𝐶superscript𝑡2g_{0,j}(t)-\tilde{C}t^{2}italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT ( italic_t ) - over~ start_ARG italic_C end_ARG italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is convex for some constant C~>0~𝐶0\tilde{C}>0over~ start_ARG italic_C end_ARG > 0; when sha(g0,j){4}shasubscript𝑔0𝑗4{\rm sha}(g_{0,j})\in\{4\}roman_sha ( italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT ) ∈ { 4 }, the function g0,j(t)+C~t2subscript𝑔0𝑗𝑡~𝐶superscript𝑡2g_{0,j}(t)+\tilde{C}t^{2}italic_g start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT ( italic_t ) + over~ start_ARG italic_C end_ARG italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is concave for some constant C~>0~𝐶0\tilde{C}>0over~ start_ARG italic_C end_ARG > 0.

Assumption 6 is used to control the fluctuation of the score function corresponding to the monotone components in the direction of the projection defined in Lemma 1. (Huang, 2002; Cheng, 2009) adopted a similar assumption. Assumptions 78, which are analogues of Assumption B1 of Kuchibhotla et al. (2023), are used for approximations of the convex (concave) components.

Theorem 3.

Suppose that models (1) and (2) and Assumptions 18 are satisfied. Further assume that I(βべーた0)𝐼subscript𝛽0I(\beta_{0})italic_I ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is non-singular. Then as n𝑛n\rightarrow\inftyitalic_n → ∞, n(βべーた^βべーた0)dN(0,I1(βべーた0)).superscript𝑑𝑛^𝛽subscript𝛽0𝑁0superscript𝐼1subscript𝛽0\sqrt{n}(\hat{\beta}-\beta_{0})\stackrel{{\scriptstyle d}}{{\longrightarrow}}N% (0,I^{-1}(\beta_{0})).square-root start_ARG italic_n end_ARG ( over^ start_ARG italic_βべーた end_ARG - italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_RELOP SUPERSCRIPTOP start_ARG ⟶ end_ARG start_ARG italic_d end_ARG end_RELOP italic_N ( 0 , italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) . This implies that the asymptotic variance achieves the information bound and βべーた^^𝛽\hat{\beta}over^ start_ARG italic_βべーた end_ARG is asymptotically semiparametric efficient among all regular estimators of βべーた0subscript𝛽0\beta_{0}italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

By Theorem 3, βべーた^^𝛽\hat{\beta}over^ start_ARG italic_βべーた end_ARG has an asymptotically normal distribution with asymptotic variance I1(βべーた0)superscript𝐼1subscript𝛽0I^{-1}(\beta_{0})italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). When making inference about βべーた0subscript𝛽0\beta_{0}italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT based on this theorem, we need to construct a consistent estimator for I1(βべーた0)superscript𝐼1subscript𝛽0I^{-1}(\beta_{0})italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). However, I1(βべーた0)superscript𝐼1subscript𝛽0I^{-1}(\beta_{0})italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) or equivalently I(βべーた0)𝐼subscript𝛽0I(\beta_{0})italic_I ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) has a rather complicated form, making its plug-in estimator not easy to use. To crack this nut, we propose a novel data-splitting estimation method to estimate I1(βべーた0)superscript𝐼1subscript𝛽0I^{-1}(\beta_{0})italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ).

4.1 Data-splitting variance estimation and inference on βべーた𝛽\betaitalic_βべーた

We introduce the proposed data-splitting variance estimation method under a general setting as it is applicable generally. Let θしーた0dsubscript𝜃0superscript𝑑\theta_{0}\in\mathbb{R}^{d}italic_θしーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT be a functional of a statistical population 𝒫𝒫\mathscr{P}script_P and θしーた^nsubscript^𝜃𝑛\hat{\theta}_{n}over^ start_ARG italic_θしーた end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be an estimator of θしーた0subscript𝜃0\theta_{0}italic_θしーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT based on i.i.d samples {Oi}i=1nsuperscriptsubscriptsubscript𝑂𝑖𝑖1𝑛\{O_{i}\}_{i=1}^{n}{ italic_O start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT from 𝒫𝒫\mathscr{P}script_P. Suppose that n(θしーた^nθしーた0)𝑑N(0,Σしぐま)𝑛subscript^𝜃𝑛subscript𝜃0𝑑𝑁0Σしぐま\sqrt{n}(\hat{\theta}_{n}-\theta_{0}){\overset{d}{\longrightarrow\;}}N(0,\Sigma)square-root start_ARG italic_n end_ARG ( over^ start_ARG italic_θしーた end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_θしーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) overitalic_d start_ARG ⟶ end_ARG italic_N ( 0 , roman_Σしぐま ), where ΣしぐまΣしぐま\Sigmaroman_Σしぐま is a semi-positive matrix. Let kn<nsubscript𝑘𝑛𝑛k_{n}<nitalic_k start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT < italic_n and knsubscript𝑘𝑛k_{n}\rightarrow\inftyitalic_k start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT → ∞. We partition the sample into knsubscript𝑘𝑛\lfloor k_{n}\rfloor⌊ italic_k start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⌋ subsamples, each of which has mn=n/knsubscript𝑚𝑛𝑛subscript𝑘𝑛m_{n}=\lfloor n/k_{n}\rflooritalic_m start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = ⌊ italic_n / italic_k start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⌋ observations, and let θしーた^nisubscript^𝜃𝑛𝑖\hat{\theta}_{ni}over^ start_ARG italic_θしーた end_ARG start_POSTSUBSCRIPT italic_n italic_i end_POSTSUBSCRIPT denote the estimator of θしーた0subscript𝜃0\theta_{0}italic_θしーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT based on the i𝑖iitalic_i-th subsample, 1ikn1𝑖subscript𝑘𝑛1\leq i\leq\lfloor k_{n}\rfloor1 ≤ italic_i ≤ ⌊ italic_k start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⌋. Our splitting-data estimator for the asymptotic variance ΣしぐまΣしぐま\Sigmaroman_Σしぐま is defined as

Σしぐま^=mnkni=1kn(θしーた^niθしーた^¯n)(θしーた^niθしーた^¯n),^Σしぐまsubscript𝑚𝑛subscript𝑘𝑛superscriptsubscript𝑖1subscript𝑘𝑛subscript^𝜃𝑛𝑖subscript¯^𝜃𝑛superscriptsubscript^𝜃𝑛𝑖subscript¯^𝜃𝑛top\displaystyle\hat{\Sigma}=\frac{m_{n}}{\lfloor k_{n}\rfloor}\sum_{i=1}^{% \lfloor k_{n}\rfloor}(\hat{\theta}_{ni}-\bar{\hat{\theta}}_{n})(\hat{\theta}_{% ni}-\bar{\hat{\theta}}_{n})^{\top},over^ start_ARG roman_Σしぐま end_ARG = divide start_ARG italic_m start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_ARG start_ARG ⌊ italic_k start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⌋ end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⌊ italic_k start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⌋ end_POSTSUPERSCRIPT ( over^ start_ARG italic_θしーた end_ARG start_POSTSUBSCRIPT italic_n italic_i end_POSTSUBSCRIPT - over¯ start_ARG over^ start_ARG italic_θしーた end_ARG end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ( over^ start_ARG italic_θしーた end_ARG start_POSTSUBSCRIPT italic_n italic_i end_POSTSUBSCRIPT - over¯ start_ARG over^ start_ARG italic_θしーた end_ARG end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT , (8)

where θしーた^¯nsubscript¯^𝜃𝑛\bar{\hat{\theta}}_{n}over¯ start_ARG over^ start_ARG italic_θしーた end_ARG end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is the sample mean of θしーた^n1,,θしーた^nknsubscript^𝜃𝑛1subscript^𝜃𝑛subscript𝑘𝑛\hat{\theta}_{n1},\ldots,\hat{\theta}_{n\lfloor k_{n}\rfloor}over^ start_ARG italic_θしーた end_ARG start_POSTSUBSCRIPT italic_n 1 end_POSTSUBSCRIPT , … , over^ start_ARG italic_θしーた end_ARG start_POSTSUBSCRIPT italic_n ⌊ italic_k start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⌋ end_POSTSUBSCRIPT. For better stableness, we may repeat the above splitting and estimating procedure for many times and take the average of the resulting variance estimates as a final variance estimate.

Theorem 4.

Let θしーた^nsubscript^𝜃𝑛\hat{\theta}_{n}over^ start_ARG italic_θしーた end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be an estimator of θしーた0subscript𝜃0\theta_{0}italic_θしーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT based on i.i.d samples {Oi}i=1nsuperscriptsubscriptsubscript𝑂𝑖𝑖1𝑛\{O_{i}\}_{i=1}^{n}{ italic_O start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and n(θしーた^nθしーた0)𝑑N(0,Σしぐま)𝑛subscript^𝜃𝑛subscript𝜃0𝑑𝑁0Σしぐま\sqrt{n}(\hat{\theta}_{n}-\theta_{0}){\overset{d}{\longrightarrow\;}}N(0,\Sigma)square-root start_ARG italic_n end_ARG ( over^ start_ARG italic_θしーた end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_θしーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) overitalic_d start_ARG ⟶ end_ARG italic_N ( 0 , roman_Σしぐま ). Let kn=nαあるふぁ~subscript𝑘𝑛superscript𝑛~𝛼k_{n}=n^{\tilde{\alpha}}italic_k start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = italic_n start_POSTSUPERSCRIPT over~ start_ARG italic_αあるふぁ end_ARG end_POSTSUPERSCRIPT for some αあるふぁ~(0,1)~𝛼01\tilde{\alpha}\in(0,1)over~ start_ARG italic_αあるふぁ end_ARG ∈ ( 0 , 1 ), mn=n1αあるふぁ~subscript𝑚𝑛superscript𝑛1~𝛼m_{n}=\lfloor n^{1-\tilde{\alpha}}\rflooritalic_m start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = ⌊ italic_n start_POSTSUPERSCRIPT 1 - over~ start_ARG italic_αあるふぁ end_ARG end_POSTSUPERSCRIPT ⌋ and Σしぐま^^Σしぐま\hat{\Sigma}over^ start_ARG roman_Σしぐま end_ARG be the variance estimator in (8). Then Σしぐま^=Σしぐま+op(1)^ΣしぐまΣしぐまsubscript𝑜𝑝1\hat{\Sigma}=\Sigma+o_{p}(1)over^ start_ARG roman_Σしぐま end_ARG = roman_Σしぐま + italic_o start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( 1 ) as n𝑛n\rightarrow\inftyitalic_n → ∞.

Theorem 4 guarantees the validity of the data-splitting variance estimator. This method is very easy to use and is flexible enough for general purpose. Alternatively, we may construct a variance estimator by bootstrap. However, the consistency of a bootstrap variance estimator often requires stronger conditions (Groeneboom and Hendrickx, 2017) and is often very difficult to prove, especially under shape restrictions.

As a specific application, we apply the data-splitting estimation method to construct an estimator for the information bound or the asymptotic variance I1(βべーた0)superscript𝐼1subscript𝛽0I^{-1}(\beta_{0})italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) of βべーた^^𝛽\hat{\beta}over^ start_ARG italic_βべーた end_ARG. Denote the resulting estimator by I1(βべーた0)^^superscript𝐼1subscript𝛽0\widehat{I^{-1}(\beta_{0})}over^ start_ARG italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG, which is consistent to I1(βべーた0)superscript𝐼1subscript𝛽0I^{-1}(\beta_{0})italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) by Theorem 4. Therefore n(βべーた^βべーた0){I1(βべーた0)^}1(βべーた^βべーた0)𝑛superscript^𝛽subscript𝛽0topsuperscript^superscript𝐼1subscript𝛽01^𝛽subscript𝛽0n(\hat{\beta}-\beta_{0})^{\mathrm{\scriptscriptstyle\top}}\{\widehat{I^{-1}(% \beta_{0})}\}^{-1}(\hat{\beta}-\beta_{0})italic_n ( over^ start_ARG italic_βべーた end_ARG - italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT { over^ start_ARG italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG } start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( over^ start_ARG italic_βべーた end_ARG - italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) follows asymptotically χかい2(d)superscript𝜒2𝑑\chi^{2}(d)italic_χかい start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_d ), a chisquare distribution of d𝑑ditalic_d degrees of freedom. For αあるふぁ(0,1)𝛼01\alpha\in(0,1)italic_αあるふぁ ∈ ( 0 , 1 ), let χかい1αあるふぁ2(d)subscriptsuperscript𝜒21𝛼𝑑\chi^{2}_{1-\alpha}(d)italic_χかい start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 - italic_αあるふぁ end_POSTSUBSCRIPT ( italic_d ) be the (1αあるふぁ)1𝛼(1-\alpha)( 1 - italic_αあるふぁ ) quantile of χかい2(d)superscript𝜒2𝑑\chi^{2}(d)italic_χかい start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_d ). A (1αあるふぁ)1𝛼(1-\alpha)( 1 - italic_αあるふぁ )-level confidence region for βべーた0subscript𝛽0\beta_{0}italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT can be constructed as

{βべーた:n(βべーた^βべーた){I1(βべーた0)^}1(βべーた^βべーた)χかい1αあるふぁ2(d)}.conditional-set𝛽𝑛superscript^𝛽𝛽topsuperscript^superscript𝐼1subscript𝛽01^𝛽𝛽subscriptsuperscript𝜒21𝛼𝑑\displaystyle\{\beta:n(\hat{\beta}-\beta)^{\mathrm{\scriptscriptstyle\top}}\{% \widehat{I^{-1}(\beta_{0})}\}^{-1}(\hat{\beta}-\beta)\leq\chi^{2}_{1-\alpha}(d% )\}.{ italic_βべーた : italic_n ( over^ start_ARG italic_βべーた end_ARG - italic_βべーた ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT { over^ start_ARG italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG } start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( over^ start_ARG italic_βべーた end_ARG - italic_βべーた ) ≤ italic_χかい start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 - italic_αあるふぁ end_POSTSUBSCRIPT ( italic_d ) } . (9)

And for the hypothesis H0:βべーた=βべーた0H1:βべーたβべーた0:subscript𝐻0𝛽subscript𝛽0subscript𝐻1:𝛽subscript𝛽0H_{0}:\beta=\beta_{0}\leftrightarrow H_{1}:\beta\neq\beta_{0}italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT : italic_βべーた = italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ↔ italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT : italic_βべーた ≠ italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, we propose to reject H0subscript𝐻0H_{0}italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT at the significance level αあるふぁ𝛼\alphaitalic_αあるふぁ if

n(βべーた^βべーた0){I1(βべーた0)^}1(βべーた^βべーた0)>χかい1αあるふぁ2(d).𝑛superscript^𝛽subscript𝛽0topsuperscript^superscript𝐼1subscript𝛽01^𝛽subscript𝛽0subscriptsuperscript𝜒21𝛼𝑑\displaystyle n(\hat{\beta}-\beta_{0})^{\mathrm{\scriptscriptstyle\top}}\{% \widehat{I^{-1}(\beta_{0})}\}^{-1}(\hat{\beta}-\beta_{0})>\chi^{2}_{1-\alpha}(% d).italic_n ( over^ start_ARG italic_βべーた end_ARG - italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT { over^ start_ARG italic_I start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG } start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( over^ start_ARG italic_βべーた end_ARG - italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) > italic_χかい start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 - italic_αあるふぁ end_POSTSUBSCRIPT ( italic_d ) . (10)

By Theorems 3 and 4, the confidence region (9) has an asymptotically correct (1αあるふぁ)1𝛼(1-\alpha)( 1 - italic_αあるふぁ ) coverage probability, and the test defined by the rejection region (10) has an asymptotically correct type I eror αあるふぁ𝛼\alphaitalic_αあるふぁ.

5 Simulations

In this section, we conduct simulations to assess the finite-sample performance of the proposed SMPLE βべーた^^𝛽\hat{\beta}over^ start_ARG italic_βべーた end_ARG and the proposed confidence region (9) for the linear covariate effect βべーた𝛽\betaitalic_βべーた. To generate data, we take X𝑋Xitalic_X and Z𝑍Zitalic_Z to be two scalar random variables, which are iid from the standard normal distribution, and take the conditional distribution of T𝑇Titalic_T given (X,Z)𝑋𝑍(X,Z)( italic_X , italic_Z ) to be an exponential distribution with mean 1/exp(βべーた0X+g0(Z))1subscript𝛽0𝑋subscript𝑔0𝑍1/\exp(\beta_{0}X+g_{0}(Z))1 / roman_exp ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_X + italic_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_Z ) ). Therefore the conditional hazard function of T𝑇Titalic_T given (X,Z)𝑋𝑍(X,Z)( italic_X , italic_Z ) is exp(βべーた0X+g0(Z))subscript𝛽0𝑋subscript𝑔0𝑍\exp(\beta_{0}X+g_{0}(Z))roman_exp ( italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_X + italic_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_Z ) ). We set the censoring time C𝐶Citalic_C to follow a uniform distribution on (0,c)0𝑐(0,c)( 0 , italic_c ). We consider three scenarios for g0subscript𝑔0g_{0}italic_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT: (I) g0(z)=2zsubscript𝑔0𝑧2𝑧g_{0}(z)=-2zitalic_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_z ) = - 2 italic_z, (II) g0(z)=|z|3/2subscript𝑔0𝑧superscript𝑧32g_{0}(z)=-|z|^{3}/2italic_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_z ) = - | italic_z | start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT / 2 and (III) g0(z)=2|z|subscript𝑔0𝑧2𝑧g_{0}(z)=2|z|italic_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_z ) = 2 | italic_z |. We set βべーた0=2subscript𝛽02\beta_{0}=-2italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = - 2 and consider two choices for c𝑐citalic_c: 5555 and 10101010, and two sample sizes: 600600600600 and 800800800800. A larger c𝑐citalic_c results in a smaller censoring proportion.

When implementing our SMPLE, we set 𝒌0subscript𝒌0\boldsymbol{k}_{0}bold_italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to be 3,4,33433,4,33 , 4 , 3 in Scenarios I–III, respectively. For comparison, we also consider the traditional Cox regression estimator (TCR) of βべーた𝛽\betaitalic_βべーた and the partial likelihood estimator of βべーた𝛽\betaitalic_βべーた with the r𝑟ritalic_r-order polynomial splines under the partially linear additive model (Huang, 1999, SPLA-r𝑟ritalic_r), where r𝑟ritalic_r may be 2,3232,32 , 3 or 4444. We generate 1000 random samples to evaluate the performance of the above five estimation methods.

5.1 Point estimation

Table 1 presents 100100100100 times the simulated root mean square errors (RMSEs) and the absolute biases (BIASs) of these estimators. The model assumptions of SMPLE and the SPLA-r𝑟ritalic_r are correct in all the three scenarios, whereas the standard Cox model is correctly specified only in Scenario I. As expected, in Scenario I, TCR has uniformly the best performance among the five estimators under comparison in terms of RMSE and BIAS. Nevertheless, the SMPLE and the SPLA-r𝑟ritalic_r estimators have almost the same RMSEs and BIASs. When the standard Cox model is misspecified in Scenarios II and III, TCR has much larger RMSEs and BIASs than SMPLE and the SPLA-r𝑟ritalic_r, or equivalently, SMPLE and the SPLA-r𝑟ritalic_r have clear priority over TCR. Compared with SPLA-r𝑟ritalic_r, SMPLE is comparable and slightly inferior in Scenarios I and II, but has uniformly much smaller RMSEs and BIASs in Scenario III. A possible explanation for this phenomenon is that although continuous in all three scenarios, the hazard function is smooth in Scenarios I and II but nonsmooth in Scenario III. As the sample size n𝑛nitalic_n or the constant c𝑐citalic_c increases, we have more completely observed data, consequently all estimators have improved performance when the underlying model assumption is correct. A counterexample is the performance of TCR in Scenarios II and III, where TCR has larger RMSEs and BIASs as n𝑛nitalic_n or c𝑐citalic_c increases.

Figure 1 displays the boxplots of the SMPLE and SPLA-r𝑟ritalic_r (r=2, 3, 4) estimators (minus βべーた0subscript𝛽0\beta_{0}italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT) of βべーた𝛽\betaitalic_βべーた under study when the sample size is n=800𝑛800n=800italic_n = 800. TCR is excluded here as it has extremely large RMSEs and BIASs in Scenarios II and III. SMPLE and the three SPLA-r𝑟ritalic_r exhibit almost the same performance in Scenario I. In Scenario II, where the true hazard function is smooth, the four methods have close variances, but from SMPLE, to SPLA-2222, SPLA-3333, and SPLA-4444, their BIASs become smaller and smaller. In Scenario III, where the true hazard function is nonsmooth, the four methods have close variances again, however the three SPLA-r𝑟ritalic_r estimators have much larger BIASs than SMPLE, whose BIASs are negligible.

Table 1: Simulated root mean square errors and absolute biases (in parentheses) of the five point estimators under comparison. All results have been multiplied by 100.
Scenario c𝑐citalic_c n𝑛nitalic_n SMPLE TCR SPLA-2222 SPLA-3333 SPLA-4444
I 5 600 9.25(7.29)7.299.25\underset{(7.29)}{9.25}start_UNDERACCENT ( 7.29 ) end_UNDERACCENT start_ARG 9.25 end_ARG 9.13(7.17)7.179.13\underset{(7.17)}{9.13}start_UNDERACCENT ( 7.17 ) end_UNDERACCENT start_ARG 9.13 end_ARG 9.17(7.21)7.219.17\underset{(7.21)}{9.17}start_UNDERACCENT ( 7.21 ) end_UNDERACCENT start_ARG 9.17 end_ARG 9.22(7.27)7.279.22\underset{(7.27)}{9.22}start_UNDERACCENT ( 7.27 ) end_UNDERACCENT start_ARG 9.22 end_ARG 9.29(7.32)7.329.29\underset{(7.32)}{9.29}start_UNDERACCENT ( 7.32 ) end_UNDERACCENT start_ARG 9.29 end_ARG
800 8.04(6.36)6.368.04\underset{(6.36)}{8.04}start_UNDERACCENT ( 6.36 ) end_UNDERACCENT start_ARG 8.04 end_ARG 7.95(6.30)6.307.95\underset{(6.30)}{7.95}start_UNDERACCENT ( 6.30 ) end_UNDERACCENT start_ARG 7.95 end_ARG 7.99(6.33)6.337.99\underset{(6.33)}{7.99}start_UNDERACCENT ( 6.33 ) end_UNDERACCENT start_ARG 7.99 end_ARG 8.01(6.35)6.358.01\underset{(6.35)}{8.01}start_UNDERACCENT ( 6.35 ) end_UNDERACCENT start_ARG 8.01 end_ARG 8.03(6.36)6.368.03\underset{(6.36)}{8.03}start_UNDERACCENT ( 6.36 ) end_UNDERACCENT start_ARG 8.03 end_ARG
10 600 8.96(7.19)7.198.96\underset{(7.19)}{8.96}start_UNDERACCENT ( 7.19 ) end_UNDERACCENT start_ARG 8.96 end_ARG 8.86(7.07)7.078.86\underset{(7.07)}{8.86}start_UNDERACCENT ( 7.07 ) end_UNDERACCENT start_ARG 8.86 end_ARG 8.90(7.11)7.118.90\underset{(7.11)}{8.90}start_UNDERACCENT ( 7.11 ) end_UNDERACCENT start_ARG 8.90 end_ARG 8.96(7.17)7.178.96\underset{(7.17)}{8.96}start_UNDERACCENT ( 7.17 ) end_UNDERACCENT start_ARG 8.96 end_ARG 9.01(7.21)7.219.01\underset{(7.21)}{9.01}start_UNDERACCENT ( 7.21 ) end_UNDERACCENT start_ARG 9.01 end_ARG
800 7.65(6.07)6.077.65\underset{(6.07)}{7.65}start_UNDERACCENT ( 6.07 ) end_UNDERACCENT start_ARG 7.65 end_ARG 7.57(6.03)6.037.57\underset{(6.03)}{7.57}start_UNDERACCENT ( 6.03 ) end_UNDERACCENT start_ARG 7.57 end_ARG 7.61(6.05)6.057.61\underset{(6.05)}{7.61}start_UNDERACCENT ( 6.05 ) end_UNDERACCENT start_ARG 7.61 end_ARG 7.62(6.05)6.057.62\underset{(6.05)}{7.62}start_UNDERACCENT ( 6.05 ) end_UNDERACCENT start_ARG 7.62 end_ARG 7.64(6.06)6.067.64\underset{(6.06)}{7.64}start_UNDERACCENT ( 6.06 ) end_UNDERACCENT start_ARG 7.64 end_ARG
II 5 600 10.29(8.19)8.1910.29\underset{(8.19)}{10.29}start_UNDERACCENT ( 8.19 ) end_UNDERACCENT start_ARG 10.29 end_ARG 68.89(67.57)67.5768.89\underset{(67.57)}{68.89}start_UNDERACCENT ( 67.57 ) end_UNDERACCENT start_ARG 68.89 end_ARG 9.46(7.65)7.659.46\underset{(7.65)}{9.46}start_UNDERACCENT ( 7.65 ) end_UNDERACCENT start_ARG 9.46 end_ARG 9.46(7.66)7.669.46\underset{(7.66)}{9.46}start_UNDERACCENT ( 7.66 ) end_UNDERACCENT start_ARG 9.46 end_ARG 9.70(7.78)7.789.70\underset{(7.78)}{9.70}start_UNDERACCENT ( 7.78 ) end_UNDERACCENT start_ARG 9.70 end_ARG
800 8.49(6.68)6.688.49\underset{(6.68)}{8.49}start_UNDERACCENT ( 6.68 ) end_UNDERACCENT start_ARG 8.49 end_ARG 70.06(69.02)69.0270.06\underset{(69.02)}{70.06}start_UNDERACCENT ( 69.02 ) end_UNDERACCENT start_ARG 70.06 end_ARG 8.17(6.53)6.538.17\underset{(6.53)}{8.17}start_UNDERACCENT ( 6.53 ) end_UNDERACCENT start_ARG 8.17 end_ARG 8.10(6.44)6.448.10\underset{(6.44)}{8.10}start_UNDERACCENT ( 6.44 ) end_UNDERACCENT start_ARG 8.10 end_ARG 8.09(6.36)6.368.09\underset{(6.36)}{8.09}start_UNDERACCENT ( 6.36 ) end_UNDERACCENT start_ARG 8.09 end_ARG
10 600 9.89(7.86)7.869.89\underset{(7.86)}{9.89}start_UNDERACCENT ( 7.86 ) end_UNDERACCENT start_ARG 9.89 end_ARG 78.58(77.53)77.5378.58\underset{(77.53)}{78.58}start_UNDERACCENT ( 77.53 ) end_UNDERACCENT start_ARG 78.58 end_ARG 9.21(7.42)7.429.21\underset{(7.42)}{9.21}start_UNDERACCENT ( 7.42 ) end_UNDERACCENT start_ARG 9.21 end_ARG 9.18(7.40)7.409.18\underset{(7.40)}{9.18}start_UNDERACCENT ( 7.40 ) end_UNDERACCENT start_ARG 9.18 end_ARG 9.30(7.44)7.449.30\underset{(7.44)}{9.30}start_UNDERACCENT ( 7.44 ) end_UNDERACCENT start_ARG 9.30 end_ARG
800 8.20(6.46)6.468.20\underset{(6.46)}{8.20}start_UNDERACCENT ( 6.46 ) end_UNDERACCENT start_ARG 8.20 end_ARG 79.62(78.79)78.7979.62\underset{(78.79)}{79.62}start_UNDERACCENT ( 78.79 ) end_UNDERACCENT start_ARG 79.62 end_ARG 8.04(6.48)6.488.04\underset{(6.48)}{8.04}start_UNDERACCENT ( 6.48 ) end_UNDERACCENT start_ARG 8.04 end_ARG 7.93(6.38)6.387.93\underset{(6.38)}{7.93}start_UNDERACCENT ( 6.38 ) end_UNDERACCENT start_ARG 7.93 end_ARG 7.78(6.15)6.157.78\underset{(6.15)}{7.78}start_UNDERACCENT ( 6.15 ) end_UNDERACCENT start_ARG 7.78 end_ARG
III 5 600 8.54(6.87)6.878.54\underset{(6.87)}{8.54}start_UNDERACCENT ( 6.87 ) end_UNDERACCENT start_ARG 8.54 end_ARG 63.46(63.15)63.1563.46\underset{(63.15)}{63.46}start_UNDERACCENT ( 63.15 ) end_UNDERACCENT start_ARG 63.46 end_ARG 20.54(18.83)18.8320.54\underset{(18.83)}{20.54}start_UNDERACCENT ( 18.83 ) end_UNDERACCENT start_ARG 20.54 end_ARG 19.39(17.63)17.6319.39\underset{(17.63)}{19.39}start_UNDERACCENT ( 17.63 ) end_UNDERACCENT start_ARG 19.39 end_ARG 9.74(7.95)7.959.74\underset{(7.95)}{9.74}start_UNDERACCENT ( 7.95 ) end_UNDERACCENT start_ARG 9.74 end_ARG
800 7.28(5.74)5.747.28\underset{(5.74)}{7.28}start_UNDERACCENT ( 5.74 ) end_UNDERACCENT start_ARG 7.28 end_ARG 63.61(63.35)63.3563.61\underset{(63.35)}{63.61}start_UNDERACCENT ( 63.35 ) end_UNDERACCENT start_ARG 63.61 end_ARG 20.92(19.60)19.6020.92\underset{(19.60)}{20.92}start_UNDERACCENT ( 19.60 ) end_UNDERACCENT start_ARG 20.92 end_ARG 19.98(18.62)18.6219.98\underset{(18.62)}{19.98}start_UNDERACCENT ( 18.62 ) end_UNDERACCENT start_ARG 19.98 end_ARG 9.28(7.74)7.749.28\underset{(7.74)}{9.28}start_UNDERACCENT ( 7.74 ) end_UNDERACCENT start_ARG 9.28 end_ARG
10 600 8.43(6.78)6.788.43\underset{(6.78)}{8.43}start_UNDERACCENT ( 6.78 ) end_UNDERACCENT start_ARG 8.43 end_ARG 63.70(63.40)63.4063.70\underset{(63.40)}{63.70}start_UNDERACCENT ( 63.40 ) end_UNDERACCENT start_ARG 63.70 end_ARG 21.05(19.40)19.4021.05\underset{(19.40)}{21.05}start_UNDERACCENT ( 19.40 ) end_UNDERACCENT start_ARG 21.05 end_ARG 19.91(18.21)18.2119.91\underset{(18.21)}{19.91}start_UNDERACCENT ( 18.21 ) end_UNDERACCENT start_ARG 19.91 end_ARG 9.86(8.04)8.049.86\underset{(8.04)}{9.86}start_UNDERACCENT ( 8.04 ) end_UNDERACCENT start_ARG 9.86 end_ARG
800 7.17(5.66)5.667.17\underset{(5.66)}{7.17}start_UNDERACCENT ( 5.66 ) end_UNDERACCENT start_ARG 7.17 end_ARG 63.71(63.47)63.4763.71\underset{(63.47)}{63.71}start_UNDERACCENT ( 63.47 ) end_UNDERACCENT start_ARG 63.71 end_ARG 21.25(20.01)20.0121.25\underset{(20.01)}{21.25}start_UNDERACCENT ( 20.01 ) end_UNDERACCENT start_ARG 21.25 end_ARG 20.31(19.04)19.0420.31\underset{(19.04)}{20.31}start_UNDERACCENT ( 19.04 ) end_UNDERACCENT start_ARG 20.31 end_ARG 9.34(7.78)7.789.34\underset{(7.78)}{9.34}start_UNDERACCENT ( 7.78 ) end_UNDERACCENT start_ARG 9.34 end_ARG
Refer to caption
Figure 1: Boxplots of the SMPLE, SPLA-2222, SPLA-3333 and SPLA-4444 estimates (minus βべーた0subscript𝛽0\beta_{0}italic_βべーた start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT) of βべーた𝛽\betaitalic_βべーた when the sample size is n=800𝑛800n=800italic_n = 800.
Table 2: Simulated coverage probabilities and average interval lengths (in parentheses) of the confidence intervals at the 95% confidence level based on the five estimators under comparison
Scenario c𝑐citalic_c n𝑛nitalic_n SMPLE TCR SPLA-2222 SPLA-3333 SPLA-4444
I 5 600 0.947(0.412)0.4120.947\underset{(0.412)}{0.947}start_UNDERACCENT ( 0.412 ) end_UNDERACCENT start_ARG 0.947 end_ARG 0.930(0.384)0.3840.930\underset{(0.384)}{0.930}start_UNDERACCENT ( 0.384 ) end_UNDERACCENT start_ARG 0.930 end_ARG 0.940(0.396)0.3960.940\underset{(0.396)}{0.940}start_UNDERACCENT ( 0.396 ) end_UNDERACCENT start_ARG 0.940 end_ARG 0.944(0.409)0.4090.944\underset{(0.409)}{0.944}start_UNDERACCENT ( 0.409 ) end_UNDERACCENT start_ARG 0.944 end_ARG 0.948(0.422)0.4220.948\underset{(0.422)}{0.948}start_UNDERACCENT ( 0.422 ) end_UNDERACCENT start_ARG 0.948 end_ARG
800 0.941(0.343)0.3430.941\underset{(0.343)}{0.941}start_UNDERACCENT ( 0.343 ) end_UNDERACCENT start_ARG 0.941 end_ARG 0.937(0.324)0.3240.937\underset{(0.324)}{0.937}start_UNDERACCENT ( 0.324 ) end_UNDERACCENT start_ARG 0.937 end_ARG 0.937(0.332)0.3320.937\underset{(0.332)}{0.937}start_UNDERACCENT ( 0.332 ) end_UNDERACCENT start_ARG 0.937 end_ARG 0.941(0.340)0.3400.941\underset{(0.340)}{0.941}start_UNDERACCENT ( 0.340 ) end_UNDERACCENT start_ARG 0.941 end_ARG 0.949(0.348)0.3480.949\underset{(0.348)}{0.949}start_UNDERACCENT ( 0.348 ) end_UNDERACCENT start_ARG 0.949 end_ARG
10 600 0.940(0.388)0.3880.940\underset{(0.388)}{0.940}start_UNDERACCENT ( 0.388 ) end_UNDERACCENT start_ARG 0.940 end_ARG 0.930(0.364)0.3640.930\underset{(0.364)}{0.930}start_UNDERACCENT ( 0.364 ) end_UNDERACCENT start_ARG 0.930 end_ARG 0.937(0.374)0.3740.937\underset{(0.374)}{0.937}start_UNDERACCENT ( 0.374 ) end_UNDERACCENT start_ARG 0.937 end_ARG 0.940(0.385)0.3850.940\underset{(0.385)}{0.940}start_UNDERACCENT ( 0.385 ) end_UNDERACCENT start_ARG 0.940 end_ARG 0.942(0.395)0.3950.942\underset{(0.395)}{0.942}start_UNDERACCENT ( 0.395 ) end_UNDERACCENT start_ARG 0.942 end_ARG
800 0.945(0.327)0.3270.945\underset{(0.327)}{0.945}start_UNDERACCENT ( 0.327 ) end_UNDERACCENT start_ARG 0.945 end_ARG 0.935(0.310)0.3100.935\underset{(0.310)}{0.935}start_UNDERACCENT ( 0.310 ) end_UNDERACCENT start_ARG 0.935 end_ARG 0.941(0.317)0.3170.941\underset{(0.317)}{0.941}start_UNDERACCENT ( 0.317