On the convergence rate of d -dimensional fourth-order beta polynomial kernels

This article focuses on formulating the Asymptotic Mean Integrated Squared Error (AMISE) scheme for d -dimensional fourth-order beta polynomial kernels in the context of kernel density estimation


INTRODUCTION
Density estimation, a fundamental aspect of statistical analysis, involves deriving insights into probability density functions based on observed data [1].Two primary approaches in this field are parametric and nonparametric methods.Nonparametric methods, particularly kernel density estimation (KDE), have gained prominence for their versatility in avoiding assumptions about specific density forms and their applicability across various scenarios, including multivariate analysis [2].
Significant research spanning from the 1950s onwards has focused on optimizing bandwidth selection for univariate KDE [3][4][5].The test graph method introduced by Sliverman [6], char-acterized by swift convergence and consistent performance, has been influential.Scott and Factor [7] explored the bandwidth's impact on bias, while Rudemo [8] focused on minimizing the mean integrated squared error (MISE) to identify optimal bandwidth choices.
Multivariate Kernel Density Estimation (MKDE) has witnessed noteworthy contributions, encompassing diverse parameterizations and optimal bandwidths for bivariate density estimation, as outlined by Wand and Jones [9].Adaptive techniques, tailoring smoothing based on local data features, have emerged through the work of Bowman and Foster [10].Wand and Jones [2] extended KDE to higher dimensions, deriving asymptotic and exact mean integrated squared error results.Duong and Hazelton [11] explored plug-in bandwidth matrices for bivariate estimation.
Recent research by Siloko et al. [20] has focused on reducing the AMISE in KDE.Their exploration involved techniques related to kernel density derivatives and kernel boosting, revealing that an increase in kernel derivatives and the number of boosting steps can effectively decrease AMISE.KDE plays a pivotal role in exploratory data analysis and statistical visualization due to its simplicity, interpretability, and wide applicability, making it a valuable tool for data analysis.Ongoing endeavors aim to develop generalized convergence strategies that reduce AMISE while preserving the statistical characteristics of real-world observations.
Within the context of KDE, for a given random sample X 1 , X 2 , . . ., X n drawn from a common density f , the ddimensional kernel density estimator is defined as follows: Here, In this article, we adopt the parameterizations H = h 2 I d proposed by Cacoullos [21] for selecting the bandwidths.This choice allows for closed-form expressions for both the optimal bandwidth and AMISE [17-19, 22, 23].The ddimensional kernel density estimator, using the chosen parameterisations as discussed by Wand and Jones [2], can be expressed as follows: The central objective of this study is to explore and elucidate the convergence rates within the domain of kernel density estimation.Convergence rates play a pivotal role in delineating how rapidly the estimated density, denoted as f (x), approaches the true density, represented by f (x).This exploration is particularly significant as it extends observations from previous research conducted in univariate settings by Sliverman [1] and Afere [12] to the more complex and intricate landscape of higher-order ddimensional kernels.
The phenomenon of convergence rates serves as a crucial indicator, offering insights into the efficiency and speed of the estimation process.Faster convergence rates imply a swifter approximation of the estimated density to the underlying true density.Such insights are pivotal in understanding and improving the accuracy of kernel density estimation, especially when dealing with multivariate scenarios.
The subsequent sections of this article delve into a detailed examination of the convergence properties and behavior specifically related to fourth-order multivariate kernels.Building upon the well-established convergence schemes for second-order kernels documented in the existing literature [1,2], our focus pivots towards the nuances and intricacies associated with fourthorder kernels.This shift in focus is motivated by the desire to enhance our understanding of convergence dynamics in higherdimensional spaces.
The structure of this work unfolds as follows: Section 2 lays the foundation for this article, discussing the framework and convergence schemes for fourth-order multivariate kernels.Furthermore, Section 3 is dedicated to the derivation of a generalized fourth-order convergence scheme for AMISE.This involves a rigorous mathematical exploration, providing explicit formulas that capture the essence of the convergence behavior in this specific context.
In Section 4, we delve into the detailed quantitative validation of the outcomes derived in our study.This section is instrumental in providing a rigorous assessment of the reliability, accuracy, and applicability of the proposed convergence schemes for fourth-order multivariate kernels.The validation process involves a meticulous comparison with the outcomes reported by Deheuvels (see Scott [24] pg 189) and Jones et al. (see Chacón and Duong [25] pg 70), encompassing various sample sizes and dimensionalities.
Finally, in Section 5, we embark on a meticulous examination of the results obtained from our study.This examination critically assesses the implications and importance of the observed convergence rates in kernel density estimation.Conclusions drawn from this analysis contribute to a broader understanding of the efficiency of the proposed convergence rate.

CONVERGENCE SCHEMES FOR THE FOURTH-ORDER KERNELS
In the landscape of this research domain, it is a prevalent practice to gauge the effectiveness of the density function f by employing the Mean Integrated Squared Error (MISE).However, the derivation of MISE in closed form is often unattainable, necessitating the adoption of an asymptotic approximation as a pragmatic solution.This approximation serves as an estimate for the MISE, providing a means to evaluate the performance of the density function under consideration.The asymptotic approximation takes the following form: where and The basic assumptions for any multivariate second-order symmetric kernels are: However, in 1962, Parzen [26] introduced the concept of using kernels that can take both negative and positive values.Consequently, for the purpose of achieving a fourth-order estimation, we relax the assumptions outlined in Eq.( 6) and obtain the following expression: Now, if the Taylor series expansion of f (x − tH 2 ) up to the fifth term is used, and on application of the assumptions in Eq.( 6) with the parametrisation H = h 2 I d , the AMISE in Eq.( 3) results to: 8) On differentiating Eq.( 8) with respect to h and minimizing accordingly, we obtain h as: −1 Substituting Eq.( 9) into Eq.(8), we have: The main significance from this is that the convergence rate n − 4 d+4 as proposed by Deheuvels in 1977 (see Scott [24] pg 189) decreases to n − 8 d+8 proposed by Jones et al when kernels meeting the conditions in Eq.( 7) are applied (see Chacón and Duong [25] pg 70).Section 3 below presents our theorem which is an improvement to the result in Eq.( 10) if fourth-order kernels is used.

THE PROPOSED GENERALIZED CONVERGENCE SCHEME
In this section, we introduce a comprehensive framework that builds upon the advancements made in Eqs.( 9) and (10).We present Theorem 1, which encapsulates the enhanced outcomes achieved through this scheme.

Theorem 1. If K (t) is any differentiable d-dimensional fourthorder kernel that is parameterized by
satisfying the conditions in Eq.( 7), and then the generalized asymptotic mean integrated squared error is given as: Proof.The bias term is given by: If we substitute Eq.( 2) into Eq.(11) and simplify, we have: Now, on using the multivariate Taylor's series expansion up to (m + 1) st -times on Eq.( 12), we have: where, D f (x) is the vector of first-order partial derivatives of f and H f (x) is the Hessian matrix of the d × d matrix having (i,j) entry equal to ∂ 2 f (x) ∂x i ∂x j = 0, i j.On expanding the last equation further, we have: Imposition of the moment conditions of Eq.( 7) and the (2m + 2)th moment of the statement of the theorem in Eq.( 13), Eq.( 13) reduces to: The trace of the Hessian matrix (H f (x)) in Eq.( 14) is defined by ∇ 2 f (x).With this definition, the bias can be expressed as: Squaring both sides of Eq.( 15) and using the bandwidth matrix H = h 2 I d and then substituting into Eq.( 4), the AISB is obtained as: Also, on the (2m + 2)-th moment in the statement of the Theorem, Eq.( 16) becomes: The variance term is given by: On substituting Eq.( 18) into Eq.( 5) and using the necessary assumptions as in the case of bias, we have: ) ) Hence, following similar algebraic variable substitution and Taylor series expansion argument as in the case of bias and using all the necessary assumptions, Eq.( 19) becomes: where On substituting Eq.( 20) into Eq.(5), the resulting equation becomes: Also, on using the parametrisation H = h 2 I d , Eq.( 21) becomes: Plugging back Eqs.( 17) and ( 22) into Eq.(3), we have: On differentiating Eq.( 23) with respect to h, we have: But at the minimum or maximum point, ∂AMISE fH (x) ∂h = 0. Therefore, .
Hence, the approximate fourth-order optimal bandwidth, in the sense of minimizing Eq.( 23) with respect to h, is of the form: On substituting Eq.( 24) into Eq.(23), we have: ) ) ) ) ) 2 ) d d+4m+4 n − 4m+4 d+4m+4 .Thus, the expression for the generalized AMISE, independent of the optimal bandwidth (h), can be formulated as follows: 2 ) Adopting the nomenclature AMISE 2m for AMISE fH (x), becomes: ) By comparing Eqs.( 10) and ( 26), we observe a significant improvement in the convergence rate of the global error, also known as AMISE.The rate has transitioned from n − 8 d+8 to n − 4m+4 d+4m+4 , showcasing a more favorable convergence behavior.This improvement has been made possible due to the regularity conditions outlined in the statement of the theorem (that is, The regularity conditions not only contribute to the reduction in global error but also play a crucial role in establishing the generalized convergence schemes for any d-dimensional fourth-order polynomial symmetric kernels.Thus, these conditions provide valuable insights and enable us to achieve improved convergence performance in the estimation process.

QUANTITATIVE VALIDATION OF RESULTS
In this section, our primary aim is to validate and reinforce the findings delineated in Section 3 through a comprehensive comparison with the outcomes reported by Deheuvels (see Scott [24] pg 189) and Jones et al. (see Chacón and Duong [25] pg 70).The objective of this quantitative validation is twofold.Firstly, it aims to affirm the robustness and precision of the obtained results by scrutinizing their consistency with established findings in the literature.Secondly, it seeks to elucidate the generalizability of our proposed convergence scheme across diverse scenarios of sample sizes and dimensionality.This thorough examination is crucial for bolstering the credibility and broader applicability of our research findings.
The detailed tabulated results presented in Tables 1 through 3 serve as the foundation for this quantitative validation.These tables provide a comprehensive comparison between the proposed convergence rate (PG) and those presented by Jones et al. (JMH) [25] and, Deheuvels (PD) [24].The examination spans various sample sizes and dimensionalities, allowing for a nuanced understanding of how well our proposed schemes perform across a spectrum of scenarios.
By subjecting our results to this quantitative scrutiny, we aim to fortify the scientific rigor of our study and provide a clear assessment of the effectiveness and reliability of the proposed convergence scheme for fourth-order multivariate kernels.This process contributes to the comprehensive evaluation of our research findings and enhances their value in the broader context of kernel density estimation.[25] and, Deheuvels (PD) [24] across various sample sizes (n) when d = 1.

n
PG JMH PD 10 0.1003750000 0.1291550000 0.1584890000 25 0.0402118000 0.0571988000 0.0761462000 300 0.0033656200 0.0062822400 0.0104304000 1,000 0.0010119300 0.0021544300 0.0039810700 100,000 0.0000102135 0.0000359381 0.0001000000 In Table 1, a comparative analysis of the proposed convergence rate (PG) against Jones et al. (JMH) [25] and, Deheuvels (PD) [24] is presented.The comparison is conducted across various sample sizes (n) while maintaining a fixed dimensionality (d = 1).As the sample size increases, all approaches exhibit a decline in convergence rates.Notably, PG consistently demonstrates the fastest convergence, followed by JMH and PD, which generally exhibit higher rates than JMH.This observation underscores the superior performance of PG in achieving faster convergence under the specified conditions.[25] and, Deheuvels (PD) [24] across various sample sizes (n) when d = 2.

RESULTS AND CONCLUSION
Upon comparing Eqs. ( 10) and ( 26), a clear improvement in the convergence rate is observed for the global error of any fourth-order polynomial symmetric kernel.The transition from n 26) signifies a notable enhancement in convergence behavior across varying dimensions and orders.Moreover, obtaining a closed-form solution for the bandwidth and minimizing the generalized AMISE expression for fourthorder kernels yields noteworthy insights.Initially, the introduction of negative kernels proved to be a pivotal factor in enhancing MISE.This deliberate integration of negative kernels represents a substantial advancement in the overall efficiency of the estimation process, contributing significantly to the precision and accuracy of our methodology.Additionally, the AMISE of fourthorder kernels exhibits a swifter convergence rate compared to second-order kernels.However, it is crucial to recognize that this accelerated rate remains slower than the univariate case (d = 1) owing to the impact of the curse of dimensionality on the convergence rate, Scott [24].
Quantitatively validating these results, Table 1 compares the proposed convergence rate (PG) with Jones et al. (JMH) and, Deheuvels (PD) across various sample sizes (n) when the dimensionality (d) is fixed at 1. Larger sample sizes contribute to decreased convergence rates for all methods, with PG consistently exhibiting the lowest values, indicating the fastest convergence rate.The effectiveness of PG in achieving faster convergence remains consistent across all sample sizes, underscoring the importance of ample data for accurate estimates in scenarios with d = 1.
Table 2 extends this comparison to a dimensionality of 2. Increasing sample sizes leads to decreased convergence rates for all methods, and PG consistently demonstrates lower values compared to JMH and PD, with differences becoming more pronounced as sample size increases.The reduction in convergence rates with larger sample sizes emphasizes the importance of sufficient data for accurate estimation, particularly in a higherdimensional space (d = 2).The observed differences between PG and other methods are more substantial, indicating a potentially more significant practical impact.
Table 3 further extends the analysis to a dimensionality of 3. Larger sample sizes again result in decreased convergence rates for all methods, and PG consistently exhibits lower values compared to JMH and PD, with differences becoming more pronounced with increasing sample size.The reduction in convergence rates with larger sample sizes underscores the importance of adequate data for accurate estimation, particularly in a higherdimensional space (d = 3).The substantial differences between PG and other methods also highlight the potential practical impact of PG in this higher-dimensional scenario.
In summary, the tables collectively suggest that the proposed method (PG) consistently outperforms existing methods (JMH, PD) across varying dimensionalities.PG's superiority in achieving a faster convergence rate is particularly evident in scenarios with larger sample sizes, emphasizing its potential advantages in practical applications.
In conclusion, our extensive investigation, encompassing both analytical and quantitative analyses, has revealed the superior performance of the new convergence rate compared to its predecessors.The new convergence scheme for fourth-order kernels not only reduces bias but also improves convergence rates, especially in higher dimensions.So, using this generalized conver-gence scheme for fourth-order kernels helps address challenges related to increasing dimensionality during the estimation procedure.

Table 1 .
Comparing the proposed convergence rate (PG) with the convergence rates presented byJones et al. (JMH)

Table 2 .
Comparing the proposed convergence rate (PG) with the convergence rates presented by Jones et al. (JMH)

Table 3 .
Comparing the proposed convergence rate (PG) with the convergence rates presented byJones et al. (JMH)