Tropical cyclones (TCs),as intense weather systems,exert destructive impacts that depend not only on their intensity but also closely on their size. However,the relationship between TC translation speed and its size remains insufficiently understood. Based on the Extended Best Track (EBT) dataset and ERA5 reanalysis data from 1988 to 2021 in the North Atlantic,this study investigates the statistical relationship between TC translation speed and two size metrics and its possible physical mechanisms:the radius of maximum wind (RMW) and the radius of 17 m·s-1 wind (R17). Results show that R17 increases significantly with the translation speed,with its higher percentiles being more sensitive to changes in moving speed. RMW also increases slightly with the translation speed,and its change can largely be explained by that of R17. Physically,faster⁃moving TCs significantly weaken sea surface temperature (SST) cooling,thereby maintaining or enhancing surface enthalpy fluxes and increasing atmospheric instability. Concurrently,faster⁃moving TCs substantially intensify low⁃level convergence in the forward quadrant. The combined effect of thermal and dynamical processes jointly intensifies the upward motion,thereby facilitating the development of spiral rainbands,whose resultant diabatic heating further enhances the inward transport of angular momentum. This process results in the expansion of the outer wind field,causing an increase in R17,which consequently enlarges RMW.
This study investigates the differences in the initial size distribution of tropical cyclones (TCs) generated over Western North Pacific (WNP) and South China Sea (SCS) from July to October during 1981-2017. Results show that although the median initial sizes of TC in the two basins are similar,the WNP exhibits greater variance with a right⁃skewed distribution,while SCS TC sizes follow a distribution closer to normal. Further analysis indicates that during the non⁃convective phase of Boreal Summer Intraseasonal Oscillation (BSISO),the median initial TC sizes in both regions do not differ significantly. However,during the convective phase,WNP experiences a pronounced increase in TC size,whereas SCS shows only limited growth. In terms of environment conditions,although SCS has stronger low⁃level relative vorticity and humidity,its higher vertical wind shear leads to a marked wavenumber⁃1 asymmetry in TC convective structure. During the BSISO convective phase,environmental factors such as low⁃level vorticity and humidity are significantly improved over WNP,resulting in systematically enhanced TC convection and more efficient inward transport of angular momentum,favoring the formation of larger TCs. In contrast,SCS sees no substantial weakening in vertical wind shear,and its asymmetric structure becomes more pronounced. The overall TC circulation and central convection do not exhibit significant strengthening,leading to a weaker response of TC size. These findings highlight that the modulation of environmental conditions and TC structure by the BSISO convective phase contributes importantly to the differences in initial TC size distribution between the WNP and SCS.
Different El Niño⁃Southern Oscillation (ENSO) events exhibit pronounced diversity during their decay phases,and the rate of ENSO decay can substantially influence the locations of tropical cyclone (TC) rapid intensification (RI) over the Western North Pacific (WNP) in boreal summer. To quantitatively characterize ENSO evolution during the decay phase,this study introduces a new metric,the ENSO Changing Rate (ECR),and reveals the modulatory role of the Interdecadal Pacific Oscillation (IPO) in the relationship between ECR and RI locations. The period 1951-2024 is divided into three subperiods (P1: 1951-1978,P2: 1979-1998,and P3: 1999-2024). The results show pronounced interdecadal variations in the correlation between ECR and the longitude of RI occurrence,which are synchronized with IPO phase transitions: a significant positive correlation is found during the negative IPO phases (P1 and P3),whereas a negative correlation emerges during the positive IPO phase (P2). By modulating the ENSO decay rate,the IPO alters the large⁃scale atmospheric and oceanic responses over the WNP during ENSO decay summers. During the negative IPO phases,ENSO events tend to decay rapidly,leading to more favorable atmospheric and oceanic conditions for RI over the western WNP,such that ECR effectively regulates the longitude of RI occurrence. In contrast,during the positive IPO phase,ENSO decay is slower and the large⁃scale environment over the WNP becomes less favorable for RI,with RI locations being more strongly influenced by TC tracks and environmental conditions along the tracks. These results provide physical insight into the interdecadal modulation of ENSO impacts on RI and offer a scientific basis for developing RI seasonal prediction models with interdecadal adaptability.
By analyzing actual typical cases,the differences in development and roles of initial mid⁃level (MV) vortices and low⁃level (LV) vortices during tropical cyclone (TC) genesis were examined. During the early stage of the MV⁃type TC Koinu,a persistent positive temperature anomaly existed,which was not observed in the LV⁃type TC Mawar. In addition,the warm core appeared earlier in Koinu than in Mawar. The initial warm core was at a lower altitude in Mawar than in Koinu. When the warm core was strengthened to the same extent,the low⁃level winds in Mawar intensified more rapidly than in Koinu,thereby enhancing the radial gradient of the sea level pressure,generating stronger low⁃level radial inflow,which was more favorable for TC intensification. At the same time,the initial mid⁃level cold anomaly in Koinu also inhibited its intensification. The appearance of the warm core in Mawar was closely related to the subsidence induced warming process,which did not occur in Koinu. Therefore,the difference of initial altitudes of the warm cores in Mawar and Koinu ultimately led to a higher TC genesis efficiency (the occurrence of warm core as the start time in the definition) for Marwar compared to Koinu. Moreover,the warm cores in both Koinu and Mawar appeared prior to obvious intensification and reaching the level of tropical storm,which indicated that the warm core is a prerequisite for TC genesis.
Accurately simulating regional climate over East Asia has become increasingly important for understanding the impacts of global climate change. To overcome the coarse spatial resolution of Global Climate Models (GCMs),this study develops a novel Regional Climate Model Emulator (RCM⁃Emulator) based on deep learning techniques and conducts high⁃resolution downscaling experiments over East Asia. The proposed model integrates high⁃resolution topographic and land⁃sea mask constraints,and introduces additional inputs of incoming shortwave radiation and surface latent heat flux for temperature and precipitation,respectively,thereby enhancing its sensitivity to energy balance and moisture transport processes. Furthermore,a Bernoulli⁃Gamma loss function is adopted to address the highly skewed nature of precipitation distributions and to improve the representation of extreme rainfall.Results demonstrate that the model can faithfully reconstruct near⁃surface temperature and precipitation fields in homogeneous experiments using RegCM4 simulations,with the spatial biases remaining minimal and the RMSE values significantly lower than those of bilinear interpolation. When transferred to the ERA5 reanalysis dataset without additional calibration,the emulator successfully reproduces the spatial patterns and temporal variations of the reference data,exhibiting strong cross⁃dataset generalization. Overall,the proposed RCM⁃Emulator achieves high physical consistency and computational efficiency,enabling the generation of multi⁃year regional climate fields within minutes. This approach provides a promising,high⁃accuracy,and low⁃cost alternative for regional climate studies,ensemble simulations,and climate risk assessments.
Atmospheric turbulence dissipation refers to the conversion of turbulence kinetic energy into thermal energy. The turbulent dissipation rate is a crucial parameter for quantifying turbulence intensity,mixing,and transport characteristics,and it is also an important indicator in engineering applications such as aviation safety and wind power generation. Radiosonde observation sare widely used for vertical atmospheric profiles of wind,temperature,and humidity. However,because turbulence dissipation occurs at the smallest continuous scales of the atmosphere (millimeter and millisecond scales),radiosondes cannot directly observe the dissipation rate. To overcome this limitation and enrich the vertical profile observations of turbulent dissipation rates,a deep learning approach is developed based on large eddy simulation data of the convective boundary layer. An XGBRegressor model is trained to predict dissipation based on the vertical profiles of key meteorological variables including wind,potential temperature and pressure,as well as their vertical gradients. Model performance is evaluated in terms of feature extraction,nonlinear modeling,and generalization capability. The results demonstrate that the proposed model exhibits decent diagnostic skills that outperform the classic Thorpe diagnostic model for dissipation rates. Furthermore,the model demonstrates good generalization capabilities to process different vertical resolutions other than the training datasets. This machine⁃learning model provides an alternative approach for profiling turbulence dissipation rates based on radiosonde data,and can be potentially used for the parameterization of turbulence dissipation rates in PBL schemes.
The gust factor,an indicator of gust intensity,is widely used to analyze gust characteristics during typhoon events. This study analyzes the gust factors during typhoon Mangkhut using 13⁃level two⁃dimensional ultrasonic wind data obtained from the Shenzhen Meteorological Gradient Tower. The results show that the gust factor primarily ranges from 1 to 1.75,and the gust intensity gradually decreases with increasing height. The gust factor is larger during the severe typhoon due to the low mean wind speed resulting from the typhoon center being located at a considerable distance from the tower,decreases significantly with the approach of landfall,and increases again after transitioning into a tropical storm. The gust factor decreases with increasing wind speed and approaches 1 under high wind speed conditions. It is also negatively correlated with the 10⁃minute mean wind speed across layers,with stronger correlations within the same layer. The gust factor is more pronounced in the lower layers and under low wind speed conditions. The gust factor is lower under the dominant wind directions (northerly and easterly),but higher under southerly and westerly winds,despite their lower frequency. In the wind direction range of 157.5° to 292.5°,the values in the lower layers account for approximately 55% of the total. At higher gust factor values,the wind direction shifts noticeably at 10 m and 20 m.
With the rapid development of Internet technology,recommendation systems are playing an increasingly important role in addressing information overload. However,traditional recommendation methods often overlook the complex latent relationships between users' personalized features and items,leading to suboptimal performance. To tackle this issue,we propose PKGRec,a Feature⁃Interactive Graph Neural Network recommendation model based on Personal Knowledge Graphs. PKGRec integrates users' personal knowledge graphs with public knowledge graphs and captures complex interaction patterns among entities through a feature⁃entity interaction layer. Furthermore,we design a preference⁃aware attention mechanism that enables fine⁃grained user representation learning based on the user's interaction weights with different items,effectively enhancing the model's expressive power. We evaluate our model on two large⁃scale real⁃world datasets: NetEase Cloud Music and KuaiRec. Experimental results show that PKGRec significantly outperforms eight strong baselines,including BPRMF,NFM,and CKE,across three evaluation metrics:Precision,Recall,and NDCG. Notably,PKGRec exhibits significant advantages in cold⁃start and long⁃tail recommendation scenarios,validating the effectiveness of personal knowledge graphs in enhancing recommendation systems.
Knowledge Tracing (KT) dynamically assesses and tracks students' knowledge mastery levels based on their historical learning trajectories,enabling the prediction of their future learning performance. As a core technology in online learning systems,KT facilitates personalized learning experiences. While existing deep neural network⁃based KT models (e.g.,DKT,DKVMN) have demonstrated significant advantages over traditional methods, they typically require large⁃scale training data. Early⁃stage interactions,where the student response data are extremely sparse,pose substantial challenges to training complex and effective deep KT models. To address this limitation,we propose MetaKT (Meta⁃Learning⁃Enhanced Knowledge Tracing),a framework that leverages meta⁃learning to enhance early⁃stage KT performance. Given a target KT task and several related auxiliary tasks,MetaKT first pre⁃trains the model on auxiliary tasks,and then fine⁃tunes it using the target task's limited data until convergence. Experiments on seven public datasets,with DKT and DKVMN as backbones,demonstrate that MetaKT improves AUC for DKT and DKVMN in 27 and 33 out of 35 test scenarios,respectively.
Large Language Model (LLM) have demonstrated significant potential in tabular data generation. However,they often struggle to accurately preserve the statistical dependencies between columns. To address this challenge,we propose TabProLLM,a probabilistic prompting framework that separately generates numerical and categorical columns using strategies grounded in probability distributions. For numerical columns,we fit a Gaussian Mixture Model (GMM) to decompose the empirical distribution into multiple Gaussian components. Prompts are then constructed based on these segmented distributions to guide the LLM in generating realistic numerical values. For categorical columns,we condition on a reference numerical column by partitioning its range and computing the conditional probability distribution of each category within each interval. These conditional probabilities are embedded into the prompt design to steer the generation of categorical data consistent with observed inter⁃variable dependencies. During prompt construction,correlation coefficients and other statistical measures are incorporated to verify that the generated data preserves the correlation structure of the original dataset. Experimental results on 10 public datasets show that TabProLLM,while ensuring strong data privacy,achieves performance gains of 0.5% to 18.3% over existing methods across multiple fidelity metrics in the SDMetrics toolkit,including RangeCoverage,CategoryCoverage,KSComplement,and TVComplement. On the CorrelationSimilarity metric,TabPro⁃LLM performs comparably to the state⁃of⁃the⁃art TabDDPM model and surpasses GPT⁃4o (using mean⁃variance prompts) by approximately 4.1%. Furthermore,in privacy evaluations,TabProLLM achieves top or second⁃best performance across DCR and NNDR metrics (evaluated at the 5th percentile),highlighting its robust privacy⁃preserving capabilities.
Complications of diabetes mellitus are important factors in patient mortality,and revealing their key features can effectively help physicians develop targeted intervention strategies to reduce the risk of death in comorbid conditions. However,most previous studies have focused on identifying risk factors for a single complication of diabetes,ignoring potential associations between complications. Therefore,based on the Diabetes Complications Early Warning Dataset provided by the National Population Health Sciences Data Centre,we used Pearson's correlation coefficient and the chi⁃square test to screen out significantly associated diabetic complications and incorporated them into a multi⁃task learning model for joint modeling. Then the importance of each feature was assessed using SHAP (SHapley Additive exPlanations),and 11 features with SHAP values higher than the 75% quartile were screened as significant risk factors for diabetes co⁃morbidities. A predictive model for diabetes⁃related complications was constructed using random forest,logistic regression,gradient boosting,extreme gradient boosting,adaptive boosting,and categorical feature gradient boosting. Input variables comprised features with SHAP values exceeding the 25th percentile. Optimal parameter combinations were selected via grid search,with model predictive performance evaluated using metrics including accuracy,precision,F1⁃score,and AUC. Results indicated that features selected through the interpretable multi⁃task learning model constituted key predictors,with all six predictive models achieving AUC values approaching 0.90. Finally,LIME (Local Interpretable Model⁃Agnostic Explanations) was introduced to interpret the model outcomes,thereby further validating the effectiveness and reliability of the constructed interpretable multi⁃task learning model for screening key features. The interpretable multi⁃task learning model comprehensively accounts for the underlying relationships between complications,enabling the precise identification of key risk factors for concurrent diabetic complications. This assists clinicians in formulating targeted intervention strategies,thereby helping to reduce patient mortality attributable to complications.
In MIL (Multi⁃Instance Learning),data objects are hierarchically organized as bags containing multiple instances. The well⁃known MIL embedding approach embeds each bag as a vector by selecting representative instances. However,most existing methods ignore the hierarchical structure of bags,leading to the generated KIS (Key Instance Set) that contains substantial outlier instances (the instances where bag labeling cannot be triggered). Additionally,KIS is not utilized to exclude outliers in bags,which will impact embedding quality. To address these issues,we propose HKMIL (Hierarchical Key Instance Selection for Multi⁃Instance Embedding Learning) algorithm with three technologies. First,HIS (Hierarchical Instance Selection) uses subspace⁃ and affinity⁃based updates to identify and refine KIS,generating new bags while considering instance density. Second,FVE (Fisher Vector Embedding) technique uses Gaussian mixture models to extract key statistical information from the new bags,converting them into vectors to simplify the MIL problem. Third,ECT (Ensemble Classification Technique) dynamically weights the information before and after KIS updates for improved bag label predictions. Experiments on six MIL tasks show that HKMIL outperforms nine state⁃of⁃the⁃art algorithms,achieving superior classification performance.
Recent remarkable advances in machine learning (ML) have inspired the exploration of novel algorithms for solving differential equations. After nearly three decades of development,numerous ML⁃based solvers have emerged,demonstrating significant performance advantages in specific scenarios. However,recent studies have revealed a widespread and systematic omission of negative results in current literature,leading to an overly optimistic bias in the academic assessment of ML’s capabilities for solving differential equations. Consequently,there is an urgent need for more comprehensive empirical evidence to objectively evaluate algorithmic efficacy,particularly to establish a rational understanding of failure cases and performance boundaries. This study investigates the widely used Physics⁃Informed Neural Network (PINN) framework for solving the Landau–Lifshitz–Gilbert (LLG) equation,the core governing equation in micromagnetics. By systematically varying the magnetocrystalline anisotropy constant (
In practical adaptive filtering systems,stochastic processing delays and heterogeneous measurement noises,such as Gaussian noise and impulsive noise,are commonly encountered. However,existing variable step⁃size least mean square (VSSLMS) algorithms typically assume a delay⁃free system in analysis. To address this limitation,we propose a stochastic delay⁃tolerant robust VSSLMS algorithm. The proposed method leverages two key advantages of the Squareplus function. Firstly,it's inherently smooth,which stabilizes gradient estimation under time⁃delayed conditions. Secondly,it's capable to suppress nonlinear interference arising from multiple types of noise distributions. We theoretically analyze the algorithm's mean square error (MSE) and steady⁃state MSE to evaluate its performance. Furthermore,system identification experiments are conducted via simulation to verify the effectiveness of the proposed algorithm. The experimental results align well with the theoretical analysis and demonstrate superior performance compared to existing adaptive filtering algorithms. Consequently,the proposed algorithm not only achieves better steady⁃state performance but also exhibits enhanced robustness in the presence of stochastic time delays and diverse types of measurement noises.
