南京大学学报(自然科学版) ›› 2021, Vol. 57 ›› Issue (4): 627640.doi: 10.13232/j.cnki.jnju.2021.04.011
• • 上一篇
Zhiliang Yan, Zhipeng Feng, Dan Liu, Huiqing Wang()
摘要:
赖氨酸乙酰化(Lysine acetylation,Kace)普遍存在于人体代谢酶中,与多种代谢疾病密切相关,因此准确识别该位点对于代谢疾病治疗的研究具有重要意义.现有的Kace位点预测方法大多采用蛋白质序列层面的信息作为输入,蛋白质结构特性考虑不全面;特征提取时未关注氨基酸残基间顺序相关性,信息丢失严重,降低了预测准确度.提出一种新的Kace位点预测深度学习CL?Kace模型.CL?Kace引入蛋白质结构特性,并与蛋白质原始序列、氨基酸理化属性共同构建位点特征空间,采用卷积神经网络(Convolutional Neural Network,CNN)提取特征;引入双向长短期记忆(Bidirectional Long Short?Term Memory,BiLSTM)网络捕获残基间的顺序依赖关系,以提高网络的抽象能力,识别潜在的Kace位点.实验结果表明,CL?Kace模型优于现有的Kace位点预测器,能够有效地预测潜在的位点.
中图分类号:
1 | Liu Y,Wang M H,Xi J N,et al. PTM?ssMP:A web server for predicting different types of post?translational modification sites using novel site?specific modification profile. International Journal of Biological Sciences,2018,14(8):946-956. |
2 | Wang D L,Liang Y C,Xu D. Capsule network for protein post?translational modification site prediction. Bioinformatics,2019,35(14):2386-2394. |
3 | Khoury G A,Baliban R C,Floudas C A. Proteome?wide post?translational modification statistics:Frequency analysis and curation of the swiss?prot database. Scientific Reports,2011,1:90. |
4 | Nallamilli B R R,Edelmann M J,Zhong X X,et al. Global analysis of lysine acetylation suggests the involvement of protein acetylation in diverse biological processes in rice (Oryza sativa). PLoS One,2014,9(2):e89283. |
5 | 朱志坚,王兵,葛玮等. 血清组蛋白去乙酰化酶3对稳定性冠心病患者经皮冠状动脉介入治疗术后主要心血管不良事件的预测价值. 中国医师进修杂志,2020,43(10):939-943. |
Zhu Z J,Wang B,Ge W,et al. Predictive value of serum histone deacetylase 3 on major adverse cardiovascular events in patients with stable coronary artery disease after percutaneous coronary intervention. Chinese Journal of Postgraduates of Medicine,2020,43(10):939-943. | |
6 | Shao J L,Xu D,Hu L D,et al. Systematic analysis of human lysine acetylation proteins and accurate prediction of human lysine acetylation through bi?relative adapted binomial score Bayes feature representation. Molecular BioSystems,2012,8(11):2964-2973. |
7 | Deng W K,Wang C W,Zhang Y,et al. GPS?PAIL:Prediction of lysine acetyltransferase?specific modification sites from protein sequences. Scientific Reports,2016(6):39787. |
8 | Butler C A,Veith P D,Nieto M F,et al. Lysine acetylation is a common post?translational modification of key metabolic pathway enzymes of the anaerobe Porphyromonas gingivalis. Journal of Proteomics,2015(128):352-364. |
9 | Zhao S M,Xu W,Jiang W Q,et al. Regulation of cellular metabolism by protein lysine acetylation. Science,2010,327(5968):1000-1004. |
10 | Li A,Xue Y,Jin C J,et al. Prediction of Nε?acetylation on internal lysines implemented in bayesian discriminant method. Biochemical and Biophysical Research Communications,2006,350(4):818-824. |
11 | Lee T Y,Hsu J B K,Lin F M,et al. N?Ace:Using solvent accessibility and physicochemical properties to identify protein N?acetylation sites. Journal of Computational Chemistry,2010,31(15):2759-2771. |
12 | 施绍萍,索生宝,邱建丁. 组合二级结构信息预测赖氨酸甲基化和乙酰化∥中国化学会第28届学术年会论文集. 成都:中国化学会,2012:1. |
Shi S P,Suo S B,Qiu J D. Incorporating secondary structure for identification of lysine methylation and lysine acetylation∥Proceedings of the 28th Annual Conference of the Chinese Chemical Society. Chengdu,China:Chinese Chemical Society,2012:1. | |
13 | Xu Y,Wang X B,Ding J,et al. Lysine acetylation sites prediction using an ensemble of support vector machine classifiers. Journal of Theoretical Biology,2010,264(1):130-135. |
14 | 索生宝,孙兴玉,邱建丁. 结合多特征算法和信息熵预测蛋白质乙酰化位点∥第十一届全国计算(机)化学学术会议论文集. 兰州:中国化学会,2011:51. |
Suo S B,Sun X Y,Qiu J D. Combining Multi?feature algorithm and information entropy to analyze protein lysine acetylation∥Proceedings of the 11th National Conference on Computational Chemistry of the Chinese Chemical Society. Lanzhou,China:Chinese Chemical Society,2011:51. | |
15 | Gnad F,Gunawardena J,Mann M. PH |
OSIDA2011:The posttranslational modification database. Nucleic Acids Research,2011,39(S1):D253-D260. | |
16 | Chen G D,Cao M,Luo K,et al. ProAcePred:prokaryote lysine acetylation sites prediction based on elastic net feature optimization. Bioinformatics,2018,34(23):3999-4006. |
17 | Hou T,Zheng G Y,Zhang P Y,et al. LAceP:Lysine acetylation site prediction using logistic regression classifiers. PLoS One,2014,9(2):e89575. |
18 | Lu Z K,Cheng Z Y,Zhao Y M,et al. Bioinformatic analysis and post?translational modification crosstalk prediction of lysine acetylation. PLoS One,2011,6(12):e28228. |
19 | Heffernan R,Yang Y D,Paliwal K,et al. Capturing non?local interactions by long short?term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure,backbone angles,contact numbers and solvent accessibility. Bioinformatics,2017,33(18):2842-2849. |
20 | Reddy H M,Sharma A,Dehzangi A,et al. GlyStruct:Glycation prediction using structural properties of amino acid residues. BMC Bioinformatics,2019,19(S13):547. |
21 | López Y,Sharma A,Dehzangi A,et al. Success:Evolutionary and structural properties of amino acids prove effective for succinylation site prediction. BMC Genomics,2018,19(S1):923. |
22 | Chandra A,Sharma A,Dehzangi A,et al. Phoglystruct:Prediction of phosphoglycerylated lysine residues using structural properties of amino acids. Scientific Reports,2018(8):17923. |
23 | Wang D L,Zeng S,Xu C H,et al. MusiteDeep:A deep?learning framework for general and kinase?specific phosphorylation site prediction. Bioinformatics,2017,33(24):3909-3916. |
24 | He F,Wang R,Li J G,et al. Large?scale prediction of protein ubiquitination sites using a multimodal deep architecture. BMC Systems Biology,2018,12(S6):109. |
25 | Long H X,Liao B,Xu X Y,et al. A hybrid deep learning model for predicting protein hydroxylation sites. International Journal of Molecular Sciences,2018,19(9):2817. |
26 | Guo Y B,Li W H,Wang B Y,et al. DeepACLSTM:Deep asymmetric convolutional long short?term memory neural models for protein secondary structure prediction. BMC Bioinformatics,2019(20):341. |
27 | Luo F L,Wang M H,Liu Y,et al. DeepPhos:Prediction of protein phosphorylation sites with deep learning. Bioinformatics,2019,35(16):2766-2773. |
28 | Kiemer L,Bendtsen J D,Blom N. NetAcet:Prediction of N?terminal acetylation sites. Bioinformatics,2005,21(7):1269-1270. |
29 | Wu M Q,Yang Y X,Wang H,et al. A deep learning method to more accurately recall known lysine acetylation sites. BMC Bioinformatics,2019,20(1):49. |
30 | Atchley W R,Zhao J P,Fernandes A D,et al. Solving the protein sequence metric problem. Proceedings of the National Academy of Sciences of the United States of America,2005,102(18):6395-6400. |
31 | Xu H D,Zhou J Q,Lin S F,et al. PLMD:An updated data resource of protein lysine modifications. Journal of Genetics and Genomics,2017,44(5):243-250. |
32 | Huang Y,Niu B F,Gao Y,et al. CD?HIT Suite:A web server for clustering and comparing biological sequences. Bioinformatics,2010,26(5):680-682. |
33 | Chicco D,Jurman G. The advantages of the matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics,2020,21(1):6. |
34 | He H B,Garcia E A. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering,2009,21(9):1263-1284. |
35 | Van Der Maaten L,Hinton G. Visualizing data using t?SNE. Journal of Machine Learning Research,2008(9):2579-2605. |
36 | Ma J Z,Yu M K,Fong S,et al. Using deep learning to model the hierarchical structure and function of a cell. Nature Methods,2018,15(4):290-298. |
37 | Chen L,Zhang H W,Xiao J,et al. SCA?CNN:Spatial and channel?wise attention in convolutional networks for image captioning∥Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,HI,USA:IEEE,2017:5659-5667. |
[1] | 段建设, 崔超然, 宋广乐, 马乐乐, 马玉玲, 尹义龙. 基于多尺度注意力融合的知识追踪方法[J]. 南京大学学报(自然科学版), 2021, 57(4): 591-598. |
[2] | 方志文, 刘青山, 周峰. 基于像素⁃目标级共生关系学习的多标签航拍图像分类方法[J]. 南京大学学报(自然科学版), 2021, 57(2): 208-216. |
[3] | 范习健, 杨绪兵, 张礼, 业巧林, 业宁. 一种融合视觉和听觉信息的双模态情感识别算法[J]. 南京大学学报(自然科学版), 2021, 57(2): 309-317. |
[4] | 高春永, 柏业超, 王琼. 基于改进的半监督阶梯网络SAR图像识别[J]. 南京大学学报(自然科学版), 2021, 57(1): 160-166. |
[5] | 李一凡, 朱斐, 凌兴宏, 刘全. 具有窗口结构Bi⁃LSTM网络的心电图QRS波检测方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 42-51. |
[6] | 潘越,王骏,李文飞,张建,王炜. 基于卷积神经网络的蛋白质折叠类型最小特征提取[J]. 南京大学学报(自然科学版), 2020, 56(5): 744-753. |
[7] | 梅志伟,王维东. 基于FPGA的卷积神经网络加速模块设计[J]. 南京大学学报(自然科学版), 2020, 56(4): 581-590. |
[8] | 朱伟,张帅,辛晓燕,李文飞,王骏,张建,王炜. 结合区域检测和注意力机制的胸片自动定位与识别[J]. 南京大学学报(自然科学版), 2020, 56(4): 591-600. |
[9] | 赵子龙,赵毅强,叶茂. 基于FPGA的多卷积神经网络任务实时切换方法[J]. 南京大学学报(自然科学版), 2020, 56(2): 167-174. |
[10] | 罗春春,郝晓燕. 基于双重注意力模型的微博情感倾向性分析[J]. 南京大学学报(自然科学版), 2020, 56(2): 236-243. |
[11] | 王吉地,郭军军,黄于欣,高盛祥,余正涛,张亚飞. 融合依存信息和卷积神经网络的越南语新闻事件检测[J]. 南京大学学报(自然科学版), 2020, 56(1): 125-131. |
[12] | 狄 岚, 何锐波, 梁久祯. 基于可能性聚类和卷积神经网络的道路交通标识识别算法[J]. 南京大学学报(自然科学版), 2019, 55(2): 238-250. |
[13] | 胡 太, 杨 明. 结合目标检测的小目标语义分割算法[J]. 南京大学学报(自然科学版), 2019, 55(1): 73-84. |
[14] | 安 晶, 艾 萍, 徐 森, 刘 聪, 夏建生, 刘大琨. 一种基于一维卷积神经网络的旋转机械智能故障诊断方法[J]. 南京大学学报(自然科学版), 2019, 55(1): 133-142. |
[15] | 梁蒙蒙1,周 涛1,2*,夏 勇3,张飞飞1,杨 健1. 基于随机化融合和CNN的多模态肺部肿瘤图像识别[J]. 南京大学学报(自然科学版), 2018, 54(4): 775-. |
|