猜你喜欢
材料信息学导论(上)

材料信息学导论(上)

书籍作者:张统一 ISBN:9787030728982
书籍语言:简体中文 连载状态:全集
电子书格式:pdf,txt,epub,mobi,azw3 下载次数:7900
创建日期:2023-05-06 发布日期:2023-05-06
运行环境:PC/Windows/Linux/Mac/IOS/iPhone/iPad/Kindle/Android/安卓/平板
内容简介

材料信息学是一门新兴的交叉学科,为在材料基因组理念下加速材料科学研究和技术发展提供了一个全新的方法。作为材料和力学学者,作者在推动材料信息学发展方面做了大量工作,在人工智能(AI)、机器学习(ML)和材料科学技术融合交叉方面,有诸多的尝试和心得体会。作者旨在写一本易懂的材料信息学简介,以进一步推动材料信息学的发展。为便于读者尽快理解和掌握材料信息学的核心内容,兼顾成书的完整性,《材料信息学导论.上,机器学习基础》分为上下两卷,上卷侧重于机器学习基础,下卷侧重于深度学习并综述材料信息学的现状及发展前景。
  本上卷共十二章,内容包括线性回归与线性分类、支持向量机、决策树和K近邻(KNN)、集成学习、贝叶斯定理和期望放大(EM)算法、符号回归、神经网络、隐型马尔可夫链、数据预处理和特征选择、可解释性机器学习,等等。叙述力求从简单明了的数学定义和物理图像出发,密切结合材料科学研究案例,给出了各种算法的详细步骤,便于读者学习和运用。

目录
Contents
Foreword
Preface
Symbols and Notations
Chapter 1 Introduction 1
References 13
Chapter 2 Linear Regression 15
2.1 Least Squares Linear Regression 15
2.2 Principal Component Analysis and Principal Component Regression 26
2.3 Least Absolute Shrinkage and Selection Operator (L1) 37
2.4 Ridge Regression (L2) 40
2.5 Elastic Net Regression 44
2.6 Multiply Task LASSO (MultiTaskLASSO) 49
Homework 52
References 53
Chapter 3 Linear Classification 55
3.1 Perceptron 57
3.2 Logistic Regression 60
3.3 Linear Discriminant Analysis 73
Homework 80
References 82
Chapter 4 Support Vector Machine 83
4.1 SVC 83
4.2 Kernel Functions 88
4.3 Soft Margin 96
4.4 SVR 102
Homework 108
References 110
Chapter 5 Decision Tree and K-Nearest-Neighbors (KNN) 112
5.1 Classification Trees 112
5.2 Regression Tree 121
5.3 K-Nearest-Neighbors (KNN) Methods 129
Homework 133
References 134
Chapter 6 Ensemble Learning 136
6.1 Boosting 137
6.1.1 AdaBoost 137
6.1.2 Gradient Boosting Machine (GBM) 145
6.1.3 eXtreme Gradient Boosting (XGBoost) 151
6.2 Bagging 153
Homework 158
References 159
Chapter 7 Bayesian Theorem and Expectation-Maximization (EM) Algorithm 160
7.1 Bayesian Theorem 160
7.2 Naive Bayes Classifier 161
7.3 Maximum Likelihood Estimation 168
7.3.1 Gaussian distribution 168
7.3.2 Weibull distribution 170
7.4 Bayesian Linear Regression 175
7.5 Expectation-Maximization (EM) Algorithm 184
7.5.1 Gaussian mixture model (GMM) 185
7.5.2 The mixture of Lorentz and Gaussian distributions 197
7.6 Gaussian Process (GP) Regression 209
Homework 219
References 219
Chapter 8 Symbolic Regression 221
8.1 Overview of Evolutionary Computation 221
8.2 Genetic Programming 223
8.3 Grammar-Guided Genetic Programming and Grammatical Evolution 225
8.4 The Application of LASSO in Symbolic Regression 234
Homework 235
References 235
Chapter 9 Neural Networks 238
9.1 Neural Networks and Perceptron 238
9.2 Back Propagation Algorithm 241
9.3 Regularization in NNs 250
9.3.1 L1 regularization 250
9.3.2 L2 regularization 257
9.4 Classification NNs 261
9.4.1 Binary classification 261
9.4.2 Multiclassification of multiply grades in a category 267
9.5 Autoencoders 272
9.5.1 Introduction 272
9.5.2 Denoising autoencoder 273
9.5.3 Sparse autoencoder 280
9.5.4 Variational autoencoder 288
Homework 311
References 312
Chapter 10 Hidden Markov Chains 313
10.1 Markov Chain 313
10.2 Stationary Markov Chain 317
10.3 Markov Chain Monte Carlo Methods 318
10.3.1 Metropolis Hastings (M-H) algorithm 320
10.3.2 Gibbs sampling algorithm 321
10.4 Calculation Methods for the Probability of Observation Sequence 325
10.4.1 Direct method 325
10.4.2 Forward method 328
10.4.3 Backward method 330
10.5 Estimation of Optimal State Sequence 332
10.5.1 Direct method 332
10.5.2 Viterbi algorithm 333
10.6 Estimation of Intrinsic Parameters—The Baum-Welch Algorithm 334
Homework 344
References 345
Chapter 11 Data Preprocessing and Feature Selection 347
11.1 Reliable Data, Normals and Anomalies 348
11.1.1 Local outlier factor 348
11.1.2 Isolated forest 352
11.1.3 One-class support vector machine 355
11.1.4 Support vector data description 361
11.2 Feature Selection 365
11.2.1 Filter approach 366
11.2.2 Wrapper approach 394
11.2.3 Embedded approach 402
Homework 408
References 408
Chapter 12 Interpretative SHAP Value and Partial Dependence Plot 410
12.1 SHapley Additive exPlanation value 410
12.2 The joint SHAP value of two features 426
12.3 Partial Dependence Plot 427
Homework 440
References 440
Appendix 1 Vector and Matrix 442
A1.1 Definition 442
A1.1.1 Vector 442
A1.1.2 Matrix 442
A1.2 Matrix Algebra 442
A1.2.1 Inverse and transpose 442
A1.2.2 Trace 443
A1.2.3 Determinant 443
A1.2.4 Eigenvalues and eigenvectors 444
A1.2.5 Singular value decomposition (SVD) 444
A1.2.6 Pseudo inverse 445
A1.2.7 Some useful identities 445
A1.3 Matrix Analysis 446
A1.3.1 Derivative of matrix 446
A1.3.2 Derivative of the determinant of a matrix 446
A1.3.3 Derivative of an inverse matrix 447
A1.3.4 Jacobian matrix and Hessian matrix 447
A1.3.5 The chain rule 447
References 447
Appendix 2 Basic Statistics 448
A2.1 Probability 448
A2.1.1 Joint probability 448
A2.1.2 Bayesian theorem and conjugation 448
A2.1.3 Probability density of continuous variables 449
A2.1.4 Quantile function 449
A2.1.5 Expectation, variance and covariance of random variables 449
A2.2 Distributions 449
A2.2.1 Bernoulli distribution 450
A2.2.2 Binomial distribution 450
A2.2.3 Poisson distribution 450
A2.2.4 Gaussian distribution 450
A2.2.5 Weibull distribution 451
A2.2.6 The chi-square (χ2) distribution and χ2-test 451
A2.2.7 Th