- Defining the Goal
- Estimating Complexity
- Machine Learning
- Machine Learning Engineering
- Machine Learning Team
- ML Cost
- ML Impact
- Model-Based vs Instance-Based Learning
- Notation
- Parameters and Hyperparameters
- Reinforcement Learning
- Semi-Supervised Learning
- Shallow Learning
- Shallow vs Deep Learning
- Supervised Learning
- Unsupervised Learning
- When to (not) Use ML
- Why ML Projects Fail
- Adaptive Synthetic Sampling Method (ADASYN)
- Adversarial Validation
- Causes of Data Leakage
- Class Weighting
- Common Problems with Data
- Concept Drift
- Data Augmentation
- Data Bias
- Data Imputation
- Data Leakage
- Data Manipulation Best Practices
- Data Noise
- Data Partitioning
- Data Sampling
- Dealing with Missing Attributes
- Distribution Shift
- Feedback Loop
- Good Data
- Homoscedasticity
- Imbalanced Data
- Interaction Data
- Multicollinearity
- Outliers
- Oversampling
- Questions about Data
- Raw and Tidy Data
- Storing Data
- Synthetic Minority Oversampling Technique (SMOTE)
- Tomek Links
- Training and Holdout Datasets
- Undersampling
- Feature Vector
- Feature Engineering
- Properties of Good Features
- Bag of Words
- Boruta
- Dimensionality Reduction
- Feature Hashing
- Feature Scaling
- Feature Selection
- Mean Encoding
- Normalization
- One-Hot Encoding
- Standardization
- Storing and Documenting Features
- Synthesizing Features
- t-SNE
- The Curse of Dimensionality
- Accuracy
- Almost Correct Prediction Error Rate
- AUC-ROC
- Baseline
- Cohen's Kappa Statistic
- Confusion Matrix
- Cross-Entropy Loss
- Cumulative Gain
- Discounted Cumulative Gain
- Entropy
- F-Score
- Gini Index
- Hinge Loss
- Ideal Discounted Cumulative Gain
- Information Gain
- Mean Absolute Error
- Mean Average Precision
- Mean Squared Error
- Median Absolute Error
- Model Performance Metrics
- Normalized Discounted Cumulative Gain
- Overfitting
- Precision
- Precision-Recall Tradeoff
- Properties of a Successful Model
- R-squared
- Recall
- Receiver Operating Characteristic
- Underfitting
- Selecting the Learning Algorithm
- AdaBoost
- AutoEncoder
- Bagging
- Clustering
- Decision Tree
- Density-Based Spatial Clustering of Applications with Noise
- Ensemble Methods
- Ensemble of Resampled Datasets
- Gaussian Mixture Models
- Gradient Boosting Machines
- Hierarchical Clustering
- Isomap
- K-Means Clustering
- k-Nearest Neighbors
- Latent Dirichlet Allocation (LDA)
- Latent Semantic Analysis (LSA)
- Linear Regression
- Logistic Regression
- Naive Bayes
- Principal Component Analysis
- Random Forest
- Singular Value Decomposition (SVD)
- Spectral Clustering
- Stacking
- Support Vector Machines
- Tokenization
- Topic Modelling
- Transfer Learning
- Backpropagation
- Convolutional Neural Networks
- Deep Learning
- Deep Learning Optimization Algorithms
- Deep Learning Strategy
- Feedforward Neural Networks
- Gradient Descent
- Handling Multiple Inputs and Outputs
- Long Short-Term Memory
- Neural Networks
- Non-convex Optimization Problems
- Parameter Initialization
- Recurrent Neural Networks
- Stochastic Gradient Descent
- Transformers
Use the sidebar to navigate the topics.