6.9 KiB
PMLPP
This is an experimental Machine Learning module for the Pandemonium Engine.
Based on https://github.com/novak-99/MLPP .
Status
- Reworked codestyle to match the engine's.
- Reworked classes to work well with the engine.
- Added custom MLPPVector and MLPPMatrix that store data in a single continuous array. Their downside is push back, and remove uses realloc().
- Added binds for most of the classes.
- Added unit tests.
Todos
Saves
Reimplement saving.
Bind remaining methods
Go through and bind all methods. Also add properties as needed.
Add initialization api to all classes that need it
The original library used contructors to initialize everything, but with the engine scripts can't rely on this, make sure all classes have initializations apis, and they bail out when they are in an uninitialized state.
Rework remaining apis.
Rework and bind the remaining apis, so they can be used from scripts.
Error handling
Make error macros usage consistent. Also a command line option should be available that disables them for math operations.
Crashes
There are still likely lots of crashes, find, and fix them.
Unit tests
- Add more unit tests
- Also use the engine's own unit test module. It still needs to be fininshed, would be a good idea doing it alongside this modules's tests.
- They should only be built when you want them. Command line option:
mlpp_tests=yes
Old classes
- Remove all old classes once they are no longer needed.
- Old classes should only be built when you want them. Command line option:
mlpp_tests=yes
or maybe something else? - Make old classes use old utilities.
- Remove old apis from new classes that has been ported.
std::random
Replace remaining std::random usage with engine internals.
Tensor
Add a tensor class. Same as MLPPVector and MLPPMatrix, except it's n-d.
More algos
Add more machine learning algorithms.
Contents of the Library
- Regression
- Linear Regression
- Logistic Regression
- Softmax Regression
- Exponential Regression
- Probit Regression
- CLogLog Regression
- Tanh Regression
- Deep, Dynamically Sized Neural Networks
- Possible Activation Functions
- Linear
- Sigmoid
- Softmax
- Swish
- Mish
- SinC
- Softplus
- Softsign
- CLogLog
- Logit
- Gaussian CDF
- RELU
- GELU
- Sign
- Unit Step
- Sinh
- Cosh
- Tanh
- Csch
- Sech
- Coth
- Arsinh
- Arcosh
- Artanh
- Arcsch
- Arsech
- Arcoth
- Possible Optimization Algorithms
- Batch Gradient Descent
- Mini-Batch Gradient Descent
- Stochastic Gradient Descent
- Gradient Descent with Momentum
- Nesterov Accelerated Gradient
- Adagrad Optimizer
- Adadelta Optimizer
- Adam Optimizer
- Adamax Optimizer
- Nadam Optimizer
- AMSGrad Optimizer
- 2nd Order Newton-Raphson Optimizer*
- Normal Equation*
- Possible Loss Functions
- MSE
- RMSE
- MAE
- MBE
- Log Loss
- Cross Entropy
- Hinge Loss
- Wasserstein Loss
- Possible Regularization Methods
- Lasso
- Ridge
- ElasticNet
- Weight Clipping
- Possible Weight Initialization Methods
- Uniform
- Xavier Normal
- Xavier Uniform
- He Normal
- He Uniform
- LeCun Normal
- LeCun Uniform
- Possible Learning Rate Schedulers
- Time Based
- Epoch Based
- Step Based
- Exponential
- Possible Activation Functions
- Prebuilt Neural Networks
- Multilayer Peceptron
- Autoencoder
- Softmax Network
- Generative Modeling
- Tabular Generative Adversarial Networks
- Tabular Wasserstein Generative Adversarial Networks
- Natural Language Processing
- Word2Vec (Continous Bag of Words, Skip-Gram)
- Stemming
- Bag of Words
- TFIDF
- Tokenization
- Auxiliary Text Processing Functions
- Computer Vision
- The Convolution Operation
- Max, Min, Average Pooling
- Global Max, Min, Average Pooling
- Prebuilt Feature Detectors
- Horizontal/Vertical Prewitt Filter
- Horizontal/Vertical Sobel Filter
- Horizontal/Vertical Scharr Filter
- Horizontal/Vertical Roberts Filter
- Gaussian Filter
- Harris Corner Detector
- Principal Component Analysis
- Naive Bayes Classifiers
- Multinomial Naive Bayes
- Bernoulli Naive Bayes
- Gaussian Naive Bayes
- Support Vector Classification
- Primal Formulation (Hinge Loss Objective)
- Dual Formulation (Via Lagrangian Multipliers)
- K-Means
- k-Nearest Neighbors
- Outlier Finder (Using z-scores)
- Matrix Decompositions
- SVD Decomposition
- Cholesky Decomposition
- Positive Definiteness Checker
- QR Decomposition
- Numerical Analysis
- Numerical Diffrentiation
- Univariate Functions
- Multivariate Functions
- Jacobian Vector Calculator
- Hessian Matrix Calculator
- Function approximator
- Constant Approximation
- Linear Approximation
- Quadratic Approximation
- Cubic Approximation
- Diffrential Equations Solvers
- Euler's Method
- Growth Method
- Numerical Diffrentiation
- Mathematical Transforms
- Discrete Cosine Transform
- Linear Algebra Module
- Statistics Module
- Data Processing Module
- Setting and Printing Datasets
- Available Datasets
- Wisconsin Breast Cancer Dataset
- Binary
- SVM
- MNIST Dataset
- Train
- Test
- Iris Flower Dataset
- Wine Dataset
- California Housing Dataset
- Fires and Crime Dataset (Chicago)
- Wisconsin Breast Cancer Dataset
- Feature Scaling
- Mean Normalization
- One Hot Representation
- Reverse One Hot Representation
- Supported Color Space Conversions
- RGB to Grayscale
- RGB to HSV
- RGB to YCbCr
- RGB to XYZ
- XYZ to RGB
- Utilities
- TP, FP, TN, FN function
- Precision
- Recall
- Accuracy
- F1 score
Citations
Various different materials helped me along the way of creating ML++, and I would like to give credit to several of them here. This article by TutorialsPoint was a big help when trying to implement the determinant of a matrix, and this article by GeeksForGeeks was very helpful when trying to take the adjoint and inverse of a matrix.