A library created to revitalize C++ as a machine learning front end. Per aspera ad astra.
Go to file
2022-01-16 00:38:33 -08:00
MLPP adam optimizer for neural nets. extra tensor operations. etc. 2022-01-16 00:38:33 -08:00
SharedLib added new optimizers. fixed isnan. 2022-01-12 18:25:49 -08:00
.DS_Store Dual formulation of SVM [LINEAR KERNEL ONLY, BATCH GD ONLY] 2021-12-31 18:22:44 -08:00
a.out adam optimizer for neural nets. extra tensor operations. etc. 2022-01-16 00:38:33 -08:00
buildSO.sh Added MANN (multidimensional output artifical neural net), cleaned up code for ANN and MANN 2021-06-05 22:28:23 -07:00
cover_gif.gif Added cover GIF to readme 2021-05-27 18:52:51 -07:00
main.cpp adam optimizer for neural nets. extra tensor operations. etc. 2022-01-16 00:38:33 -08:00
README.md Update README.md 2021-12-31 18:29:56 -08:00

ML++

Machine learning is a vast and exiciting discipline, garnering attention from specialists of many fields. Unfortunately, for C++ programmers and enthusiasts, there appears to be a lack of support in the field of machine learning. To fill that void and give C++ a true foothold in the ML sphere, this library was written. The intent with this library is for it to act as a crossroad between low-level developers and machine learning engineers.

Installation

Begin by downloading the header files for the ML++ library. You can do this by cloning the repository and extracting the MLPP directory within it, as well as the "MLPP.so" file.

git clone https://github.com/novak-99/MLPP

After doing so, maintain the ML++ source files in a local directory and include them in this fashion:

#include "MLPP/Stat/Stat.hpp" // Including the ML++ statistics module. 

int main(){
...
}

Finally, after you have concluded creating a project, compile it using g++. Be sure to store the MLPP.so file in a local directory.

g++ main.cpp MLPP.so --std=c++17

Usage

Please note that ML++ uses the std::vector<double> data type for emulating vectors, and the std::vector<std::vector<double>> data type for emulating matricies.

Begin by including the respective header file of your choice.

#include "MLPP/LinReg/LinReg.hpp"

Next, instantiate an object of the class. Don't forget to pass the input set and output set as parameters.

LinReg model(inputSet, outputSet);

Afterwards, call the optimizer that you would like to use. For iterative optimizers such as gradient descent, include the learning rate, epoch number, and whether or not to utilize the UI pannel.

model.gradientDescent(0.001, 1000, 0);

Great, you are now ready to test! To test a singular testing instance, utilize the following function:

model.modelTest(testSetInstance);

This will return the model's singular prediction for that example.

To test an entire test set, use the following function:

model.modelSetTest(testSet);

The result will be the model's predictions for the entire dataset.

Contents of the Library

  1. Regression
    1. Linear Regression
    2. Logistic Regression
    3. Softmax Regression
    4. Exponential Regression
    5. Probit Regression
    6. CLogLog Regression
    7. Tanh Regression
  2. Deep, Dynamically Sized Neural Networks
    1. Possible Activation Functions
      • Linear
      • Sigmoid
      • Softmax
      • Swish
      • Mish
      • SinC
      • Softplus
      • Softsign
      • CLogLog
      • Logit
      • Gaussian CDF
      • RELU
      • GELU
      • Sign
      • Unit Step
      • Sinh
      • Cosh
      • Tanh
      • Csch
      • Sech
      • Coth
      • Arsinh
      • Arcosh
      • Artanh
      • Arcsch
      • Arsech
      • Arcoth
    2. Possible Loss Functions
      • MSE
      • RMSE
      • MAE
      • MBE
      • Log Loss
      • Cross Entropy
      • Hinge Loss
    3. Possible Regularization Methods
      • Lasso
      • Ridge
      • ElasticNet
    4. Possible Weight Initialization Methods
      • Uniform
      • Xavier Normal
      • Xavier Uniform
      • He Normal
      • He Uniform
      • LeCun Normal
      • LeCun Uniform
  3. Prebuilt Neural Networks
    1. Multilayer Peceptron
    2. Autoencoder
    3. Softmax Network
  4. Natural Language Processing
    1. Word2Vec (Continous Bag of Words, Skip-Gram)
    2. Stemming
    3. Bag of Words
    4. TFIDF
    5. Tokenization
    6. Auxiliary Text Processing Functions
  5. Computer Vision
    1. The Convolution Operation
    2. Max, Min, Average Pooling
    3. Global Max, Min, Average Pooling
    4. Prebuilt Feature Detectors
      • Horizontal/Vertical Prewitt Filter
      • Horizontal/Vertical Sobel Filter
      • Horizontal/Vertical Scharr Filter
      • Horizontal/Vertical Roberts Filter
  6. Principal Component Analysis
  7. Naive Bayes Classifiers
    1. Multinomial Naive Bayes
    2. Bernoulli Naive Bayes
    3. Gaussian Naive Bayes
  8. Support Vector Classification
    1. Primal Formulation (Hinge Loss Objective)
    2. Dual Formulation (Via Lagrangian Multipliers)
  9. K-Means
  10. k-Nearest Neighbors
  11. Outlier Finder (Using z-scores)
  12. Matrix Decompositions
    1. SVD Decomposition
    2. Cholesky Decomposition
      • Positive Definiteness Checker
    3. QR Decomposition
  13. Numerical Analysis
  14. Linear Algebra Module
  15. Statistics Module
  16. Data Processing Module
    1. Setting and Printing Datasets
    2. Feature Scaling
    3. Mean Normalization S
    4. Reverse One Hot Representation
  17. Utilities
    1. TP, FP, TN, FN function
    2. Precision
    3. Recall
    4. Accuracy
    5. F1 score

What's in the Works?

ML++, like most frameworks, is dynamic, and constantly changing. This is especially important in the world of ML, as new algorithms and techniques are being developed day by day. Here are a couple of things currently being developed for ML++:

- Convolutional Neural Networks

- Kernels for SVMs

- Support Vector Regression

Citations

Various different materials helped me along the way of creating ML++, and I would like to give credit to several of them here. This article by TutorialsPoint was a big help when trying to implement the determinant of a matrix, and this article by GeeksForGeeks was very helpful when trying to take the adjoint and inverse of a matrix. Lastly, I would like to thank this article by Towards Data Science which helped illustrate a practical definition of the Hinge Loss function and its gradient when optimizing with SGD.