Projects



    Book Recommendation System Using Goodreads Dataset

    Goals: Create an AI book recommendation system based on the Goodreads raw review dataset for spoiler detection.

    Results: See below. Project work is still ongoing

    Motivation

    This project was part of a Physics 5680 (Big Data Analytics for Physics) class. The project was to create an AI book recommendation system based on the Goodreads dataset. This was a pre-defined project choice out of many from the course. Below is a pipeline of the project.

    Pipeline

    Report

    A report was required as part of the project. The report outlines all aspects of the project. It is attached below. The report includes information like the training of BERT-Tiny, the confusion matrix, the pipeline, the collaborative filtering, and the results of the project.

    Project IPYNB

    Below is the PDF version of the Jupyter Notebook creating the project. It is attached below for viewing.

    See the full project on Github

    Dementia Classification AI

    Goals: Revisit the project for OSU Hack AI 2024 and make the model not overfit.

    Results: Stacked layers of 61 images of an MRI scan into 3D tensors and build a Tensorflow functional model and trained it using GPU acceleration with an NVIDIA RTX 3060 GPU for a test accuracy of 97.31%.

    Motivation

    We wanted to create a classification model that would be able to detect if a patient has dementia based on a set of 61 layers of an MRI scan. If successful, this model could be a support tool for doctors to detect if a patient has dementia. It may not be the definitive way to detect if a patient has dementia, but it could be helpful for doctors to see what the model thinks the patient has and with what confidence.


    Data

    The data comes from the OASIS Alzheimer’s dataset, a public dataset consisting of 80,000 MRI images. Since 61 MRI images are for a single patient, it means we have over 1300 patients worth of data in this dataset. The data was downloaded from Kaggle, but can also be accessed through the OASIS website.


    The HackAI 2024 Model

    As a part of the OSU AI Club Hack AI 2024, this project was selected by the team to try in complete in the 24-hour hackathon. We downsized the OASIS images and converted them to grayscale to decrease training time and complexity. We stacked each of the 61 MRI images per patient into a 3D tensor as a numpy ndarray and saved the patient’s 3D MRI scan as an .npz file. We then amplified some of the moderate dementia and mild dementia samples through duplication, as there weren’t many patients with that classification. We then created a Tensorflow functional model to train on the sample data.

    Although we had a good training accuracy, we had a relatively low testing accuracy (indicating overfitting). Unfortunately, due to not having access to a GPU at the hackathon, we could not GPU accelerate the model, and we did not have enough time to test a simpler model. The training accuracy was in the high 90% range, but the testing accuracy hovered around high 70%.

    Although this repository is being integrated with the Hack AI 2024 repository (meaning the files here will now also be there), the older version of the model can still be found by looking back in the commits here.


    This Model & Results

    We revisited this model to try and create a better model. We continued to use the stacked 3D tensors. The creation file for them is located in the npz_generation.py file.

    We created a simpler Tensorflow functional model and GPU accelerated with an NVIDIA RTX 3060 to decrease training time. We added training callbacks of ModelCheckpoint and EarlyStopping to save the best possible model. We got a training accuracy of 100% (it isn’t really 100%, it is being rounded) with a testing accuracy of 97.31%, indicating that our model makes much better generalizations this time around.

    Loss and accuracy graphs:

    Accuracy Graph Loss Graph

    Confusion matrix:

    Confusion Matrix

    See the full project on Github

    Starr Programming Language

    Goals: Create a custom program language written in C++, Flex, and Bison, using LLVM and CMake.

    Results: Work on this project is ongoing, but the preliminary basic phase has been completed.

    Example Syntax:

    int one(int a) {
      int x = a * 5
      return x + 3
    }
    
    int two() {
      return 5 % 3
    }
    
    out(one(12)) // Outputs 63
    out(two()) // Outputs 2
    

    See the current project on Github

    Dot on Wheel Animation

    Goals: Use Mathematica to create an animation of a dot on a wheel that is rolling at a constant angular velocity.

    Problem:

    A wheel with radius R has a constant angular velocity in the -z direction (so the wheel rolls in +x). There is a dot on the wheel a distance p away from the center where p>0. We wish for R>p, but we do explore when p>R and even when p=R. We wish to animate the motion of this dot.

    Result:

    We go through many different results and iterations of animation. The entire process is on the Github tied to the project. This was the final result for R=2, angular velocity of -2, and p=1, and for values of p=2 and p=4.

    p=1: Result Gif

    p=2: Result Gif 2

    p=4: Result Gif 3

    Note that the original project used a .mov file. This resulting .gif is of lower quality than the original video

    See the full project on Github

    Graphene Unsupervised AI Clustering

    Goals: Create a custom Graphene layer identification model through AI.

    Results: Incomplete, but we get are getting good results for preliminary results. The current work on the project has stalled.

    Example Output:

    Graphene Clustered

    See the current project on Github

    There was a poster presentation associated with this. For discretion regarding the others involved, it has not been included on this website

    Chocolate - Vanilla Minecraft Extension

    Goals: Use FabricAPI and Gradle to implement custom behavior into popular game, Minecraft. Work with multiple collaborators through the project.

    Implementations:

    This project implements a lot. To see the custom behavior, check the wiki of the repository

    Addition Example:

    Quartz Spike

    See the full project on Github

    House Price AI, Revisited

    Goals: Revisit Scikit-Learn House Price predicition AI, and redo it with Tensorflow and excluding location data.

    Results: Model with an r2 of 0.9946 and is generalized compared to the previous one.

    Process:

    Use Pandas to read a csv into a DataFrame. Drop all date and location information and ensure we are only working with numerical data for regression. Deal with null values accordingly, and then split the data into training and testing sets. Define a model that normalizes the data and then trains on 4 ReLU Dense layers of 512 filters with L2 Regularization, and then finish with a singular filter Dense layer. Train the model with a 0.001 learning rate Adam optimzer, using MAE loss. Train the model, and compare the predictions to the true values.

    Project File:

    See the full project on Github

    Hemoglobin Binding Project

    Goals: Use Jupyter Notebook, LaTeX, and a Matlab Kernel to do data fitting on experimental hemoglobin binding data. Curve fit different models to the data, including the "Non-Cooperative", Pauling, and Adair models. Run a Monte Carlo simulation on the Adair model, but use brute force for the "Non-Cooperative" and Pauling models. Learn how to derive the models using the grand partition function. Create a poster and present it in LaTeX. Part of Polaris Program at Ohio State.

    Results: We see that experimentally that hemoglobin oxygen particles are cooperative to some extent.

    Project File:

    Derivations:

    Poster (Obscured):

    See the full project on Github

    PG Stock Price Prediction RNN w/Tensorflow

    Goals: Learn about Neural Networks and RNNs. Get comfortable with Tensorflow. Predict Proctor & Gamble Stock (closing) price.

    Results: A model that is pretty close to the true values of the stock price range predicted (matches shape well, exact values not attained).

    Process:

    Collect csv data into a Pandas DataFrame. Enumerate the data to make it all numbers and then visualize the data to see what we’re working with. We then reshape and compile the closing price data and do the same to the high and low price values. We concatenate that usiny NumPy into one array and scale the data. We split the data into training and testing sets, and then using TensorFlow and Keras to generate a Sequential model with a LSTM and Dense layer. After compiling it and training it on the training set, we then predict the stock price closing value of the test set, and then graph the results to inspect the accuracy.

    Snippets:

    PG Stock Price Graph: PG Price

    The RNN Summary: PG Price

    The Prediction: PG Price

    The Prediction w/ More Data: PG Price

    Full Scope of Prediction (Vertical Line Where Prediction Starts): PG Price

    See the full project on Github

    Scikit-Learn House Price AI

    Goals: Get accustomed to Jupyter Notebooks, Scikit-Learn, and simple regression AI modeling. Learn concepts such as normalization, imputation, enumeration, the foundations of CRISP-DM, and the basics of AI modeling.

    Results: Model with an r2 of 0.9999968, but is not general to any house due to location data being factored into the model.

    Process:

    Use Pandas to read a csv into a DataFrame. Enumerate the data to get a frame with only numbers. Check for unusable data and use imputation, if needed, to insert data. After inspecting graphs of the data, normalize the data and filter it accordingly. Split the data into training and test sets, and then model the data using KNeighborsRegressor model and train the data. Then, we predict on the test set and measure the error. Finally, we fiddle with the model a bit to find the most accurate one, and then we’re done.

    Snippets:

    Information Gain on Parameters (VarianceThreshold Not Pictured): Information Gain

    The r2 Values: R-Squared Values

    The Final Model: Final Model

    See the full project on Github

    PandoraPvP

    About: Pandora PvP is a collection of Spigot Minecraft plugins for a designated modded Minecraft public server. The plugins ranged from economy, to utility, and moderation. My personal work was on the development side, where I worked on server optimizations regarding instant block placement, world border creation, staff command logs, and moderator utility commands. Although my time on the project only spanned a couple months, I enjoyed the time I worked on the project.

    What is Spigot?: Minecraft is a video game made in Java (for non-console systems). Spigot is one of many server-hosting options for Minecraft, like Bukkit, Paper, Yatopia, and more. Spigot allows for programmers to create plugins that are loaded with the server as enhancements. Using Spigot’s API, more optimizations and customizability are available for developers.

    This project helped me learn how to organize many tasks into smaller projects, and taught me how to work with more complicated programming systems, like APIs, documentation, and Maven.

    See the full project on Github

    Discord Bot Applications

    About: Discord is a messaging platform with many quality-of-life features. Any developer can create a Discord Bot to add enhancements to the application, which can be fun things (like games) or more useful things (moderation abilities). Starting May 2020, I worked on a Discord Bot called Source Code. In the future, I worked on a larger bot called Psyduck. Both bots were for personal use only, and never became public verified Discord Bots.

    Through this project I learned a lot about Python development and asynchronous programming. Additionally, I learned about cloud hosting environments like Heroku and Google Cloud VM to host the bot 24/7. I worked a lot with JSON and SQLite as well to store important data. This project help fuel my love for computer science and also taught me how to handle a large-scale project.

    Screenshots:
    Uptime Bug Ping Status

    See an obscured version of one of my bots (Psyduck) on Github

    © 2022-2024 Neil Ghugare, All Rights Reserved
    Notice | License | Changelog

    github logo
      
    linkedin logo