Projects

AI Docking Port Locator and Distance Regressor for the ISS

23 Feb 2025

Goals: Create a project for HackAI 2025 at OSU regarding AI docking port location and distance regression for the International Space Station.

Results: Utilizing the data, a multi-headed network (MHN) was built on the top layer of the MobileNetV3Small architecture. The MHN had three heads, each head regressing one of three values: the distance to the ISS, the x-coordinate of the docking port, or the y-coordinate of the docking port. After training, the best model was saved and then used to create a three-dimensional visualization showing the line the SpaceX Dragon capsule would need to take to dock with the ISS. An animation was made of this visualization. This visualization utilized public-access STL files of the ISS and Dragon capsule.

This project won 3rd place in the competition.

Motivation

We wanted to create an open-source docking system that can be generalized for different spacecraft. This would aid in the flight process of docking capsules to space stations and would allow for astronauts to direct their attention to other aspects outside of docking. This project also allows for the generalization of docking, showing that it would be possible to do other similar docking tasks with the right data (i.e., automated parking, ship docking, airplane-to-gate travel, etc.)

Data

The data is a set of 10,000 images of the ISS with labeled distance values and the location of docking port (albeit in a less-usable format). The data came from Kaggle. It was originally part of an AICrowd challenge.

The Model

After tuning the model, we arrived on the following MHN top layer on the MobileNetV3Small architecture.

Model Top Layer

We trained the model for 50 epochs implementing early stopping and model checkpoint callbacks to stop the training early if the validation loss stopped decreasing (started to overfit), and to only save the best model based on validation loss. Since this was an MHN, we had three loss functions to minimize, one for each head. We chose MAE for all three heads because we are working with distances. The total loss of the model was a weighted sum of these three individual losses, the weights being tuned through trial-and-error. The result was the following loss graph:

Loss Graph

Visualization

Training the model yielded the best model to use for visualization and prediction. Utilizing public-access STL files of the ISS and SpaceX Dragon capsule, we were able to create a 3D visualization in python of the capsule and ISS in 3D-space, and the line the capsule needs to take to dock, using the predictions from the model. Using this visualization, we made an animation showcasing the results of the model. The animation is shown below.

Visualization Animation

See the full project on Github

Book Recommendation System Using Goodreads Dataset

12 Dec 2024

Goals: Create an AI book recommendation system based on the Goodreads raw review dataset for spoiler detection.

Results: See below. Project work is still ongoing

Motivation

This project was part of a Physics 5680 (Big Data Analytics for Physics) class. The project was to create an AI book recommendation system based on the Goodreads dataset. This was a pre-defined project choice out of many from the course. Below is a pipeline of the project.

Pipeline

Report

A report was required as part of the project. The report outlines all aspects of the project. It is attached below. The report includes information like the training of BERT-Tiny, the confusion matrix, the pipeline, the collaborative filtering, and the results of the project.

Project IPYNB

Below is the PDF version of the Jupyter Notebook creating the project. It is attached below for viewing.

See the full project on Github

Dementia Classification AI

28 Sep 2024

Goals: Revisit the project for OSU Hack AI 2024 and make the model not overfit.

Results: Stacked layers of 61 images of an MRI scan into 3D tensors and build a Tensorflow functional model and trained it using GPU acceleration with an NVIDIA RTX 3060 GPU for a test accuracy of 97.31%.

Motivation

We wanted to create a classification model that would be able to detect if a patient has dementia based on a set of 61 layers of an MRI scan. If successful, this model could be a support tool for doctors to detect if a patient has dementia. It may not be the definitive way to detect if a patient has dementia, but it could be helpful for doctors to see what the model thinks the patient has and with what confidence.

Data

The data comes from the OASIS Alzheimer’s dataset, a public dataset consisting of 80,000 MRI images. Since 61 MRI images are for a single patient, it means we have over 1300 patients worth of data in this dataset. The data was downloaded from Kaggle, but can also be accessed through the OASIS website.

The HackAI 2024 Model

As a part of the OSU AI Club Hack AI 2024, this project was selected by the team to try in complete in the 24-hour hackathon. We downsized the OASIS images and converted them to grayscale to decrease training time and complexity. We stacked each of the 61 MRI images per patient into a 3D tensor as a numpy ndarray and saved the patient’s 3D MRI scan as an .npz file. We then amplified some of the moderate dementia and mild dementia samples through duplication, as there weren’t many patients with that classification. We then created a Tensorflow functional model to train on the sample data.

Although we had a good training accuracy, we had a relatively low testing accuracy (indicating overfitting). Unfortunately, due to not having access to a GPU at the hackathon, we could not GPU accelerate the model, and we did not have enough time to test a simpler model. The training accuracy was in the high 90% range, but the testing accuracy hovered around high 70%.

Although this repository is being integrated with the Hack AI 2024 repository (meaning the files here will now also be there), the older version of the model can still be found by looking back in the commits here.

This Model & Results

We revisited this model to try and create a better model. We continued to use the stacked 3D tensors. The creation file for them is located in the npz_generation.py file.

We created a simpler Tensorflow functional model and GPU accelerated with an NVIDIA RTX 3060 to decrease training time. We added training callbacks of ModelCheckpoint and EarlyStopping to save the best possible model. We got a training accuracy of 100% (it isn’t really 100%, it is being rounded) with a testing accuracy of 97.31%, indicating that our model makes much better generalizations this time around.

Loss and accuracy graphs:

Accuracy Graph Loss Graph

Confusion matrix:

Confusion Matrix

See the full project on Github

Starr Programming Language

12 Aug 2024

Goals: Create a custom program language written in C++, Flex, and Bison, using LLVM and CMake.

Results: Work on this project is ongoing, but the preliminary basic phase has been completed.

Example Syntax:

int one(int a) {
  int x = a * 5
  return x + 3
}

int two() {
  return 5 % 3
}

out(one(12)) // Outputs 63
out(two()) // Outputs 2

See the current project on Github

Dot on Wheel Animation

20 Jul 2023

Goals: Use Mathematica to create an animation of a dot on a wheel that is rolling at a constant angular velocity.

Problem:

A wheel with radius R has a constant angular velocity in the -z direction (so the wheel rolls in +x). There is a dot on the wheel a distance p away from the center where p>0. We wish for R>p, but we do explore when p>R and even when p=R. We wish to animate the motion of this dot.

Result:

We go through many different results and iterations of animation. The entire process is on the Github tied to the project. This was the final result for R=2, angular velocity of -2, and p=1, and for values of p=2 and p=4.

p=1: Result Gif

p=2: Result Gif 2

p=4: Result Gif 3

_{Note that the original project used a .mov file. This resulting .gif is of lower quality than the original video}

See the full project on Github

Graphene Unsupervised AI Clustering

14 Jul 2023

Goals: Create a custom Graphene layer identification model through AI.

Results: Incomplete, but we get are getting good results for preliminary results. The current work on the project has stalled.

Example Output:

Graphene Clustered

See the current project on Github

_{There was a poster presentation associated with this. For discretion regarding the others involved, it has not been included on this website}

Chocolate - Vanilla Minecraft Extension

23 Jun 2023

Goals: Use FabricAPI and Gradle to implement custom behavior into popular game, Minecraft. Work with multiple collaborators through the project.

Implementations:

This project implements a lot. To see the custom behavior, check the wiki of the repository

Addition Example:

Quartz Spike

See the full project on Github

House Price AI, Revisited

22 Jun 2023

Goals: Revisit Scikit-Learn House Price predicition AI, and redo it with Tensorflow and excluding location data.

Results: Model with an r² of 0.9946 and is generalized compared to the previous one.

Process:

Use Pandas to read a csv into a DataFrame. Drop all date and location information and ensure we are only working with numerical data for regression. Deal with null values accordingly, and then split the data into training and testing sets. Define a model that normalizes the data and then trains on 4 ReLU Dense layers of 512 filters with L2 Regularization, and then finish with a singular filter Dense layer. Train the model with a 0.001 learning rate Adam optimzer, using MAE loss. Train the model, and compare the predictions to the true values.

Project File:

See the full project on Github

Hemoglobin Binding Project

17 Apr 2023

Goals: Use Jupyter Notebook, LaTeX, and a Matlab Kernel to do data fitting on experimental hemoglobin binding data. Curve fit different models to the data, including the "Non-Cooperative", Pauling, and Adair models. Run a Monte Carlo simulation on the Adair model, but use brute force for the "Non-Cooperative" and Pauling models. Learn how to derive the models using the grand partition function. Create a poster and present it in LaTeX. Part of Polaris Program at Ohio State.

Results: We see that experimentally that hemoglobin oxygen particles are cooperative to some extent.

Project File:

Derivations:

Poster (Obscured):

See the full project on Github

PG Stock Price Prediction RNN w/Tensorflow

27 Jan 2023

Goals: Learn about Neural Networks and RNNs. Get comfortable with Tensorflow. Predict Proctor & Gamble Stock (closing) price.

Results: A model that is pretty close to the true values of the stock price range predicted (matches shape well, exact values not attained).

Process:

Collect csv data into a Pandas DataFrame. Enumerate the data to make it all numbers and then visualize the data to see what we’re working with. We then reshape and compile the closing price data and do the same to the high and low price values. We concatenate that usiny NumPy into one array and scale the data. We split the data into training and testing sets, and then using TensorFlow and Keras to generate a Sequential model with a LSTM and Dense layer. After compiling it and training it on the training set, we then predict the stock price closing value of the test set, and then graph the results to inspect the accuracy.

Snippets:

PG Stock Price Graph: PG Price

The RNN Summary: PG Price

The Prediction: PG Price

The Prediction w/ More Data: PG Price

Full Scope of Prediction (Vertical Line Where Prediction Starts): PG Price

See the full project on Github

Scikit-Learn House Price AI

28 Nov 2022

Goals: Get accustomed to Jupyter Notebooks, Scikit-Learn, and simple regression AI modeling. Learn concepts such as normalization, imputation, enumeration, the foundations of CRISP-DM, and the basics of AI modeling.

Results: Model with an r² of 0.9999968, but is not general to any house due to location data being factored into the model.

Process:

Use Pandas to read a csv into a DataFrame. Enumerate the data to get a frame with only numbers. Check for unusable data and use imputation, if needed, to insert data. After inspecting graphs of the data, normalize the data and filter it accordingly. Split the data into training and test sets, and then model the data using KNeighborsRegressor model and train the data. Then, we predict on the test set and measure the error. Finally, we fiddle with the model a bit to find the most accurate one, and then we’re done.

Snippets:

Information Gain on Parameters (VarianceThreshold Not Pictured):

The r² Values: R-Squared Values

The Final Model:

See the full project on Github

PandoraPvP

13 Jul 2022

About: Pandora PvP is a collection of Spigot Minecraft plugins for a designated modded Minecraft public server. The plugins ranged from economy, to utility, and moderation. My personal work was on the development side, where I worked on server optimizations regarding instant block placement, world border creation, staff command logs, and moderator utility commands. Although my time on the project only spanned a couple months, I enjoyed the time I worked on the project.

What is Spigot?: Minecraft is a video game made in Java (for non-console systems). Spigot is one of many server-hosting options for Minecraft, like Bukkit, Paper, Yatopia, and more. Spigot allows for programmers to create plugins that are loaded with the server as enhancements. Using Spigot’s API, more optimizations and customizability are available for developers.

This project helped me learn how to organize many tasks into smaller projects, and taught me how to work with more complicated programming systems, like APIs, documentation, and Maven.

See the full project on Github

Discord Bot Applications

28 May 2020

About: Discord is a messaging platform with many quality-of-life features. Any developer can create a Discord Bot to add enhancements to the application, which can be fun things (like games) or more useful things (moderation abilities). Starting May 2020, I worked on a Discord Bot called Source Code. In the future, I worked on a larger bot called Psyduck. Both bots were for personal use only, and never became public verified Discord Bots.

Through this project I learned a lot about Python development and asynchronous programming. Additionally, I learned about cloud hosting environments like Heroku and Google Cloud VM to host the bot 24/7. I worked a lot with JSON and SQLite as well to store important data. This project help fuel my love for computer science and also taught me how to handle a large-scale project.

Screenshots:
Uptime Bug Ping Status

See an obscured version of one of my bots (Psyduck) on Github