Projects



    CURE: Crude Traceback of IC4665 Cluster

    CURE: Class-based Undergraduate Research Experience.

    Throughout this project, the traceback of the IC4665 star cluster was investigated to see the characteristics of the cluster millions of years ago. This was done primarily by leveraging Gaia SQL queries on proper motion and “rewinding” these proper motions in the opposite direction to find the right ascension and declination of where all the stars were previously located. We also make use of Simbad data to do this traceback analysis. Our initial prediction would be that the cluster members would be closer together in the past then they are today. We do not precisely see this, as we see that the cluster throughout time tends to stay together, but some members spread out slightly in the past. This, however, can be understood due to the constraints on our traceback calculations and the fact that we are not considering any gravitational forces. We also make an animation of the cluster traceback utilizing prior knowledge and open-source code from Drs. Furnstahl and Brandenburg for Physics 5300.

    Description

    As a part of the Astronomy 3350 course (Methods of Astronomical Observation and Data Analysis), our last computational essay involved doing an actual research project (of our choosing) on the IC4665 cluster. Our group chose the “Traceback” project, which involved identifying cluster members by tracing back the motion of the cluster using Right Ascension and Declination values. Since this project was intended to be a small undergraduate research experience, the results and information was unknown.

    Theory

    The traceback of the cluster elements was done very crudely. We used public Simbad data to bound data from Gaia to the rough area of the cluster using parallax, R.A., and Dec. We used an SQL query via

    top_n = 10000  # how many values we want to fetch from the data
    
    query = """SELECT TOP %d
    source_id, ra, ra_error, dec, dec_error, parallax, parallax_error, pmra, pmra_error, pmdec, pmdec_error 
    FROM gaiadr3.gaia_source
    where parallax >= %.2f and parallax <= %.2f and ra >= %.2f
    and ra <= %.2f and dec >= %.2f and dec <= %.2f
    """ % (top_n, parallax_min, parallax_max, ra_min, ra_max, dec_min, dec_max)
    query  # Let's see what the query looks like
    

    Using Pandas, we manipulated the data, and defined how we were going to do the traceback. If \(q\) is the given R.A. or Dec. in degrees, and \(\mu_q\) is the proper motion R.A. or Dec. in mas/yr, and \(T\) is the time period in years, we can trace our result back for that coordinate value (R.A. or Dec. respectively) via the formula

    \[{\rm Traced\ Result} = \frac{(3.6 \times10^6\ {\rm mas})q - \mu_q T}{3.6\times10^6\ {\rm mas}/1^{\circ}}\]

    We created a Python function for this, where we can insert \(n\)-million years to trace the result. We then used Pandas Series analysis to create new columns of the traced motion, and also simultaneously traced the location of the cluster by using Simbad data and using the traceback function. We noted based on Wikipedia data that IC4665 is expected to be roughly 55 million years old, so we recognize that we shouldn’t go past that amount, and that 1 million year timesteps are adequate time steps. After tracing back 10 million years, we used Pandas analysis to identify members close to the cluster center, and label those as being members of the cluster. This allowed us to create the animations in the next section, separating cluster members from non-members.

    Animations

    We made animations of the system using the adapted code from Physics 5300 (Theoretical Mechanics). In doing so we can create two styles of animations of the cluster. One keeps the axes fixed, and one dynamically updates the axes to follow the cluster. The cyan elements are the cluster elements, while the rest are background elements. This is the animation with the fixed axes:

    The following is the dynamic axes animation:

    Limitations

    From looking at the above animations, we can do some more official analysis. We see that the cluster, in general, moves together throughout time which is good and what we expect. From the first animation, we can see that the cluster is a bit more clustered, but it’s not as clustered as we would generally expect. Especially, with the second animation, some cluster elements appear to get much further apart as you go back in time. This, however, is due to the crude way we traced the cluster members back, and we also consider no gravitational impacts, which would help attract the objects together. Also, due to lack of Gaia data, did not consider measurements like radial velocity, as most Gaia data points did not have this value readily available. Despite that, by seeing that we do get the general cluster moves together, that is a good result for this loose analysis.

    Data Notice

    Because this cluster and its data and properties are still actively being investigated, this project is not publicly hosted on Github or other hosting platforms. However, the project .ipynb file used for submission for the course may be distributed upon reasonable request.

    Kapitza Pendulum Analysis

    The following project involves the numerical analysis and manim animation of a Kapitza pendulum system as a part of a theoretical mechanics course final project. Specifically, we use Manim animations and Hopf bifurcation analysis to inspect the stability of the inverted position of the pendulum in the high-driving, low-amplitude regime. This is done with a mix of analytical and numerical methods.

    Implementations

    Reading the Project notebook file provides a comprehensive step-by-step breakdown of the project. Inspecting the various .py files shows the different numerical solving methods used to solve this system.

    The project does the following:

    • Derives the Lagrangian, equations of motion, and effective potential for the Kapitza pendulum system, while inspecting the high-driving, low-amplitude regime.
    • Implements numerical methods to solve the system and check that turning off the driving components recovers the simple pendulum.
    • Dynamic widget analysis of the system with varying parameters.
    • \(U(\phi,t)\) and \(U_{\rm eff}(\phi)\) comparisons for high-driving, low-amplitude regime, with separate PDF derivation.
    • Manim animations of the system, with normalized state space axis as well.
    • Chaos and bifurcation analysis: using Liapunov exponent to judge chaos. Manim animations of two pendula for chaos, with animated Liapunov exponent graph. Hopf bifurcation discrete analysis with checks on stability vs. instability of the vertical position using the bifurcation plot. Further analysis on the low-amplitude area, showing periodic motion but no chaos.

    Included in the Github repository:

    • All project files: the project Jupyter notebook, the PDF derivation file, any .py numerical files, and the outputted images.
    • The Conda environment used for the project, built for M-architecture OS.
    • The presentation file given as a part of this project.

    Manim animations are hosted in the releases, as they are big, hosted in the videos.zip file in each release description. One example is shown below, but in order to get it to display, it has been converted to a gif, which greatly reduces its overall quality. It is recommended to download the zip for higher-fidelity animations. The zip file is here.

    Here is the (low quality) gif:
    Example Animation as Low-Quality Gif

    See the full project on Github

    AI Docking Port Locator and Distance Regressor for the ISS

    Goals: Create a project for HackAI 2025 at OSU regarding AI docking port location and distance regression for the International Space Station.

    Results: Utilizing the data, a multi-headed network (MHN) was built on the top layer of the MobileNetV3Small architecture. The MHN had three heads, each head regressing one of three values: the distance to the ISS, the x-coordinate of the docking port, or the y-coordinate of the docking port. After training, the best model was saved and then used to create a three-dimensional visualization showing the line the SpaceX Dragon capsule would need to take to dock with the ISS. An animation was made of this visualization. This visualization utilized public-access STL files of the ISS and Dragon capsule.

    This project won 3rd place in the competition.

    Motivation

    We wanted to create an open-source docking system that can be generalized for different spacecraft. This would aid in the flight process of docking capsules to space stations and would allow for astronauts to direct their attention to other aspects outside of docking. This project also allows for the generalization of docking, showing that it would be possible to do other similar docking tasks with the right data (i.e., automated parking, ship docking, airplane-to-gate travel, etc.)


    Data

    The data is a set of 10,000 images of the ISS with labeled distance values and the location of docking port (albeit in a less-usable format). The data came from Kaggle. It was originally part of an AICrowd challenge.


    The Model

    After tuning the model, we arrived on the following MHN top layer on the MobileNetV3Small architecture.

    Model Top Layer

    We trained the model for 50 epochs implementing early stopping and model checkpoint callbacks to stop the training early if the validation loss stopped decreasing (started to overfit), and to only save the best model based on validation loss. Since this was an MHN, we had three loss functions to minimize, one for each head. We chose MAE for all three heads because we are working with distances. The total loss of the model was a weighted sum of these three individual losses, the weights being tuned through trial-and-error. The result was the following loss graph:

    Loss Graph


    Visualization

    Training the model yielded the best model to use for visualization and prediction. Utilizing public-access STL files of the ISS and SpaceX Dragon capsule, we were able to create a 3D visualization in python of the capsule and ISS in 3D-space, and the line the capsule needs to take to dock, using the predictions from the model. Using this visualization, we made an animation showcasing the results of the model. The animation is shown below.

    Visualization Animation

    See the full project on Github

    Book Recommendation System Using Goodreads Dataset

    Goals: Create an AI book recommendation system based on the Goodreads raw review dataset for spoiler detection.

    Results: See below. Project work is still ongoing

    Motivation

    This project was part of a Physics 5680 (Big Data Analytics for Physics) class. The project was to create an AI book recommendation system based on the Goodreads dataset. This was a pre-defined project choice out of many from the course. Below is a pipeline of the project.

    Pipeline

    Report

    A report was required as part of the project. The report outlines all aspects of the project. It is attached below. The report includes information like the training of BERT-Tiny, the confusion matrix, the pipeline, the collaborative filtering, and the results of the project.

    Project IPYNB

    Below is the PDF version of the Jupyter Notebook creating the project. It is attached below for viewing.

    See the full project on Github

    Dementia Classification AI

    Goals: Revisit the project for OSU Hack AI 2024 and make the model not overfit.

    Results: Stacked layers of 61 images of an MRI scan into 3D tensors and build a Tensorflow functional model and trained it using GPU acceleration with an NVIDIA RTX 3060 GPU for a test accuracy of 97.31%.

    Motivation

    We wanted to create a classification model that would be able to detect if a patient has dementia based on a set of 61 layers of an MRI scan. If successful, this model could be a support tool for doctors to detect if a patient has dementia. It may not be the definitive way to detect if a patient has dementia, but it could be helpful for doctors to see what the model thinks the patient has and with what confidence.


    Data

    The data comes from the OASIS Alzheimer’s dataset, a public dataset consisting of 80,000 MRI images. Since 61 MRI images are for a single patient, it means we have over 1300 patients worth of data in this dataset. The data was downloaded from Kaggle, but can also be accessed through the OASIS website.


    The HackAI 2024 Model

    As a part of the OSU AI Club Hack AI 2024, this project was selected by the team to try in complete in the 24-hour hackathon. We downsized the OASIS images and converted them to grayscale to decrease training time and complexity. We stacked each of the 61 MRI images per patient into a 3D tensor as a numpy ndarray and saved the patient’s 3D MRI scan as an .npz file. We then amplified some of the moderate dementia and mild dementia samples through duplication, as there weren’t many patients with that classification. We then created a Tensorflow functional model to train on the sample data.

    Although we had a good training accuracy, we had a relatively low testing accuracy (indicating overfitting). Unfortunately, due to not having access to a GPU at the hackathon, we could not GPU accelerate the model, and we did not have enough time to test a simpler model. The training accuracy was in the high 90% range, but the testing accuracy hovered around high 70%.

    Although this repository is being integrated with the Hack AI 2024 repository (meaning the files here will now also be there), the older version of the model can still be found by looking back in the commits here.


    This Model & Results

    We revisited this model to try and create a better model. We continued to use the stacked 3D tensors. The creation file for them is located in the npz_generation.py file.

    We created a simpler Tensorflow functional model and GPU accelerated with an NVIDIA RTX 3060 to decrease training time. We added training callbacks of ModelCheckpoint and EarlyStopping to save the best possible model. We got a training accuracy of 100% (it isn’t really 100%, it is being rounded) with a testing accuracy of 97.31%, indicating that our model makes much better generalizations this time around.

    Loss and accuracy graphs:

    Accuracy Graph Loss Graph

    Confusion matrix:

    Confusion Matrix

    See the full project on Github

    Starr Programming Language

    Goals: Create a custom program language written in C++, Flex, and Bison, using LLVM and CMake.

    Results: Work on this project is ongoing, but the preliminary basic phase has been completed.

    Example Syntax:

    int one(int a) {
      int x = a * 5
      return x + 3
    }
    
    int two() {
      return 5 % 3
    }
    
    out(one(12)) // Outputs 63
    out(two()) // Outputs 2
    

    See the current project on Github

    Dot on Wheel Animation

    Goals: Use Mathematica to create an animation of a dot on a wheel that is rolling at a constant angular velocity.

    Problem:

    A wheel with radius R has a constant angular velocity in the -z direction (so the wheel rolls in +x). There is a dot on the wheel a distance p away from the center where p>0. We wish for R>p, but we do explore when p>R and even when p=R. We wish to animate the motion of this dot.

    Result:

    We go through many different results and iterations of animation. The entire process is on the Github tied to the project. This was the final result for R=2, angular velocity of -2, and p=1, and for values of p=2 and p=4.

    p=1: Result Gif

    p=2: Result Gif 2

    p=4: Result Gif 3

    Note that the original project used a .mov file. This resulting .gif is of lower quality than the original video

    See the full project on Github

    Graphene Unsupervised AI Clustering

    Goals: Create a custom Graphene layer identification model through AI.

    Results: Incomplete, but we get are getting good results for preliminary results. The current work on the project has stalled.

    Example Output:

    Graphene Clustered

    See the current project on Github

    There was a poster presentation associated with this. For discretion regarding the others involved, it has not been included on this website

    Chocolate - Vanilla Minecraft Extension

    Goals: Use FabricAPI and Gradle to implement custom behavior into popular game, Minecraft. Work with multiple collaborators through the project.

    Implementations:

    This project implements a lot. To see the custom behavior, check the wiki of the repository

    Addition Example:

    Quartz Spike

    See the full project on Github

    House Price AI, Revisited

    Goals: Revisit Scikit-Learn House Price predicition AI, and redo it with Tensorflow and excluding location data.

    Results: Model with an r2 of 0.9946 and is generalized compared to the previous one.

    Process:

    Use Pandas to read a csv into a DataFrame. Drop all date and location information and ensure we are only working with numerical data for regression. Deal with null values accordingly, and then split the data into training and testing sets. Define a model that normalizes the data and then trains on 4 ReLU Dense layers of 512 filters with L2 Regularization, and then finish with a singular filter Dense layer. Train the model with a 0.001 learning rate Adam optimzer, using MAE loss. Train the model, and compare the predictions to the true values.

    Project File:

    See the full project on Github

    Hemoglobin Binding Project

    Goals: Use Jupyter Notebook, LaTeX, and a Matlab Kernel to do data fitting on experimental hemoglobin binding data. Curve fit different models to the data, including the "Non-Cooperative", Pauling, and Adair models. Run a Monte Carlo simulation on the Adair model, but use brute force for the "Non-Cooperative" and Pauling models. Learn how to derive the models using the grand partition function. Create a poster and present it in LaTeX. Part of Polaris Program at Ohio State.

    Results: We see that experimentally that hemoglobin oxygen particles are cooperative to some extent.

    Project File:

    Derivations:

    Poster (Obscured):

    See the full project on Github

    PG Stock Price Prediction RNN w/Tensorflow

    Goals: Learn about Neural Networks and RNNs. Get comfortable with Tensorflow. Predict Proctor & Gamble Stock (closing) price.

    Results: A model that is pretty close to the true values of the stock price range predicted (matches shape well, exact values not attained).

    Process:

    Collect csv data into a Pandas DataFrame. Enumerate the data to make it all numbers and then visualize the data to see what we’re working with. We then reshape and compile the closing price data and do the same to the high and low price values. We concatenate that usiny NumPy into one array and scale the data. We split the data into training and testing sets, and then using TensorFlow and Keras to generate a Sequential model with a LSTM and Dense layer. After compiling it and training it on the training set, we then predict the stock price closing value of the test set, and then graph the results to inspect the accuracy.

    Snippets:

    PG Stock Price Graph: PG Price

    The RNN Summary: PG Price

    The Prediction: PG Price

    The Prediction w/ More Data: PG Price

    Full Scope of Prediction (Vertical Line Where Prediction Starts): PG Price

    See the full project on Github

    Scikit-Learn House Price AI

    Goals: Get accustomed to Jupyter Notebooks, Scikit-Learn, and simple regression AI modeling. Learn concepts such as normalization, imputation, enumeration, the foundations of CRISP-DM, and the basics of AI modeling.

    Results: Model with an r2 of 0.9999968, but is not general to any house due to location data being factored into the model.

    Process:

    Use Pandas to read a csv into a DataFrame. Enumerate the data to get a frame with only numbers. Check for unusable data and use imputation, if needed, to insert data. After inspecting graphs of the data, normalize the data and filter it accordingly. Split the data into training and test sets, and then model the data using KNeighborsRegressor model and train the data. Then, we predict on the test set and measure the error. Finally, we fiddle with the model a bit to find the most accurate one, and then we’re done.

    Snippets:

    Information Gain on Parameters (VarianceThreshold Not Pictured): Information Gain

    The r2 Values: R-Squared Values

    The Final Model: Final Model

    See the full project on Github

    PandoraPvP

    About: Pandora PvP is a collection of Spigot Minecraft plugins for a designated modded Minecraft public server. The plugins ranged from economy, to utility, and moderation. My personal work was on the development side, where I worked on server optimizations regarding instant block placement, world border creation, staff command logs, and moderator utility commands. Although my time on the project only spanned a couple months, I enjoyed the time I worked on the project.

    What is Spigot?: Minecraft is a video game made in Java (for non-console systems). Spigot is one of many server-hosting options for Minecraft, like Bukkit, Paper, Yatopia, and more. Spigot allows for programmers to create plugins that are loaded with the server as enhancements. Using Spigot’s API, more optimizations and customizability are available for developers.

    This project helped me learn how to organize many tasks into smaller projects, and taught me how to work with more complicated programming systems, like APIs, documentation, and Maven.

    See the full project on Github

    Discord Bot Applications

    About: Discord is a messaging platform with many quality-of-life features. Any developer can create a Discord Bot to add enhancements to the application, which can be fun things (like games) or more useful things (moderation abilities). Starting May 2020, I worked on a Discord Bot called Source Code. In the future, I worked on a larger bot called Psyduck. Both bots were for personal use only, and never became public verified Discord Bots.

    Through this project I learned a lot about Python development and asynchronous programming. Additionally, I learned about cloud hosting environments like Heroku and Google Cloud VM to host the bot 24/7. I worked a lot with JSON and SQLite as well to store important data. This project help fuel my love for computer science and also taught me how to handle a large-scale project.

    Screenshots:
    Uptime Bug Ping Status

    See an obscured version of one of my bots (Psyduck) on Github

    © 2020-2026 Neil Ghugare, All Rights Reserved.
    Notice | License | Changelog

    github logo
      
    linkedin logo