A Comparison of tf.data.Dataset with Manual Image Loading

Loading data for AI models can be RAM and time-intensive. For a larger project (HackAI 2025) with a big dataset, I ran an optimization test on a manual loader and a Tensorflow pipeline to see the variations in time and RAM. As expected, Tensorflow’s model is much more efficient.
While working on our HackAI 2025 project, we found that our original way of loading the image data for our multi-headed regression was incredibly slow, and was crashing some of our computers due to RAM usage. Eventually, we converted our pipeline over to one that utilizes the more-optimized tf.data.Dataset pipeline. For our own entertainment, we made plots comparing the old method with the new method.