This project showcases an advanced OCR (Optical Character Recognition) solution that combines multi modal LLMs (GPT-4 with vision or Claude 3). It is designed to process images, specifically focusing ...
A from scratch, simple OCR project to recognize/detect text in images from the MNIST dataset which is just a bunch of 28x28 images of white number digits centered on a black background. We're using ...