AI Research

Urdu Character GAN

Generative Model for Handwritten Urdu Script

Generative AINLPUrdu Language Tech
A GAN trained to synthesize realistic handwritten Urdu characters — addressing a genuine gap in low-resource language tooling.

Urdu is written by over 70 million people, but it's deeply underserved in machine learning tooling. The training data problem is part of why: high-quality handwritten Urdu script samples are scarce, which makes OCR systems and educational tools hard to develop.

This project trained a Generative Adversarial Network on handwritten Urdu characters to synthesize new, realistic samples. The model runs behind a FastAPI backend with a React interface for interactive generation. The output is usable for OCR training datasets, language learning applications, and digital preservation efforts.

It started as a research apprenticeship project at Folio3 AI. It ended as a working generative system for a genuinely underrepresented language.

Technology Stack

PyTorchFastAPIReactFirebaseDocker

Domain

Generative AINLPUrdu Language Tech

Project type

AI Research

Build something similar?

Talk to us about your project.

Work with us

Have something hard you want to talk through?

Whether you need a production system built in weeks or a research partner for a problem that might take months to frame — reach out. We'll tell you honestly whether we're the right fit.