Building with AI

How I Built a Children's Book Generator in 3 Hours

A hackathon project about bringing the stories I grew up with to a new generation—and the product decisions that shaped it.

V
Vignesh
December 2024
5 min read
The Origin

Stories Lost in Translation

I grew up in India reading and listening to stories from the Ramayana and Mahabharata—epic tales my mother told me. These stories subconsciously shaped how I think about life and and the complexity of good and evil.

A few weeks ago, I was at a friend's place watching their kids—ages 4 and 6—completely absorbed in picture books. But everything they read was Americanized: familiar characters, familiar settings. Nothing wrong with that, but I wondered: where are the stories I grew up with?

The existing options for Indian epics were either too academic for kids, poorly illustrated, or buried in religious framing that obscured the adventure. I wanted something that captured the wonder—Hanuman leaping across the ocean, Arjuna's impossible choice on the battlefield—in a way a 5-year-old could feel. I used to write a lot; and as a storyteller, I have always looked for new and engaging ways to take stories to more people.

“What if AI could transform any classic text into an illustrated children's book—in minutes, not months?”
The Problem

More Than Just “Generate Images”

The naive approach would be: take a story, split it into pages, generate an image for each. Done. But that creates incoherent, inconsistent books that feel hollow.

Real picture books have visual continuity. Characters look the same across pages. The art style is consistent. The pacing builds tension. These aren't accidents—they're craft.

🎨

Style Consistency

How do you ensure Rama looks the same on page 1 and page 15?

📖

Narrative Adaptation

Ancient epics aren't written for 5-year-olds. How do you adapt without losing essence?

Generation Speed

A dozen images per chapter. Users won't wait 10 minutes. How do you parallelize?

Key Decisions

The Choices That Shaped the Product

01

Single Model for Text and Images

I use Gemini for both story adaptation and image generation. Why? The same model that writes the scene description generates the image. It understands its own intent. This produces better text-image alignment than chaining separate models.

Source Text → Story Adaptation (Gemini) → Image Prompts → Parallel Generation (Gemini)
02

Character Anchoring for Visual Consistency

The adaptation phase generates detailed character descriptions that get prepended to every image prompt. Rama isn't just “a prince”—he's “a young man with dark skin, wearing a golden crown and yellow dhoti, carrying a divine bow.”

Result: Characters maintain visual identity across 20+ pages and chapters.

03

Graceful Degradation Over Failure

Image generation APIs fail. Content filters trigger unexpectedly. Rather than failing the entire book, I implemented a fallback: if an image fails after retries, generate a text-only page with beautiful typography instead.

User gets a book with 8/10 images instead of an error message.

04

Amar Chitra Katha Style

I chose Amar Chitra Katha—the beloved Indian comic book style—as the default. It's culturally specific and instantly recognizable to anyone who grew up with these stories. Not generic “cartoon style,” but something that honors the source material.

Cultural specificity over generic aesthetics.

05

The Feature I Didn't Build

The killer feature would be obvious: upload a photo of your child, and they become the protagonist. Imagine your kid as Arjuna, or standing beside Hanuman. I thought hard about this and decided against it.

Current model providers don't offer the data protection and privacy guarantees I'd need for children's photos. I don't have the compute to run this locally. Until I can guarantee that a child's image isn't retained, logged, or used for training, I won't build it.

I'd rather ship without a feature than ship it unsafely.

Under the Hood

Architecture That Scales

Frontend

  • Next.js with App Router
  • Real-time job status via polling
  • PDF preview with react-pdf
  • Deployed on Railway

Backend

  • FastAPI with async job processing
  • Gemini API for story adaptation
  • Gemini for image generation
  • ReportLab for PDF assembly

The Generation Pipeline

Source TextGemini AdaptationParallel Image GenPDF AssemblyS3 Upload
The Build

3 Hours, Solo, Deployed

This was built for a friends-only hackathon. No prizes, no judges—just a group of builders challenging each other to ship something in a day.

I scoped aggressively. No auth system. No user accounts. No payment processing. Just the core loop: pick a story, pick a style, generate a book, download the PDF.

One more thing: I didn't write a single line of code. The entire project—FastAPI backend, Next.js frontend, Gemini integration, PDF pipeline, Railway deployment—was built using Claude Code. I directed, it implemented. That's the future of software development, and it's already here.

Hour 1

Core Pipeline

FastAPI backend, Gemini integration for text and images, basic PDF assembly. Ugly, but it worked.

Hour 2

Character Consistency

Added character reference system. Pre-parsed Ramayana, Mahabharata, Panchatantra with character databases. Images started looking coherent.

Hour 3

Frontend + Deploy

Next.js frontend, Railway deployment, S3 for PDF storage. Live at kadhai.ai before the hackathon ended.

Reflections

What I Learned

Scope Ruthlessly

No auth. No accounts. No payments. Just the core value: make a book. Everything else is a distraction when you have 3 hours.

AI Products Need Guardrails

Unconstrained AI output is inconsistent. The magic comes from thoughtful constraints: curated styles, character anchoring, age-appropriate adaptation.

Same Model, Better Alignment

Using Gemini for both text and images wasn't laziness—it was a design choice. The model that writes the scene understands how to illustrate it.

Know What Not to Build

The photo-to-protagonist feature would have been a hit. But without privacy guarantees for children's images, it stays on the shelf.

Try It Yourself

Generate an illustrated storybook from any classic tale. Pick a story, choose an art style, and watch the magic happen.

Create Your Book →