A flashcards webapp

I recently built a flashcards web application to help people learn and review various topics. The project combines several technologies I’m passionate about - React, FastAPI, MongoDB and AI - into a practical learning tool.

The core functionality is simple: users can review flashcards on different topics, with multiple choice questions and explanations. But under the hood, there are some interesting technical aspects that made this a fun project to work on.

The frontend is built with React and Tailwind CSS, deployed on Vercel. I used custom hooks for state management and focused on making it mobile-responsive with a dynamic sidebar. The backend API is powered by FastAPI and hosted on Digital Ocean, using Uvicorn and Nginx for handling requests efficiently.

The deployment is fully automated through a GitHub Actions workflow that, on push to master, automatically SSHs into the Digital Ocean droplet, pulls the latest changes and restarts the Uvicorn service. The infrastructure is managed through Terraform for reproducibility, eliminating the need for manual deployment actions.

One of the more interesting parts is the AI-powered content generation system. It uses OpenAI’s models with some careful prompt engineering to generate flashcards, explanations, and even emojis for topics. To keep costs reasonable while maintaining quality, I implemented a two-tier approach:

For flashcard generation, it uses a more capable model (gpt-4o) with comprehensive validation and smart duplicate detection using multiple techniques:
- TLSH (Tuned Locality Sensitive Hashing) to compute content similarity hashes
- N-grams analysis to detect overlapping phrases
- Keyword extraction for topic similarity
For explanations, it uses a smaller, cost-effective model (gpt-4o mini) enhanced with RAG (Retrieval Augmented Generation), pulling in relevant context from Wikipedia and other sources

The explanation system was particularly fun to implement - it not only explains why the correct answer is right but also why the incorrect options are wrong. This helps create a more complete learning experience.

The system also has an intelligent keyword discovery mechanism that helps users dive deeper into topics. Starting with a seed topic, it progressively discovers more specific and niche keywords, moving from broad concepts to detailed subtopics. For example, with “AWS Lambda”, it might start with basic terms like “serverless” and “function as a service”, then progress to more specific concepts like “cold starts”, “execution context”, and finally to expert-level topics like “reserved concurrency limits” and “IAM execution role inline policies”. This helps ensure comprehensive coverage of topics while maintaining a natural learning progression.

For data storage, I went with MongoDB hosted on Digital Ocean. While this required more complex setup compared to SQLite, it gives better scalability and performance. The document-based structure works well for storing flashcard metadata and user progress.

To make learning more engaging, I implemented several gamification features. The leaderboard system tracks user progress across different timeframes. Users can view their ranking and points for daily, weekly, monthly, or all-time periods. The leaderboard shows the top performers with medal emojis (🥇, 🥈, 🥉) for the top three positions, and highlights the current user’s position.

I also added a leveling system that tracks users’ knowledge gained. Each answered question contributes to XP (experience points), with users leveling up every 10 XP. The level badges use different colors and visual effects (like shadows and rings) as users progress, creating a sense of achievement. The system keeps track of answered questions by topic, showing users their progress and allowing them to revisit resources they’ve learned from. A progress bar indicates how close they are to reaching the next level, with animations triggering during level-up moments.

# AI-Powered Content Generation from Resources

Recently, I added a powerful feature that allows automatic flashcard generation from various resources like PDFs, documentation sites, and web pages. This system intelligently processes content and creates relevant flashcards while maintaining context.

# Context Providers

The system uses specialized “Context Providers” that know how to parse different types of content sources:

Documentation sites: Built-in providers for major platforms like AWS, Terraform, Python docs, etc.
Wikipedia: Custom parsers for both English and Hungarian Wikipedia
PDFs: Direct text extraction with PyPDF2
General websites: Intelligent HTML parsing with BeautifulSoup
NCBI (Medical/Biology): Specialized parser for medical documentation

Each provider understands the structure of its source and can extract meaningful content while filtering out navigation elements, headers, and other non-content sections.

# Flashcard Generation Process

When generating flashcards from a resource, the system:

Content Extraction: The appropriate context provider parses the source and extracts clean, relevant text
Smart Chunking: Splits large content into overlapping chunks to maintain context between flashcards
Keyword Generation: Uses AI to identify key concepts and terms from the content
Context-Aware Generation: Creates flashcards that reference specific parts of the source material
Quality Control: Validates generated cards for accuracy and relevance

Some other notable features include:

Local storage optimization for better performance
Progressive loading of content (infinite scroll)
Celebration effects for correct answers
Support for multiple languages
Daily quiz functionality

The project is out at brightmind.space. I’m continuously adding new features and improvements, so feel free to try it out and let me know what you think!

Written on October 28, 2024

If you notice anything wrong with this post (factual error, rude tone, bad grammar, typo, etc.), and you feel like giving feedback, please do so by contacting me at hello@samu.space. Thank you!

Back