Projects
Three working systems.
Real outputs.
Each project solves a concrete problem and runs on real infrastructure.
01
DanceMap CPH
Copenhagen's dance events are scattered across Facebook, Meetup, and private pages. This system scrapes all of them daily, parses the content with AI, removes duplicates, and outputs a clean unified dataset.
- Scrapes multiple event sources on a daily schedule
- AI parsing extracts date, time, venue, and price from raw text
- Fuzzy deduplication matches events across sources
- Structured output stored in Google Sheets
02
AI Event Extractor
Takes raw, unstructured event text — from Facebook posts, flyers, or messages — and returns clean, validated JSON with all relevant fields extracted and typed.
- Accepts any free-form text — no formatting required
- Extracts title, date, time, location, price, event type, dance style
- Returns a confidence score with each extraction
- Handles both Danish and English text
03
Scraper + Dedup System
A modular scraping framework with built-in deduplication. Built for collecting entity-based data from multiple sources without creating duplicate records.
- Separate scraper module per source — easy to extend
- Deduplication using fuzzy matching on title, date, and location
- Source attribution and timestamps on every record
- Configurable without modifying the core code
Let's build something that works.
Have a project in mind? Let's talk.