Maharaj Systems
AI · Automation · Data Pipelines
01

DanceMap CPH

Copenhagen's dance events are scattered across Facebook, Meetup, and private pages. This system scrapes all of them daily, parses the content with AI, removes duplicates, and outputs a clean unified dataset.

  • Scrapes multiple event sources on a daily schedule
  • AI parsing extracts date, time, venue, and price from raw text
  • Fuzzy deduplication matches events across sources
  • Structured output stored in Google Sheets
Playwright Claude API Google Apps Script Google Sheets JavaScript
02

AI Event Extractor

Takes raw, unstructured event text — from Facebook posts, flyers, or messages — and returns clean, validated JSON with all relevant fields extracted and typed.

  • Accepts any free-form text — no formatting required
  • Extracts title, date, time, location, price, event type, dance style
  • Returns a confidence score with each extraction
  • Handles both Danish and English text
Claude API JavaScript JSON Schema
03

Scraper + Dedup System

A modular scraping framework with built-in deduplication. Built for collecting entity-based data from multiple sources without creating duplicate records.

  • Separate scraper module per source — easy to extend
  • Deduplication using fuzzy matching on title, date, and location
  • Source attribution and timestamps on every record
  • Configurable without modifying the core code
Playwright Google Apps Script JavaScript

Let's build something that works.

Have a project in mind? Let's talk.

Get in touch →