Opportunity Radar
A scraper that tracks 150+ Canadian tech employers and turns their open roles into a filterable dashboard, built for my own job hunt.
Job hunting in Canadian tech is noisy. Roles get posted quietly, sit on a careers page for a week, and vanish. Checking 150 company sites by hand isn't a plan — it's a part-time job. So I built the tool I wanted: something that watches the churn for me and hands back a clean, shareable dashboard instead of forty open tabs.
What it does
Opportunity Radar scans a curated list of 150+ Canadian technology employers — Toronto, Vancouver, Montreal, Waterloo, and remote-friendly teams — pulls their open roles, and turns the results into both an Excel workbook and a single-page HTML dashboard. The dashboard flags brand-new postings, surfaces remote roles, and lets you filter by company, department, and seniority, so triaging a morning's worth of openings takes a minute instead of an hour.
The goal was for the output to feel like a small data product, not a raw scrape — something I'd actually be comfortable putting in a portfolio.
How it works
The pipeline runs end to end: fetch, extract, normalize, categorize, report.
- Fetching is concurrent, using
requestsfor normal pages and falling back to Playwright for the JavaScript-heavy job boards that won't render otherwise. - Extraction tries structured
JSON-LDfirst, since a lot of careers pages embed clean posting data, then falls back to content heuristics when they don't. - Normalizing cleans up titles, parses seniority and remote signals, and deduplicates the noisy boards that list the same role five times.
- Categorizing sorts roles into high-level departments — engineering, data, security, product, marketing — so I can scan by area.
It also has a --demo mode that runs entirely offline against local fixtures, which made it
easy to show the whole pipeline without hammering anyone's site.
What I learned
The interesting part wasn't the scraping — it was how messy real careers pages are. Every board structures its data differently, so the "JSON-LD first, heuristics second" approach came out of repeatedly hitting pages that looked standard and weren't. Building the offline demo mode early also paid off more than I expected: it let me iterate on the report design without depending on the network, and it doubles as a safe way to show the project. If I came back to it, the next step would be scheduling it to run nightly and commit a fresh dashboard on its own.