<Kirbo.DEV />

2026-04-29 · Coding ·

This portfolio sat on a back burner for nearly seven years. What finally moved it was a growing pile of things I actually wanted to write about — projects I'd shipped, learned from, and wanted to document somewhere I owned. The Electrolux-to-MQTT (Message Queuing Telemetry Transport) project had picked up real users. Other things were in the pipeline. At the same time I was getting comfortable with Agentic Coding and Claude Code — enough that the effort of getting here finally felt manageable alongside a full-time job.

The 2019 backstory

I established this project in 2019. Designed the "logo", the color palette, implemented an almost complete alpha of the frontend with Ant-Design — no dark theme, and missing the part that actually mattered: a backend and a decent way to manage content.

Initially I wanted to build most of the rest myself too — as an ultimate reference on what I'm capable of — using ExpressJS or similar, with the content management on top. The backend work wasn't worth my free time though: auth flows, admin panels, rate limiting — solid headless CMS (Content Management System) options handle all of that. I opted for one of those and focused on the frontend instead.

I evaluated a couple:

Strapi
Payload
some others I can't remember anymore..

None really spoke to me back then. I got interested in completely different stuff — Go, and the never-ending smart home rabbit hole with a Raspberry Pi 3B+ — so this project went into hiatus.

The seven-year itch

Over the years I've been cursing:

I wish I had my own blog, that I've been meaning to create for years...

At some point I almost gave up and seriously considered just using self-hosted WordPress or WordPress.com so I could start writing something about my projects, learnings, fuck-ups, opinions, whatnot.. Somewhere to write about things I find interesting, show how I spend my time, and how I think. Felt like it'd make a nice hobby. 😅

Around the same time I started diving into Agentic Coding at work and in my free time, and I wanted somewhere to write that down too — practices, experiences, how it has already changed my everyday work. My Electrolux to MQTT (Message Queuing Telemetry Transport) / Home Assistant project had also picked up steam (10 reported active users, including me! 🥳) and was overdue for proper write-ups. That was enough motivation to revive this, and I realized I could also use Claude Code on the project itself.

Agentic-assisted QA (Quality Assurance)

It sped things up. When I revived this project I had no tests, and I'd barely manually tested anything beyond my desktop PC (Personal Computer) and phone. Even if I'd added the laptop and tablet to the manual loop, I couldn't have covered half the surface at the level of detail I got with Claude after maybe an hour of planning and then iterating.

Of course I've set up and used Playwright before, automated it on CI/CD (Continuous Integration / Continuous Delivery) pipelines, and I know I could've handcrafted the tests with data-testid attributes or recorded them step-by-step via a browser extension. But that's a project on its own, and I'd have to keep it up-to-date manually every time I made any drastic change.

My motivation to revive this was already low. Spending my limited free time hand-rolling testing pipelines for 11 viewports × 11 pages × 2 themes wasn't going to happen — especially with the matrix also covering typography, layout consistency, a11y (Accessibility), and content density per viewport.

So I built it half-scripted, half-AI (Artificial Intelligence). The scripted matrix currently has 242 cells, and one run looks like this:

Run root: /path/to/visual-review/2026-04-29-010050/
Sweep: 242 cells, concurrency 6. 242/242 cells (35 with findings)

Report: /path/to/visual-review/2026-04-29-010050/findings-report.md
Matrix: /path/to/visual-review/2026-04-29-010050/matrix-result.md
JSON:   /path/to/visual-review/2026-04-29-010050/findings.json
Severity counts: blocker 0 · major 45 · minor 13

..after this, the AI (Artificial Intelligence) reviews the findings, matrix results, reviews and compares the targeted screenshots of the problematic areas of the website:

if the reported finding + the screenshot(s) are clearly problematic, flags that and proposes a solution on how to solve the issue at hand
if it's uncertain whether the finding is actually an issue or is by design, flags that in the report, proposes how to solve it or should it be flagged as false-positive in the future runs
if it's >80% confident that the finding is actually a false positive, still flags it in the report as wontfix-likely and verifies from me whether it should be suppressed in the future
if previous-run memory already has a decision on the same false-positive, the finding is dropped from the report as redundant.

...then verifies the uncertain scenarios via Playwright MCP (Model Context Protocol), whether it can reproduce the finding and tries to evaluate whether it's worth keeping or omitting from the report, then hands over a final report, with the following instruction about grouping:

### Grouping — pick the shortest viable axis

Two findings are **identical** when severity, kind, selector (or selector pattern), Computed value, Expected value, and Proposed fix all match. Identical findings get rendered ONCE with a list of (device-orientation, theme, route) cells underneath — never repeated per cell.

Choose the report's primary group axis based on which produces the shorter file for THIS run:

- **Group by finding** when the same kind/selector/fix repeats across many cells (e.g. one CSS rule explains 22 cells across every device). The report becomes a finding-first list with cell occurrences nested under each finding.
- **Group by `<device-orientation>` then route** when findings are mostly cell-specific and don't repeat (e.g. a typography drift only on iPad portrait, an overflow only on iPhone SE). The report becomes a viewport-first list.
- **Hybrid** is allowed: a "Cross-cell" section at the top for findings that span ≥ 3 cells (rendered once with an occurrence list), then per-viewport sections for the remainder. Use only when neither pure axis is clearly shorter.

State the chosen axis in one line right under the run summary: `Grouping: by-finding` / `Grouping: by-viewport` / `Grouping: hybrid (cross-cell + by-viewport)`.

Then I just review the pre-reviewed/summarized/grouped report, with direct links to the specific screenshots for each finding and the proposed solutions, if any.

A similar setup is doable by hand — an experienced Senior Quality Engineer would knock it out in hours. I'm not a testing professional, so my progress without help is slow and cautious.

On top of that, I have a separate script which runs 44 Lighthouse audits (desktop + mobile, light and dark theme) across 11 cornerstone pages, using Playwright and axe-core:

Done — 11 pages × 2 devices × 2 themes = 44 reports.

After that, AI (Artificial Intelligence) analyzes all the reports and summarizes what should be done per page/device/theme combination to improve the scores.

Some of this is duplicate work — both pipelines test a11y (Accessibility). I could drop it from one, but the overlap is negligible and gives me a useful cross-check.

Stack

Eventually I migrated the original vanilla React + webpack project to Next.js, mostly for SSG (Static Site Generation), and ditched Ant-Design as it wasn't easily customizable. Components are mine, co-written with Claude Code.

Keystatic is the Headless CMS (Content Management System). Content lives in the same repository — no servers, no databases, no Node runtime in production. GitLab CI/CD (Continuous Integration / Continuous Delivery) bumps versions, builds the static site on every commit, and deploys.

The static output could run on UpCloud (referral link separately), Cloudflare Pages, or wherever I decide. Technically I could do everything end-to-end inside GitLab or GitHub if I wanted.

Where it's at

The site you're reading this on is shipped: SSG (Static Site Generation) build, Keystatic for content, the visual + Lighthouse pipelines running on every meaningful change.

What's still on my list:

more groundwork and wireframes before I focus fully on content
probably a separate Project post on this site itself with code examples
possibly a public repo, or a stripped boilerplate if anyone else wants to give it a go

The honest reason this sat unfinished wasn't procrastination — it was the effort-to-gain ratio. My own quality bar was steep: proper support across devices, viewports, and orientations; full WCAG (Web Content Accessibility Guidelines) compliance; automated visual and a11y (Accessibility) testing; a CI/CD (Continuous Integration / Continuous Delivery) pipeline; Lighthouse scores in the high 90s on both desktop and mobile. I could've done all of it by hand — I had the knowledge and experience. But the free-time hours it would've cost me, on top of a full-time job, made it hard to justify. Agentic coding changed that equation — the high-effort parts of reaching my own standards finally felt doable. So I pushed it to the finish line: a place to gather my work, document projects in real depth, and write about whatever I find worth sharing.

The 2019 backstory

The seven-year itch

Agentic-assisted QA (Quality Assurance)

Lighthouse + a11y (Accessibility)

Stack

Where it's at