Solar Flare

10 June 2026
Benchmarking Local PII Redaction Tools for Meeting Transcripts

TLDR: Why can’t I just install a thing that strips PII from a markdown file?

Markdown is the default container for text right now. Transcription tools emit it, Obsidian vaults are full of it, and every AI coding agent and chat tool leaves a trail of it behind. A huge amount of text that could contain names, emails, phone...
30 January 2026
Vibe Coding Already Won

Vibe coding already won. If you are still debating whether it’s a good idea, it’s time to reframe the question.

Just as DevOps helped us to frame the question: What do I need to do in order to release our code to production multiple times a day? Now we need to ask: What do I need to do in order...
29 January 2026
Thoughts on Coding Agents and Architecture

Why agentic coding demands new tools for architectural sensemaking, not less human judgment

The software engineering industry is at a crossroads. There’s an emerging tension and increasing failure risk with the widespread adoption of agentic coding. This article argues that software engineering culture has eliminated the opportunities for new developers to learn systems thinking, so we have a...
22 January 2026
AI Psychosis and Attachment Hacking

Heavy hitting interview about AI psychosis and the societal perils of attachment hacking from CHT, in which they specifically call out the need for evals.

How could we structure an eval to go even deeper than HumaneBench does now? What would be the most meaningful things to measure? How can we formulate the next evals we build to address these...
15 January 2026
HumaneBench Applied in Production

So awesome to see HumaneBench applied in this way!

Erika Anderson said:

Storytell.ai is now running a live dashboard for humane metrics—evaluating real production conversations using HumaneBench.

How humane is your AI? And how would you actually know?

Accuracy is easy to measure. Human impact isn’t—but that doesn’t mean we shouldn’t try. Other startups are following suit. See...
29 December 2025
Ontology is having its moment

Haha I saw that coming! Obviously LLMs (and multi-modal world models to follow) need grounding. Ontologies, knowledge graphs, formal logic, these kinds of things can provide that.

The fads of AI are pretty hilarious. ML is so not cool. ML IS AMAZING! Ontologies are the Way! Ontologies are so passe. ONTOLOGIES ARE THE FUTURE.

Another thing I think will become...
29 November 2025
HumaneBench.ai

Very excited to share that we’ve released HumaneBench.ai - a benchmark measuring not just how well LLMs respond to tough life questions a user might ask, but how resistant or willing they are to participate in harmful types of interaction with the user when prompted to do so. This is an important field to measure. AI has great potential...
29 November 2025
Benchmarking humanity: Building the infrastructure for humane AI

The results are in from this weekend’s Building Humane Technology hack! We gathered at the Internet Archive (huge thanks for hosting us!) with an amazing team of volunteers who helped rank LLM responses. This gave us a baseline of human-as-a-judge ratings to ground the HumaneBench dataset.

Put simply, how human-friendly is your chatbot, and can we measure that...
29 October 2025
Work Life Balance for Founders

I love to hear this from successful founders. It confirms my sense that practices like 996 are not necessary. There should be room for people to burn like hell and do crazy things to get where they want to be. But there’s a line that’s crossed when it becomes a cultural norm and expectation in an industry that everyone should...
29 October 2025
The Future of SaaS

I mostly agree with this. Dynamically generated UIs and business logic, dynamically sourced data, these things are right at the edge of current agentic capability, so they’re coming soon.

There’s definitely nuance in a few areas that won’t be so easy to just replace with big generalist models. SaaS companies have a lot of value in the sophisticated domain knowledge...

Solar Flare

TLDR: Why can’t I just install a thing that strips PII from a markdown file?

Why agentic coding demands new tools for architectural sensemaking, not less human judgment