About Multiphasic Labs

We're focused on making AI systems safer in high-stakes, human-facing contexts.

Mission

Multiphasic Labs exists to build practical tooling for evaluating and improving the safety of AI—especially in domains like mental health, where poor behavior can cause real harm. We start with scripted testing and LLM-as-judge evaluation so teams can run repeatable safety checks before deployment.

What we believe

  • Safety is a process, not a one-time check. We build for pipelines: personas, runs, and structured results that fit into CI and review workflows.
  • Tooling should be open and auditable. Our Mental Health Safety Tester is open source so teams can inspect, extend, and adapt it to their context.
  • Testing is not clinical practice. Our tools are for evaluating AI systems only. They must not be used to assess or triage real people in crisis.

Where we're headed

We're early-stage. The Clinical Testing Tool is our first MVP—a building block for offline safety evaluation. We're iterating on personas, criteria, and judge design, and we're interested in feedback from teams building mental-health-oriented or other sensitive-domain AI.

Get in touch or explore the repo on GitHub.