What an Ethics Audit Taught Us About Building Wellness Software
Earlier this year, we put Daylogue through an independent ethics audit. Not because we had to. Nobody required it. No regulation demanded it. We did it because we handle sensitive emotional data and we wanted an outside perspective on whether we were doing it responsibly.
The score was 87 out of 100. Good enough to feel validated. Not perfect enough to feel comfortable. Which is probably exactly where you want to be, because 100 out of 100 would mean you're not looking hard enough.
Here's what happened, what we changed, and what we're still working on.
Why We Did This
Most wellness apps operate in a regulatory gray zone. They're not medical devices, so they don't need FDA approval. They're not handling health records in the HIPAA sense (unless they want to be). They can collect incredibly intimate data about your emotional life and face very few formal requirements about what they do with it.
That bothers us. Not because we were doing anything wrong, but because "not doing anything wrong" is a low bar. We wanted to know: Are we doing this well? Are there places where our good intentions are getting undermined by design choices we haven't examined? Are we creating risks we haven't considered?
An outside audit answers those questions in ways that internal review can't, because you can't see your own blind spots.
What the Audit Examined
The audit covered several areas. Privacy and data handling. Consent and transparency. Engagement design (are we using manipulative patterns?). Content safety (what happens when someone is in crisis?). Equity and accessibility. Language and framing (are we making clinical claims we shouldn't be making?).
Each area was evaluated on a rubric that considered both current implementation and design intent. Getting credit for what we planned to build wasn't enough. The audit looked at what was actually shipping to users.
What We Were Already Doing Right
The audit validated several things we'd prioritized from the start.
End-to-end encryption. Your check-in data is encrypted with AES-256-GCM before it leaves your device. We cannot read your entries. This isn't a marketing claim. It's an architectural decision that was built in from day one. The auditors verified the implementation.
No data selling. We don't sell your data. We don't share it with advertisers. We don't use your entries to train our models without explicit consent. Our business model is subscriptions, not surveillance.
Privacy-first enterprise model. Our enterprise offering uses K-anonymity and anonymous aggregate dashboards. Organizations see trends. They never see individual entries. The individual controls their data completely.
These weren't surprises. They're core to how we think about the product. But having them independently verified matters. Trust claims are cheap. Verified trust claims are expensive and worth it.
What We Changed
The audit identified several areas where we could do better. Here's what we changed.
Removed all streak mechanics. We had some streak-adjacent features. Subtle things like consecutive check-in counts and notifications that referenced consistency. The audit flagged these as potentially manipulative engagement patterns. The concern was that any gamification around consistency creates guilt for inconsistency, which is the opposite of what a wellness tool should do.
We agreed. We removed all of it. Not just the visible UI elements, but the backend tracking. The infrastructure still exists in the database for compatibility, but nothing reads or writes to it anymore. When you come back to Daylogue after a break, there's no counter reminding you how long you were gone. Just "welcome back."
Improved crisis protocol. The audit highlighted a gap in our voice check-in experience. If someone expressed something that suggested they might be in crisis, the handoff to crisis resources wasn't immediate enough. We implemented a protocol where the conversation ends within two seconds of triggering crisis resources, and the resources (including 988 Suicide and Crisis Lifeline) are displayed prominently.
We're a wellness tool, not a crisis intervention service. But when someone is in our space and they need help we can't provide, getting them to that help fast is our responsibility.
Reframed clinical-sounding language. The audit caught several places where our language sounded more clinical than it should. Words like "assess" and "intervention" had crept into internal labels that were sometimes visible to users. We went through the entire product and reframed.
"Assess" became "notice." "Therapy prep" became "session summary." "Burnout risk" became "sustained elevated stress." "Anxiety" as a tag became "worry." "Mental health" became "self-care." These aren't just word swaps. They reflect a genuine distinction: Daylogue helps you notice patterns. It doesn't diagnose conditions.
Added scope disclaimer. We added clear messaging about what Daylogue is and isn't. It's a wellness tool for self-awareness. It's not therapy. It's not a substitute for professional help. If you're struggling, talk to a human. This disclaimer now appears in onboarding and on key pages throughout the app.
Age verification and minor protections. The audit flagged that we needed better handling for younger users. We added a birthday verification step in onboarding. Users under 13 are blocked entirely. Users between 13 and 17 are flagged internally so we can build age-appropriate guardrails as we develop them.
What We're Still Working On
Transparency means being honest about the gaps, not just the wins.
Minor-specific content guardrails. We flag minor users but we haven't yet built differentiated AI guardrails for them. A 16-year-old's check-in experience should probably differ from an adult's in certain ways, and we're still figuring out exactly how.
AI data handling disclosure. We need to be clearer about exactly how AI processes check-in data. We encrypt everything at rest, but during the processing step, data is temporarily decrypted to generate insights. We should explain this process more transparently.
Dead streak code cleanup. The streak features are disabled, but the dead code still exists in dozens of files. It's not a user-facing issue, but messy code leads to messy thinking. We're cleaning it up over time.
Scope reminder in check-in view. The scope disclaimer appears in onboarding and on narrative pages, but not yet in the check-in chat itself. That's arguably the most important place for it, since that's where users are most actively engaging with the AI.
What This Process Taught Us
Three things stand out.
First, good intentions aren't enough. We built Daylogue with genuine care for our users' wellbeing. But care without examination creates blind spots. The streak mechanics are a perfect example. We didn't add them to manipulate anyone. They just seemed like standard engagement features. It took an outside perspective to show us that "standard" doesn't mean "ethical," especially in a wellness context.
Second, language matters more than you think. The difference between "assess your mood" and "notice your mood" seems small. It's not. One implies clinical evaluation. The other implies gentle attention. In a wellness tool, that distinction shapes how people relate to their own experience. We're not evaluating them. We're helping them pay attention.
Third, transparency builds trust. Publishing our audit score and our gaps is uncomfortable. Nobody likes admitting they didn't get a perfect grade. But the alternative is claiming to be perfect, which is obviously false, or saying nothing, which breeds suspicion. We'd rather be honest about 87 than pretend we're 100.
The Ongoing Work
An ethics audit isn't a one-time event. It's a checkpoint. The score represents a moment in time, not a permanent state. We'll do this again. The gaps we identified will hopefully be closed. New gaps will probably emerge as we build new features.
Building wellness software means accepting that you're handling something precious: people's honest reflections about their inner lives. That demands a level of care that goes beyond legal compliance and into genuine ethical responsibility.
We scored 87. We're working on the other 13.