Biomarkers are standardized, deduplicated, and aggregated health metrics derived from raw data coming from HealthKit, Health Connect, and supported wearables. They’re designed to be the easiest “building blocks” for product features—dashboards, personalization, reporting, and engagement—without needing to manage raw samples yourself.
Key Takeaways
- What biomarkers are: clean, consistent metrics (with units, aggregation method, and time window) you can store and use immediately.
- Why they exist: to remove the heavy lifting—deduplicating overlapping sources and normalizing raw data into a consistent format.
- How to use them: dashboards, weekly summaries, personalization rules, CRM/CDP enrichment, segmentation, and analytics.
- When they’re most useful: when you want product-ready values (vs raw sample streams).
- How you receive them: via API (on-demand) or Webhooks (push, real-time/interval-based).
Metric Spec
| Item | Value |
|---|---|
| Product | Biomarkers |
| Data inputs | HealthKit, Health Connect, and wearables |
| Output format | Consistent JSON schema with value, unit, aggregation, periodicity, and a time window |
| Delivery | API + Webhooks |
| Best used for | Product UX, personalization, engagement automation, analytics and reporting |
| Raw alternative | Data Logs (webhook-only raw samples) |
What Biomarkers Are (and what they aren’t)
Biomarkers are best thought of as product-ready metrics, not raw sensor streams.
- Biomarkers are processed outputs: aggregated totals/averages/point-in-time values.
- Biomarkers are deduplicated across overlapping sources (phone + watch + wearable).
- Biomarkers have consistent units and definitions across sources.
If you need raw samples (timestamped, device/app provenance, per-record metadata), you want Data Logs, not biomarkers.
How Biomarkers Work
Sahha’s biomarker pipeline is intentionally simple:
- Collect — raw samples from multiple sources
- Deduplicate — remove overlapping records
- Aggregate — daily totals, averages, or point-in-time values
- Deliver — via API or webhooks in real time
This means you can build around stable “daily objects” rather than streaming, merging, and cleaning raw events.
What Biomarkers Cover
Biomarkers span common “building block” categories. The exact list is large and grows over time—use the Data Dictionary for the full inventory.
Typical categories include:
- Activity (e.g., steps, active duration, active hours, floors climbed, energy burned)
- Sleep (e.g., sleep duration, sleep latency, sleep efficiency, sleep debt, sleep regularity)
- Vitals (e.g., resting heart rate, HRV, VO₂ max, etc. depending on device/source coverage)
- Body (e.g., weight, height, BMI, body fat % where supported)
- Engagement (platform-level signals that can support personalization)
- Reproductive (documented as coming soon in product docs)
Biomarker Schema (What you receive)
Every biomarker uses a consistent shape—making storage and processing straightforward.
Core fields:
id— idempotent identifier (updates replace the previous entry with the same id)type— biomarker type (e.g.,steps,sleep_duration)category—activity,sleep,vitals,body,engagementvalue— string value (parse usingvalueType)valueType—long,double,string, ordatetimeunit— e.g.,count,minute,bpm,percentage,kcalaggregation—total,average,minimum,maximum,noneperiodicity—daily,weekly,monthly,none(and some may update intraday as documented)startDateTime/endDateTime— the measurement window (ISO 8601)createdAtUtc— when the entry was created
Example:
{
"id": "b7c8d9e0-f1a2-3456-bcde-f78901234567",
"type": "steps",
"category": "activity",
"value": "8432",
"valueType": "long",
"unit": "count",
"aggregation": "total",
"periodicity": "daily",
"startDateTime": "2024-09-03T00:00:00+05:00",
"endDateTime": "2024-09-03T23:59:59+05:00",
"createdAtUtc": "2024-09-04T05:30:00Z"
}
Why Biomarkers Are Useful
1) They’re the easiest path to product UX
Biomarkers are ideal for:
- daily dashboards (sleep duration, steps, HRV)
- “last 7 days” charts
- weekly summaries and progress views
2) They simplify multi-device reality
Many users have multiple data sources. Biomarkers are designed to deliver one clean value per window, rather than making you decide which device “wins” each day.
3) They’re automation-friendly
Biomarkers are stable and easy to reason about in rules:
- If
sleep_durationdrops below baseline → trigger a recovery prompt - If
active_hourstrends down → suggest “movement snacks” - If
resting_heart_raterises for several days → soften intensity messaging
(If you want deeper decision signals, pair biomarkers with Insights—trends and comparisons.)
4) They’re designed for storage + querying
Most biomarker types are perfect for a time-series table keyed by:
externalIdtypestartDateTime/periodicity
When to Use Biomarkers (and when not to)
Use Biomarkers when you want…
- daily/weekly totals and averages for UX
- consistent units and schema
- simplified ingestion and storage
- to avoid building your own cleaning + dedup pipeline
Use Data Logs when you need…
- raw samples and full provenance (device/app, recording method)
- custom analytics at sub-daily resolution
- research/clinical auditability
Data Logs are webhook-only and higher-volume by design.
How to Use Biomarkers in Your Product
Pattern 1: Dashboards and “My Stats”
- Fetch last 7–30 days of biomarkers for charts
- Highlight the latest daily values
- Provide simple explanations (“vs your baseline” or “week-over-week”)
Pattern 2: Personalization and engagement
- Map biomarker types into feature flags and content routing
- Use “baseline framing” (user vs their usual) to avoid shame-based comparisons
- Add guardrails (e.g., act only after 3 days or a weekly trend)
Pattern 3: CRM/CDP enrichment
- Store a small set of “core biomarkers” as user attributes:
- sleep: duration, regularity, debt
- activity: steps, active hours, active duration
- vitals: resting HR, HRV (if available)
Then drive:
- lifecycle campaigns
- onboarding paths
- reactivation flows
Pattern 4: Reporting and analytics
- cohort analysis by biomarker trends
- retention vs movement/sleep patterns
- A/B test outcomes tied to objective behavior signals
Delivery: API vs Webhooks
API (on-demand)
Use the API when you need to fetch:
- profile pages
- coach dashboards
- backfills or batch jobs (via account token and
externalIdworkflows)
Webhooks (push updates)
Use webhooks when you want your DB to stay current automatically.
Important delivery detail:
- Scores and biomarkers support a configurable webhook interval that acts like a deduplication window (batching updates and sending only the final value within the interval).
- Data Logs are always real-time (no interval batching), because each log is a new raw sample.
Implementation Suggestions for your Products
-
Store biomarkers with upserts
- Use
idas an idempotency key (updates replace the previous entry with the same id).
- Use
-
Parse using
valueTypevalueis a string—always parse based onvalueTypeto avoid type bugs.
-
Index by time window
startDateTime/endDateTimedefine the window; use them for daily rollups and charts.
-
Start small
- Pick 8–15 biomarker types that directly power your product UX, then expand.
-
Design for missing coverage
- Some biomarkers require a wearable or consistent wear time. Handle
null/missing gracefully (hide tiles, show “Not available”, or fallback to other metrics).
- Some biomarkers require a wearable or consistent wear time. Handle
-
Avoid overreacting to one day
- Use 7–14 day baselines or Insights (Trends/Comparisons) before triggering heavier interventions.
Common Pitfalls
- Treating estimates as precision: energy burned and some intensity durations vary by device—use “estimated” language.
- Overloading users with metrics: most products perform better with a small set of “hero” biomarkers + context.
- Not handling timezones: always respect the ISO 8601 timestamps in
startDateTime/endDateTime. - Using raw logs for UX: if your goal is a daily chart, biomarkers are almost always the right tool.
FAQ
Do biomarkers require a wearable?
Some do, some don’t. Many core activity and sleep metrics can be derived from phone + OS health platforms, while certain vitals and sleep efficiency/latency metrics are more wearable-dependent.
How often do biomarkers update?
Biomarkers update based on their periodicity settings (daily/weekly/monthly), and some can update intraday depending on the metric and data availability.
Can I display biomarkers to end users directly?
Yes. Biomarkers are designed to be user-facing. If you want a faster UI path, Sahha Widgets can render common data displays with minimal build effort.
How do biomarkers relate to Scores and Insights?
- Biomarkers: the building blocks (steps, sleep duration, HRV, etc.)
- Scores: higher-level outcomes that combine multiple signals (with factors)
- Insights: analytics on top (trends and comparisons for “what’s changing” and “how am I doing?”)
Notes
This guide is educational and intended for product building. It is not medical advice and should not be used to diagnose health conditions.
References
-
Biomarkers (product overview, pipeline, schema, list)
https://docs.sahha.ai/docs/products/biomarkers -
Data Dictionary (full list of available outputs)
https://docs.sahha.ai/docs/get-started/data-dictionary -
Webhooks (delivery model, interval behavior, event types)
https://docs.sahha.ai/docs/connect/webhooks -
Event Reference (payload schemas, including BiomarkerCreatedIntegrationEvent)
https://docs.sahha.ai/docs/connect/webhooks/events -
SDK Biomarkers (getBiomarkers examples)
https://docs.sahha.ai/docs/connect/sdk/biomarkers -
Data Logs (raw sample alternative)
https://docs.sahha.ai/docs/products/logs -
Widgets (pre-built UI components)
https://docs.sahha.ai/docs/products/widgets