Updated 1 week ago Guides

Understanding biomarkers: the building blocks of health data

A customer-friendly guide to Sahha Biomarkers—what they are, how they’re produced, what they cover, and how to use them to build product features, personalization, and reporting.

Biomarkers are standardized, deduplicated, and aggregated health metrics derived from raw data coming from HealthKit, Health Connect, and supported wearables. They’re designed to be the easiest “building blocks” for product features—dashboards, personalization, reporting, and engagement—without needing to manage raw samples yourself.


Key Takeaways

  • What biomarkers are: clean, consistent metrics (with units, aggregation method, and time window) you can store and use immediately.
  • Why they exist: to remove the heavy lifting—deduplicating overlapping sources and normalizing raw data into a consistent format.
  • How to use them: dashboards, weekly summaries, personalization rules, CRM/CDP enrichment, segmentation, and analytics.
  • When they’re most useful: when you want product-ready values (vs raw sample streams).
  • How you receive them: via API (on-demand) or Webhooks (push, real-time/interval-based).

Metric Spec

ItemValue
ProductBiomarkers
Data inputsHealthKit, Health Connect, and wearables
Output formatConsistent JSON schema with value, unit, aggregation, periodicity, and a time window
DeliveryAPI + Webhooks
Best used forProduct UX, personalization, engagement automation, analytics and reporting
Raw alternativeData Logs (webhook-only raw samples)

What Biomarkers Are (and what they aren’t)

Biomarkers are best thought of as product-ready metrics, not raw sensor streams.

  • Biomarkers are processed outputs: aggregated totals/averages/point-in-time values.
  • Biomarkers are deduplicated across overlapping sources (phone + watch + wearable).
  • Biomarkers have consistent units and definitions across sources.

If you need raw samples (timestamped, device/app provenance, per-record metadata), you want Data Logs, not biomarkers.


How Biomarkers Work

Sahha’s biomarker pipeline is intentionally simple:

  1. Collect — raw samples from multiple sources
  2. Deduplicate — remove overlapping records
  3. Aggregate — daily totals, averages, or point-in-time values
  4. Deliver — via API or webhooks in real time

This means you can build around stable “daily objects” rather than streaming, merging, and cleaning raw events.


What Biomarkers Cover

Biomarkers span common “building block” categories. The exact list is large and grows over time—use the Data Dictionary for the full inventory.

Typical categories include:

  • Activity (e.g., steps, active duration, active hours, floors climbed, energy burned)
  • Sleep (e.g., sleep duration, sleep latency, sleep efficiency, sleep debt, sleep regularity)
  • Vitals (e.g., resting heart rate, HRV, VO₂ max, etc. depending on device/source coverage)
  • Body (e.g., weight, height, BMI, body fat % where supported)
  • Engagement (platform-level signals that can support personalization)
  • Reproductive (documented as coming soon in product docs)

Biomarker Schema (What you receive)

Every biomarker uses a consistent shape—making storage and processing straightforward.

Core fields:

  • ididempotent identifier (updates replace the previous entry with the same id)
  • type — biomarker type (e.g., steps, sleep_duration)
  • categoryactivity, sleep, vitals, body, engagement
  • valuestring value (parse using valueType)
  • valueTypelong, double, string, or datetime
  • unit — e.g., count, minute, bpm, percentage, kcal
  • aggregationtotal, average, minimum, maximum, none
  • periodicitydaily, weekly, monthly, none (and some may update intraday as documented)
  • startDateTime / endDateTime — the measurement window (ISO 8601)
  • createdAtUtc — when the entry was created

Example:

{
  "id": "b7c8d9e0-f1a2-3456-bcde-f78901234567",
  "type": "steps",
  "category": "activity",
  "value": "8432",
  "valueType": "long",
  "unit": "count",
  "aggregation": "total",
  "periodicity": "daily",
  "startDateTime": "2024-09-03T00:00:00+05:00",
  "endDateTime": "2024-09-03T23:59:59+05:00",
  "createdAtUtc": "2024-09-04T05:30:00Z"
}

Why Biomarkers Are Useful

1) They’re the easiest path to product UX

Biomarkers are ideal for:

  • daily dashboards (sleep duration, steps, HRV)
  • “last 7 days” charts
  • weekly summaries and progress views

2) They simplify multi-device reality

Many users have multiple data sources. Biomarkers are designed to deliver one clean value per window, rather than making you decide which device “wins” each day.

3) They’re automation-friendly

Biomarkers are stable and easy to reason about in rules:

  • If sleep_duration drops below baseline → trigger a recovery prompt
  • If active_hours trends down → suggest “movement snacks”
  • If resting_heart_rate rises for several days → soften intensity messaging

(If you want deeper decision signals, pair biomarkers with Insights—trends and comparisons.)

4) They’re designed for storage + querying

Most biomarker types are perfect for a time-series table keyed by:

  • externalId
  • type
  • startDateTime / periodicity

When to Use Biomarkers (and when not to)

Use Biomarkers when you want…

  • daily/weekly totals and averages for UX
  • consistent units and schema
  • simplified ingestion and storage
  • to avoid building your own cleaning + dedup pipeline

Use Data Logs when you need…

  • raw samples and full provenance (device/app, recording method)
  • custom analytics at sub-daily resolution
  • research/clinical auditability

Data Logs are webhook-only and higher-volume by design.


How to Use Biomarkers in Your Product

Pattern 1: Dashboards and “My Stats”

  • Fetch last 7–30 days of biomarkers for charts
  • Highlight the latest daily values
  • Provide simple explanations (“vs your baseline” or “week-over-week”)

Pattern 2: Personalization and engagement

  • Map biomarker types into feature flags and content routing
  • Use “baseline framing” (user vs their usual) to avoid shame-based comparisons
  • Add guardrails (e.g., act only after 3 days or a weekly trend)

Pattern 3: CRM/CDP enrichment

  • Store a small set of “core biomarkers” as user attributes:
    • sleep: duration, regularity, debt
    • activity: steps, active hours, active duration
    • vitals: resting HR, HRV (if available)

Then drive:

  • lifecycle campaigns
  • onboarding paths
  • reactivation flows

Pattern 4: Reporting and analytics

  • cohort analysis by biomarker trends
  • retention vs movement/sleep patterns
  • A/B test outcomes tied to objective behavior signals

Delivery: API vs Webhooks

API (on-demand)

Use the API when you need to fetch:

  • profile pages
  • coach dashboards
  • backfills or batch jobs (via account token and externalId workflows)

Webhooks (push updates)

Use webhooks when you want your DB to stay current automatically.

Important delivery detail:

  • Scores and biomarkers support a configurable webhook interval that acts like a deduplication window (batching updates and sending only the final value within the interval).
  • Data Logs are always real-time (no interval batching), because each log is a new raw sample.

Implementation Suggestions for your Products

  1. Store biomarkers with upserts

    • Use id as an idempotency key (updates replace the previous entry with the same id).
  2. Parse using valueType

    • value is a string—always parse based on valueType to avoid type bugs.
  3. Index by time window

    • startDateTime/endDateTime define the window; use them for daily rollups and charts.
  4. Start small

    • Pick 8–15 biomarker types that directly power your product UX, then expand.
  5. Design for missing coverage

    • Some biomarkers require a wearable or consistent wear time. Handle null/missing gracefully (hide tiles, show “Not available”, or fallback to other metrics).
  6. Avoid overreacting to one day

    • Use 7–14 day baselines or Insights (Trends/Comparisons) before triggering heavier interventions.

Common Pitfalls

  • Treating estimates as precision: energy burned and some intensity durations vary by device—use “estimated” language.
  • Overloading users with metrics: most products perform better with a small set of “hero” biomarkers + context.
  • Not handling timezones: always respect the ISO 8601 timestamps in startDateTime/endDateTime.
  • Using raw logs for UX: if your goal is a daily chart, biomarkers are almost always the right tool.

FAQ

Do biomarkers require a wearable?

Some do, some don’t. Many core activity and sleep metrics can be derived from phone + OS health platforms, while certain vitals and sleep efficiency/latency metrics are more wearable-dependent.

How often do biomarkers update?

Biomarkers update based on their periodicity settings (daily/weekly/monthly), and some can update intraday depending on the metric and data availability.

Can I display biomarkers to end users directly?

Yes. Biomarkers are designed to be user-facing. If you want a faster UI path, Sahha Widgets can render common data displays with minimal build effort.

How do biomarkers relate to Scores and Insights?

  • Biomarkers: the building blocks (steps, sleep duration, HRV, etc.)
  • Scores: higher-level outcomes that combine multiple signals (with factors)
  • Insights: analytics on top (trends and comparisons for “what’s changing” and “how am I doing?”)

Notes

This guide is educational and intended for product building. It is not medical advice and should not be used to diagnose health conditions.


References