[data-colorid=n8iexbv28v]{color:#97a0af} html[data-color-mode=dark] [data-colorid=n8iexbv28v]{color:#505968}[data-colorid=b4ayzxbckn]{color:#97a0af} html[data-color-mode=dark] [data-colorid=b4ayzxbckn]{color:#505968}[data-colorid=jmxt33o9oj]{color:#97a0af} html[data-color-mode=dark] [data-colorid=jmxt33o9oj]{color:#505968}[data-colorid=bqjv34t1t6]{color:#97a0af} html[data-color-mode=dark] [data-colorid=bqjv34t1t6]{color:#505968}

Target releaseType // to add a target release date
EpicType /Jira to add Jira epics and issues
Document statusDRAFT
Document ownerSally Hong
DesignerSally Hong
Tech lead@ lead
Technical writers@ writers
QA

Objective

Design the data model and detailed database schema to store model training data in the database.

Success metrics

GoalMetric
We should be able to ingest retrospective data (that is customer retrospective data) into a database.Successful ingestion of data.

Assumptions

Requirements

RequirementImportanceNotes
Data model should be designed so that data scientists are able to train, test and validate ML modelsHIGH
Every customers data set should be logically separate. Does this data model accomplish that?HIGH

User interaction and design

Open Questions

QuestionAnswerDate Answered
Sally Hong When we go live, we will need to ingest and store the data that we will receive via APIs. Will that data be saved in the same data model or do we need a different data model for that?- The Data Model (Generic) should be the same across all hospitals - We can make slight tweaks depending on the production model - The model takes into account raw data (CV ID’s and demographic data) separated and transformed data (post-cleaning where we denormalized data - Denormalized = consolidated data for data analysis / ML11 Apr 2025

Out of Scope

Add details here that are out of scope.

Data Model Design

https://drive.google.com/file/d/1xoFX3-OVtbNjq86J_Kk0B5Pn0LZcXpY0/view?usp=drive_link

Database Design

Key Elements to Include

Tables
Columns and Data Types
Indexes
Normalization
Security
Access Control
Performance
Maintenance