Storage Fundamentals
Data Dimensions
Velocity, Variety, Volume
- Velocity - Speed at which data is being read or written (RPS, WPS)
- Variety - How structured the data is and how many different structures exist in the data.
- Range from highly structured to loosely structured, unstructured, or binary large object (BLOB) Data.
- Loosely Structured Data
- Unstructured Data
- BLOB Data
- Volume - Total size of dataset
- Data Collected are based on two purposes
- Developing valuable insight
- Storage for later use
Storage Temperature
- Hot Data - Data being actively worked on with new ingests, updates, etc
- Warm Data - Data is still being actively accessed, but less frequently than hot data
- Cold Data - Needs to be accessed occasionally, but updates to this data are rare.
- Frozen Data - Needed to be preserved for business continuity or regulatory reason
Data Values
- Transient Data - short lived data such as clickstream or twitter data
- Reproducible Data - contains copy of useful information that is often created to improve performance or simplify consumption.
- Authoritative Data - Source of truth. Difficult or impossible to restore/replace
- Critical/Regulated Data - Data that a business must retain at all cost.
Block, Object, and File Storage
- Block Storage
- Object Storage
- File Storage