Data glossary

A quick reference guide for understanding the most common terms used in analytics.
A process that involves searching, gathering and presenting data for consumption.
A process that removes any association between the data and identifiable people within your database to prevent the discovery of the source of the records.
Business Intelligence
Strategies and tools used to provide historic, current and predictive views of business operations.
Categorical Data
Data that can be sorted, according to defined groups or categories.
Clickstream Analytics
The analysis of users activity through the items they interact with.
Continuous Data
Data that can take any value within a given range.
Customer Data Platform
Often used to aggregate data from multiple sources to create a richer profile prior to sending to 3rd-party tools or a data warehouse.
Provides a high-level overview of key performance indicators relevant to a particular objective or business purpose.
Data Architecture
How data is structured to be useful to the business. This can be broken down into business entities, the relationship between those entities, and the infrastructure required to support this operation.
Data Cleansing
The process of reviewing and revising data to delete duplicate entries, correct misspelling and other errors, add missing data and provide consistency.
Data Integration
The process of combining data from different sources and presenting it in a single view.
Data Integrity
The measure of trust an organization has in the accuracy, completeness, timeliness and validity of the data.
Data Lake
An unstructured database usually used to store raw unprocessed event data.
Data Modelling
Defines the structure of the data for the purpose of communicating between functional and technical people to show data needed for business processes, or for communicating a plan to develop how data is stored and accessed.
Data Pipeline
The process in which data moves from a one or more source to one or more destination.
Data Warehouse
A structured database used for reporting and data analysis.
Discrete Data
Data that can only take certain values.
ETL is short for extract, transform, load, three database functions that are combined into one tool to pull data out of one database and place it into another database often while transforming it into a more useful structure.
Event Tracking
Are user interactions taken within your product that are recorded for analysis.
Qualitative Data
Data that is descriptive and collected through observation, tells you the why.
Quantitative Data
Data that is always numerical, tells you the what.
Real Time Data
Data that is created, processed, stored, analyzed and visualized within milliseconds.
SQL (Structured Query Language)
A programming language for retrieving data from a relational database.
Statistical Significance
When the outcome of an experiment is unlikely to have occurred due to chance.
Stream Processing
Let's you react on real-time streaming data by continuously querying using services like Kafka.
Structured Data
Data that is organized according to a predetermined format.
Transactional Data
Data that relates to the operation of a business, for example an order transaction includes a value, the item ordered, and a timestamp for when the transaction occurred.
Unstructured Data
Data that has not pre-defined data model, for example text, media, and sensory data.