Looking for some solid data principles and tenets to guide your organization’s data decisions and the tradeoffs they have to make? Here they are, copy at will!
These tenets are derived from the principles that form the basis of many data protection and privacy laws around the world, including the European Union's General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) in the United States. It's worth noting that the specifics of these laws can vary quite a bit from one jurisdiction to another - but tenets are not laws, they’re simply guiding principles for decision making.
- "Garbage In; Garbage Out": this tenet is foundational to any data-driven initiative. Simply put, the quality of your data outputs can only be as good as the quality of your data inputs. If inaccurate or low-quality data is provided, any subsequent analytics or insights derived will be flawed. This highlights the importance of rigorous upstream data control mechanism, be it validation, testing, evolution policies and so on. As much as possible, shift responsibilities left.
- "Privacy by Design": data privacy isn’t just a nice to have, it’s a legal right in many countries, include it in your design phase. Organizations should prioritize building privacy into their data systems and processes from the onset. This means acquiring informed consent before data collection, using secure data processing methods, and ensuring that privacy safeguards are integral to your systems, not just afterthoughts.
- "Ask Why! - Data Producers": aka Data Minimization. Data collection should be purposeful and efficient, gathering only the data that is truly necessary for the task at hand. This principle guards against potential security risks and respects the privacy of individuals, while also reducing unnecessary data storage and processing costs further down the line. So ask: why do we need to collect this data?
- "Ask Why! - Data Consumers": aka Purpose Limitation. Data should be collected with a specific, explicit, and legitimate purposes in mind. Repurposing or further processing data beyond the originally stated scope should require additional consent or legal bases. Transparency about how and why data is being used also contributes to trust. So ask: why do we need to process this data?
- "See it; Say It": incorporating robust security measures into the design of systems and practices is paramount. Data security should never be an afterthought but should be an integral part of all data-related processes. This includes protecting against unauthorized access and data breaches, as well as ensuring data is stored securely but don’t assume they’re perfect. See something off? Say it.
- "Trust, but Verify": ensuring the accuracy, completeness, relevancy, and timeliness of data is not a nice-to-have, it’s the job. Data is only as valuable as it’s reliable, hence quality should be measured and for better results, have goals on it.
- "Two-way Door": if you are collecting information about individuals, then individuals should be able to access the information that is collected about them, why it is collected, and how it is used. Transparency is also crucial for complying with data privacy regulations and has the benefit of fostering trust between users and organizations,
- "Producer-Owner": entities or individuals who collect and process data are responsible for enforcing data principles, laws, and regulations. If missteps or breaches occur, it’s the producer’s responsibility to act on them. Demonstrating compliance with these standards and taking ownership for any mistakes along the way is fundamental for good data governance.
We use these eight tenets as a framework to guide our data organization. By following these principles, you can also ensure that your organization is not just maximizing the value of its data, but also using it in a manner that is ethical, responsible, and respectful of individual privacy. Whether you're a data scientist, a data privacy officer, or a business executive, incorporating these tenets into your organization's data strategy will contribute to a robust and trustworthy data ecosystem.