Build Data Quality and Data Governance Framework for Organization is a journey of building foundation of data system
Share post here - Long Bui Linkedin
Proud to contribute alongside experts across AI, Data, and Cybersecurity in building a digitally resilient future. Stay turned.
With perspective of a data engineer with AI adoption momentum - focusing on data quality and governance
In the age of AI adoption where people are using AI for generating and creating products. Developer and builder have been developing new Gen AI model in larger token supporting, multi agent, model communication protocol, integration connection, … embedding Gen AI workflow into current work.
It is one more important than that - it is ensuring quality and governance for foundation of Gen AI layer.
With out Data, AI is nothing
So, the problem is how to ensure data quality and govern data in platform?
I was speaker of “Build Data Quality and Data Governance Framework in Age of AI adoption which is a part of “Advancing Data & AI Workshop” organized by Microsoft and FPT - be honest to talk and share experience in governing data journey. It is not easy to be a “one-fit-all”, but using right approach and creating framework which get more closer step to ensure company and reach AI adoption in a safe and correct manner.
In detail, I spitted the workshop into 3 parts which helps C-suite to understand, build, validate the idea.
Part 1: Understanding Data Quality and Governance
Data Quality
Ensuring data is accurate, complete, consistent, and reliable
Accuracy Completeness Consistency Reliability
System of policies, roles, processes that manage data throughout its lifecycle
Data Governance
Availability Usability Integrity Security
Why They Matter
Better Decisions Regulatory Compliance Operation Efficiency Competitive Advantage
Part 2: Building the Frameworks with Microsoft Tools
-
Step 1- Assessment of Current Data Landscape
-
Step 2- Define Clear Goals & Objectives
-
Step 3- Engage Key Stakeholders
-
Step 4- Set Policies & Standards
-
Step 5- Implement Microsoft Tools
-
Step 6- Monitor & Continuously Improve
This Methodical approach ensures sustainable outcomes
Part 3: Applying Frameworks into Organization
Example with Azure services:
Data Quality with Azure Data Factory
- Unified Data Catalog:Connect & index all data sources for a comprehensive view. Example: Map relationships between web purchases & inventory systems
- Customer Data Classification: Auto-classify PII data to meet GDPR & privacy laws. Example: Tag loyalty member addresses as PII for protection
- End-to-End Data Lineage: Track data movement from source to consumption.Example: Trace product pricing from supply chain to e-commerce
Data Governance with Azure Purview
- Data Quality with Azure Data Factory
- Standardization, Cleansing & Validation
- Standardization Pipelines
- Uniform product codes, customer IDs & transaction formats
Example: Normalize SKUs across online & physical stores
- Automated Data Cleansing
- Remove duplicates, correct errors & handle missing values
Example: De-duplicate customer records from multiple channels
- Data Validation Rules
- Apply business rules to ensure data accuracy
Example: Validate pricing constraints for promotional offers
Demo site
I created this demo for see how it work in term of ensuring governance by measurable metrics and KPIs, using one-data-governance-preview.up.railway.app as template for custom use-cases.