Data Scientist . AI & ML Engineer . Research Data Analyst
DQA Profiler - Data Quality Assessment & Profiling System
A structured profiling workflow for auditing dataset quality and generating decision-ready quality reports for operations,
analytics, and monitoring teams.
CIGMA Data Profiler is a full-stack Data Quality Assessment application built to audit incoming datasets,
detect quality risks early, and generate decision-ready outputs for analysts, data teams, and operations leads.
File upload and automatic data nature detection
Column-level profiling with missingness, uniqueness, and anomaly flags
Data quality scoring, root cause analysis, and blast-radius summary
Auto-remediation plan generation for quality improvements
Core Capabilities
Data type inference: quantitative / qualitative profiling
Outlier detection and numeric distribution labeling
Analysis-fit guidance for feasible statistical/ML approaches
Server-side metrics and async job-status monitoring
Data Inputs & Transformations
The transformation layer supports practical clean-up workflows before downstream analytics. Users can run duplicate
removal, missing-value treatment, text normalization, and outlier-capping operations directly through the app.
Upload + preview for rapid validation
Duplicate handling and missing strategy controls
Column-targeted transformation choices
Downloadable transformed dataset outputs
Main DQA interface for upload, profiling, and quality diagnostics.
Reports & AI Insights
Exportable CSV and PDF quality reports
Missing-value summaries and outlier snapshots
AI insights (sync + async jobs) with job tracking endpoint
Chat-assistant mode for dataset-focused Q&A
Root cause + blast radius + remediation recommendations
Analytics panel with profiling summaries and trend views.Data intelligence score and quality-risk prioritization output.
Architecture & API
Backend: FastAPI service (`backend/app.py`) for analysis, transform, and reporting
Frontend: static HTML/CSS/JS interface served by backend