Is your data ready for AI?
AIDRIN is open-source infrastructure that quantitatively assesses and improves your dataset's readiness for AI and machine learning, then helps you remediate what's holding it back.
How it works
Inspect. Remediate. Transform.
AIDRIN is more than an assessment tool. It closes the loop from inspection to an AI-ready dataset.
Inspect
Quantitatively assess readiness across six dimensions.
Remediate
Apply built-in remedies to fix detected issues.
Transform
Export a cleaned, AI-ready dataset.
Assessment
Seven readiness dimensions
Six core dimensions, color-coded across the readiness spectrum, plus AI Application-Specific readiness that cuts across them all.
Data Quality
Completeness, outliers, duplicates, and overall integrity.
Data Governance
Privacy, sensitivity, and responsible-use signals.
Understandability & Usability
Documentation, metadata, and ease of reuse.
Fairness & Bias
Class imbalance and representation across groups.
Impact on AI
Feature relevance and correlation that shape model outcomes.
Structure & Organization
Schema, formats, and structural consistency.
AI Application-Specific
Cross-cuttingReadiness judged against the needs of your specific AI application, cutting across all six dimensions rather than standing apart from them.
Access
Use AIDRIN your way
One engine, six ways in, from a zero-setup browser app to an AI agent driving it for you.
Web Inspector
Upload and assess datasets in your browser. No setup.
Python Library
pip install and score datasets in scripts and notebooks.
CLI
Agent-readyHeadless command-line evaluation, scriptable and CI-friendly.
MCP Server
Agent-readyExpose AIDRIN to AI agents via the Model Context Protocol.
Globus Remote Compute
Run metrics on remote datasets without transferring files.
LLM Explanations
Generate plain-language explanations of metric results.
Built to extend
Agentic Evaluation
Let an AI agent inspect, remediate, and report autonomously via the CLI and MCP server.
Custom Metrics
Define your own metrics and remedies through an extensible framework.
OpenTelemetry
Emit traces and metrics for observability into evaluation runs with OpenTelemetry support.
APPFL
Assess data readiness inside privacy-preserving federated learning workflows.
Inputs
Bring your data
AIDRIN reads the formats scientific and ML datasets actually ship in.
- CSV
- Excel
- JSON
- Parquet
- NumPy
- HDF5
Get started
Up and running in minutes
Install the Python library, drive it from the CLI, run it from source, or self-host the full web app.
# pip install aidrin
from aidrin import calculate_completeness, calculate_outliers
# file_info = (path, name, type)
file_info = ("data/adult.csv", "adult.csv", ".csv")
calculate_completeness(file_info)
# {'Overall Completeness': 0.97, 'Completeness scores': {...}}
calculate_outliers(file_info)
# {'Outlier scores': {...}} aidrin list # available metrics
aidrin data-quality data.csv # completeness, duplicity, outliers
aidrin run completeness data.csv # a single metric git clone https://github.com/idtlab/AIDRIN.git
cd AIDRIN
conda create -n aidrin-env python=3.10 -y
conda activate aidrin-env
python -m pip install -e . # 1) Redis
redis-server --port 6379
# 2) Celery worker
PYTHONPATH=. celery -A worker.make_celery worker --beat --loglevel=info
# 3) Flask app -> http://127.0.0.1:5000
flask --app 'web:create_app()' run --debug Research
Backed by peer-reviewed research
AIDRIN grows out of published work on data readiness for AI (2024–2025).
-
Data Readiness for AI: A 360-Degree Survey
Hiniduma, S. Byna, J. L. Bez
ACM Computing Surveys 57(9):219 · 2025 · Read →
-
AIDRIN: A Comprehensive Toolset for Automating Data Preparation for AI
Hiniduma, J. L. Bez, R. Madduri, S. Byna
SC25 (poster) · 2025 · Read →
-
AIDRIN 2.0: A Framework to Assess Data Readiness for AI
Hiniduma, D. Ryan, S. Byna, J. L. Bez, R. Madduri
SSDBM 2025 (poster) · 2025 · Read →
-
CADRE: Customizable Assurance of Data Readiness in Privacy-Preserving Federated Learning
Hiniduma, Z. Li, A. Sinha, R. Madduri, S. Byna
e-Science '25 · 2025 · Read →
-
AI Data Readiness Inspector (AIDRIN) for Quantitative Assessment of Data Readiness for AI
Hiniduma, S. Byna, J. L. Bez, R. Madduri
SSDBM '24 · 2024 · Read →
The bottom line
Stop guessing whether
your data is ready.
One engine, every interface. Measure, fix, and ship an AI-ready dataset from the browser, your code, the command line, or an AI agent.
