AI Data Readiness Infrastructure

Is your data ready for AI?

AIDRIN is open-source infrastructure that quantitatively assesses and improves your dataset's readiness for AI and machine learning, then helps you remediate what's holding it back.

How it works

Inspect. Remediate. Transform.

AIDRIN is more than an assessment tool. It closes the loop from inspection to an AI-ready dataset.

Inspect

Quantitatively assess readiness across six dimensions.

Remediate

Apply built-in remedies to fix detected issues.

Transform

Export a cleaned, AI-ready dataset.

Assessment

Seven readiness dimensions

Six core dimensions, color-coded across the readiness spectrum, plus AI Application-Specific readiness that cuts across them all.

Data Quality

Completeness, outliers, duplicates, and overall integrity.

Data Governance

Privacy, sensitivity, and responsible-use signals.

Understandability & Usability

Documentation, metadata, and ease of reuse.

Fairness & Bias

Class imbalance and representation across groups.

Impact on AI

Feature relevance and correlation that shape model outcomes.

Structure & Organization

Schema, formats, and structural consistency.

AI Application-Specific

Cross-cutting

Readiness judged against the needs of your specific AI application, cutting across all six dimensions rather than standing apart from them.

Access

Use AIDRIN your way

One engine, six ways in, from a zero-setup browser app to an AI agent driving it for you.

Web Inspector

Upload and assess datasets in your browser. No setup.

Python Library

pip install and score datasets in scripts and notebooks.

CLI

Agent-ready

Headless command-line evaluation, scriptable and CI-friendly.

MCP Server

Agent-ready

Expose AIDRIN to AI agents via the Model Context Protocol.

Globus Remote Compute

Run metrics on remote datasets without transferring files.

LLM Explanations

Generate plain-language explanations of metric results.

Built to extend

Agentic Evaluation

Let an AI agent inspect, remediate, and report autonomously via the CLI and MCP server.

Custom Metrics

Define your own metrics and remedies through an extensible framework.

OpenTelemetry logo

OpenTelemetry

Emit traces and metrics for observability into evaluation runs with OpenTelemetry support.

APPFL logo

APPFL

Assess data readiness inside privacy-preserving federated learning workflows.

Inputs

Bring your data

AIDRIN reads the formats scientific and ML datasets actually ship in.

  • CSV
  • Excel
  • JSON
  • Parquet
  • NumPy
  • HDF5

Get started

Up and running in minutes

Install the Python library, drive it from the CLI, run it from source, or self-host the full web app.

# pip install aidrin
from aidrin import calculate_completeness, calculate_outliers

# file_info = (path, name, type)
file_info = ("data/adult.csv", "adult.csv", ".csv")

calculate_completeness(file_info)
# {'Overall Completeness': 0.97, 'Completeness scores': {...}}

calculate_outliers(file_info)
# {'Outlier scores': {...}}

The bottom line

Stop guessing whether
your data is ready.

One engine, every interface. Measure, fix, and ship an AI-ready dataset from the browser, your code, the command line, or an AI agent.