Current main Next Release

Added

  • Data Quality Engine: Added profile(), DataQualityReport, and ColumnProfile for dataset diagnostics.
  • Smart Cleaning: Added suggest_cleaning() and auto_clean() with safe and strict modes.
  • Schema Validation: Added Schema, Field, validate(), validation result objects, and field builders for Int64, Float64, String, Bool, Email, and URL.

Fixed

  • Hardened CSV parsing for quoted multiline records, unterminated quotes, duplicate headers, empty headers, and non-UTF-8 encodings.
  • Improved Python-level errors by wrapping native read failures as CsvReadError.
  • Improved pandas conversion for mixed numeric/object columns and rejected unsafe nested objects.
v1.0.2 Latest Package

Documentation

  • Redesigned the README with clearer product positioning, architecture notes, benchmarks, and contribution paths.
  • Added language identifiers to fenced code blocks for cleaner Markdown linting.
v1.0.1

Added

  • Automated Release Pipeline: Integrated Google Release Please for automated versioning and changelog generation.
  • Architecture Documentation: New internal documentation explaining the Python-to-C++ boundary and memory management.
  • CI/CD Resilience: Fixed GITHUB_TOKEN trigger limitations in the release workflow using reusable workflows.

Fixed

  • Fixed hardcoded relative paths in test fixtures that caused cibuildwheel failures on CI.
  • Removed deprecated inputs in the Release Please workflow for v4 compatibility.
v1.0.0

Added

  • Cross-Platform Wheels: Full cibuildwheel automation delivering pre-compiled native wheels for Windows, Linux, and macOS (Intel & Apple Silicon).
  • Google Colab Compatibility: Linux wheels are now fully manylinux compliant, allowing pip install arnio to work out-of-the-box on Colab and Ubuntu.
  • Production-Grade Packaging: Resolved ModuleNotFoundError by removing double-nesting issues in scikit-build-core config.
  • CI/CD Excellence: Fully automated PyPI publishing pipeline via Trusted Publishing with integrated source distributions (sdist).
  • Stable API: Officially marked arnio as stable for production workloads with "Development Status :: 5 - Production/Stable".

Fixed

  • Migrated from FetchContent to find_package(pybind11) for faster, offline, and more robust cross-platform builds.
  • Refactored cibuildwheel configuration entirely into pyproject.toml for standard and declarative packaging.
v0.1.3

Fixed

  • normalize_case() now accepts case_type kwarg as documented in README (previously accepted case=, causing TypeError for all README users).
  • to_pandas() completely rewritten using zero-copy NumPy buffer interface — eliminates O(rows × cols) pybind11 boundary crossings, restoring actual performance advantage over pandas.
  • from_pandas() implemented with correct null handling and round-trip fidelity.

Added

  • ar.register_step(name, fn) — register pure-Python pipeline steps without C++.
  • arnio.exceptions module with ArnioError, UnknownStepError, CsvReadError, TypeCastError — replaces opaque C++ errors with actionable messages.
  • arnio.__version__ now available programmatically.
  • benchmarks/generate_data.py — deterministic 1M row test dataset generator.
  • benchmarks/benchmark_vs_pandas.py — reproducible end-to-end benchmark.

Fixed (Internal)

  • CI now verifies compilation on Ubuntu and Windows across Python 3.9–3.12.
v0.1.2

Fixed

  • Stability improvements and initial PyPI release.