CsvPath begins its journey towards data quality
Atesta Analytics is backing a new tool for CSV data quality. CsvPath comes out of years of development of format validation tools and workflows. It follows in the footsteps of Schematron. When we first met Schematron in 2004 it was love at first sight. Its take on XPath-driven rules-based file validation was so simple and useful we wanted to apply it to every problem. CSV was an early target. Structured data formats like JSON were easier, but CSV obviously needed validation more.
CSV files have every quirk you could imagine. Multiple headers, subdelimiters, quirky quote chars, unexpected char sets, and hundreds of fields. They are data’s wild west. So wild that attempts to validate them have largely stopped at the basics, as if more can’t be done. And yet it can be.
We care because most of the companies we’ve worked with have CSV data. In some cases millions of dollars of transactions flow through CSV. In other cases, the raw data that becomes the stock in trade of analytics and APIs is harvested as CSVs. Why would you avoid it? We all want the data and will take it any way it comes.
That leads to validation problems as challenging as the most complex XML and JSON files. And, for many people, complex + vital == interesting!
Join us in welcoming CsvPath.