1

The term "Data Cleaning" is used to describe outlier checking, date parsing, missing value imputation to structuring datasets (organizing data values within a dataset) to facilitate analysis.

The latter is commonly referred to as "Data Tidying" but what about the former? I have seen outlier checking, date parsing and missing value imputation referred to as "soft processing", however I wonder if this is a commonly used term.

If I describe a component of data cleaning as "soft processing" is it reasonable to assume that knowledgeable people will know that I am referring to outlier checking, date parsing and missing value imputation?

In the field of Data Science, is there a commonly accepted phrase or term that umbrellas outlier checking, date parsing and missing value imputation.

grldsndrs
  • 567
  • 4
  • 11

1 Answers1

2

If you're not modifying anything you can call it validation. If you are, you can call it preprocessing.

Emre
  • 10,491
  • 1
  • 29
  • 39