Data Preprocessing

 

Data Preprocessing



Data Preprocessing in Data Mining

Data Preprocessing comes under the various steps which Data Mining involves. Under preprocessing, we convert the raw data into a more valuable, structured, and practical format. Different data preprocessing techniques in data mining, like cleaning, integration, selection, transformation, and reduction, help obtain the essential data. Here, we have discussed a few steps of data processing in data mining-

  • 1. Data cleaning: Under cleaning, the irrelevant, unstructured, and useless data get removed from the dataset. This step ensures the quality and accuracy of data and works on the reliability of the results we get from data mining.
  • 2. Data integration: This step leads to the formation of a single dataset comprising useful data from various sources. It may involve working on inconsistent data and record matching of datasets.
  • 3. Data transformation: This is the most crucial step of data preprocessing in data mining. This step converts the data into a usable format for data mining. It uses various techniques like discretization, normalization, and feature engineering.
  • 4. Data reduction: This step gives us the most important part of the data while reducing the size of the dataset. Data Reduction involves techniques of dimensionality reduction, sampling, and clustering.
  • 6. Data discretization: This phase transforms continuous data into discrete values to facilitate analysis. Decision trees and association rule mining are two examples of data mining methods that frequently employ them.
  • 7. Data formatting: This stage structures the data to suit the selected data mining technique. It may comprise tasks like transforming data into a specific data structure or encoding data in a specific format.

No comments:

Post a Comment