For someone who is new to Data mining, classification and clustering can seem similar because both data mining algorithms essentially “divide” the datasets into sub-datasets; But there is difference between them and this blog-post, we’ll see exactly that:
CLASSIFICATION | CLUSTERING |
|
|
Since a Training set exists, we describe this technique as Supervised learning | Since Training set is not used, we describe this technique as Unsupervised learning |
Example:We use training dataset which categorized customers that have churned. Now based on this training set, we can classify whether a customer will churn or not. | Example:We use a dataset of customers and split them into sub-datasets of customers with “similar” characteristics. Now this information can be used to market a product to a specific segment of customers that has been identified by clustering algorithm |
If you want to learn about Data Mining, check out the “free Book in PDF format: Mining the massive data-sets”.