Description
This course is ideal for those that are interested in
data mining, and it is a beginner course. You should have a beginner to intermediate understanding of Python as I don't spend a lot of time on the programming aspect.
Most data in the world (whether text,audio,visual, etc) is raw or unlabeled
. This is precisely the reason that
unsupervised machine learning
has become so important. By using certain
approaches to unsupervised machine learning
(like
clusterin
g)
we can discover
patterns
or underlying structures in
data.
This is a major component of
exploratory
data mining
. Furthermore, when one does
exploratory data mining,
it is used to draw hypotheses, assess assumptions about our statistical inferences, and its used as a basis for further research. For example, the conclusion of a cluster analysis could result in the initiation of a full scale experiment.
The course covers two of the most important and common
non-hierarchical clustering algorithms
,
K-means and DBSCAN using Python.
With K-Means,
we start with a 'starter' (or simple) example. We then discuss
'Completeness Score'.
The next lesson we discuss how k-means deals with larger variances and different shapes. Then we discuss
'Color Quantization'
. This is used when an individual wants to decrease the size of an image/and or see if there is any underlying structure to an image. Finally, we will take a look at cells of the human body, and do some
cell segmentation. For DBSCAN, we will look at a starter example as well using Blobs. Then I will show you how DBSCAN overcomes some of the issues of K-means.
If you are interested in data mining,
and want to get a taste of how it works, this course is a great introduction!
Requrirements
Requirements
Understanding Python at beginner or intermediate level is mandatory.