Home Uncategorized clustering especially K-Means Using Python programming and Anaconda

Uncategorized

clustering especially K-Means Using Python programming and Anaconda

HW: Clustering
MCIS-6273: Data Mining

This assignment is to give you a basic understanding of clustering especially K-Means Using Python
programming and Anaconda.

Please download the dataset Mall_Customers.csv from blackboard. It will be used for solving this
assignment.

Using K-Means

Part1: [10 points]
First, read the data set into your code. Save two data features to X. (Please pick the fourth feature
(Annual Income (k$)) and the fifth feature (Spending Score(1-100)), in this case we can visualize
the clusters.)
Please do the following:

1. Use the elbow method to find the optimal number of clusters
2. Fit K-Means to the dataset by using the optimal number of clusters found by the

elbow method
3. Predict the clustering results y for data set X
4. Visualizing the clusters results, please use different color for different clusters.

1. title, x label, y label should be specified.
2. The legend should be included.

Part2: [10 points]
Repeat the steps in Part1 but now pick the second feature (Gender) and the third feature (Age) in
your work to visualize the clusters. [This part may be trickier.]

Guidelines:
• This assignment is to be solved in groups of two students, not more.
• You only need to deliver a PDF report that is nicely formatted with: [5 points]

◦ Title page: Title and Group Names
◦ ToC page:
◦ Pages should be numbered and numbers show in the ToC
◦ A snapshot of each of the figures as described below, please see the Notes.

▪ Each snapshot has to have a caption, 10 words, describing the picture.
◦ Only one report per group should be submitted
◦ No need to submit any code

Notes:
• For reading and handling the data and guide your work, you will be given the code

example_3D.py and data 3D_network.csv.
◦ You should run the code and understand what it does first.
◦ Also, you will be given a code file named: practice_blobs.py. You can run the code in

Anaconda and see how the output and the different steps should be performed so you
know what to do.

◦ The codes run with no issues so any issues running the code is your responsibility to
resolve

• To know more about the Elbow Method mentioned above for choosing the right number of
clusters, please check: https://www.geeksforgeeks.org/elbow-method-for-optimal-value-
of-k-in-kmeans/

• The report you will submit should have the figures below.
◦ To give you an idea, running the practice_blobs.py gives the following output: [arrows

for output order]

predicted group: 2
distance from center 0 is: 3.731771999479638
distance from center 1 is: 6.290334770382815
distance from center 2 is: 3.382224740457218
distance from center 3 is: 7.132308122920062

https://www.geeksforgeeks.org/elbow-method-for-optimal-value-of-k-in-kmeans/

Turn in your highest-quality paper
Get a qualified writer to help you with

“ clustering especially K-Means Using Python programming and Anaconda ”

Get high-quality paper

NEW! AI matching with writer

Hire a Writer

Client Reviews

4.9

Sitejabber

4.6

Trustpilot

4.8

Our Guarantees

100% Confidentiality

Information about customers is confidential and never disclosed to third parties.

Original Writing

We complete all papers from scratch. You can get a plagiarism report.

Timely Delivery

No missed deadlines – 97% of assignments are completed in time.

Money Back

If you're confident that a writer didn't follow your order details, ask for a refund.

New to Your Trusted Assignment Help Service? Sign up & Save

Calculate the price of your order

Type of paper needed:

Pages:

You will get a personal manager and a discount.

Academic level:

We'll send you the first draft for approval by at

Total price:

$0.00

Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.

Power up Your Study Success with Experts We’ve Got Your Back.

Order Now Order Now