clustering especially K-Means Using Python programming and Anaconda

HW: Clustering
MCIS-6273: Data Mining

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

This assignment is to give you a basic understanding of clustering especially K-Means Using Python
programming and Anaconda.

Please download the dataset Mall_Customers.csv from blackboard. It will be used for solving this
assignment.

Using K-Means

Part1: [10 points]
First, read the data set into your code. Save two data features to X. (Please pick the fourth feature
(Annual Income (k$)) and the fifth feature (Spending Score(1-100)), in this case we can visualize
the clusters.)
Please do the following:

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

1. Use the elbow method to find the optimal number of clusters
2. Fit K-Means to the dataset by using the optimal number of clusters found by the

elbow method
3. Predict the clustering results y for data set X
4. Visualizing the clusters results, please use different color for different clusters.

1. title, x label, y label should be specified.
2. The legend should be included.

Part2: [10 points]
Repeat the steps in Part1 but now pick the second feature (Gender) and the third feature (Age) in
your work to visualize the clusters. [This part may be trickier.]

Guidelines:
• This assignment is to be solved in groups of two students, not more.
• You only need to deliver a PDF report that is nicely formatted with: [5 points]

◦ Title page: Title and Group Names
◦ ToC page:
◦ Pages should be numbered and numbers show in the ToC
◦ A snapshot of each of the figures as described below, please see the Notes.

▪ Each snapshot has to have a caption, 10 words, describing the picture.
◦ Only one report per group should be submitted
◦ No need to submit any code

Notes:
• For reading and handling the data and guide your work, you will be given the code

example_3D.py and data 3D_network.csv.
◦ You should run the code and understand what it does first.
◦ Also, you will be given a code file named: practice_blobs.py. You can run the code in

Anaconda and see how the output and the different steps should be performed so you
know what to do.

◦ The codes run with no issues so any issues running the code is your responsibility to
resolve

• To know more about the Elbow Method mentioned above for choosing the right number of
clusters, please check: https://www.geeksforgeeks.org/elbow-method-for-optimal-value-
of-k-in-kmeans/

• The report you will submit should have the figures below.
◦ To give you an idea, running the practice_blobs.py gives the following output: [arrows

for output order]

predicted group: 2
distance from center 0 is: 3.731771999479638
distance from center 1 is: 6.290334770382815
distance from center 2 is: 3.382224740457218
distance from center 3 is: 7.132308122920062

https://www.geeksforgeeks.org/elbow-method-for-optimal-value-of-k-in-kmeans/

https://www.geeksforgeeks.org/elbow-method-for-optimal-value-of-k-in-kmeans/

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.

Order your essay today and save 30% with the discount code ESSAYHELP