Machine Learning-Python
Machine Learning homework (using Python Language)
Fill in the 3 functions in the .py file, using the output sample file for verification.
2.096774193548387177e-01 | 5.219759202928908604e-01 | – | 1.000000000000000000e+00 | ||||||||||||||||
2.999999999999999889e-01 | 2.000000000000000111e-01 | -1.000000000000000000e+00 | |||||||||||||||||
7.083333333333334814e-01 | 1.212546219558842397e-01 | ||||||||||||||||||
3.145161290322581182e-01 | 8.425529589624960458e-01 | ||||||||||||||||||
7.505603971557407439e-01 | 7.491774154517092388e-01 | ||||||||||||||||||
5.134408602150538625e-01 | 6.367279284530336092e-01 | ||||||||||||||||||
6.908602150537634934e-01 | 3.981166098978160539e-01 | ||||||||||||||||||
5.967741935483872329e-01 | |||||||||||||||||||
7.500000000000000000e-01 | |||||||||||||||||||
1.760752688172043390e-01 | 2.942933644195916409e-01 | ||||||||||||||||||
9.408602150537634379e-02 | 7.624086992950946939e-01 | ||||||||||||||||||
4.610215053763441206e-01 | 7.587657784011219153e-01 | ||||||||||||||||||
5.793010752688172449e-01 | 1.358263055317753543e-01 | ||||||||||||||||||
6.034946236559141086e-01 | 2.505783136919181864e-01 | ||||||||||||||||||
1.922043010752688408e-01 | 9.281615999708565656e-01 | ||||||||||||||||||
9.247311827956991026e-01 | 1.485765286606801350e-01 | ||||||||||||||||||
6.263440860215054862e-01 | 8.152310522577002061e-01 | ||||||||||||||||||
3.736559139784947359e-01 | 5.529407478916594787e-01 | ||||||||||||||||||
3.400537634408602461e-01 | 4.272599770495983940e-01 | ||||||||||||||||||
8.346774193548388565e-01 | 6.312635471120744413e-01 | ||||||||||||||||||
2.352150537634409011e-01 | 6.640498351578294489e-01 | ||||||||||||||||||
5.026881720430108613e-01 | 4.400102001785031192e-01 | ||||||||||||||||||
7.809139784946237395e-01 | 2.833646017376731940e-01 | ||||||||||||||||||
6.868279569892473679e-01 | 6.021201799602921012e-01 | ||||||||||||||||||
4.099462365591398649e-01 | 2.578641554798637436e-01 | ||||||||||||||||||
8.753392470082510535e-01 |
Testing your Homework 2 Solutions…
In [1]: import numpy as np
In [2]: import homework2solution as hw
In [3]: #Testing the find_closest_example function
In [4]: #We’ll place a datapoint in each quadrant (in 2D plane example)
In [5]: tdata = np.array([ [-1,1], [1,1], [1, -1], [-1, -1] ])
In [6]: hw.find_closest_example(tdata, np.array([-6,9]) )
Out[6]: 0
In [7]: hw.find_closest_example(tdata, np.array([4,9]) )
…:
Out[7]: 1
In [8]: hw.find_closest_example(tdata, np.array([-3,-4]) )
Out[8]: 3
In [9]: hw.find_closest_example(tdata, np.array([3,-4]) )
Out[9]: 2
In [10]: #Note that 0 means the first example, or the example in the first row in the training data
In [11]: #Now testing three dimensional problems
In [12]: tdata = np.array([ [1, -5, 1], [3, 5, 0], [-5, -6, -7] ])
In [13]: hw.find_closest_example(tdata, np.array([-2, -4,-8]) )
Out[13]: 2
In [14]: # Now testing the calculate_centroid_pos function
In [15]: mydata = np.array( [ [1, 2], [3, 4], [5, 6] ])
In [16]: hw.calculate_centroid_pos(mydata)
Out[16]: array([3., 4.])
In [17]: # Now testing higher dimensions
In [18]: mydata = np.array([ [ 1, 2, 3], [4, 5, 6], [7, 8, 9] ])
In [19]: hw.calculate_centroid_pos(mydata)
Out[19]: array([4., 5., 6.])
In [20]: # Now simulating perceptron classification (using examples from our lecture slides)
In [21]: w = np.array([-5, 1, 1])
In [22]: x = np.array([ [ 1, 5] ])
In [23]: hw.classify_examples(w, x)
Out[23]: array([1.])
In [24]: x = np.array([ [1, 3] ])
In [25]: hw.classify_examples(w, x)
Out[25]: array([-1.])
In [26]: x = np.array([ [1, 1] ])
In [27]: hw.classify_examples(w, x)
Out[27]: array([-1.])
In [28]: x = np.array([ [4, 3] ])
In [29]: hw.classify_examples(w, x)
Out[29]: array([1.])
In [30]: # Now we pass them as a batch 2D numpy array
In [31]: x = np.array([ [1,5], [1, 3], [1, 1], [4, 3] ])
In [32]: hw.classify_examples(w, x)
Out[32]: array([ 1., -1., -1., 1.])
In [33]: # Good luck! 🙂
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
“””
CMPE 471 – CMPS 497
Homework 2
DUE: 21/2/2021 at 11:59pm (Sunday midnight)
Q1: What is your name?
Q2: What is your QUID?
Q3: (4 pts)
Name two advantages and two disadvantages of KNN when it is compared to
Decision Trees:
A3:
Q4: (5 pts)
Identify which of the classifiers learned so far (KNN, DT, or Perceptron)
will not be suitable for the 2-D binary classification problem described by
the data inside the file dataset.csv. In your own words, justify why you choose
the classifier in your answer.
Hint: Use scatter with two different colors for positive and negative examples
A4:
Fill in the body of the functions below following their descriptions
“””
def find_closest_example(data, test_example):
“””
(5 pts)
Using the euclidean distance, this function finds the position of the closest
example in the “data” parameter to “test_example”. So, if the closest example
is the third one inside data, then it returns 2. If it is the fifth example,
it returns the number 4…etc
Parameters:
———-
data: 2-D numpy array continuous data in N x M dimensions
test_example: 1-D array with with M values for a test example
Returns:
———-
The “index” of the closest example (one integer).
“””
return None
def calculate_centroid_pos(data):
“””
(5 pts)
function receives a 2D numpy array “data” (N x M dimensions) and calculates
the new “updated” centroid position for these data
Note: assume all the the examples given in “data” belong to one cluster
function should return a numpy array with 1xM dimensions (centroid position).
Parameters:
———-
data: 2D numpy array (N x M dim) with continuous numerical values.
Returns:
———
The numpy array with 1xM dimensions of the centroid: one centroid position in M dimensions.
“””
return None
def classify_examples(weights, test_examples):
“””
(5 pts)
This function classifies examples using the perceptron algorithm.
To do that, it receives one single dimensional numpy array weights, and 2D numpy array
(test_examples) then returns a 1D numpy array with the classification results for each example
(1 is positive, and -1 if it is a negative).
Parameters:
———-
weights: 1-D numpy array with the perceptron weights and bias (bias is in position 0)
test_examples: 2-D numpy array with the feature-values of the test examples.
Note: the width (number of columns) of test_examples should be length of weights – 1.
Note: assume weights and the features vectors follow the same order:
e.g. w0, w1, w2, w3…
x1, x2, x3…
Returns:
———-
1D numpy array with the classification results of the test_examples: (each is either 1, or -1).
“””
return None
# BELOW IS “EXTRA PRACTICE” (FOR FUN ONLY AND NOT GRADED)
def kmeans_cluster(data, k_init_centroids):
“””
(extra practice)
Can you combine the two parts above to create the K-Means algorithm? This is where you can try it.
Use the above two methods (find_closest_example, calculate_centroid_pos) to implement the K-Means algorithm.
It should terminate either when no examples move across clusters, or if the number of iterations reaches 10.
Parameters:
———-
data: 2-d numpy array of the data to be clustered (data can be more than 2 features)
k_init_centroids: 2d numpy array containing the initial centroid positions (number of rows should determine “K”).
Returns:
———-
the final centroid positions (should be the same length and dimension as the k_initial_centroids)
“””
return None