Unstable Spiral (In Development) STATUS

Unstable Spiral is a legacy project dedicated to Howard Blumenfeld, author of ๐˜”๐˜ฆ๐˜ฏ๐˜ต๐˜ข๐˜ญ ๐˜ˆ๐˜ณ๐˜ค๐˜ฉ๐˜ช๐˜ต๐˜ฆ๐˜ค๐˜ต๐˜ถ๐˜ณ๐˜ฆ and Mathematics professor at Laspositas college. Unstable Spiral was a website that he worked on in college and for his birthday I'm recreating it with modern taste. I'm using bootstrap for the styling and PHP for the back end. The websites homepage is done and I am currently working on database integration. This website is still in development.

WEBSITE URL: https://repl.it/@Snakebiking49/Unstable-Spiral

Unstable Spiral (In Development)

Unstable Spiral is a legacy project dedicated to Howard Blumenfeld, author of ๐˜”๐˜ฆ๐˜ฏ๐˜ต๐˜ข๐˜ญ ๐˜ˆ๐˜ณ๐˜ค๐˜ฉ๐˜ช๐˜ต๐˜ฆ๐˜ค๐˜ต๐˜ถ๐˜ณ๐˜ฆ and Mathematics professor at Laspositas college. Unstable Spiral was a website that he worked on in college and for his birthday I'm recreating it with modern taste. I'm using bootstrap for the styling and PHP for the back end. The websites homepage is almost done and the rest of the website is in development. Go on the link below to see the website in development.

@AdamBlumenfeld The link to the website is https://unstable-spiral--snakebiking49.repl.co/ IN DEVELOPMENT

Linear Regression with any amount of variables

hm = int(input("Enter how many categories or inputs do you want: ")) boo = int(input("How many inputs per category do you want: ")) dataSet = [] for be in range(hm): print ("This is your " + str(be) + "th input set.") inputer = [] for i in range(boo): point = float(input("Enter value: ")) inputer.append(point) dataSet.append(inputer) outputs = [] for i in range(boo): print ("Enter you output.") output = float(input("Enter value: ")) outputs.append(output) dataSet.append(outputs) values = [] for i in range(hm): values.append(0) yIntercept = 0 def slopeDerivativeOne(dataSet, values, hm, boo, yIntercept, wanted): length = hm total = 0 wantedValues = [] a = dataSet[wanted] for i in range(len(a)): wantedValues.append(a[i]) for f in range(length): innerValues = [] for i in range(len(dataSet)): b = dataSet[i] val = b[f] innerValues.append(val) innerTotal = 0 for i in range(len(values)): bobby = values[i] * innerValues[i] innerTotal += bobby innerTotal += yIntercept jj = len(dataSet) - 1 nn = dataSet[jj] difference = (nn[f] - innerTotal) * wantedValues[f] total += difference returnValue = (-2/length) * total return returnValue def DerivativeOne(dataSet, values, hm, boo, yIntercept, wanted): length = hm total = 0 wantedValues = [] a = dataSet[wanted] for i in range(len(a)): wantedValues.append(a[i]) for f in range(length): innerValues = [] for i in range(len(dataSet)): z = dataSet[i] val = z[f] innerValues.append(val) innerTotal = 0 for i in range(len(values)): bobby = values[i] * innerValues[i] innerTotal += bobby innerTotal += yIntercept jj = len(dataSet) - 1 nn = dataSet[jj] difference = (nn[f] - innerTotal) total += difference print (total) returnValue = (-2/length) * total return returnValue def getCost(dataSet, values, hm, boo, yIntercept, wanted): total = 0 length = hm for f in range(length): innerValues = [] for i in range(len(dataSet)): z = dataSet[i] val = z[f] innerValues.append(val) innerTotal = 0 for i in range(len(values)): bobby = values[i] * innerValues[i] innerTotal += bobby innerTotal += yIntercept jj = len(dataSet) - 1 nn = dataSet[jj] difference = (nn[f] - innerTotal) total += difference print (total) return (abs(total)) l = 0.0001 iterations = 100000 previousValues = [] for x in range(len(values)): print (x) previousValues.append(5) previousYInt = 5 j = 0 while (True): j += 1 ds = [] for i in range(len(values)): d = slopeDerivativeOne(dataSet, values, hm, boo, yIntercept, i) word = values[i] - (l * d) values[i] = word d1 = DerivativeOne(dataSet, values, hm, boo, yIntercept, i) yIntercept = yIntercept - (l * d1) print (j) if j > 1000000: False break previousValues = (values) previousYInt = yIntercept for i in range(len(values)): bob = round(values[i], 2) print (bob) bobby = round(yIntercept, 2) print ("y-intercept" + str(bobby))

Linear Regression

This is Linear Regression


x = [15, 9, 12, 1, 10, 11, 4, 16, 2, 30, 4, 15, 18, 12, 14] y = [13 ,19, 16, 15, 24, 7, 18, 23, 28, 2, 26, 12, 18, 24, 17] def DerivativeSlope(x, y, slope, yInt): length = len(x) total = 0 for i in range(length): predicted = (slope * x[i]) + yInt difference = (y[i] - predicted) * x[i] total += difference returnValue = (-2/length) * total return returnValue def DerivativeIntercept(x, y, slope, yInt): length = len(x) total = 0 for i in range(length): predicted = (slope * x[i]) + yInt difference = (y[i] - predicted) total += difference returnValue = (-2/length) * total return returnValue m = 0 c = 0 l = 0.000001 iterations = 99999 for i in range(iterations): slope = DerivativeSlope(x, y, m ,c) intercept = DerivativeIntercept(x, y, m ,c) m = m - (l * slope) c = c - (l * intercept) print (m, c)

Linear Regression Using Gradient Decent

This is ๐“›๐“ฒ๐“ท๐“ฎ๐“ช๐“ป ๐“ก๐“ฎ๐“ฐ๐“ป๐“ฎ๐“ผ๐“ผ๐“ฒ๐“ธ๐“ท ๐“ค๐“ผ๐“ฒ๐“ท๐“ฐ ๐“–๐“ป๐“ช๐“ญ๐“ฒ๐“ฎ๐“ท๐“ฝ ๐““๐“ฎ๐“ฌ๐“ฎ๐“ท๐“ฝ optimized to 50 milliseconds (average of 12).

""" @nexclap/AdamBlumenfeld """ # Imports import numpy as np from random import randint from matplotlib import pyplot as plt # Define Style Of Matplotlib Graphs plt.style.use("ggplot") # Define data X = np.array([1, 3, 5, 6, 8, 10, 11, 18, 19, 20, 24, 26, 30, 32, 36, 38, 39, 40, 43, 46, 52, 55, 56, 58, 59]) y = np.array([3, 4, 5, 7, 8, 9, 10, 12, 14, 15, 21, 36, 37, 38, 39, 40, 43, 46, 49, 51, 54, 56, 58, 60, 69]) # Plot data plt.scatter(X, y) plt.show() #Regressor Class class Regressor: # Training Function def fit(self, X, y, learning_rate=0.00001, converge=0.001, cst=False): # Cst is weather or not to make a history of cost for further analysis self.cst_b = cst if cst: self.cst = [[], []] # Dataset self.X = X self.y = y # Learning rate, or "a" in the gradient decent formula self.learning_rate = learning_rate # The M and B values in the hypothysis function self.theta = [0, 0] # Cost, which initialtes at infinity self.cost = float('inf') # The iterator of the gradient decent algorithm, mine is recursive (Lol, I just had to add that flex) self.gradient_decent_step(converge) # isub for theta, basically saying theta -= (whatever), only for practical reasons, I had to make it a seprete function def theta_isub(self, i, other): self.theta[i] -= other return self.theta[i] # Calculate and update (or store if cst is True) cost def _cost(self, iteration=None): # Cost function self.cost = (1/(2*len(X))*sum([(self.h(X[index]) - y[index])*X[index] for index in range(len(X))])**2) if self.cst_b: # Update cst self.cst[0].append(self.cost) self.cst[1].append(iteration) # Hypothesis function def h(self, x): # h_ฮธ(x) = ฮธโ‚ + ฮธโ‚€x (Yes, I know that in my hypothysis function is switched around) return x*self.theta[0] + self.theta[1] # Gradient decent iterator def gradient_decent_step(self, converge, iteration=1): # Base case: if the cost is less than the set convergence point than accept current theata values if self.cost <= converge: return None # Do one iteration of gradient decent self._step() # Compute cost self._cost(iteration) return self.gradient_decent_step(converge, iteration+1) # All the math of gradient decent, (Now you know why I made the theta_isub function) def _step(self): return [self.theta_isub(0, self.learning_rate * (1/len(X)*sum([(self.h(X[index]) - y[index])*X[index] for index in range(len(X))]))),self.theta_isub(1, self.learning_rate * (1/len(X)*sum([self.h(X[index]) - y[index] for index in range(len(X))])))] # Define a model model = Regressor() # Train model (With cst = True for graphing) model.fit(X, y, cst=True) # Get the theta (M and B values) and the cst variable (or history of cost to iterations) theta = model.theta cst = model.cst # Nerd plot stuff (Plot linear regression graph) x = np.linspace(0,60,100) y1 = theta[0]*x+theta[1] plt.title("Linear Regression") plt.scatter(X, y, c='teal') plt.plot(x, y1) #plt.savefig("linear_regression.png") (Saves graph to file) plt.show() # More nerd plot stuf (Plot cost graph (cst)) plt.title("Cost") plt.plot(cst[1], cst[0]) #plt.savefig("cost.png") (Saves graph to file) plt.show()

Linear Regression

This is linear regression from scratch. I used a Coursera course for help with this.

import numpy as np import pandas as pd import matplotlib.pyplot as plt x = [1, 5, 3, 4, 7, 9, 12, 13, 15, 16, 17, 4, 5, 2, 10, 23, 25] y = [5, 12, 23, 14, 17, 8, 20, 21, 25, 38, 42, 10, 13, 7, 23, 50, 55] plt.plot(x, y) plt.show() def slopeDerivative(x, y, slope, yInt): length = len(x) total = 0 for i in range(length): predicted = (slope * x[i]) + yInt difference = (y[i] - predicted) * x[i] total += difference returnValue = (-2/length) * total return returnValue def interceptDerivative(x, y, slope, yInt): length = len(x) total = 0 for i in range(length): predicted = (slope * x[i]) + yInt difference = (y[i] - predicted) total += difference returnValue = (-2/length) * total return returnValue m = 0 c = 0 l = 0.0001 iterations = 1000000 for i in range(iterations): derivativeSlope = slopeDerivative(x, y, m ,c) derivativeIntercept = interceptDerivative(x, y, m ,c) m = m - (l * derivativeSlope) c = c - (l * derivativeIntercept) print (m, c)

Naive Bayes Algorithm

This is a program demonstrating the Naive Bayes Algorithm. It is used to see if it will be a good day to golf.



import math #set up the predone data outlook = ['sunny','sunny','overcast','rainy','rainy','rainy','overcast','sunny', 'sunny','rainy','sunny','overcast','overcast','rainy'] temperature = ['hot','hot','hot','mild','cool','cool','cool','mild','cool', 'mild','mild','mild','hot','mild'] humidity = ['high','high','high','high','normal','normal','normal','high', 'normal','normal','normal','high','normal','high'] windy = ['false','true','false','false','false','true','true','false','false', 'false','true','true','false','true'] play = ['no','no','yes','yes','yes','no','yes','no','yes','yes','yes','yes', 'yes','no'] print("Hi welcome to the Naive Bayes Algorithm which will help calculate if it will be a good day to golf!") print("This program has some preset data that will be inputted into the algorithm""\n") #need sunny,cool,high humidity,strong wind, and play yes yescount = 0 nocount = 0 #finds the amount of days you can play golf and bad days to play golf out the the amount of days tested def findyes(): no = 0 yes = 0 for x in range(len(play)): if play[x] == 'no': no += 1 else: yes += 1 global yescount yescount += yes global nocount nocount += no print("There were", yes ,"/", len(play),"days you could golf on and","There were", no,"/",len(play),"days you couldn't golf on") findyes() yeshumid = 0 yessunny = 0 yescool = 0 yeswind = 0 ''' This is a function that helps determine the amount of humid, sunny, cool, and windy days out of the days that it was good to gold ''' def yescalculator(): highhumidity = 0 sunny = 0 cool = 0 wind = 0 for x in range(len(play)): if play[x] == 'yes': if outlook[x] == 'sunny': sunny += 1 if temperature[x] == 'cool': cool += 1 if humidity[x] == 'high': highhumidity += 1 if windy[x] == 'true': wind += 1 global yeshumid yeshumid += highhumidity global yessunny yessunny += sunny global yescool yescool += cool global yeswind yeswind += wind nohumid = 0 nosunny = 0 nocool = 0 nowind = 0 ''' This function helps determine the amount of sunny, humid, cool, a windy days bad golf days ''' def nocalculator(): sunny = 0 cool = 0 wind = 0 highhumidity = 0 for x in range(len(play)): if play[x] == 'no': if outlook[x] == 'sunny': sunny += 1 if temperature[x] == 'cool': cool += 1 if humidity[x] == 'high': highhumidity += 1 if windy[x] == 'true': wind += 1 global nohumid nohumid += highhumidity global nosunny nosunny += sunny global nocool nocool += cool global nowind nowind += wind yescalculator() nocalculator() print("\n""Probability we can play the game:") print("Probability of it being a good golf day and sunny is" ,yessunny,"/",yescount) print("Probability of it being a good golf day and cool is" ,yescool,"/",yescount) print("Probability of it being a good golf day and humid is" ,yeshumid,"/",yescount) print("Probability of it being a good golf day and windy is" ,yeswind,"/",yescount,"\n") print("Probability we cannot play a game:") print("Probability of it being a stay home day and sunny is",nosunny,"/",nocount) print("Probability of it being a stay home day and sunny is",nocool,"/",nocount) print("Probability of it being a stay home day and sunny is",nohumid,"/",nocount) print("Probability of it being a stay home day and sunny is",nowind,"/",nocount) #calculate evidence P(x) of the equation which is denominator evidences = 0 def findevidence(): evidence = (((yeshumid + nohumid)/14)*((yessunny + nosunny)/14)*((yeswind+nowind)/14)*((yescool + nocool)/14)) global evidences evidences += evidence findevidence() ''' this function multiplies all the no probabilities and yes probabilities and compares to see if it will be a good day to golf or a good day to stay home and watch a movie ''' def compare(): yesequation = (((yessunny/yescount)*(yeshumid/yescount)*(yeswind/yescount)*(yescool/yescount)*(yescount/len(play)))/evidences) noequation = (((nosunny/nocount)*(nohumid/nocount)*(nowind/nocount)*(nocool/nocount)*(nocount/len(play)))/evidences) print("\n""The probability it will be a good day is", yesequation) print("The probability it will be a bad day is", noequation) if yesequation > noequation: print("\n""There is higher probability it will be a good day for golf.","Let's go play some!") else: print("\n""There is a higher probability it will be a bad day for golf.","Lets go watch a movie!") compare()

Hangman

I made a program for Hangman, where the user is given 5 wrong letters to try to guess the random word.


import random wordbank = ["dinosaur","peanut","pencil","apple","pineapple"] separatedword = [] usedletters = [] #chooses a word from the list randomly random = random.randint(0,4) word = wordbank[random] guesses = 0 amountwrong = 5 wrong = "true" #welcomes user print("Hi welcome to Royce's Hangman Game" "\n" "Rule include: you have 5 wrong guesses", "Good Luck :)") #function to check if the user has already guessed the letter def guesscheck(userguesses): for x in range(len(usedletters)): if userguesses == usedletters[x]: print("oops you have already guessed that letter") global userguess userguess = input("\n" "Guess a letter") #separates the word up into a list by letter def separateword(): for x in range(0,len(word),1): separatedword.append(word[x]) separateword() letterleft = len(separatedword) length = len(separatedword) print("For testing purposes I'm including the random word" , word) print("The random word is", letterleft, "letters long") ''' loop to see if they can guess the word as long as the player hasn't run out of turns or has guessed it the loop will continue ''' while amountwrong != 0 and letterleft != 0: #asks user for a letter userguess = input("\n""Guess a letter") if guesses >= 1: guesscheck(userguess) #iterates through the list and checks to see if the guessed letter is in the #word for x in range(0,length,1): if userguess == separatedword[x]: print("You got the letter right!", "The word had a(n)",userguess) letterleft -= 1 wrong = "false" usedletters.append(userguess) guesses += 1 ''' uses a true or false to determine if the user has guessed wrong if true then it takes away a chance and tells the user ''' if wrong == "true": amountwrong -=1 print("oops that letter isn't in the word try again") #if not then it just continues and prints out the stats else: wrong = "true" print("You can still get" , amountwrong, "wrong" ) print("There are still", letterleft, "letters left to guess") #end of game print("\n" "You took an amount of",guesses,"guesses") if amountwrong == 0: print("Too bad you lost all your chances. Better luck next time!") else: print("Congrats! you guessed the word!", "The word was",word)

K Means Clustering

This program first assigns initial clusters to a set of data. It then takes the mean of each of these clusters and then uses them to make new clusters. If the new cluster is the same as the previous loop's cluster then the program ends.


import math import random same = "false" counter = 1 print("Hi welcome to my K Means Cluster program") #select value of K kvalue = int(input('What value of K?')) #endless loop until clusters are set while same != "true": clusterpoints = [(2,0),(2.3,0),(2.5,0),(2.7,0),(2.9,0),(2.2,0) ,(3.1,0),(3.3,0),(3.4,0),(3.6,0),(3.9,0),(3.2,0) ,(4,0),(4.5,0),(4.1,0),(4.3,0),(4.7,0),(4.5,0)] referencepoints = [] red = [] blue = [] yellow = [] #select 3 points if counter == 1: for x in range(0,kvalue,1): points = random.randint(0,15) if x == 1: red.append(clusterpoints[points]) clusterpoints.remove(clusterpoints[points]) elif x == 2: blue.append(clusterpoints[points]) clusterpoints.remove(clusterpoints[points]) else: yellow.append(clusterpoints[points]) clusterpoints.remove(clusterpoints[points]) #after the first loop it now uses the mean as a point to determine clusters if counter > 1 : red.append((findmean(previousred),0)) blue.append((findmean(previousblue),0)) yellow.append((findmean(previousyellow),0)) reddistances = [] bluedistances = [] yellowdistances = [] #function for calculating the distances def distancecalculator(color, list): for x in range(len(clusterpoints)): first = clusterpoints[x] firstvalue = first[0] secondvalue = first[1] distance = math.sqrt(((firstvalue-color[0][0])**2)+((secondvalue-color[0][1])**2)) list.append(distance) distance = 0 distancecalculator(red, reddistances) distancecalculator(blue, bluedistances) distancecalculator(yellow, yellowdistances) #assign each point to the nearest cluster placeholder1 = 0 placeholder2 = 0 placeholder3 = 0 for x in range(0,len(clusterpoints),1): placeholder1 = reddistances[x] placeholder2 = bluedistances[x] placeholder3 = yellowdistances[x] if placeholder1 < placeholder2 and placeholder1 < placeholder3: red.append(clusterpoints[x]) elif placeholder2 < placeholder3 and placeholder2 < placeholder1: blue.append(clusterpoints[x]) elif placeholder3 < placeholder2 and placeholder3 < placeholder1: yellow.append(clusterpoints[x]) #removes reference point red.pop(0) blue.pop(0) yellow.pop(0) #This is for the breaking of the loop. It checks to see if all three clusters are the same endcounter = 0 if counter > 1: if (sorted(red)==sorted(previousred)): endcounter +=1 if (sorted(red)==sorted(previousred)): endcounter +=1 if (sorted(red)==sorted(previousred)): endcounter +=1 #if all 3 clusters are same as previous loop then end code if endcounter == 3: print("\n""This is the previous clusters","\n""red",previousred,"\n""blue",previousblue,"\n""yellow",previousyellow) print("\n""This is the cluster on the next loop","\n""red",red,"\n""blue",blue,"\n""yellow",yellow) print("\n""Since they are the same, the clusters are set") print("The clusters were set in", counter, "loops") same = "true" #This is a variable that is used to keep the previous clusters previousred = red previousblue = blue previousyellow = yellow #find mean of each cluster and do the process again def findmean(list): adder = 0 for x in range(len(list)): adder += list[x][0] meanvalue = adder/len(list) return meanvalue counter+=1

Implementing Hierarchical Clustering into a dataset

This code implements Hierarchical Clustering into a real dataset of information regarding credit cards. It outputs the centroids of the number of clusters that the user wants the data to be split into.


import pandas as pd import math import random csv = pd.read_csv('Card.csv', header = 0) data = [] for x in range(len(csv)): balance = csv.iloc[x].iloc[1] balanceFrequency = csv.iloc[x].iloc[2] purchases = csv.iloc[x].iloc[3] oneOffPurchases = csv.iloc[x].iloc[4] installmentsPurchases = csv.iloc[x].iloc[5] cashAdvance = csv.iloc[x].iloc[6] accountTuple = (balance, balanceFrequency, purchases, oneOffPurchases, installmentsPurchases, cashAdvance) data.append(accountTuple) k = int(input("How many final clusters do you want(max is 97):")) def distance(x, y): return math.sqrt((x[0] - y[0]) ** 2 + (x[1] - y[1]) ** 2 + (x[2] - y[2]) ** 2 + (x[3] - y[3]) ** 2 + (x[4] - y[4]) ** 2 + (x[5] - y[5]) ** 2) def minDistance(clusters): minDist = 10000 for x in data: for y in data: dist = distance(x, y) if dist < minDist and dist != 0: minDist = dist minX = x minY = y minPoints = (minX, minY) return minPoints def findCentroid(minPoints): x = minPoints[0] y = minPoints[1] centroid = [] for d in range(len(x)): centroid.append((x[d] + y[d])/2) return centroid data1 = data while k < len(data): minPoints = minDistance(data) data.append(findCentroid(minPoints)) data.remove(minPoints[0]), data.remove(minPoints[1]) print("Centroids:", data)

Challenge: Program A Graphing Calculator: Prize Money!

๐‚๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ž ๐ƒ๐ž๐ฌ๐œ๐ซ๐ข๐ฉ๐ญ๐ข๐จ๐ง: Create a program that simulates a graphing calculator with numerical, graphical, and simple algebraic functions. Submit your result's to Nexclap with the post tag ๐—ด๐—ฟ๐—ฎ๐—ฝ๐—ต๐—ถ๐—ป๐—ด๐—ฐ๐—ฎ๐—น๐—ฐ๐˜‚๐—น๐—ฎ๐˜๐—ผ๐—ฟ๐—ฐ๐—ต๐—ฎ๐—น๐—น๐—ฎ๐—ป๐—ด๐—ฒ. Be sure to include instructions on what platform and/or language the program is in. I will periodically update the state of the challenge and will post the final results on Nexclap on ๐Ÿ–/๐Ÿ–/๐Ÿ๐ŸŽ๐Ÿ๐Ÿ—. ๐‚๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ž ๐ซ๐ฎ๐ฅ๐ž๐ฌ: ๐—ก๐—ผ ๐—•๐˜‚๐—ถ๐—น๐˜-๐—ถ๐—ป ๐—ฝ๐—ฎ๐—ฐ๐—ธ๐—ฎ๐—ด๐—ฒ๐˜€ ๐—ฎ๐—น๐—น๐—ผ๐˜„๐—ฒ๐—ฑ! The program must have an arithmetic function and must include these mathematical functions: Sine: (Such as: sin(3)) Cosine: (Such as: cos(4)) Tangent: (Such as tan(89)) Square root: (Such as โˆš4) Nth root: (Such as โˆ›8) Square of x: (Such as xยฒ) Nth exponent of x: (Such as xยณ) Order of operations: (Must obey order of operations) Parenthetical support: (Such as: (1+3)/2) Graphing equations: (Such as: xยณ + 2xยฒ - 1 or y = โˆšx) ๐‚๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ž ๐ญ๐ข๐ฉ๐ฌ: A good model for your graphing calculator would be the ti-83 graphing calculator by Texas Instruments. ๐‘ญ๐’†๐’†๐’ ๐’‡๐’“๐’†๐’† ๐’•๐’ ๐’‘๐’๐’”๐’• ๐’š๐’๐’–๐’“ ๐’’๐’–๐’†๐’”๐’•๐’Š๐’๐’๐’” ๐’๐’ ๐’•๐’‰๐’Š๐’” ๐’„๐’‰๐’‚๐’๐’๐’†๐’๐’ˆ๐’† ๐’Š๐’ ๐’•๐’‰๐’† ๐’„๐’๐’Ž๐’Ž๐’†๐’๐’•๐’”! ๐“‘๐“ฎ๐“ผ๐“ฝ ๐“ž๐“ฏ ๐“›๐“พ๐“ฌ๐“ด! - เธ„เน”เธ„เน“ PRIZE MONEY! 0 dollars! It's a free challange! You really expect me to give you prize money!!!

/* @nexclap/AdamBlumenfeld */ The Deadline Of This Challenge: 8/8/2019! ๏ผง๏ฝ๏ฝ๏ฝ„ ๏ผฌ๏ฝ•๏ฝƒ๏ฝ‹
Adam Blumenfeld Jul 08

I will periodically update the status of this challenge

Akhil Yeleswar Jul 08

Awesome. Should add a rule - no built in packages allowed

Implementing K Means Clustering into a dataset

This code takes a real dataset of information of credit card users and uses K means Clustering to categorize the users into separate clusters. It outputs the lowest variation and the centroids of the clusters


import pandas as pd import math import random csv = pd.read_csv('Card.csv', header = 0) data = [] for x in range(len(csv)): balance = csv.iloc[x].iloc[1] balanceFrequency = csv.iloc[x].iloc[2] purchases = csv.iloc[x].iloc[3] oneOffPurchases = csv.iloc[x].iloc[4] installmentsPurchases = csv.iloc[x].iloc[5] cashAdvance = csv.iloc[x].iloc[6] accountTuple = (balance, balanceFrequency, purchases, oneOffPurchases, installmentsPurchases, cashAdvance) data.append(accountTuple) clstr1 = [] lowestVarClstr1 = [] clstr2 = [] lowestVarClstr2 = [] clstr3 = [] lowestVarClstr3 = [] centroids = [] lowestVarCentroids = [] def orignalCentroids(): for a in range(3): rand = data[random.randrange(len(data))] centroids.append(rand) def findDistance(x, y): return math.sqrt((x[0] - y[0]) ** 2 + (x[1] - y[1]) ** 2 + (x[2] - y[2]) ** 2 + (x[3] - y[3]) ** 2 + (x[4] - y[4]) ** 2 + (x[5] - y[5]) ** 2) def sortIntoClstr(): clstr1.clear() clstr2.clear() clstr3.clear() for d in data: coorCentClstr1 = centroids[0] coorCentClstr2 = centroids[1] coorCentClstr3 = centroids[2] distClstr1 = findDistance(coorCentClstr1, d) distClstr2 = findDistance(coorCentClstr2, d) distClstr3 = findDistance(coorCentClstr3, d) if distClstr1 < distClstr2 and distClstr1 < distClstr3: clstr1.append(d) elif distClstr2 < distClstr1 and distClstr2 < distClstr3: clstr2.append(d) elif distClstr3 < distClstr1 and distClstr3 < distClstr2: clstr3.append(d) def newCentroid(cluster): x1 = 0 x2 = 0 x3 = 0 x4 = 0 x5 = 0 x6 = 0 for c in cluster: coor = c x1 += coor[0] x2 += coor[1] x3 += coor[2] x4 += coor[3] x5 += coor[4] x6 += coor[5] x1 /= (len(cluster) + 1) x2 /= (len(cluster) + 1) x3 /= (len(cluster) + 1) x4 /= (len(cluster) + 1) x5 /= (len(cluster) + 1) x6 /= (len(cluster) + 1) return (x1, x2, x3, x4, x5, x6) def findNewCentroids(): centroids.clear() centroids.append(newCentroid(clstr1)) centroids.append(newCentroid(clstr2)) centroids.append(newCentroid(clstr3)) def varCluster(cluster, centroid): var = 0 for c in cluster: var += findDistance(centroid, c) return var def findVariation(): variation = 0 variation += varCluster(clstr1, centroids[0]) variation += varCluster(clstr2, centroids[1]) variation += varCluster(clstr3, centroids[2]) return variation def oneCycle(): for x in range(0, 100, 1): sortIntoClstr() findNewCentroids() return findVariation() orignalCentroids() for c in range(21): variation = oneCycle() lowestVariation = 99999999999999999 if variation < lowestVariation: lowestVariation = variation lowestVarCentroids = centroids print('Lowest Variation: ') print(lowestVariation) print('Centroids: ') print(lowestVarCentroids)
1 2 ... 4