About Me

Render Gallery

Digital Art

Instagram Github Linkedin

Big Data

Machine Learning


Data Analysis

Heaven Scent


We sourced a data set about Las Vegas hotels in order to find out what factors affected hotel ratings. We used general linear regression to identify which of the categorical factors had the moset effect on a hotel's score. The dataset had 20 features.

glm_binom = sm.GLM(data.endog, data.exog, family=sm.families.Binomial())
res = glm_binom.fit()

A A non linear decision tree was implemented through 8 iterations. This gave the Maximum Depth as 4 and the Minimum Impurity Decrease as 0.01.

x_train_h, x_val_h, x_test_h = np.array(train_h[predictors]), np.array(val_h[predictors]) ,
y_train_h, y_val_h, y_test_h = np.array(train_h[target]), np.array(val_h[target]) , np.array(test_h[target]) 

r8 = tree.DecisionTreeClassifier(max_depth = 7, min_impurity_decrease= 0.005) # Our classification tree
r8 = r8.fit(x_train_h, y_train_h)
print('1. Train set accuracy: %.3f'%accuracy_score(y_train_h,r8.predict(x_train_h)))