機器學習-決策樹（Decision Tree）案例

2019 年 10 月 5 日
筆記

背景介紹

這是我最喜歡的演算法之一，我經常使用它。它是一種監督學習演算法，主要用於分類問題。令人驚訝的是，它適用於分類和連續因變數。在該演算法中，我們將總體分成兩個或更多個同類集。這是基於最重要的屬性/獨立變數來完成的，以儘可能地作為不同的組。有關詳細資訊，請參閱簡化決策樹：https://www.analyticsvidhya.com/blog/2016/04/complete-tutorial-tree-based-modeling-scratch-in-python/

在上圖中，您可以看到人口根據多個屬性分為四個不同的組，以識別「他們是否會玩」。為了將人口分成不同的異構群體，它使用各種技術，如基尼，資訊增益，卡方，熵。

理解決策樹如何工作的最好方法是玩Jezzball–一款來自微軟的經典遊戲（如下圖所示）。基本上，你有一個移動牆壁的房間，你需要創建牆壁，以便最大限度的區域被球清除。

所以，每次你用牆隔開房間時，你都試圖在同一個房間里創造2個不同的人口。決策樹以非常類似的方式工作，通過將人口分成儘可能不同的群體。

接下來看使用Python Scikit-learn的決策樹案例：

import pandas as pd  from sklearn.tree import DecisionTreeClassifier  from sklearn.metrics import accuracy_score    # read the train and test dataset  train_data = pd.read_csv('train-data.csv')  test_data = pd.read_csv('test-data.csv')    # shape of the dataset  print('Shape of training data :',train_data.shape)  print('Shape of testing data :',test_data.shape)    train_x = train_data.drop(columns=['Survived'],axis=1)  train_y = train_data['Survived']    test_x = test_data.drop(columns=['Survived'],axis=1)  test_y = test_data['Survived']  model = DecisionTreeClassifier()  model.fit(train_x,train_y)    # depth of the decision tree  print('Depth of the Decision Tree :', model.get_depth())    # predict the target on the train dataset  predict_train = model.predict(train_x)  print('Target on train data',predict_train)    # Accuray Score on train dataset  accuracy_train = accuracy_score(train_y,predict_train)  print('accuracy_score on train dataset : ', accuracy_train)    # predict the target on the test dataset  predict_test = model.predict(test_x)  print('Target on test data',predict_test)    # Accuracy Score on test dataset  accuracy_test = accuracy_score(test_y,predict_test)  print('accuracy_score on test dataset : ', accuracy_test)

上面程式碼運行結果：

Shape of training data : (712, 25)  Shape of testing data : (179, 25)  Depth of the Decision Tree : 19  Target on train data [0 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0   1 0 0 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 1 1 1 0 1 0 01 0 0 0 0 0   0 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 1 11 0 0 0 0 0   0 0 0 0 1 0 0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 00 1 0 1 1 0   0 0 0 1 1 0 0 1 0 0 1 0 1 1 0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 0 00 0 1 0 0 1   0 1 1 1 1 0 0 1 0 1 0 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 0 0 00 0 0 0 0 0   0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 1 1 01 0 0 0 1 0   0 1 1 0 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 0 1 10 0 0 0 0 0   0 0 1 1 0 1 1 0 1 0 1 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 1 1 0 0 00 0 0 0 0 1   1 0 0 1 1 0 1 0 0 0 1 0 1 0 0 0 1 0 1 0 0 1 0 1 0 0 0 0 0 1 00 0 1 0 0 0   0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 0 0 0 1 0 0 1 1 1 1 01 1 0 1 1 1   0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 0 0 1 0 0 1 0 1 0 1 1 11 0 0 1 0 0   0 1 0 0 0 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 1 1 10 0 0 0 0 0   0 0 1 1 1 0 0 1 0 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 00 0 0 1 0 1   1 0 0 0 0 1 0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 1 1 1 01 0 1 0 0 1   0 0 0 1 1 0 0 1 0 0 1 0 1 0 0 1 0 0 0 1 0 0 1 1 0 1 0 0 0 0 11 0 1 1 1 0   1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 1 1 0 1 0 0 0 1 0 1 0 0 0 00 0 0 0 0 1   0 0 0 1 0 1 1 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 1 1 1 1 0 1 0 0 01 0 1 0 1 0   1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 1 0 1 0   1 0 1 1 1 0 0 1 0]  accuracy_score on train dataset :  0.9859550561797753  Target on test data [0 0 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 01 1 1 1 0 0 1 0 1 1 0 1 1 1 1 0   1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 0 1 1 1 0 1 1 1 0 1 1 1 0 0 0 00 1 0 0 0 0   1 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 1 1 11 0 1 1 0 1   0 1 0 0 0 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 0 0 1 1 0 0 0 1 0 1 01 0 0 0 1 0   0 0 0 1 1 0 0 1 0 0 0 0 0 1 0 0 0 1 1 0 1 0 1 1 0 1 0 0 0 0 0]  accuracy_score on test dataset :  0.770949720670391

機器學習-決策樹（Decision Tree）案例

背景介紹

VirMach 便宜 VPS

QNews

機器學習-決策樹（Decision Tree）案例

背景介紹

分享此文：

Related Posts

ASP.NET Core中配置監聽URLs的五種方式

偽分散式Spark + Hive on Spark搭建

IT兄弟連 JavaWeb教程 監聽器3

IT兄弟連 JavaWeb教程 監聽器4

VirMach 便宜 VPS

QNews

熱門搜尋

IT兄弟連 JavaWeb教程監聽器3

IT兄弟連 JavaWeb教程監聽器4