手把手教你Python實現30 個主流機器學習演算法

  • 2019 年 10 月 4 日
  • 筆記

上周推了一篇關於機器學習演算法需要掌握到什麼程度的文章

掌握機器學習演算法的三種境界,附資源推薦!

第三重境界便是Python實現主流機器學習模型。

今天向大家推薦普林斯頓博士後 David Bourgin 最近開源的項目:用 NumPy 手寫所有主流 ML 模型,看了一下,程式碼可讀性極強。

在每一個程式碼集下,作者都會提供不同實現的參考資料,例如模型的效果示例圖、參考論文和參考鏈接等。

以線性回歸為例,作者不但用500行程式碼實現了OLS/Ridge/Logistic/Bayesian linear regression

import numpy as np  from ..utils.testing import is_symmetric_positive_definite, is_number    class LinearRegression:      def __init__(self, fit_intercept=True):          """          An ordinary least squares regression model fit via the normal equation.          Parameters            fit_intercept : bool              Whether to fit an additional intercept term in addition to the              model coefficients. Default is True.          """          self.beta = None          self.fit_intercept = fit_intercept      def fit(self, X, y):          """          Fit the regression coefficients via maximum likelihood.          Parameters          ----------          X : :py:class:`ndarray <numpy.ndarray>` of shape `(N, M)`              A dataset consisting of `N` examples, each of dimension `M`.          y : :py:class:`ndarray <numpy.ndarray>` of shape `(N, K)`              The targets for each of the `N` examples in `X`, where each target              has dimension `K`.          """          # convert X to a design matrix if we're fitting an intercept          if self.fit_intercept:              X = np.c_[np.ones(X.shape[]), X]          pseudo_inverse = np.dot(np.linalg.inv(np.dot(X.T, X)), X.T)          self.beta = np.dot(pseudo_inverse, y)      def predict(self, X):          """          Used the trained model to generate predictions on a new collection of          data points.          Parameters         ----------          X : :py:class:`ndarray <numpy.ndarray>` of shape `(Z, M)`              A dataset consisting of `Z` new examples, each of dimension `M`.          Returns          -------          y_pred : :py:class:`ndarray <numpy.ndarray>` of shape `(Z, K)`              The model predictions for the items in `X`.          """          # convert X to a design matrix if we're fitting an intercept          if self.fit_intercept:              X = np.c_[np.ones(X.shape[]), X]          return np.dot(X, self.beta)  

還畫出了手寫與調用sklearn的對比:

更多精彩內容,值得大家仔細挖掘,相信跟著完整實現一遍之後,大家對機器學習基礎的掌握也將極其牢固。另外,建議大家配合作者提供的documentation 一同食用,效果更佳。

項目地址:https://github.com/ddbourgin/numpy-ml

文檔地址:https://numpy-ml.readthedocs.io/