AutoKeras—自动机器学习
- 2019 年 10 月 6 日
- 筆記
声明:本项目来自于
https://github.com/jhfjhfj1/autokeras
Auto-Keras是用于自动机器学习的开源软件库。目的是让仅拥有一定数据科学知识或机器学习背景的行业专家可以轻松地应用深度学习模型。
AutoKeras提供了很多用于自动研究深度学习模型架构与超参数的函数。
(Auto-Keras is an open source software library for automated machine learning (AutoML). It is developed by DATA Lab at Texas A&M University and community contributors. The ultimate goal of AutoML is to provide easily accessible deep learning tools to domain experts with limited data science or machine learning background. Auto-Keras provides functions to automatically search for architecture and hyperparameters of deep learning models.)
使用从安装开始:
采用pip进行简单安装
pip install autokeras
注明:autokeras库只支持python3.6的版本
简单的使用:
import autokeras as ak clf = ak.ImageClassifier() clf.fit(x_train, y_train) results = clf.predict(x_test)
新手咱们就进入example来感受一下:
https://github.com/jhfjhfj1/autokeras/tree/master/examples
load_raw_image文件夹里的load.py用来解压load_raw_image_data.zip,将测试集和训练集解压到原目录下,进行预测。
def load_images(): x_train, y_train = load_image_dataset(csv_file_path = "train/label.csv", images_path = "train") print(x_train.shape) print(y_train.shape) x_test, y_test = load_image_dataset(csv_file_path = "test/label.csv",images_path="test") print(x_test.shape) print(y_test.shape) return x_train, y_train, x_test, y_test def run(): x_train, y_train, x_test, y_test = load_images() # After loading train and evaluate classifier. clf = ImageClassifier(verbose=True, augment=False) clf.fit(x_train, y_train, time_limit=12 * 60 * 60) clf.final_fit(x_train, y_train, x_test, y_test, retrain=True) y = clf.evaluate(x_test, y_test) print(y * 100)
labeledTrainData.tsv是用来训练的数据集。
mnist.py
用autokeras的ImageClassifier来训练mnist数据集:
(x_train, y_train), (x_test, y_test) = mnist.load_data() x_train = x_train.reshape(x_train.shape + (1,)) x_test = x_test.reshape(x_test.shape + (1,)) clf = ImageClassifier(verbose=True, augment=False) clf.fit(x_train, y_train, time_limit=2 * 60) # clf.final_fit(x_train, y_train, x_test, y_test, retrain=True) y = clf.evaluate(x_test, y_test) print(y * 100)
mnist_regression.py
(x_train, y_train), (x_test, y_test) = mnist.load_data() x_train = x_train.reshape(x_train.shape + (1,)) x_test = x_test.reshape(x_test.shape + (1,)) clf = ImageRegressor(verbose=True, augment=False) clf.fit(x_train, y_train, time_limit=12 * 60 * 60) clf.final_fit(x_train, y_train, x_test, y_test, retrain=True) y = clf.evaluate(x_test, y_test) print(y * 100)
下面我们开一下autokeras的demo:
from functools import reduce import torch import numpy as np from torch.utils.data import DataLoader from torchvision.transforms import Compose from autokeras.nn.loss_function import classification_loss from autokeras.nn.metric import Accuracy from autokeras.nn.model_trainer import ModelTrainer from autokeras.preprocessor import OneHotEncoder, MultiTransformDataset class Net(torch.nn.Module): def __init__(self, input_size, hidden_size, num_classes): super(Net, self).__init__() self.fc1 = torch.nn.Linear(input_size, hidden_size) self.relu = torch.nn.ReLU() self.fc2 = torch.nn.Linear(hidden_size, num_classes) def forward(self, x): out = self.fc1(x) out = self.relu(out) out = self.fc2(out) return out model = Net(50, 100, 10) n_instance = 100 batch_size = 32 train_x = np.random.random((n_instance, 50)) test_x = np.random.random((n_instance, 50)) train_y = np.random.randint(0, 9, n_instance) test_y = np.random.randint(0, 9, n_instance) print(train_x.shape) print(train_y.shape) encoder = OneHotEncoder() encoder.fit(train_y) train_y = encoder.transform(train_y) test_y = encoder.transform(test_y) compose_list = Compose([]) train_data = DataLoader(MultiTransformDataset( torch.Tensor(train_x), torch.Tensor(train_y), compose_list), batch_size=batch_size, shuffle=False) test_data = DataLoader(MultiTransformDataset( torch.Tensor(test_x), torch.Tensor(test_y), compose_list), batch_size=batch_size, shuffle=False) model_trainer = ModelTrainer(model, loss_function=classification_loss, metric=Accuracy, train_data=train_data, test_data=test_data, verbose=True) model_trainer.train_model(2, 1) model.eval() outputs = [] with torch.no_grad(): for index, (inputs, _) in enumerate(test_data): outputs.append(model(inputs).numpy()) output = reduce(lambda x, y: np.concatenate((x, y)), outputs) predicted = encoder.inverse_transform(output) print(predicted)
详情代码见原文。