diff --git a/01 Introduction/Google_Colab_Tutorial.ipynb b/01 Introduction/Google_Colab_Tutorial.ipynb new file mode 100644 index 0000000..0257bc1 --- /dev/null +++ b/01 Introduction/Google_Colab_Tutorial.ipynb @@ -0,0 +1,299 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "Google Colab Tutorial", + "provenance": [], + "collapsed_sections": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "ca2CpPPUvO-h" + }, + "source": [ + "# **Google Colab Tutorial**\n", + "\n", + "\n", + "Should you have any question, contact TA via
ntu-ml-2021spring-ta@googlegroups.com\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "xIN7RF4wjgHk" + }, + "source": [ + "

\"Colaboratory

\r\n", + "\r\n", + "

What is Colaboratory?

\r\n", + "\r\n", + "Colaboratory, or \"Colab\" for short, allows you to write and execute Python in your browser, with \r\n", + "- Zero configuration required\r\n", + "- Free access to GPUs\r\n", + "- Easy sharing\r\n", + "\r\n", + "Whether you're a **student**, a **data scientist** or an **AI researcher**, Colab can make your work easier. Watch [Introduction to Colab](https://www.youtube.com/watch?v=inN8seMm7UI) to learn more, or just get started below!\r\n", + "\r\n", + "You can type python code in the code block, or use a leading exclamation mark ! to change the code block to bash environment to execute linux code." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "IrAxlhp3VBMD" + }, + "source": [ + "To utilize the free GPU provided by google, click on \"Runtime\"(執行階段) -> \"Change Runtime Type\"(變更執行階段類型). There are three options under \"Hardward Accelerator\"(硬體加速器), select \"GPU\". \r\n", + "* Doing this will restart the session, so make sure you change to the desired runtime before executing any code.\r\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "CLUWxZKbvQpx" + }, + "source": [ + "import torch\n", + "torch.cuda.is_available() # is GPU available\n", + "# Outputs True if running with GPU" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "EAM_tPQAELh0" + }, + "source": [ + "**1. Download Files via google drive**\n", + "\n", + " A file stored in Google Drive has the following sharing link:\n", + "\n", + " https://drive.google.com/open?id=1duQU7xqXRsOSPYeOR0zLiSA8g_LCFzoV\n", + " \n", + " The random string after \"open?id=\" is the **file_id**
\n", + "![](https://i.imgur.com/33SW1WZ.png)\n", + "\n", + " It is possible to download the file via Colab knowing the **file_id**, using the following command.\n", + "\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "XztYEj0oD7J3" + }, + "source": [ + "# Download the file with file_id \"1duQU7xqXRsOSPYeOR0zLiSA8g_LCFzoV\", and rename it to Minori.jpg\n", + "!gdown --id '1duQU7xqXRsOSPYeOR0zLiSA8g_LCFzoV' --output Minori.jpg" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Gg3T23LXG-eL" + }, + "source": [ + "# List all the files under the working directory\n", + "!ls" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "38dcGQujOVWM" + }, + "source": [ + "Exclamation mark (!) starts a new shell, does the operations, and then kills that shell, while percentage (%) affects the process associated with the notebook" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dOQxjfAZAsys" + }, + "source": [ + "It can be seen that `Minori.jpg` is saved the the current working directory. \r\n", + "\r\n", + "The working space is temporary, once you close the browser, the file will be gone.\r\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "wLUPcHuNHF8u" + }, + "source": [ + "Clicking on the folder icon will give you the visuallization of the file structure\n", + "
\n", + "  ![image.png]()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "MXp98PijHkrk" + }, + "source": [ + "There should be a file named `Minori.jpg`, if you do not see it, click the icon in the middle (refresh button)
\n", + "  ![](https://i.imgur.com/CNBTH23.png)\n", + "
\n", + "You can double click on the file to view the image.\n", + "\n", + "\n", + "   \n", + "![](https://i.imgur.com/h2PLMrq.png)\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "k_gmTo9NKtu9" + }, + "source": [ + "**2. Mounting Google Drive**\n", + "\n", + " One advantage of using google colab is that connection with other google services such as Google Drive is simple. By mounting google drive, the working files can be stored permanantly. After executing the following code block, log in to the google account and copy the authentication code to the input box to finish the process." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ImETTQKkL2l4" + }, + "source": [ + "from google.colab import drive # Import a library named google.colab\n", + "drive.mount('/content/drive', force_remount=True) # mount the content to the directory `/content/drive`" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "BmvzTF5IJ6TL" + }, + "source": [ + "from google.colab import drive\n", + "drive.mount('/content/drive')" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "AkmayCmGMD03" + }, + "source": [ + "After mounting the drive, the content of the google drive will be under a directory named `MyDrive`, check the file structure for such a folder to confirm the execution of the code." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kDrO_DjBMW5D" + }, + "source": [ + "There is also an icon for mounting google drive. The icon will automatically generate the code above.\n", + "\n", + "![](https://i.imgur.com/hM9Jgi7.png) \n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UhKhwipoMvXF" + }, + "source": [ + "After mounting the drive, all the chnages will be synced with the google drive.\n", + "Since models could be quite large, make sure that your google drive has enough space. You can apply for a gsuite drive which has unlimited space using your studentID (until 2022/07). \n", + "https://www.cc.ntu.edu.tw/chinese/services/serv_i06.asp\n", + "http://www.cc.ntu.edu.tw/english/spotlight/2016/a105038.asp" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "UT0TEPRS7KF6" + }, + "source": [ + "%cd /content/drive/MyDrive \r\n", + "#change directory to google drive\r\n", + "!mkdir ML2021 #make a directory named ML2021\r\n", + "%cd ./ML2021 \r\n", + "#change directory to ML2021" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Oj13Q58QerAx" + }, + "source": [ + "Use bash command pwd to output the current directory" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "-S8l1-ReepkS" + }, + "source": [ + "!pwd #output the current directory" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qSSvrDaBiDrP" + }, + "source": [ + "Repeat the downloading process, this time, the file will be stored permanently in your google drive." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "b39YMYicASvP" + }, + "source": [ + "# Download the file with file_id \"1duQU7xqXRsOSPYeOR0zLiSA8g_LCFzoV\", and rename it to Minori.jpg\r\n", + "!gdown --id '1duQU7xqXRsOSPYeOR0zLiSA8g_LCFzoV' --output Minori.jpg" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "D0URgikZXl5I" + }, + "source": [ + "TA will provide the homework data using code similar to the code above. The data could also be stored in the google drive and loaded from there." + ] + } + ] +} \ No newline at end of file diff --git a/01 Introduction/ML2021Spring_HW1.ipynb b/01 Introduction/ML2021Spring_HW1.ipynb new file mode 100644 index 0000000..94b7712 --- /dev/null +++ b/01 Introduction/ML2021Spring_HW1.ipynb @@ -0,0 +1,874 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "ML2021Spring - HW1.ipynb", + "provenance": [], + "collapsed_sections": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "accelerator": "GPU" + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "mz0_QVkxCrX3" + }, + "source": [ + "# **Homework 1: COVID-19 Cases Prediction (Regression)**" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZeZnPAiwDRWG" + }, + "source": [ + "Author: Heng-Jui Chang\n", + "\n", + "Slides: https://github.com/ga642381/ML2021-Spring/blob/main/HW01/HW01.pdf \n", + "Video: TBA\n", + "\n", + "Objectives:\n", + "* Solve a regression problem with deep neural networks (DNN).\n", + "* Understand basic DNN training tips.\n", + "* Get familiar with PyTorch.\n", + "\n", + "If any questions, please contact the TAs via TA hours, NTU COOL, or email.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Jx3x1nDkG-Uy" + }, + "source": [ + "# **Download Data**\n", + "\n", + "\n", + "If the Google drive links are dead, you can download data from [kaggle](https://www.kaggle.com/c/ml2021spring-hw1/data), and upload data manually to the workspace." + ] + }, + { + "cell_type": "code", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "tMj55YDKG6ch", + "outputId": "fc40ecc9-4756-48b1-d5c6-c169a8b453b2" + }, + "source": [ + "tr_path = 'covid.train.csv' # path to training data\n", + "tt_path = 'covid.test.csv' # path to testing data\n", + "\n", + "!gdown --id '19CCyCgJrUxtvgZF53vnctJiOJ23T5mqF' --output covid.train.csv\n", + "!gdown --id '1CE240jLm2npU-tdz81-oVKEF3T2yfT1O' --output covid.test.csv" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Downloading...\n", + "From: https://drive.google.com/uc?id=19CCyCgJrUxtvgZF53vnctJiOJ23T5mqF\n", + "To: /content/covid.train.csv\n", + "100% 2.00M/2.00M [00:00<00:00, 31.7MB/s]\n", + "Downloading...\n", + "From: https://drive.google.com/uc?id=1CE240jLm2npU-tdz81-oVKEF3T2yfT1O\n", + "To: /content/covid.test.csv\n", + "100% 651k/651k [00:00<00:00, 10.2MB/s]\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "wS_4-77xHk44" + }, + "source": [ + "# **Import Some Packages**" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "k-onQd4JNA5H" + }, + "source": [ + "# PyTorch\n", + "import torch\n", + "import torch.nn as nn\n", + "from torch.utils.data import Dataset, DataLoader\n", + "\n", + "# For data preprocess\n", + "import numpy as np\n", + "import csv\n", + "import os\n", + "\n", + "# For plotting\n", + "import matplotlib.pyplot as plt\n", + "from matplotlib.pyplot import figure\n", + "\n", + "myseed = 42069 # set a random seed for reproducibility\n", + "torch.backends.cudnn.deterministic = True\n", + "torch.backends.cudnn.benchmark = False\n", + "np.random.seed(myseed)\n", + "torch.manual_seed(myseed)\n", + "if torch.cuda.is_available():\n", + " torch.cuda.manual_seed_all(myseed)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BtE3b6JEH7rw" + }, + "source": [ + "# **Some Utilities**\n", + "\n", + "You do not need to modify this part." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "FWMT3uf1NGQp" + }, + "source": [ + "def get_device():\n", + " ''' Get device (if GPU is available, use GPU) '''\n", + " return 'cuda' if torch.cuda.is_available() else 'cpu'\n", + "\n", + "def plot_learning_curve(loss_record, title=''):\n", + " ''' Plot learning curve of your DNN (train & dev loss) '''\n", + " total_steps = len(loss_record['train'])\n", + " x_1 = range(total_steps)\n", + " x_2 = x_1[::len(loss_record['train']) // len(loss_record['dev'])]\n", + " figure(figsize=(6, 4))\n", + " plt.plot(x_1, loss_record['train'], c='tab:red', label='train')\n", + " plt.plot(x_2, loss_record['dev'], c='tab:cyan', label='dev')\n", + " plt.ylim(0.0, 5.)\n", + " plt.xlabel('Training steps')\n", + " plt.ylabel('MSE loss')\n", + " plt.title('Learning curve of {}'.format(title))\n", + " plt.legend()\n", + " plt.show()\n", + "\n", + "\n", + "def plot_pred(dv_set, model, device, lim=35., preds=None, targets=None):\n", + " ''' Plot prediction of your DNN '''\n", + " if preds is None or targets is None:\n", + " model.eval()\n", + " preds, targets = [], []\n", + " for x, y in dv_set:\n", + " x, y = x.to(device), y.to(device)\n", + " with torch.no_grad():\n", + " pred = model(x)\n", + " preds.append(pred.detach().cpu())\n", + " targets.append(y.detach().cpu())\n", + " preds = torch.cat(preds, dim=0).numpy()\n", + " targets = torch.cat(targets, dim=0).numpy()\n", + "\n", + " figure(figsize=(5, 5))\n", + " plt.scatter(targets, preds, c='r', alpha=0.5)\n", + " plt.plot([-0.2, lim], [-0.2, lim], c='b')\n", + " plt.xlim(-0.2, lim)\n", + " plt.ylim(-0.2, lim)\n", + " plt.xlabel('ground truth value')\n", + " plt.ylabel('predicted value')\n", + " plt.title('Ground Truth v.s. Prediction')\n", + " plt.show()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "39U_XFX6KOoj" + }, + "source": [ + "# **Preprocess**\n", + "\n", + "We have three kinds of datasets:\n", + "* `train`: for training\n", + "* `dev`: for validation\n", + "* `test`: for testing (w/o target value)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TQ-MdwpLL7Dt" + }, + "source": [ + "## **Dataset**\n", + "\n", + "The `COVID19Dataset` below does:\n", + "* read `.csv` files\n", + "* extract features\n", + "* split `covid.train.csv` into train/dev sets\n", + "* normalize features\n", + "\n", + "Finishing `TODO` below might make you pass medium baseline." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "0zlpIp9ANJRU" + }, + "source": [ + "class COVID19Dataset(Dataset):\n", + " ''' Dataset for loading and preprocessing the COVID19 dataset '''\n", + " def __init__(self,\n", + " path,\n", + " mode='train',\n", + " target_only=False):\n", + " self.mode = mode\n", + "\n", + " # Read data into numpy arrays\n", + " with open(path, 'r') as fp:\n", + " data = list(csv.reader(fp))\n", + " data = np.array(data[1:])[:, 1:].astype(float)\n", + " \n", + " if not target_only:\n", + " feats = list(range(93))\n", + " else:\n", + " # TODO: Using 40 states & 2 tested_positive features (indices = 57 & 75)\n", + " pass\n", + "\n", + " if mode == 'test':\n", + " # Testing data\n", + " # data: 893 x 93 (40 states + day 1 (18) + day 2 (18) + day 3 (17))\n", + " data = data[:, feats]\n", + " self.data = torch.FloatTensor(data)\n", + " else:\n", + " # Training data (train/dev sets)\n", + " # data: 2700 x 94 (40 states + day 1 (18) + day 2 (18) + day 3 (18))\n", + " target = data[:, -1]\n", + " data = data[:, feats]\n", + " \n", + " # Splitting training data into train & dev sets\n", + " if mode == 'train':\n", + " indices = [i for i in range(len(data)) if i % 10 != 0]\n", + " elif mode == 'dev':\n", + " indices = [i for i in range(len(data)) if i % 10 == 0]\n", + " \n", + " # Convert data into PyTorch tensors\n", + " self.data = torch.FloatTensor(data[indices])\n", + " self.target = torch.FloatTensor(target[indices])\n", + "\n", + " # Normalize features (you may remove this part to see what will happen)\n", + " self.data[:, 40:] = \\\n", + " (self.data[:, 40:] - self.data[:, 40:].mean(dim=0, keepdim=True)) \\\n", + " / self.data[:, 40:].std(dim=0, keepdim=True)\n", + "\n", + " self.dim = self.data.shape[1]\n", + "\n", + " print('Finished reading the {} set of COVID19 Dataset ({} samples found, each dim = {})'\n", + " .format(mode, len(self.data), self.dim))\n", + "\n", + " def __getitem__(self, index):\n", + " # Returns one sample at a time\n", + " if self.mode in ['train', 'dev']:\n", + " # For training\n", + " return self.data[index], self.target[index]\n", + " else:\n", + " # For testing (no target)\n", + " return self.data[index]\n", + "\n", + " def __len__(self):\n", + " # Returns the size of the dataset\n", + " return len(self.data)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "AlhTlkE7MDo3" + }, + "source": [ + "## **DataLoader**\n", + "\n", + "A `DataLoader` loads data from a given `Dataset` into batches.\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "hlhLk5t6MBX3" + }, + "source": [ + "def prep_dataloader(path, mode, batch_size, n_jobs=0, target_only=False):\n", + " ''' Generates a dataset, then is put into a dataloader. '''\n", + " dataset = COVID19Dataset(path, mode=mode, target_only=target_only) # Construct dataset\n", + " dataloader = DataLoader(\n", + " dataset, batch_size,\n", + " shuffle=(mode == 'train'), drop_last=False,\n", + " num_workers=n_jobs, pin_memory=True) # Construct dataloader\n", + " return dataloader" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SGuycwR0MeQB" + }, + "source": [ + "# **Deep Neural Network**\n", + "\n", + "`NeuralNet` is an `nn.Module` designed for regression.\n", + "The DNN consists of 2 fully-connected layers with ReLU activation.\n", + "This module also included a function `cal_loss` for calculating loss.\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "49-uXYovOAI0" + }, + "source": [ + "class NeuralNet(nn.Module):\n", + " ''' A simple fully-connected deep neural network '''\n", + " def __init__(self, input_dim):\n", + " super(NeuralNet, self).__init__()\n", + "\n", + " # Define your neural network here\n", + " # TODO: How to modify this model to achieve better performance?\n", + " self.net = nn.Sequential(\n", + " nn.Linear(input_dim, 64),\n", + " nn.ReLU(),\n", + " nn.Linear(64, 1)\n", + " )\n", + "\n", + " # Mean squared error loss\n", + " self.criterion = nn.MSELoss(reduction='mean')\n", + "\n", + " def forward(self, x):\n", + " ''' Given input of size (batch_size x input_dim), compute output of the network '''\n", + " return self.net(x).squeeze(1)\n", + "\n", + " def cal_loss(self, pred, target):\n", + " ''' Calculate loss '''\n", + " # TODO: you may implement L2 regularization here\n", + " return self.criterion(pred, target)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DvFWVjZ5Nvga" + }, + "source": [ + "# **Train/Dev/Test**" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "MAM8QecJOyqn" + }, + "source": [ + "## **Training**" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "lOqcmYzMO7jB" + }, + "source": [ + "def train(tr_set, dv_set, model, config, device):\n", + " ''' DNN training '''\n", + "\n", + " n_epochs = config['n_epochs'] # Maximum number of epochs\n", + "\n", + " # Setup optimizer\n", + " optimizer = getattr(torch.optim, config['optimizer'])(\n", + " model.parameters(), **config['optim_hparas'])\n", + "\n", + " min_mse = 1000.\n", + " loss_record = {'train': [], 'dev': []} # for recording training loss\n", + " early_stop_cnt = 0\n", + " epoch = 0\n", + " while epoch < n_epochs:\n", + " model.train() # set model to training mode\n", + " for x, y in tr_set: # iterate through the dataloader\n", + " optimizer.zero_grad() # set gradient to zero\n", + " x, y = x.to(device), y.to(device) # move data to device (cpu/cuda)\n", + " pred = model(x) # forward pass (compute output)\n", + " mse_loss = model.cal_loss(pred, y) # compute loss\n", + " mse_loss.backward() # compute gradient (backpropagation)\n", + " optimizer.step() # update model with optimizer\n", + " loss_record['train'].append(mse_loss.detach().cpu().item())\n", + "\n", + " # After each epoch, test your model on the validation (development) set.\n", + " dev_mse = dev(dv_set, model, device)\n", + " if dev_mse < min_mse:\n", + " # Save model if your model improved\n", + " min_mse = dev_mse\n", + " print('Saving model (epoch = {:4d}, loss = {:.4f})'\n", + " .format(epoch + 1, min_mse))\n", + " torch.save(model.state_dict(), config['save_path']) # Save model to specified path\n", + " early_stop_cnt = 0\n", + " else:\n", + " early_stop_cnt += 1\n", + "\n", + " epoch += 1\n", + " loss_record['dev'].append(dev_mse)\n", + " if early_stop_cnt > config['early_stop']:\n", + " # Stop training if your model stops improving for \"config['early_stop']\" epochs.\n", + " break\n", + "\n", + " print('Finished training after {} epochs'.format(epoch))\n", + " return min_mse, loss_record" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0hSd4Bn3O2PL" + }, + "source": [ + "## **Validation**" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "yrxrD3YsN3U2" + }, + "source": [ + "def dev(dv_set, model, device):\n", + " model.eval() # set model to evalutation mode\n", + " total_loss = 0\n", + " for x, y in dv_set: # iterate through the dataloader\n", + " x, y = x.to(device), y.to(device) # move data to device (cpu/cuda)\n", + " with torch.no_grad(): # disable gradient calculation\n", + " pred = model(x) # forward pass (compute output)\n", + " mse_loss = model.cal_loss(pred, y) # compute loss\n", + " total_loss += mse_loss.detach().cpu().item() * len(x) # accumulate loss\n", + " total_loss = total_loss / len(dv_set.dataset) # compute averaged loss\n", + "\n", + " return total_loss" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "g0pdrhQAO41L" + }, + "source": [ + "## **Testing**" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "aSBMRFlYN5tB" + }, + "source": [ + "def test(tt_set, model, device):\n", + " model.eval() # set model to evalutation mode\n", + " preds = []\n", + " for x in tt_set: # iterate through the dataloader\n", + " x = x.to(device) # move data to device (cpu/cuda)\n", + " with torch.no_grad(): # disable gradient calculation\n", + " pred = model(x) # forward pass (compute output)\n", + " preds.append(pred.detach().cpu()) # collect prediction\n", + " preds = torch.cat(preds, dim=0).numpy() # concatenate all predictions and convert to a numpy array\n", + " return preds" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SvckkF5dvf0j" + }, + "source": [ + "# **Setup Hyper-parameters**\n", + "\n", + "`config` contains hyper-parameters for training and the path to save your model." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "NPXpdumwPjE7" + }, + "source": [ + "device = get_device() # get the current available device ('cpu' or 'cuda')\n", + "os.makedirs('models', exist_ok=True) # The trained model will be saved to ./models/\n", + "target_only = False # TODO: Using 40 states & 2 tested_positive features\n", + "\n", + "# TODO: How to tune these hyper-parameters to improve your model's performance?\n", + "config = {\n", + " 'n_epochs': 3000, # maximum number of epochs\n", + " 'batch_size': 270, # mini-batch size for dataloader\n", + " 'optimizer': 'SGD', # optimization algorithm (optimizer in torch.optim)\n", + " 'optim_hparas': { # hyper-parameters for the optimizer (depends on which optimizer you are using)\n", + " 'lr': 0.001, # learning rate of SGD\n", + " 'momentum': 0.9 # momentum for SGD\n", + " },\n", + " 'early_stop': 200, # early stopping epochs (the number epochs since your model's last improvement)\n", + " 'save_path': 'models/model.pth' # your model will be saved here\n", + "}" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6j1eOV3TOH-j" + }, + "source": [ + "# **Load data and model**" + ] + }, + { + "cell_type": "code", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "eNrYBMmePLKm", + "outputId": "fcd4f175-4f7e-4306-f33c-5f8285f11dce" + }, + "source": [ + "tr_set = prep_dataloader(tr_path, 'train', config['batch_size'], target_only=target_only)\n", + "dv_set = prep_dataloader(tr_path, 'dev', config['batch_size'], target_only=target_only)\n", + "tt_set = prep_dataloader(tt_path, 'test', config['batch_size'], target_only=target_only)" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Finished reading the train set of COVID19 Dataset (2430 samples found, each dim = 93)\n", + "Finished reading the dev set of COVID19 Dataset (270 samples found, each dim = 93)\n", + "Finished reading the test set of COVID19 Dataset (893 samples found, each dim = 93)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "FHylSirLP9oh" + }, + "source": [ + "model = NeuralNet(tr_set.dataset.dim).to(device) # Construct model and move to device" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sX2B_zgSOPTJ" + }, + "source": [ + "# **Start Training!**" + ] + }, + { + "cell_type": "code", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "GrEbUxazQAAZ", + "outputId": "f4f3bd74-2d97-4275-b69f-6609976b91f9" + }, + "source": [ + "model_loss, model_loss_record = train(tr_set, dv_set, model, config, device)" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Saving model (epoch = 1, loss = 74.9742)\n", + "Saving model (epoch = 2, loss = 50.5313)\n", + "Saving model (epoch = 3, loss = 29.1148)\n", + "Saving model (epoch = 4, loss = 15.8134)\n", + "Saving model (epoch = 5, loss = 9.5430)\n", + "Saving model (epoch = 6, loss = 6.8086)\n", + "Saving model (epoch = 7, loss = 5.3892)\n", + "Saving model (epoch = 8, loss = 4.5267)\n", + "Saving model (epoch = 9, loss = 3.9454)\n", + "Saving model (epoch = 10, loss = 3.5560)\n", + "Saving model (epoch = 11, loss = 3.2303)\n", + "Saving model (epoch = 12, loss = 2.9920)\n", + "Saving model (epoch = 13, loss = 2.7737)\n", + "Saving model (epoch = 14, loss = 2.6181)\n", + "Saving model (epoch = 15, loss = 2.3987)\n", + "Saving model (epoch = 16, loss = 2.2712)\n", + "Saving model (epoch = 17, loss = 2.1349)\n", + "Saving model (epoch = 18, loss = 2.0210)\n", + "Saving model (epoch = 19, loss = 1.8848)\n", + "Saving model (epoch = 20, loss = 1.7999)\n", + "Saving model (epoch = 21, loss = 1.7510)\n", + "Saving model (epoch = 22, loss = 1.6787)\n", + "Saving model (epoch = 23, loss = 1.6450)\n", + "Saving model (epoch = 24, loss = 1.6030)\n", + "Saving model (epoch = 26, loss = 1.5052)\n", + "Saving model (epoch = 27, loss = 1.4486)\n", + "Saving model (epoch = 28, loss = 1.4069)\n", + "Saving model (epoch = 29, loss = 1.3733)\n", + "Saving model (epoch = 30, loss = 1.3533)\n", + "Saving model (epoch = 31, loss = 1.3335)\n", + "Saving model (epoch = 32, loss = 1.3011)\n", + "Saving model (epoch = 33, loss = 1.2711)\n", + "Saving model (epoch = 35, loss = 1.2331)\n", + "Saving model (epoch = 36, loss = 1.2235)\n", + "Saving model (epoch = 38, loss = 1.2180)\n", + "Saving model (epoch = 39, loss = 1.2018)\n", + "Saving model (epoch = 40, loss = 1.1651)\n", + "Saving model (epoch = 42, loss = 1.1631)\n", + "Saving model (epoch = 43, loss = 1.1394)\n", + "Saving model (epoch = 46, loss = 1.1129)\n", + "Saving model (epoch = 47, loss = 1.1107)\n", + "Saving model (epoch = 49, loss = 1.1091)\n", + "Saving model (epoch = 50, loss = 1.0838)\n", + "Saving model (epoch = 52, loss = 1.0692)\n", + "Saving model (epoch = 53, loss = 1.0681)\n", + "Saving model (epoch = 55, loss = 1.0537)\n", + "Saving model (epoch = 60, loss = 1.0457)\n", + "Saving model (epoch = 61, loss = 1.0366)\n", + "Saving model (epoch = 63, loss = 1.0359)\n", + "Saving model (epoch = 64, loss = 1.0111)\n", + "Saving model (epoch = 69, loss = 1.0072)\n", + "Saving model (epoch = 72, loss = 0.9760)\n", + "Saving model (epoch = 76, loss = 0.9672)\n", + "Saving model (epoch = 79, loss = 0.9584)\n", + "Saving model (epoch = 80, loss = 0.9526)\n", + "Saving model (epoch = 82, loss = 0.9494)\n", + "Saving model (epoch = 83, loss = 0.9426)\n", + "Saving model (epoch = 88, loss = 0.9398)\n", + "Saving model (epoch = 89, loss = 0.9223)\n", + "Saving model (epoch = 95, loss = 0.9111)\n", + "Saving model (epoch = 98, loss = 0.9034)\n", + "Saving model (epoch = 101, loss = 0.9014)\n", + "Saving model (epoch = 105, loss = 0.9011)\n", + "Saving model (epoch = 106, loss = 0.8933)\n", + "Saving model (epoch = 110, loss = 0.8893)\n", + "Saving model (epoch = 117, loss = 0.8867)\n", + "Saving model (epoch = 118, loss = 0.8867)\n", + "Saving model (epoch = 121, loss = 0.8790)\n", + "Saving model (epoch = 126, loss = 0.8642)\n", + "Saving model (epoch = 130, loss = 0.8627)\n", + "Saving model (epoch = 137, loss = 0.8616)\n", + "Saving model (epoch = 139, loss = 0.8534)\n", + "Saving model (epoch = 147, loss = 0.8467)\n", + "Saving model (epoch = 154, loss = 0.8463)\n", + "Saving model (epoch = 155, loss = 0.8408)\n", + "Saving model (epoch = 167, loss = 0.8354)\n", + "Saving model (epoch = 176, loss = 0.8314)\n", + "Saving model (epoch = 191, loss = 0.8267)\n", + "Saving model (epoch = 200, loss = 0.8212)\n", + "Saving model (epoch = 226, loss = 0.8190)\n", + "Saving model (epoch = 230, loss = 0.8144)\n", + "Saving model (epoch = 244, loss = 0.8136)\n", + "Saving model (epoch = 258, loss = 0.8095)\n", + "Saving model (epoch = 269, loss = 0.8076)\n", + "Saving model (epoch = 285, loss = 0.8064)\n", + "Saving model (epoch = 330, loss = 0.8055)\n", + "Saving model (epoch = 347, loss = 0.8053)\n", + "Saving model (epoch = 359, loss = 0.7992)\n", + "Saving model (epoch = 410, loss = 0.7989)\n", + "Saving model (epoch = 442, loss = 0.7966)\n", + "Saving model (epoch = 447, loss = 0.7966)\n", + "Saving model (epoch = 576, loss = 0.7958)\n", + "Saving model (epoch = 596, loss = 0.7929)\n", + "Saving model (epoch = 600, loss = 0.7893)\n", + "Saving model (epoch = 683, loss = 0.7825)\n", + "Saving model (epoch = 878, loss = 0.7817)\n", + "Saving model (epoch = 904, loss = 0.7794)\n", + "Saving model (epoch = 931, loss = 0.7790)\n", + "Saving model (epoch = 951, loss = 0.7781)\n", + "Saving model (epoch = 965, loss = 0.7771)\n", + "Saving model (epoch = 1018, loss = 0.7717)\n", + "Saving model (epoch = 1168, loss = 0.7653)\n", + "Saving model (epoch = 1267, loss = 0.7645)\n", + "Saving model (epoch = 1428, loss = 0.7644)\n", + "Saving model (epoch = 1461, loss = 0.7635)\n", + "Saving model (epoch = 1484, loss = 0.7629)\n", + "Saving model (epoch = 1493, loss = 0.7590)\n", + "Finished training after 1694 epochs\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 295 + }, + "id": "hsNO9nnXQBvP", + "outputId": "1626def6-94c7-4a87-9447-d939f827c8eb" + }, + "source": [ + "plot_learning_curve(model_loss_record, title='deep model')" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "display_data", + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "tags": [], + "needs_background": "light" + } + } + ] + }, + { + "cell_type": "code", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 350 + }, + "id": "3iZTVn5WQFpX", + "outputId": "a2d5e118-559d-45c6-b644-6792af54663d" + }, + "source": [ + "del model\n", + "model = NeuralNet(tr_set.dataset.dim).to(device)\n", + "ckpt = torch.load(config['save_path'], map_location='cpu') # Load your best model\n", + "model.load_state_dict(ckpt)\n", + "plot_pred(dv_set, model, device) # Show prediction on the validation set" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "display_data", + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "tags": [], + "needs_background": "light" + } + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aQikz3IPiyPf" + }, + "source": [ + "# **Testing**\n", + "The predictions of your model on testing set will be stored at `pred.csv`." + ] + }, + { + "cell_type": "code", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "O8cTuQjQQOon", + "outputId": "6bc5de07-4c5a-4e87-9ae3-d09f539c5f2c" + }, + "source": [ + "def save_pred(preds, file):\n", + " ''' Save predictions to specified file '''\n", + " print('Saving results to {}'.format(file))\n", + " with open(file, 'w') as fp:\n", + " writer = csv.writer(fp)\n", + " writer.writerow(['id', 'tested_positive'])\n", + " for i, p in enumerate(preds):\n", + " writer.writerow([i, p])\n", + "\n", + "preds = test(tt_set, model, device) # predict COVID-19 cases with your model\n", + "save_pred(preds, 'pred.csv') # save prediction file to pred.csv" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Saving results to pred.csv\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nfrVxqJanGpE" + }, + "source": [ + "# **Hints**\n", + "\n", + "## **Simple Baseline**\n", + "* Run sample code\n", + "\n", + "## **Medium Baseline**\n", + "* Feature selection: 40 states + 2 `tested_positive` (`TODO` in dataset)\n", + "\n", + "## **Strong Baseline**\n", + "* Feature selection (what other features are useful?)\n", + "* DNN architecture (layers? dimension? activation function?)\n", + "* Training (mini-batch? optimizer? learning rate?)\n", + "* L2 regularization\n", + "* There are some mistakes in the sample code, can you find them?" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9tmCwXgpot3t" + }, + "source": [ + "# **Reference**\n", + "This code is completely written by Heng-Jui Chang @ NTUEE. \n", + "Copying or reusing this code is required to specify the original author. \n", + "\n", + "E.g. \n", + "Source: Heng-Jui Chang @ NTUEE (https://github.com/ga642381/ML2021-Spring/blob/main/HW01/HW01.ipynb)\n" + ] + } + ] +} \ No newline at end of file diff --git a/01 Introduction/Pytorch_Tutorial.ipynb b/01 Introduction/Pytorch_Tutorial.ipynb new file mode 100644 index 0000000..80e1be8 --- /dev/null +++ b/01 Introduction/Pytorch_Tutorial.ipynb @@ -0,0 +1,614 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "Pytorch Tutorial", + "provenance": [], + "collapsed_sections": [] + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "accelerator": "GPU" + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "tHILOGjOQbsQ" + }, + "source": [ + "# **Pytorch Tutorial**\r\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "C1zA7GupxdJv" + }, + "source": [ + "import torch" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6Eqj90EkWbWx" + }, + "source": [ + "**1. Pytorch Documentation Explanation with torch.max**\r\n", + "\r\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "JCXOg-iSQuk7" + }, + "source": [ + "x = torch.randn(4,5)\r\n", + "y = torch.randn(4,5)\r\n", + "z = torch.randn(4,5)\r\n", + "print(x)\r\n", + "print(y)\r\n", + "print(z)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "EEqa9GFoWF78" + }, + "source": [ + "# 1. max of entire tensor (torch.max(input) → Tensor)\r\n", + "m = torch.max(x)\r\n", + "print(m)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "wffThGDyWKxJ" + }, + "source": [ + "# 2. max along a dimension (torch.max(input, dim, keepdim=False, *, out=None) → (Tensor, LongTensor))\r\n", + "m, idx = torch.max(x,0)\r\n", + "print(m)\r\n", + "print(idx)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "oKDQW3tIXKg-" + }, + "source": [ + "# 2-2\r\n", + "m, idx = torch.max(input=x,dim=0)\r\n", + "print(m)\r\n", + "print(idx)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "6QZ6WRLyX3De" + }, + "source": [ + "# 2-3\r\n", + "m, idx = torch.max(x,0,False)\r\n", + "print(m)\r\n", + "print(idx)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "nqGuctkKbUEn" + }, + "source": [ + "# 2-4\r\n", + "m, idx = torch.max(x,dim=0,keepdim=True)\r\n", + "print(m)\r\n", + "print(idx)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "9OMzxuMlZPIu" + }, + "source": [ + "# 2-5\r\n", + "p = (m,idx)\r\n", + "torch.max(x,0,False,out=p)\r\n", + "print(p[0])\r\n", + "print(p[1])\r\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "uhd4TqGTbD2c" + }, + "source": [ + "# 2-6\r\n", + "p = (m,idx)\r\n", + "torch.max(x,0,False,p)\r\n", + "print(p[0])\r\n", + "print(p[1])" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "wbxjUSOXxN0n" + }, + "source": [ + "# 2-7\r\n", + "m, idx = torch.max(x,True)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "iMwhGLlGWYaR" + }, + "source": [ + "# 3. max(choose max) operators on two tensors (torch.max(input, other, *, out=None) → Tensor)\r\n", + "t = torch.max(x,y)\r\n", + "print(t)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nFxRKu2Dedwb" + }, + "source": [ + "**2. Common errors**\r\n", + "\r\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "KMcRyMxGwhul" + }, + "source": [ + "The following code blocks show some common errors while using the torch library. First, execute the code with error, and then execute the next code block to fix the error. You need to change the runtime to GPU.\r\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "eX-kKdi6ynFf" + }, + "source": [ + "import torch" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "-muJ4KKreoP2", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 363 + }, + "outputId": "c1d5c3a5-9540-4145-d80c-3cbca18a1deb" + }, + "source": [ + "# 1. different device error\r\n", + "model = torch.nn.Linear(5,1).to(\"cuda:0\")\r\n", + "x = torch.Tensor([1,2,3,4,5]).to(\"cpu\")\r\n", + "y = model(x)" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "error", + "ename": "RuntimeError", + "evalue": "ignored", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mmodel\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mLinear\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mto\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"cuda:0\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mTensor\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mto\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"cpu\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmodel\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py\u001b[0m in \u001b[0;36m_call_impl\u001b[0;34m(self, *input, **kwargs)\u001b[0m\n\u001b[1;32m 725\u001b[0m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_slow_forward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 726\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 727\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mforward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 728\u001b[0m for hook in itertools.chain(\n\u001b[1;32m 729\u001b[0m \u001b[0m_global_forward_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torch/nn/modules/linear.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(self, input)\u001b[0m\n\u001b[1;32m 91\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 92\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mforward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mTensor\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0mTensor\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 93\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mF\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlinear\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mweight\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mbias\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 94\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 95\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mextra_repr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py\u001b[0m in \u001b[0;36mlinear\u001b[0;34m(input, weight, bias)\u001b[0m\n\u001b[1;32m 1690\u001b[0m \u001b[0mret\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0maddmm\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbias\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweight\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mt\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1691\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1692\u001b[0;31m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmatmul\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mweight\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mt\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1693\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mbias\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1694\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0mbias\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mRuntimeError\u001b[0m: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm)" + ] + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "a54PqxJLe9-c", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "909d3693-236f-4419-f269-8fb443ef7534" + }, + "source": [ + "# 1. different device error (fixed)\r\n", + "x = torch.Tensor([1,2,3,4,5]).to(\"cuda:0\")\r\n", + "y = model(x)\r\n", + "print(y.shape)" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "text": [ + "torch.Size([1])\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "n7OHtZwbi7Qw", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 201 + }, + "outputId": "2a7d2dd0-6498-4da0-9591-3554c1739046" + }, + "source": [ + "# 2. mismatched dimensions error\r\n", + "x = torch.randn(4,5)\r\n", + "y= torch.randn(5,4)\r\n", + "z = x + y" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "error", + "ename": "RuntimeError", + "evalue": "ignored", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrandn\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrandn\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0mz\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mRuntimeError\u001b[0m: The size of tensor a (5) must match the size of tensor b (4) at non-singleton dimension 1" + ] + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "qVynzvrskFCD", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "926dc01c-be6f-48e1-ad39-a5bcecebc513" + }, + "source": [ + "# 2. mismatched dimensions error (fixed)\r\n", + "y= y.transpose(0,1)\r\n", + "z = x + y\r\n", + "print(z.shape)" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "text": [ + "torch.Size([4, 5])\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Hgzgb9gJANod", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 398 + }, + "outputId": "21b58850-b3f1-4f2a-db5d-cc45e47ccbea" + }, + "source": [ + "# 3. cuda out of memory error\n", + "import torch\n", + "import torchvision.models as models\n", + "resnet18 = models.resnet18().to(\"cuda:0\") # Neural Networks for Image Recognition\n", + "data = torch.randn(2048,3,244,244) # Create fake data (512 images)\n", + "out = resnet18(data.to(\"cuda:0\")) # Use Data as Input and Feed to Model\n", + "print(out.shape)\n" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "error", + "ename": "RuntimeError", + "evalue": "ignored", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mresnet18\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmodels\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mresnet18\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mto\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"cuda:0\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# Neural Networks for Image Recognition\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mdata\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrandn\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m2048\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m244\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m244\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# Create fake data (512 images)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0mout\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mresnet18\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mto\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"cuda:0\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# Use Data as Input and Feed to Model\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 7\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mout\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py\u001b[0m in \u001b[0;36m_call_impl\u001b[0;34m(self, *input, **kwargs)\u001b[0m\n\u001b[1;32m 725\u001b[0m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_slow_forward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 726\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 727\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mforward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 728\u001b[0m for hook in itertools.chain(\n\u001b[1;32m 729\u001b[0m \u001b[0m_global_forward_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torchvision/models/resnet.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(self, x)\u001b[0m\n\u001b[1;32m 218\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 219\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mforward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 220\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_forward_impl\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 221\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 222\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torchvision/models/resnet.py\u001b[0m in \u001b[0;36m_forward_impl\u001b[0;34m(self, x)\u001b[0m\n\u001b[1;32m 202\u001b[0m \u001b[0;31m# See note [TorchScript super()]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 203\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconv1\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 204\u001b[0;31m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mbn1\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 205\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrelu\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 206\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmaxpool\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py\u001b[0m in \u001b[0;36m_call_impl\u001b[0;34m(self, *input, **kwargs)\u001b[0m\n\u001b[1;32m 725\u001b[0m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_slow_forward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 726\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 727\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mforward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 728\u001b[0m for hook in itertools.chain(\n\u001b[1;32m 729\u001b[0m \u001b[0m_global_forward_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torch/nn/modules/batchnorm.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(self, input)\u001b[0m\n\u001b[1;32m 134\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrunning_mean\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtraining\u001b[0m \u001b[0;32mor\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrack_running_stats\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 135\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrunning_var\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtraining\u001b[0m \u001b[0;32mor\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrack_running_stats\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 136\u001b[0;31m self.weight, self.bias, bn_training, exponential_average_factor, self.eps)\n\u001b[0m\u001b[1;32m 137\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 138\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py\u001b[0m in \u001b[0;36mbatch_norm\u001b[0;34m(input, running_mean, running_var, weight, bias, training, momentum, eps)\u001b[0m\n\u001b[1;32m 2056\u001b[0m return torch.batch_norm(\n\u001b[1;32m 2057\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweight\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbias\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mrunning_mean\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mrunning_var\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2058\u001b[0;31m \u001b[0mtraining\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmomentum\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0meps\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mbackends\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcudnn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0menabled\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2059\u001b[0m )\n\u001b[1;32m 2060\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mRuntimeError\u001b[0m: CUDA out of memory. Tried to allocate 7.27 GiB (GPU 0; 14.76 GiB total capacity; 8.74 GiB already allocated; 4.42 GiB free; 9.42 GiB reserved in total by PyTorch)" + ] + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "VPksKnB_w343", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "fbee46ad-e63e-4bfc-8971-452895dd7a15" + }, + "source": [ + "# 3. cuda out of memory error (fixed)\n", + "for d in data:\n", + " out = resnet18(d.to(\"cuda:0\").unsqueeze(0))\n", + "print(out.shape)" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "text": [ + "torch.Size([1, 1000])\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "vqszlxEE0Bk0", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 346 + }, + "outputId": "a698b34d-00a8-4067-ddc5-180cb4c8eeaa" + }, + "source": [ + "# 4. mismatched tensor type\n", + "import torch.nn as nn\n", + "L = nn.CrossEntropyLoss()\n", + "outs = torch.randn(5,5)\n", + "labels = torch.Tensor([1,2,3,4,0])\n", + "lossval = L(outs,labels) # Calculate CrossEntropyLoss between outs and labels" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "error", + "ename": "RuntimeError", + "evalue": "ignored", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mouts\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrandn\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mlabels\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mTensor\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0mlossval\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mL\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mouts\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mlabels\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# Calculate CrossEntropyLoss between outs and labels\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py\u001b[0m in \u001b[0;36m_call_impl\u001b[0;34m(self, *input, **kwargs)\u001b[0m\n\u001b[1;32m 725\u001b[0m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_slow_forward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 726\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 727\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mforward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 728\u001b[0m for hook in itertools.chain(\n\u001b[1;32m 729\u001b[0m \u001b[0m_global_forward_hooks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py\u001b[0m in \u001b[0;36mforward\u001b[0;34m(self, input, target)\u001b[0m\n\u001b[1;32m 960\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mforward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mTensor\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtarget\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mTensor\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0mTensor\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 961\u001b[0m return F.cross_entropy(input, target, weight=self.weight,\n\u001b[0;32m--> 962\u001b[0;31m ignore_index=self.ignore_index, reduction=self.reduction)\n\u001b[0m\u001b[1;32m 963\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 964\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py\u001b[0m in \u001b[0;36mcross_entropy\u001b[0;34m(input, target, weight, size_average, ignore_index, reduce, reduction)\u001b[0m\n\u001b[1;32m 2466\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0msize_average\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m \u001b[0;32mor\u001b[0m \u001b[0mreduce\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2467\u001b[0m \u001b[0mreduction\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_Reduction\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlegacy_get_string\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msize_average\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreduce\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2468\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mnll_loss\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlog_softmax\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtarget\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweight\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mignore_index\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreduction\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2469\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2470\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py\u001b[0m in \u001b[0;36mnll_loss\u001b[0;34m(input, target, weight, size_average, ignore_index, reduce, reduction)\u001b[0m\n\u001b[1;32m 2262\u001b[0m .format(input.size(0), target.size(0)))\n\u001b[1;32m 2263\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mdim\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2264\u001b[0;31m \u001b[0mret\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_C\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_nn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnll_loss\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtarget\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweight\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0m_Reduction\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_enum\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mreduction\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mignore_index\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2265\u001b[0m \u001b[0;32melif\u001b[0m \u001b[0mdim\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m4\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2266\u001b[0m \u001b[0mret\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_C\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_nn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnll_loss2d\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtarget\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweight\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0m_Reduction\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_enum\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mreduction\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mignore_index\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mRuntimeError\u001b[0m: expected scalar type Long but found Float" + ] + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "CZwgwup_1dgS", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "aaf1de76-7ef2-4ca4-b87d-8482a3117249" + }, + "source": [ + "# 4. mismatched tensor type (fixed)\n", + "labels = labels.long()\n", + "lossval = L(outs,labels)\n", + "print(lossval)" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "text": [ + "tensor(2.6215)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dSuNdA8F06dK" + }, + "source": [ + "**3. More on dataset and dataloader**\r\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "in84z_xu1rE6" + }, + "source": [ + "A dataset is a cluster of data in a organized way. A dataloader is a loader which can iterate through the data set." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "34zfh-c22Qqs" + }, + "source": [ + "Let a dataset be the English alphabets \"abcdefghijklmnopqrstuvwxyz\"" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "TaiHofty1qKA" + }, + "source": [ + "dataset = \"abcdefghijklmnopqrstuvwxyz\"" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "h0jwhVa12h3a" + }, + "source": [ + "A simple dataloader could be implemented with the python code \"for\"" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "bWC5Wwbv2egy" + }, + "source": [ + "for datapoint in dataset:\r\n", + " print(datapoint)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "n33VKzkG2y2U" + }, + "source": [ + "When using the dataloader, we often like to shuffle the data. This is where torch.utils.data.DataLoader comes in handy. If each data is an index (0,1,2...) from the view of torch.utils.data.DataLoader, shuffling can simply be done by shuffling an index array. \r\n", + "\r\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9MXUUKQ65APf" + }, + "source": [ + "torch.utils.data.DataLoader will need two imformation to fulfill its role. First, it needs to know the length of the data. Second, once torch.utils.data.DataLoader outputs the index of the shuffling results, the dataset needs to return the corresponding data." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BV5txsjK5j4j" + }, + "source": [ + "Therefore, torch.utils.data.Dataset provides the imformation by two functions, `__len__()` and `__getitem__()` to support torch.utils.data.Dataloader" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "A0IEkemJ5ajD" + }, + "source": [ + "import torch\r\n", + "import torch.utils.data \r\n", + "class ExampleDataset(torch.utils.data.Dataset):\r\n", + " def __init__(self):\r\n", + " self.data = \"abcdefghijklmnopqrstuvwxyz\"\r\n", + " \r\n", + " def __getitem__(self,idx): # if the index is idx, what will be the data?\r\n", + " return self.data[idx]\r\n", + " \r\n", + " def __len__(self): # What is the length of the dataset\r\n", + " return len(self.data)\r\n", + "\r\n", + "dataset1 = ExampleDataset() # create the dataset\r\n", + "dataloader = torch.utils.data.DataLoader(dataset = dataset1,shuffle = True,batch_size = 1)\r\n", + "for datapoint in dataloader:\r\n", + " print(datapoint)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nTt-ZTid9S2n" + }, + "source": [ + "A simple data augmentation technique can be done by changing the code in `__len__()` and `__getitem__()`. Suppose we want to double the length of the dataset by adding in the uppercase letters, using only the lowercase dataset, you can change the dataset to the following." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "7Wn3BA2j-NXl" + }, + "source": [ + "import torch.utils.data \r\n", + "class ExampleDataset(torch.utils.data.Dataset):\r\n", + " def __init__(self):\r\n", + " self.data = \"abcdefghijklmnopqrstuvwxyz\"\r\n", + " \r\n", + " def __getitem__(self,idx): # if the index is idx, what will be the data?\r\n", + " if idx >= len(self.data): # if the index >= 26, return upper case letter\r\n", + " return self.data[idx%26].upper()\r\n", + " else: # if the index < 26, return lower case, return lower case letter\r\n", + " return self.data[idx]\r\n", + " \r\n", + " def __len__(self): # What is the length of the dataset\r\n", + " return 2 * len(self.data) # The length is now twice as large\r\n", + "\r\n", + "dataset1 = ExampleDataset() # create the dataset\r\n", + "dataloader = torch.utils.data.DataLoader(dataset = dataset1,shuffle = True,batch_size = 1)\r\n", + "for datapoint in dataloader:\r\n", + " print(datapoint)" + ], + "execution_count": null, + "outputs": [] + } + ] +} \ No newline at end of file