diff --git a/.gitignore_global b/.gitignore_global
new file mode 100644
index 0000000..e43b0f9
--- /dev/null
+++ b/.gitignore_global
@@ -0,0 +1 @@
+.DS_Store
diff --git a/0-Perceptron-Gradient-Descent.ipynb b/0-Perceptron-Gradient-Descent.ipynb
index d0bef90..0d9221b 100644
--- a/0-Perceptron-Gradient-Descent.ipynb
+++ b/0-Perceptron-Gradient-Descent.ipynb
@@ -842,7 +842,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# 5. (Stochastic) gradinet descent"
+    "# 5. (Stochastic) gradient descent"
    ]
   },
   {
@@ -1231,7 +1231,7 @@
     "    # extract the weights for each gradient step\n",
     "    training_w = np.array(perceptron.training_w)\n",
     "    n_steps = len(training_w)\n",
-    "    steps = np.array([0,1,2,50,n_steps-1])\n",
+    "    steps = np.array([0,1,2,round((n_steps-1)/2),n_steps-1])\n",
     "\n",
     "    # compute the values of the loss function for a grid of w-values, given our learned bias term\n",
     "    b = float(perceptron.b)\n",
@@ -1260,8 +1260,8 @@
     "    ax.scatter(training_w[0,0], training_w[0,1], color='black', s=200, zorder=99)\n",
     "    ax.plot(training_w[steps,0], training_w[steps,1], color='white', lw=1)\n",
     "    # add line for final weights\n",
-    "    ax.axvline(training_w[s,0], color='red', lw=1, ls='--')\n",
-    "    ax.axhline(training_w[s,1], color='red', lw=1, ls='--')\n",
+    "    ax.axvline(training_w[steps,0], color='red', lw=1, ls='--')\n",
+    "    ax.axhline(training_w[steps,1], color='red', lw=1, ls='--')\n",
     "    # label axes\n",
     "    cbar.set_label('Loss')\n",
     "    ax.set_title('Final loss: {}'.format(perceptron.training_loss[-1]))\n",
diff --git a/1-Neural-Networks-Backpropagation.ipynb b/1-Neural-Networks-Backpropagation.ipynb
index 559c29c..55abc35 100644
--- a/1-Neural-Networks-Backpropagation.ipynb
+++ b/1-Neural-Networks-Backpropagation.ipynb
@@ -27,7 +27,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# 1. Example 2: What if the data is not linearly separable?"
+    "# 1. Example 2: What if the data are not linearly separable?"
    ]
   },
   {
@@ -36,7 +36,7 @@
    "source": [
     "In the previous notebook (0-Perceptron-Gradient-Descent.iypnb), we have learned that we can use the perceptron algorithm to build a binary classifier for linearly separable data (by learning a  hyperplane).\n",
     "\n",
-    "Yet, not all data is linearly separable. Take a look at the figure below and try to find a line that separates the blue and red dots:"
+    "Yet, not all data are linearly separable. Take a look at the figure below and try to find a line that separates the blue and red dots:"
    ]
   },
   {
@@ -641,9 +641,9 @@
     "\n",
     "- An *output* layer, containing one output neuron for each target class in the dataset\n",
     "\n",
-    "In the illustration below, the input data is an image of a single handwritten digit. This image has 8x8 pixels; To serve it to our ANN, we would flatten it to be a single vector of 64 values. The input layer therefore would have 64 input neurons (each representing one of the 64 input features).\n",
+    "In the illustration below, the input data are an image of a single handwritten digit. This image has 8x8 pixels; To serve it to our ANN, we would flatten it to be a single vector of 64 values. The input layer therefore would have 64 input neurons (each representing one of the 64 input features).\n",
     "\n",
-    "As the data consists of single handwritten digits, we would set the output layer to contain 10 neurons (one for each digit between 0 and 9)."
+    "As the data are single handwritten digits, we would set the output layer to contain 10 neurons (one for each digit between 0 and 9)."
    ]
   },
   {
diff --git a/figures/.DS_Store b/figures/.DS_Store
deleted file mode 100644
index 5008ddf..0000000
Binary files a/figures/.DS_Store and /dev/null differ