From 9e4e12f83c199b904cf3acb942851584cee0655f Mon Sep 17 00:00:00 2001 From: alexsquires Date: Tue, 19 May 2020 12:32:22 +0100 Subject: [PATCH 1/3] documentation --- content/good_practice/documentation.ipynb | 402 +++++++++++++++++++++- 1 file changed, 397 insertions(+), 5 deletions(-) diff --git a/content/good_practice/documentation.ipynb b/content/good_practice/documentation.ipynb index 91554f8..c0c0f29 100644 --- a/content/good_practice/documentation.ipynb +++ b/content/good_practice/documentation.ipynb @@ -4,15 +4,407 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Documentation" + "# Documentation\n", + "\n", + "prerequistes:\n", + "- types \n", + "- collections\n", + "\n", + "Effective communication is essential to research: if no one can understand your work, you may as well have never have done it. Any code produced as part of your research also needs to communicated effectively, if for no other reason that _you_ need to understand what you were trying to do when you revisit your code, days, weeks or months after you originally wrote it. The cell below contains 'functional' code, but is not instantly recognisable as to what it is doing. In this section, we will work through this code, adding layers of simple 'documentation' to communicate what it is doing. \n", + "\n", + "See if you can work out what the code is doing before we write any documentation." ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 31, "metadata": {}, - "outputs": [], - "source": [] + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "28.7093380112038\n" + ] + } + ], + "source": [ + "import math\n", + "\n", + "h = 1\n", + "k = 1\n", + "l = 1\n", + "a = 6\n", + "b = 3\n", + "c = 2 \n", + "\n", + "d = math.sqrt(1/((h*h)/(a*a) + (k*k)/(b*b) + (l*l)/(c*c)))\n", + "\n", + "w = 1.5406\n", + "s = math.asin(w/(2*d))\n", + "ang = math.degrees(s)\n", + "\n", + "print(ang)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you couldn't work it out, don't worry. Let's start communicating the code more effectivly by stating the problem we are trying to solve: the code below solves Bragg's law for an orthorombic crystal to determine the Bragg angle associated with a given miller plane. Perhaps the easiest thing we can do is re-write the code, with clearer variable names." + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.3400663935848004\n", + "19.484369106643797\n" + ] + } + ], + "source": [ + "import math\n", + "\n", + "h = 1\n", + "k = 1\n", + "l = 1\n", + "a = 4\n", + "b = 4\n", + "c = 4 \n", + "\n", + "d_hkl = math.sqrt(1/((h*h)/(a*a) + (k*k)/(b*b) + (l*l)/(c*c)))\n", + "\n", + "xray_wavelength = 1.5406\n", + "bragg_angle_rads = math.asin(xray_wavelength/(2*d_hkl))\n", + "bragg_angle_degrees = math.degrees(bragg_angle_rads)\n", + "\n", + "print(bragg_angle_rads)\n", + "print(bragg_angle_degrees)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "While not exactly documentation, using descriptive variable names does improve the readability of the code. Anyone versed in crystallography would likely read the above code and be able to more quickly surmise what it is trying to achieve. That being said, we can be even more helpful buy using 'comments'.\n", + "\n", + "Comments begin with a hash symbol '#' and will not be read as python in a code cell, for example:" + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "3\n" + ] + } + ], + "source": [ + "# You don't have to comment out a whole line\n", + "print(1+2) # you can just comment out the end of a line" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The cell above still returns 3, no matter what is included as a comment. We can add some comments to our example to improve the clarity." + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.2701123839807746\n", + "15.47629959631549\n" + ] + } + ], + "source": [ + "import math\n", + "\n", + "# calculate scattering angle of an miller plane in an orthorhombic Bravias lattice\n", + "\n", + "# define miller indicies\n", + "\n", + "h = 1\n", + "k = 1\n", + "l = 1\n", + "\n", + "# define cell lengths in Angstroms\n", + "\n", + "a = 5\n", + "b = 5\n", + "c = 5 \n", + "\n", + "d_hkl = math.sqrt(1/((h*h)/(a*a) + (k*k)/(b*b) + (l*l)/(c*c)))\n", + "\n", + "xray_wavelength = 1.5406 # wavelength of incident radition in Angstroms\n", + "\n", + "# apply Bragg's law - the math module works in radians as a default, crystallographers in degrees, and so we need to \n", + "# add an additional step to convert from radians to degrees.\n", + "\n", + "bragg_angle_rads = math.asin(xray_wavelength/(2*d_hkl))\n", + "bragg_angle_degrees = math.degrees(bragg_angle_rads)\n", + "\n", + "print(bragg_angle_rads)\n", + "print(bragg_angle_degrees)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Hopefully the above code is now straightforward to comprehend for anyone with an interest in either the mechanics of the code, or the scientific problem it attempts to solve. It is important to bear in mind that commenting should be used sparingly, and the context should be considered to avoid redundancy. Ultimately, we are trying to convey the function of the code, and not distract from that. Some points in the above text about possible redundancies, and contextual commenting:\n", + "\n", + "- The first comment `# calculate scattering angle of an miller plane in an orthorhombic Bravias lattice` is redundent given the context of the code, i.e. in this jupyter book, we have already explained what we are trying to to do, but if this were a stand-alone script, a concise explanation of what the code does - such as this comment - is very helpful. Note this could also come from the name of the script or notebook.\n", + "\n", + "- Consider the line ```xray_wavelength = 1.5406 # wavelength of incident radition in Angstroms```. If it read ```xray_wavelength = 1.5406 # wavelength of incident radition``` this would be a redundant comment: the variable name conveys the same message as the comment itself. As scientists, however, it's imperative to convey the units we are working in, and so the addition of 'in Angstroms' confers meaning to the value. \n", + "\n", + "- Finally, the comment `the math module works in radians as a default, crystallographers in degrees, and so we need to add an additional step to convert from radians to degrees.` is purely for instructive purposes, most experienced python users would know this: it is not always necessary to include such descriptive commenting. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Documentation 2: docstrings" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "prequisites:\n", + "- functions\n", + "\n", + "We're going to re-write the above using functions to discuss the importance of docstrings." + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.2701123839807746\n", + "15.47629959631549\n" + ] + } + ], + "source": [ + "import math\n", + "\n", + "def calculate_d_hkl(abc,hkl):\n", + " return math.sqrt(1/((hkl[0]**2)/(abc[0]**2) + (hkl[1]**2)/(abc[1]**2) + (hkl[2]**2)/(abc[2]**2)))\n", + "\n", + "def braggs_law(d_hkl,wavelength=1.5406):\n", + " return math.asin(wavelength/(2*d_hkl))\n", + "\n", + "d_hkl = calculate_d_hkl([5,5,5],[1,1,1])\n", + "bragg_angle_rads = braggs_law(d_hkl)\n", + "bragg_angle_degrees = math.degrees(bragg_angle_rads)\n", + "\n", + "print(bragg_angle_rads)\n", + "print(bragg_angle_degrees)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If it weren't for the fact the cell above produces the same ouptut as the cells in the previous section, you may be forgiven for thinking, on first glance, that this reformulation of the code might do something slightly different. This misconception could have been avoided using docstrings, which were breifly discussed in the functions section. We revisit the concept here, to emphasis that even though they are a non-essential part of the function (as shown by the functioning code above), they offer a great deal of clarity. The description of docstrings given in the function section is quoted below:\n", + "\n", + "> The docstring is an important (although not essential) component of any function. Describing the purpose of a function is valuable of many reasons, it helps to clarify what the function will do, it offers guidence for others on how to use the function, and it acts to remind future you why it is that you have a particular function and what is does. You may read this last point and roll your eyes, however I promise you that code you write today will not stay present in your memory forever.\n", + "\n", + "> In addition to a description of the function, a docstring also typically includes information about the arguments taken by the function and objects that are returned. There are a few common way that these may be formatted, within this book we will use Google Style, however, other styles exist, such as NumPy and Sphinx. Using a standard style of docstring is useful when writting large software packages, due to the availablity of tools to automatically generate software documentation from these docstrings. The Google style indicates the arguments with the keyword Args, before listing the arguments, the expected type and a short description (we find that it can be helpful to include information about the expected units in this section). The Returns keyword is followed by a list of the values returned by the function, since these do not necessary has a name this is omitted.\n", + "\n", + "So, let's write some docstrings for our example above\n" + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.2701123839807746\n", + "15.47629959631549\n" + ] + } + ], + "source": [ + "import math\n", + "\n", + "def calculate_d_hkl(abc,hkl):\n", + " \"\"\"\n", + " Calculates d_hkl for a given miller plane in an orthorhombic Bravais lattice\n", + " \n", + " Args:\n", + " abc (list): lattice_parameters\n", + " hkl (list): miller indices\n", + " \n", + " Returns:\n", + " (float): d_hkl\n", + " \n", + " \"\"\"\n", + " return math.sqrt(1/((hkl[0]**2)/(abc[0]**2) + (hkl[1]**2)/(abc[1]**2) + (hkl[2]**2)/(abc[2]**2)))\n", + "\n", + "def braggs_law(d_hkl,wavelength=1.5406):\n", + " \"\"\"\n", + " Calculates the Bragg angle for a given value of d_hkl and x-ray wavelength\n", + " \n", + " Args:\n", + " d_hkl (float): d_hkl spacing (Angstroms)\n", + " wavelength (float, optional): wavelength of incident radiation (Angstroms). \n", + " Defaults to 1.5406 - Cu source.\n", + " \n", + " Returns:\n", + " (float): Bragg angle (float)\n", + " \n", + " \"\"\"\n", + " return math.asin(wavelength/(2*d_hkl))\n", + "\n", + "d_hkl = calculate_d_hkl([5,5,5],[1,1,1])\n", + "bragg_angle_rads = braggs_law(d_hkl)\n", + "bragg_angle_degrees = math.degrees(bragg_angle_rads)\n", + "\n", + "print(bragg_angle_rads)\n", + "print(bragg_angle_degrees)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Not only do the docstrings included in the code above explain the utility of the functions, it also allows us to question what the functions do should we ever forget by executing the following code:" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "\u001b[0;31mSignature:\u001b[0m \u001b[0mcalculate_d_hkl\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mabc\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhkl\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mDocstring:\u001b[0m\n", + "Calculates d_hkl for a given miller plane in an orthorhombic Bravais lattice\n", + "\n", + "Args:\n", + " abc (list): lattice_parameters\n", + " hkl (list): miller indices\n", + "\n", + "Returns:\n", + " (float): d_hkl\n", + "\u001b[0;31mFile:\u001b[0m /mnt/a/materials_2/intro_python_chemists/content/good_practice/\n", + "\u001b[0;31mType:\u001b[0m function\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "calculate_d_hkl?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This is a really helpful trick that can be applied to any python function, should you forget what it does/what arguemnts are required, i.e." + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "\u001b[0;31mSignature:\u001b[0m \u001b[0mones\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0morder\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'C'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mDocstring:\u001b[0m\n", + "Return a new array of given shape and type, filled with ones.\n", + "\n", + "Parameters\n", + "----------\n", + "shape : int or sequence of ints\n", + " Shape of the new array, e.g., ``(2, 3)`` or ``2``.\n", + "dtype : data-type, optional\n", + " The desired data-type for the array, e.g., `numpy.int8`. Default is\n", + " `numpy.float64`.\n", + "order : {'C', 'F'}, optional, default: C\n", + " Whether to store multi-dimensional data in row-major\n", + " (C-style) or column-major (Fortran-style) order in\n", + " memory.\n", + "\n", + "Returns\n", + "-------\n", + "out : ndarray\n", + " Array of ones with the given shape, dtype, and order.\n", + "\n", + "See Also\n", + "--------\n", + "ones_like : Return an array of ones with shape and type of input.\n", + "empty : Return a new uninitialized array.\n", + "zeros : Return a new array setting values to zero.\n", + "full : Return a new array of given shape filled with value.\n", + "\n", + "\n", + "Examples\n", + "--------\n", + ">>> np.ones(5)\n", + "array([1., 1., 1., 1., 1.])\n", + "\n", + ">>> np.ones((5,), dtype=int)\n", + "array([1, 1, 1, 1, 1])\n", + "\n", + ">>> np.ones((2, 1))\n", + "array([[1.],\n", + " [1.]])\n", + "\n", + ">>> s = (2,2)\n", + ">>> np.ones(s)\n", + "array([[1., 1.],\n", + " [1., 1.]])\n", + "\u001b[0;31mFile:\u001b[0m ~/.pyenv/versions/3.8.2/lib/python3.8/site-packages/numpy/core/numeric.py\n", + "\u001b[0;31mType:\u001b[0m function\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "from numpy import ones\n", + "# note numpy does not use the 'Google' style docstring.\n", + "ones?" + ] } ], "metadata": { @@ -31,7 +423,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.3" + "version": "3.8.2" } }, "nbformat": 4, From de6bd54f259d4e67d26176dafa077fe3318fb37f Mon Sep 17 00:00:00 2001 From: alexsquires Date: Thu, 21 May 2020 11:42:31 +0100 Subject: [PATCH 2/3] Drew changes --- content/good_practice/documentation.ipynb | 26 +++++++++++++++-------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/content/good_practice/documentation.ipynb b/content/good_practice/documentation.ipynb index c0c0f29..5b182d1 100644 --- a/content/good_practice/documentation.ipynb +++ b/content/good_practice/documentation.ipynb @@ -4,13 +4,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Documentation\n", + "# Writing documentation\n", "\n", "prerequistes:\n", "- types \n", "- collections\n", + "- reading and finding documentation\n", "\n", - "Effective communication is essential to research: if no one can understand your work, you may as well have never have done it. Any code produced as part of your research also needs to communicated effectively, if for no other reason that _you_ need to understand what you were trying to do when you revisit your code, days, weeks or months after you originally wrote it. The cell below contains 'functional' code, but is not instantly recognisable as to what it is doing. In this section, we will work through this code, adding layers of simple 'documentation' to communicate what it is doing. \n", + "Effective communication is essential to learning and research: if no one can understand your work, you may as well have never have done it. Any code produced as part of your research also needs to communicated effectively, if for no other reason that _you_ need to understand what you were trying to do when you revisit your code, days, weeks or months after you originally wrote it. The cell below contains _functional_ code, but is not instantly recognisable as to what it is doing. In this section, we will work through this code, adding layers of simple 'documentation' to communicate what it is doing. \n", "\n", "See if you can work out what the code is doing before we write any documentation." ] @@ -51,7 +52,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "If you couldn't work it out, don't worry. Let's start communicating the code more effectivly by stating the problem we are trying to solve: the code below solves Bragg's law for an orthorombic crystal to determine the Bragg angle associated with a given miller plane. Perhaps the easiest thing we can do is re-write the code, with clearer variable names." + "If you couldn't work it out, don't worry. Let's start communicating the code more effectivly by stating the problem we are trying to solve: **the code below solves Bragg's law for an orthorombic crystal to determine the Bragg angle associated with a given miller plane.** Perhaps the easiest thing we can do is re-write the code, with clearer variable names." ] }, { @@ -94,7 +95,7 @@ "source": [ "While not exactly documentation, using descriptive variable names does improve the readability of the code. Anyone versed in crystallography would likely read the above code and be able to more quickly surmise what it is trying to achieve. That being said, we can be even more helpful buy using 'comments'.\n", "\n", - "Comments begin with a hash symbol '#' and will not be read as python in a code cell, for example:" + "Comments begin with a hash symbol `#` and will not be read as python in a code cell, for example:" ] }, { @@ -141,7 +142,7 @@ "\n", "# calculate scattering angle of an miller plane in an orthorhombic Bravias lattice\n", "\n", - "# define miller indicies\n", + "# define miller indices\n", "\n", "h = 1\n", "k = 1\n", @@ -173,7 +174,7 @@ "source": [ "Hopefully the above code is now straightforward to comprehend for anyone with an interest in either the mechanics of the code, or the scientific problem it attempts to solve. It is important to bear in mind that commenting should be used sparingly, and the context should be considered to avoid redundancy. Ultimately, we are trying to convey the function of the code, and not distract from that. Some points in the above text about possible redundancies, and contextual commenting:\n", "\n", - "- The first comment `# calculate scattering angle of an miller plane in an orthorhombic Bravias lattice` is redundent given the context of the code, i.e. in this jupyter book, we have already explained what we are trying to to do, but if this were a stand-alone script, a concise explanation of what the code does - such as this comment - is very helpful. Note this could also come from the name of the script or notebook.\n", + "- The first comment `# calculate scattering angle of an miller plane in an orthorhombic Bravias lattice` is redundant given the context of the code, i.e. in this jupyter book: we have already explained what we are trying to to do, but if this were a stand-alone script, a concise explanation of what the code does - such as this comment - is very helpful. Note this could also come from the name of the script or notebook.\n", "\n", "- Consider the line ```xray_wavelength = 1.5406 # wavelength of incident radition in Angstroms```. If it read ```xray_wavelength = 1.5406 # wavelength of incident radition``` this would be a redundant comment: the variable name conveys the same message as the comment itself. As scientists, however, it's imperative to convey the units we are working in, and so the addition of 'in Angstroms' confers meaning to the value. \n", "\n", @@ -232,7 +233,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "If it weren't for the fact the cell above produces the same ouptut as the cells in the previous section, you may be forgiven for thinking, on first glance, that this reformulation of the code might do something slightly different. This misconception could have been avoided using docstrings, which were breifly discussed in the functions section. We revisit the concept here, to emphasis that even though they are a non-essential part of the function (as shown by the functioning code above), they offer a great deal of clarity. The description of docstrings given in the function section is quoted below:\n", + "If it weren't for the fact the cell above produces the same ouptut as the cells in the previous section, you may be forgiven for thinking, on first glance, that this reformulation of the code might do something slightly different. This misconception could have been avoided using docstrings, which were breifly discussed in the functions section. We revisit the concept here, to emphasise that even though they are a non-essential part of the function (as shown by the functioning code above), they offer a great deal of clarity. The description of docstrings given in the function section is quoted below:\n", "\n", "> The docstring is an important (although not essential) component of any function. Describing the purpose of a function is valuable of many reasons, it helps to clarify what the function will do, it offers guidence for others on how to use the function, and it acts to remind future you why it is that you have a particular function and what is does. You may read this last point and roll your eyes, however I promise you that code you write today will not stay present in your memory forever.\n", "\n", @@ -336,12 +337,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This is a really helpful trick that can be applied to any python function, should you forget what it does/what arguemnts are required, i.e." + "This is a really helpful trick that can be applied to any python function, should you forget what it does/what arguments are required, i.e." ] }, { "cell_type": "code", - "execution_count": 61, + "execution_count": 1, "metadata": {}, "outputs": [ { @@ -405,6 +406,13 @@ "# note numpy does not use the 'Google' style docstring.\n", "ones?" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { From ffae1685a7e69b73649cbcf83a91ae21bd5d823f Mon Sep 17 00:00:00 2001 From: alexsquires Date: Thu, 21 May 2020 11:59:54 +0100 Subject: [PATCH 3/3] merge docstrings and comments --- content/good_practice/documentation.ipynb | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/content/good_practice/documentation.ipynb b/content/good_practice/documentation.ipynb index 5b182d1..03e8400 100644 --- a/content/good_practice/documentation.ipynb +++ b/content/good_practice/documentation.ipynb @@ -7,6 +7,7 @@ "# Writing documentation\n", "\n", "prerequistes:\n", + "- functions\n", "- types \n", "- collections\n", "- reading and finding documentation\n", @@ -185,16 +186,13 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Documentation 2: docstrings" + "## Docstrings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "prequisites:\n", - "- functions\n", - "\n", "We're going to re-write the above using functions to discuss the importance of docstrings." ] },