Introduction to Python#
Although there are several programming languages to choose from when working in data science, Python is the most common (other useful languages include javascript, R, SQL and Julia). The Python language (named after Monty Python!) is extremely versatile, has an extensive library ecosystem and is easily readable. Large open-source libraries, like Pandas, Numpy, Scikit-learn and Matplotlib have made data analysis and machine learning accessible even to beginner programmers.
In this lesson we’ll cover the basics of Python, the principles of which could just as easily be applies to any programming language.
Variables#
As you saw in the introduction to notebooks, we can run code directly within a notebook. As default the output will be shown immediately below the cell, but what if we wanted to store the output for later use? That’s where variables come in.
In the cell below, we set a variable, named x
to have the value 3. Variable names must contain only valid letters, digits and underscores, and must not begin with a letter, certain words are reserved like for
, if
etc… It’s most common to define variable names as all lower case. It’s also best pratice to use a descriptive name for a variable (so our name of x
below doesn’t fit this!).
x = 3
Notice that there’s no output from that cell now. But we can check that x
has indeed been set to 3 by running the box below.
x
If 3 is displayed above, everything’s good! Jupyter notebooks will display the value of any variable or calculation that is last in a cell.
So what is a variable and why do we use them?
Variables are used to store objects, and everything in python is an object. Python is a strongly typed, dynamic language to use the correct terminology. This means that an object won’t change its type in strange ways, but you do not need to specify its type at the time of declaration.
Let’s examine our declaration above, x = 3
. We are storing our integer in a variable named x
, we can then access x at any point in the code to use this value in our calculations.
Below, we’ll use x
in a few calculations and print the results to the screen.
#This is a comment
#We can use comments to explain why we do something
#e.g. Set y to a number to test the mathematics below
y = 2
print(x*y)
print(x**y)
print(x/y)
print(x%y)
print(x+y)
print(x-y)
print(x==y)
There are a couple of points to make from the above code.
First, comments. Comments should only really be used when you’re explaining why you’ve done something a little strange in the code. However, variable names should go a long way toward explaining what they’ll be used for. With this knowledge, do you think the above variable names and comments are useful?
Mathematic Operations#
The second point we need to address are the operation used above, *
, **
, /
, %
, +
, -
and ==
.
*
multiplies 2 numbers together.**
raises the first number to the power of the second./
divides the first number by the second.%
returns the result of a modulo operation, dividing the first by the second and returning the remainder.+
adds 2 numbers.-
subtracts the second number from the first.==
compares the 2 numbers, if they’re equal it returns True, otherwise it returns False.
NOTE: a single =
sets a value, and a double ==
compares it.
Functions#
Now we can set numbers to variables and perform all sorts of mathematical operations on them! But what if we want to run 1,000,000 calculations, we don’t want to have to type out our code for every step.
So we break our code down into dynamic sections that we can call again and again with different inputs, in Python, we call these functions. These can be thought of as little Lego bricks, all sorts of amazing creations can be built by combining different types of bricks. Although it comes down to personal preference, good coding practices are usually to name functions all lower case, with words separated by underscores _
and starting a function name with a verb.
Let’s make a function below that will return the product of 2 numbers.
def multiply_numbers(number1, number2):
"""
Multiply two numbers and return the result.
Args:
number1 (float): The first number to be multiplied.
number2 (float): The second number to be multiplied.
Returns:
float: The product of number1 and number2.
"""
return number1 * number2
We’ll go through the code above in a second, for now, play around with it, we can call it by using the code below
multiply_numbers(2, 3)
Try to:
Change the above numbers
Declare some variables and using them as an input
Use fractions as inputs
Use negative numbers
When you’re finished playing, try to change the funtion to return a different mathematical operation (you may want to change the function name too!)
The sections of the function above are as follows:
def
- defines a functionmultiply_numbers
- is the name of our function(number1, number2)
- these specify the arguments, we could have more (number1, number2, number3) or less (number1) or none ()."""..."""
- this is the docstring for the function, this should explain what the function does, the argument, the type of arguments and what the function returns. It can be viewed by callinghelp(name_of_function)
.return number1 * number2
this is what the function actually feeds back into our code, in this case we multiply the 2 numbers together
Task 1#
Create a function that can take in any 2 numbers and returns the result of a modulo operation.
Task 2#
Can you improve the code below:
v = 2 + 5 * 1.4
x = 2 + 2 * 3
y = 2 + 4 * 5
z = 2 + 10 * -1
HINT: You may wish to look at the common behaviour and put them into a function.