02. A Quick Tour of The Notebook

Modified

October 19, 2023

Welcome to the Jupyter Notebook!

If you’re reading this properly, then you’ve installed Anaconda correctly. You are looking at a Jupyter notebook, which is a user-friendly combination of code and text that helps you perform data analysis and visualization easily!

Introducing The Notebook

A Jupyter notebook consists of text and images. The text can be Markdown or code.

You can include images as a weblink:

claridge-chang-lab-logo

If you’re not connected to the internet, the image above won’t render.

Double-click on this chunk of text. You should enter “edit mode” which allows you to change the text.

Hit Shift+Enter to render the change.

Running Cells

First, we need to explain how to run cells. Try to run the cell below!

Notice how the cell bracket on the left first becomes an asterisk “*” to indicate the cell is running.

Once the cell has completed running, the asterisk is replaced by a number (in this case, “1”). This is a running count of the number of cells you have run in this notebook.

import pandas as pd # Don't worry about this line yet. We'll 3explain it later below!
print("Hi! This is a cell. Press the ▶ button in the toolbar above to run it.")
Hi! This is a cell. Press the ▶ button in the toolbar above to run it.

You can also run a cell with Shift+Enter.

One of the most useful things about IPython notebook is its tab completion.

Try this: remove the “#” in the cell below (called uncommenting) and click just after read_csv( in the cell below and press Shift+Tab.

# pd.read_csv(

You should see this: Using Shift-Tab to get documentation

This is the documentation for the function pd.read_csv(). You should be able to scroll within the box.

You can also perform tab completion for function names.

After removing the #, just after pd.r in the cell below and press Tab.

# pd.r

You should see this:

Using Tab completion

Saving Your Work

As of the latest stable version, JupyterLab will autosave your notebook! But you should always click the 💾 button or hit Ctrl+S regularly.

Introducing Python

# This is a code cell. You can change the cell type in the dropdown menu above.
# In Python, anything following a `#` is a comment; it is ignored. 
# Below, we demonstrate a simple addition of two variables.

a = 5
b = 6
print(a + b)
print(a * b)
11
30

Try changing the values of a and b, and hit Shift+Enter or the Play button ▶ above.

Check to see if the values of a + b and a * b are as expected.

Congratulations! 🎉 🎊

You are now officially a programmer!!!

FOR loops

One of the key advantages of coding is the ability to automate repetitive processes with loops.

for i in range(10):
    print(i, i*2, i*3)
0 0 0
1 2 3
2 4 6
3 6 9
4 8 12
5 10 15
6 12 18
7 14 21
8 16 24
9 18 27

Quick Introduction to Arrays

Python has several different types of data structures. An important data structure is the list.

my_list = [1, 2, 3, 4, "John", "Mary", 1984]

You can access items of this list with using a 0-indexed notation. That is, the first item has an index of 0.

my_list[0]
1
my_list[1]
2

Python allows negative-numerical indexing, which accesses the list in reverse.

my_list[-1]
1984
my_list[-2]
'Mary'

You can append items to the list.

my_list.append("new item")

my_list
[1, 2, 3, 4, 'John', 'Mary', 1984, 'new item']

You can also remove items from the list.

my_list.remove(4)

my_list.remove("John")

my_list
[1, 2, 3, 'Mary', 1984, 'new item']

You can also combine lists.

[1, 2, 3] + ["Four", "cinco", "六"]
[1, 2, 3, 'Four', 'cinco', '六']

Quick Introduction to Dictionaries

Another important data structure is the dictionary. It is often referred to as a dict. As the name suggests, you can look up values with keywords.

my_dict = {"John": 99,
           "Mary": 100,
           "Address": "8 College Road, S(169857)",
           "Fragments": ["attagagacca", "ggctttcta", "ttctcaatggt"]}
my_dict["John"]
99
my_dict["Address"]
'8 College Road, S(169857)'
my_dict["Fragments"]
['attagagacca', 'ggctttcta', 'ttctcaatggt']

You can add values to a list by assigning it to a keyword.

my_dict["Susan"] = 1000

my_dict
{'John': 99,
 'Mary': 100,
 'Address': '8 College Road, S(169857)',
 'Fragments': ['attagagacca', 'ggctttcta', 'ttctcaatggt'],
 'Susan': 1000}

Removing dictionary entries with the del command.

del my_dict["John"]

my_dict
{'Mary': 100,
 'Address': '8 College Road, S(169857)',
 'Fragments': ['attagagacca', 'ggctttcta', 'ttctcaatggt'],
 'Susan': 1000}

Quiz

How do I remove ‘ggctttcta’ from my_dict["Fragments"]?

Importing Libraries

Python has a large selection of libraries for scientific computing and data visualization. These libraries have to be manually imported into your session.

import pandas as pd
# This translates into English as 'Hey Python, load the library `pandas`, and call it `pd` for short!'

import matplotlib.pyplot as plt
# In English: 'Python, there's a submodule `pyplot` in the library `matplotlib`. Import this submodule as `plt`.

import seaborn as sns
# Now, can you translate this line of Python into English?

## For some reason, this gave an error in nbdev_test
# %matplotlib inline
## This is known as a "IPython magic command". Some magic commands control how the notebook behaves. 
## This particular line tells the notebook to render any output by `matplotlib` as an inline image.

Further Reading

Read more about the JupyterLab interface.

Jupyter follows in the tradition of literate programming, the idea that code should be a piece of literature that incorporates prose, images, and code all in a single document.

More about magic commands in Jupyter.

Why is this named “Jupyter”? From this link:

The name has its origins in a few different places. First, the names comes from the planet Jupiter. We wanted to pick a name that evoked the traditions and ideas of science. Second, the core programming languages supported by Jupyter are Julia, Python and R. While the name Jupyter is not a direct acronym for these languages, it nods its head in those directions. In particular, the “y” in the middle of Jupyter was chosen to honor our Python heritage. Third, Galileo was the first person to discover the moons of Jupiter. His publication on the moons of Jupiter is an early example of research that includes the underlying data in the publication. This is one of the core ideas and requirements for scientific reproducibility. Reproducibility is one of the main focuses of our project.

Moons of Jupyter