01. Introduction

Modified

October 19, 2023

As research techniques and data collection have become almost completely digital and analysis methods grow more sophisticated, it is critical that scientists develop three skills: data visualization, statistics, and coding. Unfortunately, many undergraduate biology programs emphasize the memorization of numerous facts, while failing to offer courses in data graphics, estimation statistics, or scientific programming. In this session, we offer a basic orientation on these topics.

Before class

If any issues can’t be resolved with the below steps, we can work on it in the class together.

Getting the necessary software

  1. You’ll need to get set up with a version-control system. Go to GitHub and get an account. Download and install GitHub Desktop.

  2. Retrieve the course materials from GitHub. Go to the course repository (“repo”) at https://github.com/ACCLAB/moda. Click the green Code button and then select Open with GitHub Desktop.clonemoda.png
    You will be prompted to select a directory for the local repository. If you are using a PC it can be something like this:
    clonedir.png
    If you are using a mac, it can be something like //Users/YOURUSERNAME/Documents/GitHub/moda.

  3. To get set up with Python and Jupyter notebooks, install the Anaconda Distribution on your laptop.

  4. Open Anaconda Navigator and open a terminal window by clicking on Environments > base (root), and then clicking on the green triangle and select Open Terminal.

openterminal.png

  1. Go to your moda directory (replace the path with your own actual path) and install it with pip:
    cd Documents/GitHub/moda
    pip install .

Checking out the notebooks

  1. launch JupyterLab by clicking on it.

launchjupyterlab.png
JupyterLab will open in a browser tab.

  1. In the File Browser panel in JupyterLab, navigate to the folder where you cloned the course repo (refer to step 2). Double click on ‘nbs’. You should see a list of notebook files. Open “02_Quick_tour_of_the_Notebook.ipynb” by double-clicking on the icon shown in the JupyterLab browser window. opennotebook.png

  2. Work through the notebook. Familiarize yourself with basic Python, and with working in the JupyterLab environment.

Reading about the python packages we will use

  1. Read about pandas, matplotlib, and seaborn.

  2. Read our papers on estimation statistics here and here.

In class

  1. An overview of data analysis.

  2. A tour of the estimationstats web app.

  3. Presentation of a Jupyter notebook that introduces techniques in data analysis using Python.

  4. Try JupyterLite, an experimental web version of JupyterLab, with a class notebook here.