.. _Tutorial: ================ Tutorial: Basics ================ Load Libraries -------------- .. code-block:: python3 :linenos: import numpy as np import pandas as pd import dabest print("We're using DABEST v{}".format(dabest.__version__)) .. parsed-literal:: We're using DABEST v2023.02.14 Create dataset for demo ----------------------- Here, we create a dataset to illustrate how ``dabest`` functions. In this dataset, each column corresponds to a group of observations. .. code-block:: python3 :linenos: from scipy.stats import norm # Used in generation of populations. np.random.seed(9999) # Fix the seed so the results are replicable. # pop_size = 10000 # Size of each population. Ns = 20 # The number of samples taken from each population # Create samples c1 = norm.rvs(loc=3, scale=0.4, size=Ns) c2 = norm.rvs(loc=3.5, scale=0.75, size=Ns) c3 = norm.rvs(loc=3.25, scale=0.4, size=Ns) t1 = norm.rvs(loc=3.5, scale=0.5, size=Ns) t2 = norm.rvs(loc=2.5, scale=0.6, size=Ns) t3 = norm.rvs(loc=3, scale=0.75, size=Ns) t4 = norm.rvs(loc=3.5, scale=0.75, size=Ns) t5 = norm.rvs(loc=3.25, scale=0.4, size=Ns) t6 = norm.rvs(loc=3.25, scale=0.4, size=Ns) # Add a `gender` column for coloring the data. females = np.repeat('Female', Ns/2).tolist() males = np.repeat('Male', Ns/2).tolist() gender = females + males # Add an `id` column for paired data plotting. id_col = pd.Series(range(1, Ns+1)) # Combine samples and gender into a DataFrame. df = pd.DataFrame({'Control 1' : c1, 'Test 1' : t1, 'Control 2' : c2, 'Test 2' : t2, 'Control 3' : c3, 'Test 3' : t3, 'Test 4' : t4, 'Test 5' : t5, 'Test 6' : t6, 'Gender' : gender, 'ID' : id_col }) Note that we have 9 groups (3 Control samples and 6 Test samples). Our dataset also has a non-numerical column indicating gender, and another column indicating the identity of each observation. This is known as a ‘wide’ dataset. See this `writeup `__ for more details. .. code-block:: python3 :linenos: df.head() .. raw:: html
Control 1 Test 1 Control 2 Test 2 Control 3 Test 3 Test 4 Test 5 Test 6 Gender ID
0 2.793984 3.420875 3.324661 1.707467 3.816940 1.796581 4.440050 2.937284 3.486127 Female 1
1 3.236759 3.467972 3.685186 1.121846 3.750358 3.944566 3.723494 2.837062 2.338094 Female 2
2 3.019149 4.377179 5.616891 3.301381 2.945397 2.832188 3.214014 3.111950 3.270897 Female 3
3 2.804638 4.564780 2.773152 2.534018 3.575179 3.048267 4.968278 3.743378 3.151188 Female 4
4 2.858019 3.220058 2.550361 2.796365 3.692138 3.276575 2.662104 2.977341 2.328601 Female 5
Loading Data ------------ Before we create estimation plots and obtain confidence intervals for our effect sizes, we need to load the data and the relevant groups. We simply supply the DataFrame to ``dabest.load()``. We also must supply the two groups you want to compare in the ``idx`` argument as a tuple or list. .. code-block:: python3 :linenos: two_groups_unpaired = dabest.load(df, idx=("Control 1", "Test 1"), resamples=5000) Calling this ``Dabest`` object gives you a gentle greeting, as well as the comparisons that can be computed. .. code-block:: python3 :linenos: two_groups_unpaired .. parsed-literal:: DABEST v2023.02.14 ================== Good evening! The current time is Sun Aug 29 18:00:54 2021. Effect size(s) with 95% confidence intervals will be computed for: 1. Test 1 minus Control 1 5000 resamples will be used to generate the effect size bootstraps. Changing statistical parameters ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ You can change the width of the confidence interval that will be produced by manipulating the ``ci`` argument. .. code-block:: python3 :linenos: two_groups_unpaired_ci90 = dabest.load(df, idx=("Control 1", "Test 1"), ci=90) .. code-block:: python3 :linenos: two_groups_unpaired_ci90 .. parsed-literal:: DABEST v2023.02.14 ================== Good afternoon! The current time is Mon Oct 19 17:12:44 2020. Effect size(s) with 90% confidence intervals will be computed for: 1. Test 1 minus Control 1 5000 resamples will be used to generate the effect size bootstraps. Effect sizes ------------ ``dabest`` now features a range of effect sizes: - the mean difference (``mean_diff``) - the median difference (``median_diff``) - `Cohen’s d `__ (``cohens_d``) - `Hedges’ g `__ (``hedges_g``) - `Cliff’s delta `__ (``cliffs_delta``) Each of these are attributes of the ``Dabest`` object. .. code-block:: python3 :linenos: two_groups_unpaired.mean_diff .. parsed-literal:: DABEST v2023.02.14 ================== Good evening! The current time is Sun Aug 29 18:10:44 2021. The unpaired mean difference between Control 1 and Test 1 is 0.48 [95%CI 0.221, 0.768]. The p-value of the two-sided permutation t-test is 0.001, calculated for legacy purposes only. 5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated. Any p-value reported is the probability of observing theeffect size (or greater), assuming the null hypothesis ofzero difference is true. For each p-value, 5000 reshuffles of the control and test labels were performed. To get the results of all valid statistical tests, use `.mean_diff.statistical_tests` For each comparison, the type of effect size is reported (here, it’s the “unpaired mean difference”). The confidence interval is reported as: [*confidenceIntervalWidth* *LowerBound*, *UpperBound*] This confidence interval is generated through bootstrap resampling. See :doc:`bootstraps` for more details. Since v0.3.0, DABEST will report the p-value of the `non-parametric two-sided approximate permutation t-test `__. This is also known as the Monte Carlo permutation test. For unpaired comparisons, the p-values and test statistics of `Welch's t test `__, `Student's t test `__, and `Mann-Whitney U test `__ can be found in addition. For paired comparisons, the p-values and test statistics of the `paired Student's t `__ and `Wilcoxon `__ tests are presented. .. code-block:: python3 :linenos: pd.options.display.max_columns = 50 two_groups_unpaired.mean_diff.results .. raw:: html
control test control_N test_N effect_size is_paired difference ci bca_low bca_high bca_interval_idx pct_low pct_high pct_interval_idx bootstraps resamples random_seed permutations pvalue_permutation permutation_count permutations_var pvalue_welch statistic_welch pvalue_students_t statistic_students_t pvalue_mann_whitney statistic_mann_whitney
0 Control 1 Test 1 20 20 mean difference None 0.48029 95 0.220869 0.767721 (140, 4889) 0.215697 0.761716 (125, 4875) [0.6686169333655454, 0.4382051534234943, 0.665... 5000 12345 [-0.17259843762502491, 0.03802293852634886, -0... 0.001 5000 [0.026356588154404337, 0.027102495439046997, 0... 0.002094 -3.308806 0.002057 -3.308806 0.001625 83.0
.. code-block:: python3 :linenos: two_groups_unpaired.mean_diff.statistical_tests .. raw:: html
control test control_N test_N effect_size is_paired difference ci bca_low bca_high pvalue_permutation pvalue_welch statistic_welch pvalue_students_t statistic_students_t pvalue_mann_whitney statistic_mann_whitney
0 Control 1 Test 1 20 20 mean difference None 0.48029 95 0.220869 0.767721 0.001 0.002094 -3.308806 0.002057 -3.308806 0.001625 83.0
Let’s compute the Hedges’ *g* for our comparison. .. code-block:: python3 :linenos: two_groups_unpaired.hedges_g .. parsed-literal:: DABEST v2023.02.14 ================== Good evening! The current time is Sun Aug 29 18:12:17 2021. The unpaired Hedges' g between Control 1 and Test 1 is 1.03 [95%CI 0.349, 1.62]. The p-value of the two-sided permutation t-test is 0.001, calculated for legacy purposes only. 5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated. Any p-value reported is the probability of observing theeffect size (or greater), assuming the null hypothesis ofzero difference is true. For each p-value, 5000 reshuffles of the control and test labels were performed. To get the results of all valid statistical tests, use `.hedges_g.statistical_tests` .. code-block:: python3 :linenos: two_groups_unpaired.hedges_g.results .. raw:: html
control test control_N test_N effect_size is_paired difference ci bca_low bca_high bca_interval_idx pct_low pct_high pct_interval_idx bootstraps resamples random_seed permutations pvalue_permutation permutation_count permutations_var pvalue_welch statistic_welch pvalue_students_t statistic_students_t pvalue_mann_whitney statistic_mann_whitney
0 Control 1 Test 1 20 20 Hedges' g None 1.025525 95 0.349394 1.618579 (42, 4724) 0.472844 1.74166 (125, 4875) [1.1337301267831184, 0.8311210968422604, 1.539... 5000 12345 [-0.3295089865590538, 0.07158401210924781, -0.... 0.001 5000 [0.026356588154404337, 0.027102495439046997, 0... 0.002094 -3.308806 0.002057 -3.308806 0.001625 83.0
Producing estimation plots -------------------------- To produce a **Gardner-Altman estimation plot**, simply use the ``.plot()`` method. You can read more about its genesis and design inspiration at :doc:`robust-beautiful`. Every effect size instance has access to the ``.plot()`` method. This means you can quickly create plots for different effect sizes easily. .. code-block:: python3 :linenos: two_groups_unpaired.mean_diff.plot(); .. image:: _images/tutorial_27_0.png .. code-block:: python3 :linenos: two_groups_unpaired.hedges_g.plot(); .. image:: _images/tutorial_28_0.png Instead of a Gardner-Altman plot, you can produce a **Cumming estimation plot** by setting ``float_contrast=False`` in the ``plot()`` method. This will plot the bootstrap effect sizes below the raw data, and also displays the the mean (gap) and ± standard deviation of each group (vertical ends) as gapped lines. This design was inspired by Edward Tufte’s dictum to maximise the data-ink ratio. .. code-block:: python3 :linenos: two_groups_unpaired.hedges_g.plot(float_contrast=False); .. image:: _images/tutorial_30_0.png The ``dabest`` package also implements a range of estimation plot designs aimed at depicting common experimental designs. The **multi-two-group estimation plot** tiles two or more Cumming plots horizontally, and is created by passing a *nested tuple* to ``idx`` when ``dabest.load()`` is first invoked. Thus, the lower axes in the Cumming plot is effectively a `forest plot `__, used in meta-analyses to aggregate and compare data from different experiments. .. code-block:: python3 :linenos: multi_2group = dabest.load(df, idx=(("Control 1", "Test 1",), ("Control 2", "Test 2") )) multi_2group.mean_diff.plot(); .. image:: _images/tutorial_35_0.png The **shared control plot** displays another common experimental paradigm, where several test samples are compared against a common reference sample. This type of Cumming plot is automatically generated if the tuple passed to ``idx`` has more than two data columns. .. code-block:: python3 :linenos: shared_control = dabest.load(df, idx=("Control 1", "Test 1", "Test 2", "Test 3", "Test 4", "Test 5", "Test 6") ) .. code-block:: python3 :linenos: shared_control .. parsed-literal:: DABEST v2023.02.14 ================== Good evening! The current time is Tue Aug 31 23:39:22 2021. Effect size(s) with 95% confidence intervals will be computed for: 1. Test 1 minus Control 1 2. Test 2 minus Control 1 3. Test 3 minus Control 1 4. Test 4 minus Control 1 5. Test 5 minus Control 1 6. Test 6 minus Control 1 5000 resamples will be used to generate the effect size bootstraps. .. code-block:: python3 :linenos: shared_control.mean_diff .. parsed-literal:: DABEST v2023.02.14 ================== Good evening! The current time is Tue Aug 31 23:42:39 2021. The unpaired mean difference between Control 1 and Test 1 is 0.48 [95%CI 0.221, 0.768]. The p-value of the two-sided permutation t-test is 0.001, calculated for legacy purposes only. The unpaired mean difference between Control 1 and Test 2 is -0.542 [95%CI -0.914, -0.211]. The p-value of the two-sided permutation t-test is 0.0042, calculated for legacy purposes only. The unpaired mean difference between Control 1 and Test 3 is 0.174 [95%CI -0.295, 0.628]. The p-value of the two-sided permutation t-test is 0.479, calculated for legacy purposes only. The unpaired mean difference between Control 1 and Test 4 is 0.79 [95%CI 0.306, 1.31]. The p-value of the two-sided permutation t-test is 0.0042, calculated for legacy purposes only. The unpaired mean difference between Control 1 and Test 5 is 0.265 [95%CI 0.0137, 0.497]. The p-value of the two-sided permutation t-test is 0.0404, calculated for legacy purposes only. The unpaired mean difference between Control 1 and Test 6 is 0.288 [95%CI -0.00441, 0.515]. The p-value of the two-sided permutation t-test is 0.0324, calculated for legacy purposes only. 5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated. Any p-value reported is the probability of observing theeffect size (or greater), assuming the null hypothesis ofzero difference is true. For each p-value, 5000 reshuffles of the control and test labels were performed. To get the results of all valid statistical tests, use `.mean_diff.statistical_tests` .. code-block:: python3 :linenos: shared_control.mean_diff.plot(); .. image:: _images/tutorial_42_0.png ``dabest`` thus empowers you to robustly perform and elegantly present complex visualizations and statistics. .. code-block:: python3 :linenos: multi_groups = dabest.load(df, idx=(("Control 1", "Test 1",), ("Control 2", "Test 2","Test 3"), ("Control 3", "Test 4","Test 5", "Test 6") )) .. code-block:: python3 :linenos: multi_groups .. parsed-literal:: DABEST v2023.02.14 ================== Good evening! The current time is Tue Aug 31 23:47:40 2021. Effect size(s) with 95% confidence intervals will be computed for: 1. Test 1 minus Control 1 2. Test 2 minus Control 2 3. Test 3 minus Control 2 4. Test 4 minus Control 3 5. Test 5 minus Control 3 6. Test 6 minus Control 3 5000 resamples will be used to generate the effect size bootstraps. .. code-block:: python3 :linenos: multi_groups.mean_diff .. parsed-literal:: DABEST v2023.02.14 ================== Good evening! The current time is Tue Aug 31 23:48:17 2021. The unpaired mean difference between Control 1 and Test 1 is 0.48 [95%CI 0.221, 0.768]. The p-value of the two-sided permutation t-test is 0.001, calculated for legacy purposes only. The unpaired mean difference between Control 2 and Test 2 is -1.38 [95%CI -1.93, -0.895]. The p-value of the two-sided permutation t-test is 0.0, calculated for legacy purposes only. The unpaired mean difference between Control 2 and Test 3 is -0.666 [95%CI -1.3, -0.103]. The p-value of the two-sided permutation t-test is 0.0352, calculated for legacy purposes only. The unpaired mean difference between Control 3 and Test 4 is 0.362 [95%CI -0.114, 0.887]. The p-value of the two-sided permutation t-test is 0.161, calculated for legacy purposes only. The unpaired mean difference between Control 3 and Test 5 is -0.164 [95%CI -0.404, 0.0742]. The p-value of the two-sided permutation t-test is 0.208, calculated for legacy purposes only. The unpaired mean difference between Control 3 and Test 6 is -0.14 [95%CI -0.398, 0.102]. The p-value of the two-sided permutation t-test is 0.282, calculated for legacy purposes only. 5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated. Any p-value reported is the probability of observing theeffect size (or greater), assuming the null hypothesis ofzero difference is true. For each p-value, 5000 reshuffles of the control and test labels were performed. To get the results of all valid statistical tests, use `.mean_diff.statistical_tests` .. code-block:: python3 :linenos: multi_groups.mean_diff.plot(); .. image:: _images/tutorial_47_0.png Using long (aka ‘melted’) data frames ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``dabest`` can also work with ‘melted’ or ‘long’ data. This term is so used because each row will now correspond to a single datapoint, with one column carrying the value and other columns carrying ‘metadata’ describing that datapoint. More details on wide vs long or ‘melted’ data can be found in this `Wikipedia article `__. The `pandas documentation `__ gives recipes for melting dataframes. .. code-block:: python3 :linenos: x='group' y='metric' value_cols = df.columns[:-2] # select all but the "Gender" and "ID" columns. df_melted = pd.melt(df.reset_index(), id_vars=["Gender", "ID"], value_vars=value_cols, value_name=y, var_name=x) df_melted.head() # Gives the first five rows of `df_melted`. .. raw:: html
Gender ID group metric
0 Female 1 Control 1 2.793984
1 Female 2 Control 1 3.236759
2 Female 3 Control 1 3.019149
3 Female 4 Control 1 2.804638
4 Female 5 Control 1 2.858019
When your data is in this format, you will need to specify the ``x`` and ``y`` columns in ``dabest.load()``. .. code-block:: python3 :linenos: analysis_of_long_df = dabest.load(df_melted, idx=("Control 1", "Test 1"), x="group", y="metric") analysis_of_long_df .. parsed-literal:: DABEST v2023.02.14 ================== Good evening! The current time is Tue Aug 31 23:51:12 2021. Effect size(s) with 95% confidence intervals will be computed for: 1. Test 1 minus Control 1 5000 resamples will be used to generate the effect size bootstraps. .. code-block:: python3 :linenos: analysis_of_long_df.mean_diff.plot(); .. image:: _images/tutorial_52_0.png