Loading Data

Loading data and relevant groups

load

 load (data, idx=None, x=None, y=None, paired=None, id_col=None, ci=95,
       resamples=5000, random_seed=12345, proportional=False,
       delta2=False, experiment=None, experiment_label=None,
       x1_level=None, mini_meta=False)

Loads data in preparation for estimation statistics.

This is designed to work with pandas DataFrames.

Type Default Details
data pandas DataFrame
idx NoneType None List of column names (if ‘x’ is not supplied) or of category names
(if ‘x’ is supplied). This can be expressed as a tuple of tuples,
with each individual tuple producing its own contrast plot
x NoneType None Column name(s) of the independent variable. This can be expressed as
a list of 2 elements if and only if ‘delta2’ is True; otherwise it
can only be a string.
y NoneType None Column names for data to be plotted on the x-axis and y-axis.
paired NoneType None The type of the experiment under which the data are obtained. If ‘paired’
is None then the data will not be treated as paired data in the subsequent
calculations. If ‘paired’ is ‘baseline’, then in each tuple of x, other
groups will be paired up with the first group (as control). If ‘paired’ is
‘sequential’, then in each tuple of x, each group will be paired up with
its previous group (as control).
id_col NoneType None Required if paired is True.
ci int 95 The confidence interval width. The default of 95 produces 95%
confidence intervals.
resamples int 5000 The number of resamples taken to generate the bootstraps which are used
to generate the confidence intervals.
random_seed int 12345 This integer is used to seed the random number generator during
bootstrap resampling, ensuring that the confidence intervals
reported are replicable.
proportional bool False An indicator of whether the data is binary or not. When set to True, it
specifies that the data consists of binary data, where the values are
limited to 0 and 1. The code is not suitable for analyzing proportion
data that contains non-numeric values, such as strings like ‘yes’ and ‘no’.
When False or not provided, the algorithm assumes that
the data is continuous and uses a non-proportional representation.
delta2 bool False Indicator of delta-delta experiment
experiment NoneType None The name of the column of the dataframe which contains the label of
experiments
experiment_label NoneType None
x1_level NoneType None A list of String to specify the order of subplots for delta-delta plots.
This can be expressed as a list of 2 elements if and only if ‘delta2’
is True; otherwise it can only be a string.
mini_meta bool False Indicator of weighted delta calculation.
Returns A Dabest object.

prop_dataset

 prop_dataset (group:Union[list,tuple,numpy.ndarray,dict],
               group_names:Optional[list]=None)

Convenient function to generate a dataframe of binary data.

Type Default Details
group Union[list, tuple, np.ndarray, dict]
group_names Optional[list] None Accepts lists, tuples, or numpy ndarrays of numeric types.

Example

import numpy as np
import pandas as pd
import scipy as sp
import dabest

Create dummy data for demonstration.

np.random.seed(88888)
N = 10
c1 = sp.stats.norm.rvs(loc=100, scale=5, size=N)
t1 = sp.stats.norm.rvs(loc=115, scale=5, size=N)
df = pd.DataFrame({"Control 1": c1, "Test 1": t1})

Load the data.

my_data = dabest.load(df, idx=("Control 1", "Test 1"))
my_data
DABEST v2024.03.29
==================
                  
Good afternoon!
The current time is Tue Mar 19 15:34:58 2024.

Effect size(s) with 95% confidence intervals will be computed for:
1. Test 1 minus Control 1

5000 resamples will be used to generate the effect size bootstraps.

For proportion plot.

np.random.seed(88888)
N = 10
c1 = np.random.binomial(1, 0.2, size=N)
t1 = np.random.binomial(1, 0.5, size=N)
df = pd.DataFrame({"Control 1": c1, "Test 1": t1})
my_data = dabest.load(df, idx=("Control 1", "Test 1"), proportional=True)