Loading Data

Loading data and relevant groups

load

 load (data, idx=None, x=None, y=None, paired=None, id_col=None, ci=95,
       resamples=5000, random_seed=12345, proportional=False,
       delta2=False, experiment=None, experiment_label=None,
       x1_level=None, mini_meta=False, ps_adjust=False)

*Loads data in preparation for estimation statistics.

This is designed to work with pandas DataFrames.*

	Type	Default	Details
data	pandas DataFrame
idx	NoneType	None	List of column names (if ‘x’ is not supplied) or of category names (if ‘x’ is supplied). This can be expressed as a tuple of tuples, with each individual tuple producing its own contrast plot
x	NoneType	None	Column name(s) of the independent variable. This can be expressed as a list of 2 elements if and only if ‘delta2’ is True; otherwise it can only be a string.
y	NoneType	None	Column names for data to be plotted on the x-axis and y-axis.
paired	NoneType	None	The type of the experiment under which the data are obtained. If ‘paired’ is None then the data will not be treated as paired data in the subsequent calculations. If ‘paired’ is ‘baseline’, then in each tuple of x, other groups will be paired up with the first group (as control). If ‘paired’ is ‘sequential’, then in each tuple of x, each group will be paired up with its previous group (as control).
id_col	NoneType	None	Required if `paired` is True.
ci	int	95	The confidence interval width. The default of 95 produces 95% confidence intervals.
resamples	int	5000	The number of resamples taken to generate the bootstraps which are used to generate the confidence intervals.
random_seed	int	12345	This integer is used to seed the random number generator during bootstrap resampling, ensuring that the confidence intervals reported are replicable.
proportional	bool	False	An indicator of whether the data is binary or not. When set to True, it specifies that the data consists of binary data, where the values are limited to 0 and 1. The code is not suitable for analyzing proportion data that contains non-numeric values, such as strings like ‘yes’ and ‘no’. When False or not provided, the algorithm assumes that the data is continuous and uses a non-proportional representation.
delta2	bool	False	Indicator of delta-delta experiment
experiment	NoneType	None	The name of the column of the dataframe which contains the label of experiments
experiment_label	NoneType	None
x1_level	NoneType	None	A list of String to specify the order of subplots for delta-delta plots. This can be expressed as a list of 2 elements if and only if ‘delta2’ is True; otherwise it can only be a string.
mini_meta	bool	False	Indicator of weighted delta calculation.
ps_adjust	bool	False	Indicator of whether to adjust calculated p-value according to Phipson & Smyth (2010) # https://doi.org/10.2202/1544-6115.1585
Returns	A `Dabest` object.

prop_dataset

 prop_dataset (group:Union[list,tuple,numpy.ndarray,dict],
               group_names:Optional[list]=None)

Convenient function to generate a dataframe of binary data.

	Type	Default	Details
group	Union
group_names	Optional	None	Accepts lists, tuples, or numpy ndarrays of numeric types.

Example

import numpy as np
import pandas as pd
import scipy as sp
import dabest

Create dummy data for demonstration.

np.random.seed(88888)
N = 10
c1 = sp.stats.norm.rvs(loc=100, scale=5, size=N)
t1 = sp.stats.norm.rvs(loc=115, scale=5, size=N)
df = pd.DataFrame({"Control 1": c1, "Test 1": t1})

Load the data.

my_data = dabest.load(df, idx=("Control 1", "Test 1"))
my_data

DABEST v2024.03.29
==================
                  
Good afternoon!
The current time is Tue Mar 19 15:34:58 2024.

Effect size(s) with 95% confidence intervals will be computed for:
1. Test 1 minus Control 1

5000 resamples will be used to generate the effect size bootstraps.

For proportion plot.

np.random.seed(88888)
N = 10
c1 = np.random.binomial(1, 0.2, size=N)
t1 = np.random.binomial(1, 0.5, size=N)
df = pd.DataFrame({"Control 1": c1, "Test 1": t1})
my_data = dabest.load(df, idx=("Control 1", "Test 1"), proportional=True)