API

import greatpy as gp

Tools : tl

Create regulatory domain

tl.create_regdom(tss_file, chr_sizes_file, ...)

Create regdoms according to the three association rules, to write the result in a file or not and to return the result as a pd.DataFrame

greatpy computation

Main functions

tl.enrichment(test_file, regdom_file, ...[, ...])

Compute the enrichment GO terms for the test genomic region

tl.enrichment_multiple(tests, regdom_file, ...)

Compute the enrichment of GO term for multiple tests sets using bindome or a list of file path.

Additional functions

tl.loader(test_data, regdom_file, ...)

Load all datasets needed for the enrichment calculation

tl.set_bonferroni(enrichment_df[, alpha])

Create new columns in the dataframe with the Bonferroni correction

tl.set_fdr(enrichment_df[, alpha])

Create new columns in the dataframe with the fdr correction

tl.set_threshold(enrichment_df, colname[, alpha])

Delete rows according to the p-value of the column taken as argument.

utils additional tools

tl.get_nb_asso_per_region(test, regdom)

Determine number of peaks associated with each gene in the regulatory domain.

tl.get_dist_to_tss(test, regdom)

Determine the distance from peaks to the transcription start site of the associated gene

tl.get_association(test, regdom)

Determine the names of genes associated with at least one genomic region

tl.len_regdom(regdom)

Calculate for each gene name the size of the regulatory region in the genome

tl.number_of_hits(test, regdom)

Calculate the number of hits from several genomic regions and the file describing the regulatory regions

tl.get_binom_pval(n, k, p)

Calculate the binomial probability of obtaining k in a set of size n and whose probability is p

tl.hypergeom_pmf(N, K, n, k)

Calculate the probability mass function for hypergeometric distribution

tl.hypergeom_cdf(N, K, n, k)

Calculate the cumulative density funtion for hypergeometric distribution

Plotting : pl

pl.scatterplot(great_df, colname_x, colname_y)

Create a scatterplot from a pandas dataframe between two columns.

pl.graph_nb_asso_per_peaks(test, regdom[, ...])

Creates a barplot representing the percentage of peaks for all possible association numbers

pl.graph_dist_tss(test, regdom[, ax, color])

Creation of a barplot of the distance between the peaks and the TSS of the associated gene(s).

pl.graph_absolute_dist_tss(test, regdom[, ...])

Creation of a barplot of the absolute distance between the peaks and the TSS of the associated gene(s).

pl.plot_enrich(data[, n_terms, color, save])

Creation of a dotplot of the enrichment GO term in the inputs datas

pl.make_bubble_heatmap(p_val_df, odd_ratio_df)

Generate a dotplot with multiple categories

pl.dotplot_multi_sample(test_data[, n_row, ...])

Dotplot of enrichment GO terms for a given list of example genomic regions.

pl.get_all_comparison(results[, out_dir, ...])

Plot the comparaison between greatpy and Great from some files compute by great.tl.enrichment_multiple.