greatpy.pl.get_all_comparison
- greatpy.pl.get_all_comparison(results, out_dir='../data/tests/test_data/output/', information_folder='../data/human/', good_gene_associations=True, disp_scatterplot=True, stats=True)
Plot the comparaison between greatpy and Great from some files compute by
great.tl.enrichment_multiple.- Parameters:
- results : dict
Dictionary of results from
great.tl.enrichment_multiple- out_dir : str
Path of the output directory with the results of great webserver.
Default is
../data/tests/test_data/output/- information_folder : str
path of the folder with the information files for the tests.
Default is
../data/human/The input folder should contains the files :
information_folder/assembly_eg_hg38/regulatory_domain.bedinformation_folder/assembly_eg_hg38/chr_size.bed
- good_gene_associations : bool
If True, the function return the number of good gene associations
- disp_scatterplot : bool
If True, the function display the scatterplot of the comparaison
- stats : bool
If True, the function return the statistics of the comparaison
- Returns:
pp (pd.DataFrame) – Dataframe of the number of row lost between before preprocessing and after preprocessing
asso (pd.DataFrame) – DataFrame of the number of good gene associations for each file
stats_df (pd.DataFrame) – DataFrame of the statistics of the comparaison for each file
Example
>>> test = [ ... '../data/tests/test_data/input/09_ERF.bed', '../data/tests/test_data/input/10_MAX.bed', ... '../data/tests/test_data/input/01_random.bed', '../data/tests/test_data/input/04_ultra_hg38.bed', ... '../data/tests/test_data/input/02_srf_hg38.bed', '../data/tests/test_data/input/08_FOXO3.bed', ... '../data/tests/test_data/input/06_height_snps_hg38.bed' ... ] >>> results = great.tl.enrichment_multiple( ... tests = t, ... regdom_file=regdom, ... chr_size_file=size, ... annotation_file="../data/human/ontologies.csv", ... annpath=None, ... binom=True, ... hypergeom=True ... ) >>> pp,asso,stat = get_all_comparison(results)
>>> pp ... | | name | before_pp_greatpy_size | before_pp_great_size | final_size | %_of_GO_from_great_lost | ... |---:|:-------|-------------------------:|-----------------------:|-------------:|--------------------------:| ... | 0 | ERF | 6014 | 2410 | 1833 | 23.94 | ... | 1 | MAX | 2996 | 2395 | 1481 | 38.16 | ... | 2 | random | 579 | 197 | 117 | 40.61 | ... | 3 | ultra | 3265 | 2175 | 1393 | 35.95 | ... | 4 | srf | 4810 | 2681 | 1854 | 30.85 |
>>> asso ... | | name | number_good_gene_asso | number_genes_asso_lost | number_gene_asso_excess | ... |---:|:-------|------------------------:|-------------------------:|--------------------------:| ... | 0 | ERF | 1456 | 0 | 36 | ... | 1 | MAX | 428 | 0 | 4 | ... | 2 | random | 57 | 0 | 1 | ... | 3 | ultra | 496 | 0 | 2 | ... | 4 | srf | 923 | 0 | 7 |
>>> stat ... | | name | pearson_binom | pearson_hypergeom | ... |---:|:-------|----------------:|--------------------:| ... | 0 | ERF | 0.57644 | 0.609552 | ... | 1 | MAX | 0.601492 | 0.670499 | ... | 2 | random | 0.240765 | 0.124707 | ... | 3 | ultra | 0.52949 | 0.675438 | ... | 4 | srf | 0.631909 | 0.597787 |