greatpy.tl.get_association

greatpy.tl.get_association(test, regdom)

Determine the names of genes associated with at least one genomic region

Parameters:
test : pd.DataFrame

df of the tests pics => columns: [“chr”,”chr_start”,”chr_end”]

regdom : pd.DataFrame

df of the regulatory domains => columns: [“chr” “chr_start” “chr_end” “name” “tss” “strand”].

Returns:

res – list of gene associated with at least with one test peak

Return type:

list

Examples

>>> test = pd.DataFrame(
    {
        "chr":["chr1"],
        "chr_start":[1052028],
        "chr_end": [1052049]}
    )
>>> regdom = pd.DataFrame(
    {
        "chr":["chr1","chr1"],
        "chr_start":[1034992,1079306],
        "chr_end": [1115089,1132016],
        "name":["RNF223","C1orf159"],
        "tss":[1074306,1116089],
        "strand":['-','-']
    })
>>> get_association(test,regdom)
...    ['RNF223']