greatpy.tl.create_regdom
- greatpy.tl.create_regdom(tss_file, chr_sizes_file, association_rule, max_extension=1000000, basal_upstream=5000, basal_downstream=1000, out_path=None)
Create regdoms according to the three association rules, to write the result in a file or not and to return the result as a pd.DataFrame
- Parameters:
- tss_file : str
The path of the TSS file.
- chr_sizes_file : str
The path of the chromosome size file.
- association_rule : str
The association rule to use. Could be : “one_closet”, “two_closet”, “basal_plus_extention”.
Documentation aviable at https://great-help.atlassian.net/wiki/spaces/GREAT/pages/655443/Association+Rules.
- maximumExtension : int
The maximum extension of the regulatory domain.
Default is
100000- basalUp : int
The basal upstream of the regulatory domain.
Default is
5000- basalDown : int
The basal downstream of the regulatory domain.
Default is
1000- out_path : str or NoneType
The path of the output file.
If None, the result is only returned as a pd.DataFrame.
Default is
None
- Returns:
out – The regulatory domains.
- Return type:
pd.DataFrame
Examples
>>> regdom = create_regdom( tss_file="../../data/human/tss.bed", chr_sizes_file="../../data/human/chr_size.bed", sep=" ", names=["chr","size"], association_rule="one_closet" ) >>> regdom.head() ... | | chr | chr_start | chr_end | name | tss | strand | ... |---:|:------|------------:|----------:|:----------|------:|:---------| ... | 0 | chr1 | 0 | 17436 | MIR6859-1 | 17436 | - | ... | 1 | chr1 | 17436 | 17436 | MIR6859-2 | 17436 | - | ... | 2 | chr1 | 17436 | 17436 | MIR6859-3 | 17436 | - | ... | 3 | chr1 | 17436 | 23403 | MIR6859-4 | 17436 | - | ... | 4 | chr1 | 23403 | 29867 | WASH7P | 29370 | - |