attc¶
-
integron_finder.attc.
find_attc_max
(integrons, replicon, distance_threshold, model_attc_path, max_attc_size, min_attc_size, evalue_attc=1.0, circular=True, out_dir='.', cpu=1)[source]¶ Look for attC site with cmsearch –max option which remove all heuristic filters. As this option make the algorithm way slower, we only run it in the region around a hit. We call it local_max or eagle_eyes.
Default hit
attC __________________-->____-->_________-->_____________ ______<--------______________________________________ intI ^-------------------------------------^ Search-space with --local_max
Updated hit
attC *** *** __________________-->____-->___-->___-->___-->_______ ______<--------______________________________________ intI
Parameters: - integrons (list of
Integron
objects.) – the integrons may contain or not attC or intI. - replicon (
Bio.Seq.SeqRecord
object.) – replicon where the integrons were found (genomic fasta file). - distance_threshold (int) – the maximal distance between 2 elements to aggregate them.
- evalue_attc (float) – evalue threshold to filter out hits above it.
- model_attc_path (str) – path to the attc model (Covariance Matrix).
- max_attc_size (int) – maximum value for the attC size.
- min_attc_size (int) – minimum value for the attC size.
- circular (bool) – True if replicon is circular, False otherwise.
- out_dir (str) – The directory where to write results
used indirectly by some called functions as
infernal.local_max()
or infernal.expand. - cpu (int) – call local_max with the right number of cpu
Returns: Return type: pd.DataFrame
object- integrons (list of
-
integron_finder.attc.
search_attc
(attc_df, keep_palindromes, dist_threshold, replicon_size)[source]¶ Parse the attc data set (sorted along start site) for the given replicon and return list of arrays. One array is composed of attC sites on the same strand and separated by a distance less than dist_threshold.
Parameters: - attc_df (
pandas.DataFrame
) – - keep_palindromes (bool) – True if the palindromes must be kept in attc result, False otherwise
- dist_threshold (int) – the maximal distance between 2 elements to aggregate them
- replicon_size (int) – the replicon number of base pair
Returns: a list attC sites found on replicon
Return type: list of
pandas.DataFrame
objects- attc_df (