integron

class integron_finder.integron.Integron(replicon, cfg)[source]

Integron object represents an object composed of an integrase, attC sites and gene cassettes. Each element is characterized by their coordinates in the replicon, the strand (+ or -), the ID of the gene (except attC). The object Integron is also characterized by the ID of the replicon.

__init__(replicon, cfg)[source]
Parameters:
  • replicon (a Bio.Seq.SeqRecord object) – The replicon where integrons has been found
  • cfg (a integron_finder.config.Config object) – the configuration
__weakref__

list of weak references to the object (if defined)

add_attC(pos_beg_attC, pos_end_attC, strand, evalue, model)[source]

Adds attC site to the Integron object.

Parameters:
  • pos_beg_attC (int) – the position on the replicon of the beginning attc site
  • pos_end_attC (int) – the position on replicon of the end of the attc site
  • strand (int) – the strand where is found the attc 1 for forward, -1 for reverse
  • evalue (float) – the evalue associated to this attc site
  • model (str) – the name of attc model (for instance attc4)
add_attI()[source]

Looking for Att1 sites and add them to this integron.

add_integrase(pos_beg_int, pos_end_int, id_int, strand_int, evalue, model)[source]

Adds integrases to the integron. Should be called once.

Parameters:
  • pos_beg_int (int) – the position on the replicon of the beginning integrase site
  • pos_end_int (int) – the position on replicon of the end of the integrase site
  • id_int (str) – The protein id corresponding to the integrase
  • strand_int (int) – the strand where is found the attc 1 for forward, -1 for reverse
  • evalue (float) – the evalue associated to this attc site
  • model (str) – the name of integrase model (for instance intersection_tyr_intI)
add_promoter()[source]

Looks for known promoters if they exists within your integrons element. It takes 1s for about 13kb.

add_proteins(prot_db)[source]
Parameters:prot_db (integron.prot_db.ProteinDB object.) – The protein db corresponding to the translation of the replicon
describe()[source]
Returns:DataFrame describing the integron object The columns are:

”pos_beg”, “pos_end”, “strand”, “evalue”, “type_elt”, “model”, “distance_2attC”, “annotation”, “considered_topology”

draw_integron(file=None)[source]

Represent the different element of the integrons if file is provide save the drawing on the file otherwise display it on screen.

Parameters:file (str) – the path to save the integron schema (in pdf format)
has_attC()[source]
Returns:True if integron has attc sites False otherwise.
has_integrase()[source]
Returns:True if integron has integrase False otherwise.
type()[source]
Returns:The type of the integrons:
  • ’complete’ : Have one integrase and at least one attC
  • ’CALIN’ : Have at least one attC
  • ’In0’ : Just an integrase intI
Return type:str
integron_finder.integron.find_integron(replicon, prot_db, attc_file, intI_file, phageI_file, cfg)[source]
Function that looks for integrons given rules :
  • presence of intI
  • presence of attC
  • d(intI-attC) <= 4 kb
  • d(attC-attC) <= 4 kb

It returns the list of all integrons, be they complete or not. found in attC files + integrases file which are formatted as follow : intI_file : Accession_number ID_prot strand pos_beg pos_end evalue attc_file : Accession_number attC cm_debut cm_fin pos_beg pos_end sens evalue

Parameters:
  • replicon (Bio.Seq.SeqRecord object) – the name of the replicon
  • prot_db (a integron_finder.prot_db.ProteinDB object.) – the protein database corresponding to the replicon translation
  • attc_file (path to cmsearch output or pd.Dataframe) – the output of cmsearch or the result of parsing of this file by read_infernal
  • intI_file (str) – the output of hmmsearch with the integrase model
  • phageI_file (str) – the output of hmmsearch with the phage model
  • cfg (a integron_finder.config.Config object) – configuration
Returns:

list of all integrons, be they complete or not

Retype:

list of Integron object