exonize_analysis module

Expansion

Expansion class represents an expansion graph for a specific gene expansion.

Attributes:
  • graph (Graph) –

    A NetworkX graph representing the expansion.

__init__(expansion_id, nodes, edges)

Initializes an Expansion instance.

Parameters:
  • expansion_id (int) –

    The unique identifier for the expansion.

  • nodes (list of tuples) –

    A list of tuples representing the nodes in the form (coord, node_type).

  • edges (list of tuples) –

    A list of tuples representing the edges in the form (q_coord, t_coord, mode).

Gene

Gene class is a container for gene expansion graphs.

Attributes:
  • id (str) –

    The unique identifier for the gene.

  • coordinates (Interval) –

    The start and end coordinates of the gene on the chromosome.

  • strand (str) –

    The DNA strand ('+' or '-') on which the gene is located.

  • chromosome (str) –

    The chromosome on which the gene is located.

  • expansions (dict) –

    A dictionary where keys are expansion IDs and values are expansion objects.

__getitem__(expansion_id)

Retrieves the expansion graph for the specified expansion ID.

Parameters:
  • expansion_id (int) –

    The ID of the expansion to retrieve.

Returns:
  • Graph

    networkx.Graph: The expansion graph associated with the given expansion ID.

Examples:

>>> gene[1]  # Retrieves the expansion graph for expansion ID 1

__init__(gene_id, coordinates, strand, chromosome)

Initializes a Gene instance.

Parameters:
  • gene_id (str) –

    The unique identifier for the gene.

  • coordinates (Interval) –

    The start and end coordinates of the gene on the chromosome.

  • strand (str) –

    The DNA strand ('+' or '-') on which the gene is located.

  • chromosome (str) –

    The chromosome on which the gene is located.

__iter__()

Returns an iterator over the expansion graphs.

Returns:
  • iterator( iter ) –

    An iterator yielding each expansion graph.

Examples:

>>> for graph in gene:
...     print(graph)

__len__()

Returns the number of expansions associated with the gene.

Returns:
  • int( int ) –

    The number of expansions.

Examples:

>>> len(gene)
0

__repr__()

Returns a string representation of the Gene object.

Returns:
  • str( str ) –

    A string describing the gene's ID and number of expansions.

Examples:

>>> repr(gene)
'<Gene GENE123 with 0 expansions (iterable of expansion graphs)>'

build_gene_graph()

Builds and returns a consolidated gene graph containing nodes and edges from all expansion graphs.

Returns:
  • Graph

    networkx.Graph: A combined graph with nodes and edges from all expansions.

Examples:

>>> combined_graph = gene.build_gene_graph()
>>> print(combined_graph.nodes)
>>> print(combined_graph.edges)

draw_expansions_multigraph(expansion_id=None, figure_path=None, figure_size=(8.0, 8.0), legend=True, connect_overlapping_nodes=False, color_tandem_pair_edges=False, full_expansion=False, tandem_edges_color='blue')

Draws a multi-graph of gene expansions.

Parameters:
  • expansion_id (int, default: None ) –

    The ID of a specific expansion to draw. If None, the gene graph is drawn.

  • figure_path (Path, default: None ) –

    The path to save the figure. If None, the figure is not saved.

  • figure_size (tuple of float, default: (8.0, 8.0) ) –

    The size of the figure in inches. Default is (8.0, 8.0).

  • legend (bool, default: True ) –

    Whether to display a legend on the plot. Default is True.

  • connect_overlapping_nodes (bool, default: False ) –

    Whether to draw edges connecting overlapping nodes in the graph. Default is True.

  • color_tandem_pair_edges (bool, default: False ) –

    Color edges between tandem exon nodes. Default is True.

  • full_expansion (bool, default: False ) –

    Whether to show the full expansion graph only. Default is False.

  • tandem_edges_color (str, default: 'blue' ) –

    The color to use for tandem pair edges. Default is 'blue'.

draw_gene_structure(expansion_id=None, save_path=None)

Visualize the gene structure, highlighting coding exons and expansion events.

This method uses the dna_features_viewer library to plot the gene structure, showing the locations of coding exons and expansion events within the specified gene. The plot can be saved to a file if save_path is provided.

Parameters:
  • expansion_id (int, default: None ) –

    The ID of the expansion to visualize. Defaults to None, in which case the default expansion (if any) will be used.

  • save_path (Path, default: None ) –

    The file path where the plot will be saved. If not provided, the plot will be displayed but not saved.

Returns:
  • None

GenomeExpansions

A container for managing gene expansions across an entire genome.

Attributes:
  • exonize_db_path (str) –

    The file path to the Exonize database.

genes: list property

Returns a list of gene IDs.

Returns:
  • list( list ) –

    A list of gene IDs in the GenomeExpansions.

Examples:

>>> genome_expansions.genes
['GENE123', 'GENE456', 'GENE789']

__contains__(n)

Checks if a gene ID exists in the GenomeExpansions.

Parameters:
  • n (str) –

    The gene ID to check for existence.

Returns:
  • bool( bool ) –

    True if the gene ID exists, False otherwise.

Examples:

>>> "GENE123" in genome_expansions
True

__getitem__(gene_id)

Retrieves a Gene object by gene ID.

Parameters:
  • gene_id (str) –

    The ID of the gene to retrieve.

Returns:
  • Gene( Gene ) –

    The Gene object associated with the specified gene ID.

Examples:

>>> gene = genome_expansions["GENE123"]
>>> print(gene)
<Gene GENE123 with 0 expansions (iterable of expansion graphs)>

__init__(exonize_db_path)

Initializes a GenomeExpansions instance and builds expansions from the database.

Parameters:
  • exonize_db_path (str) –

    The file path to the Exonize database.

__iter__()

Returns an iterator over the Gene objects.

Returns:
  • iter( iter ) –

    An iterator yielding each Gene object.

Examples:

>>> for gene in genome_expansions:
...     print(gene)

__len__()

Returns the number of genes in the GenomeExpansions.

Returns:
  • int( int ) –

    The number of genes in the GenomeExpansions.

Examples:

>>> len(genome_expansions)
18

build_expansions()

Constructs the gene expansions from the Exonize database.

This method initializes each Gene object and populates its expansions based on data from the Exonize database. Each expansion consists of nodes and edges, forming a graph for each gene.

Examples:

>>> genome_expansions.build_expansions()
>>> print(len(genome_expansions))
18