Biology#
- class polars_extensions.biology.BioExtensionNamespace(expr: Expr)[source]#
Bases:
objectPolars expression namespace for biological sequence conversions and analysis.
Methods
Return AT content as a percentage.
contains_motif(motif)Return True if sequence contains the given motif.
Return the number of codons (sequence length / 3).
count_motif(motif)Count occurrences of a motif in the sequence.
Return a struct of counts for A, T, G, and C.
delete_sequence(start, end)Delete a segment of the sequence from start to end.
Return the DNA complement (A↔T, C↔G).
Return the reverse complement of DNA sequence.
Convert DNA sequences (A, T, G, C) to RNA sequences (A, U, G, C).
Transcribe DNA to RNA (same as dna_to_rna).
Return GC content as a percentage.
gc_skew()Compute GC skew = (G - C) / (G + C).
hamming_distance(other)Return Hamming distance between two equal-length sequences.
insert_sequence(position, subseq)Insert a subsequence at the given position.
Return True if the sequence only contains valid DNA bases (A, T, G, C, N).
Return True if the sequence only contains valid RNA bases (A, U, G, C, N).
mutate_sequence(position, new_base)Mutate a sequence by replacing one base at a given position (0-indexed).
Repeat the sequence n times.
Reverse a sequence string.
Convert RNA sequences (A, U, G, C) to DNA sequences (A, T, G, C).
Return the sequence length.
- delete_sequence(start: int, end: int) Expr[source]#
Delete a segment of the sequence from start to end.
- hamming_distance(
- other: Expr,
Return Hamming distance between two equal-length sequences.
- Parameters:
- otherpl.Expr
Another expression containing sequences of equal length.
- insert_sequence(
- position: int,
- subseq: str,
Insert a subsequence at the given position.
- is_valid_dna() Expr[source]#
Return True if the sequence only contains valid DNA bases (A, T, G, C, N).
- is_valid_rna() Expr[source]#
Return True if the sequence only contains valid RNA bases (A, U, G, C, N).