Utilities

fac.utils.parse_sequence_as_one_hot(nt_sequence: str) ndarray

Parse a string genomic sequence as one hot encoding. The sequence should only contain A, C, G, T, and N.

Parameters:

nt_sequence – The nucleotide sequence to parse.

Returns:

The one hot encoding of the sequence, with shape (len(nt_sequence), 4). One hot encoding is in the order A, C, G, T, with N being all zeros.

fac.utils.bootstrap_series(ys: ndarray) Tuple[ndarray, ndarray]

Compute a 95% confidence interval for the mean of the given series at each point, using a bootstrap. Boostraps are done at each time point independently.

This is done deterministically with a fixed random seed.

Parameters:

ys – The series to compute the confidence interval for. Shape: (N, T) where T is the number of time points and N is the number of samples at each point.

Returns:

lo: The lower bound of the confidence interval at each time point. Shape: (T,). hi: The upper bound of the confidence interval at each time point. Shape: (T,).

fac.utils.draw_bases(xs: ndarray)

Draw the bases from the given array.

Parameters:

xs – The array to draw the bases from. Should be of shape (…, 4) and one-hot encoded, or of shape (…,) and integer encoded with all values in (0, 1, 2, 3). Will be returned as a nested list of strings, where (N, 4) will be returned as a string of length N and (N, M, 4) will be returned as a list of N strings of length M.

fac.utils.all_3mers() ndarray

The 64 possible 3mers. We always use this order for consistency.

Returns:

The 64 possible 3mers. Will be of shape (64, 4).

Plotting utilities

fac.plotting.line_color(i: int) str

Compute a color for the i-th line in a plot. Only works for up to 6 lines.

Parameters:

i – The index of the line.

Returns:

The color for the line.

fac.plotting.bar_color(i: int) str

Compute a color for the i-th bar in a plot. Only works for up to 6 bars. Somewhat lighter than the line color.

Parameters:

i – The index of the bar.

Returns:

The color for the bar.