DelaySSA
February 7, 2025 · View on GitHub
DelaySSA is an R package designed to simulate non-Markovian models of gene expression by extending the Gillespie algorithm to reactions with delay. This package is particularly useful for researchers and practitioners in computational biology and systems biology who are interested in modeling the dynamics of gene expression with intrinsic delays.
Features
- Implementation of the delayed Gillespie algorithm [1-4].
- Support for modeling gene expression with various delay distributions.
- Easy-to-use interface for defining reactions with delays.
- Capability to handle complex gene regulatory networks.
- Visualization tools for analyzing simulation results.
Installation
Download the installation package from GitHub
devtools::install_github("Zoey-JIN/DelaySSA")
Or download and install locally
devtools::install("~/DelaySSA")
Then load the package
library("DelaySSA")
Tutorial Example : Delayed Production and Annihilation System
This tutorial is designed to demonstrate how to use DelaySSA for defining chemical reaction models, solving related problems, and visualizing the results. The source code for all examples and figures can be found in the GitHub examples folder. We study the system with two non-delay channels and one delay channel. This model describes that molecule binds and then degrade with the reaction rate . Once the reaction occurs, the molecule will be generated after a fixed time delay , and will degrade with the rate . This procedure can be described by
The species are . Let
Initialization Part
Assume reactions occur in this system from t_initial=0 to tmax=150, repeating this process for sample=1000 times. The initial quantities of and are 1000, 1000, and 1, respectively.
sample <- 1000
tmax <- 150
n_initial <- matrix(c(1000,1000,0),nrow = 3)
t_initial <- 0
According to the reactions, the counting of reactant and product molecules rows is arranged with rows indexed as , and columns indexed in order of the reactions. and denote numbers of reactant and product molecules, respectively. Then, we define as reactant_matrix and . Therefore, we can define as S_matrix. For delay part, we can define as S_matrix_delay, which can be expressed as
S_matrix <- c(-1,-1,0,0,0,-1)
S_matrix <- matrix(S_matrix,nrow = 3)
> S_matrix
[,1] [,2]
[1,] -1 0
[2,] -1 0
[3,] 0 -1
S_matrix_delay <- c(0,0,1,0,0,0)
S_matrix_delay <- matrix(S_matrix_delay,nrow = 3)
> S_matrix_delay
[,1] [,2]
[1,] 0 0
[2,] 0 0
[3,] 1 0
k <- c(0.001,0.001)
reactant_matrix <- matrix(c(1,1,0,0,0,1),nrow = 3)
> reactant_matrix
[,1] [,2]
[1,] 1 0
[2,] 1 0
[3,] 0 1
The reactions are categorized as ICD and ND, with delay_type corresponding to 2 and 1 respectively. delaytime_list records the delay times of the reactions. The first reaction will trigger a delay reaction that will happen after seconds. The second reaction will not trigger a delay reaction, thus a 0 is added to delaytime_list.
tau = 0.1
delay_type <- matrix(c(2,0),nrow = 1)
delaytime_list <- list()
delaytime_list <- append(delaytime_list,c(tau,0))
Simulation and Visualization Part
Next, use the function simulation_DelaySSA from the package to calculate the quantities of species after reactions occur and the corresponding times for each reaction
result <- simulation_DelaySSA(algorithm = "DelayMNR", sample_size=sample, tmax=tmax, n_initial=n_initial, t_initial=t_initial, S_matrix=S_matrix, S_matrix_delay=S_matrix_delay, k=k, reactant_matrix=reactant_matrix, delay_type=delay_type , delaytime_list=delaytime_list)
result is a list whose length is determined by the number of simulations sample_size. Each element of this list is also a list, consisting of:
- A vector representing the times at which reactions occurred.
- A matrix showing the changes in species over time.
Sampling times are taken as seq(0, tmax, by = 1). Use plot_SSA_mean to calculate and plot the mean values in the quantities of each species at these time points. At time tmax, use plot_SSA_density to calculate and plot the probability distribution of the quantities of each species. Here the number of is the same as the number of .
plot_SSA_mean(result = result,t=seq(0, tmax, by = 1) ,n_initial = n_initial,t_initial = 0)
plot_SSA_density(result = result,t_pick = tmax)
To be more specific, result is the output list from simulation. Then use picksample(result,i,t) to sample the species i at time t. convert_pdf(n) can convert the vector to a list containing the probability density. The list has two elements, where the first element is a vecter representing number of the species, and the second element is a table representing the probability for each quantity. plot_mean(result, i, t) returns the mean value of species i at time t after the simulation, using the result data. plot_mean relies on the picksample and mean functions to perform the calculation. All in all, we can use picksample and convert_pdf functions to obtain the probability distribution of each species at each time point and use plot_mean function to obtain the mean value of each species over repeated simulations.
> a <- c(1,2,3,4,5)
> convert_pdf(a)
[[1]]
[1] 1 2 3 4 5
[[2]]
a_vector
1 2 3 4 5
0.2 0.2 0.2 0.2 0.2
library(ggplot2)
Species <- c("S1 S2","S1 S2","S3")
svg <- svglite("TwoChannels_mean.svg", width = 5, height = 5)
t=seq(0, tmax, by = 1)
t_initial = 0
num_columns <- nrow(result[[1]]$n_values)
data_list <- lapply(c(2,3), function(i) {
n <- lapply(t, function(x) plot_mean(result, i, x))
n <- unlist(n)
if (t[1] == t_initial)
n[1] <- n_initial[i, ]
data.frame(t = t, quantity = n, Species = Species[i])
})
plot_data <- do.call(rbind, data_list)
ggplot(plot_data, aes(x = t, y = quantity, color = Species)) +
geom_line(linewidth = 0.7) +
labs(x = "T", y = "Mean Value") +
scale_color_brewer(palette = "Set1") +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
axis.title = element_text(size = 12, face = "bold"),
axis.text = element_text(size = 10),
legend.title = element_blank(),
legend.text = element_text(size = 10),
panel.grid = element_blank(),
panel.background = element_blank(),
axis.line = element_line(color = "black")
)
dev.off()
svg <- svglite("TwoChannels_S1andS2_density_150s.svg", width = 5, height = 5)
num_columns <- nrow(result[[1]]$n_values)
data_list <- lapply(c(2), function(i) {
n <- lapply(result, function(x) picksample(x, i, t=tmax))
n <- unlist(n)
plot_xy <- convert_pdf(n)
if (!all(data.frame(percentage = plot_xy)[, 1] == data.frame(percentage = plot_xy)[,2])) {
warning("Error in Calculating Density Table")
}
data.frame(quantity = data.frame(percentage = plot_xy)[,1], percentage = data.frame(percentage = plot_xy)[,3], Species = Species[i])
})
plot_data <- do.call(rbind, data_list)
ggplot(plot_data, aes(x = quantity, y = percentage, color = Species)) +
geom_line(linewidth = 0.7) +
labs(x = "# of Products", y = "Probability") +
scale_color_brewer(palette = "Set1") +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
axis.title = element_text(size = 12, face = "bold"),
axis.text = element_text(size = 10),
legend.title = element_blank(),
legend.text = element_text(size = 10),
panel.grid = element_blank(),
panel.background = element_blank(),
axis.line = element_line(color = "black")
)
dev.off()
svg <- svglite("TwoChannels_S3_density_150s.svg", width = 5, height = 5)
num_columns <- nrow(result[[1]]$n_values)
data_list <- lapply(c(3), function(i) {
n <- lapply(result, function(x) picksample(x, i, t=tmax))
n <- unlist(n)
plot_xy <- convert_pdf(n)
if (!all(data.frame(percentage = plot_xy)[, 1] == data.frame(percentage = plot_xy)[,2])) {
warning("Error in Calculating Density Table")
}
data.frame(quantity = data.frame(percentage = plot_xy)[,1], percentage = data.frame(percentage = plot_xy)[,3], Species= Species[i])
})
plot_data <- do.call(rbind, data_list)
ggplot(plot_data, aes(x = quantity, y = percentage, color = Species)) +
geom_line(linewidth = 0.7) +
labs(x = "# of Products", y = "Probability") +
scale_color_brewer(palette = "Set1") +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
axis.title = element_text(size = 12, face = "bold"),
axis.text = element_text(size = 10),
legend.title = element_blank(),
legend.text = element_text(size = 10),
panel.grid = element_blank(),
panel.background = element_blank(),
axis.line = element_line(color = "black")
)
dev.off()
Main API
algorithm
The alternative algorithm must be one of DelayMNR (default), DelayReject, or DelayDirect. Traditional stochastic simulation including Direct, MNR and NR are recommended and the parameters of S_matrix_dalay delay_type and delaytime_list can be omitted.
sample_size
A numeric value representing trajectories for the system.
tmax
A numeric value representing the cut-off time for the system.
n_initial
An N-by-1 matrix representing the initial quantity of species, where N corresponds to the number of molecular species.
t_initial
A numeric value representing the initial time for the system.
S_matrix
An N-by-R matrix, where N corresponds to the number of species types and R corresponds to the number of reactions. It represents the stoichiometric matrix at the initiation time , denoting the change in the number of molecules per reaction.
S_matrix_dalay
An N-by-R matrix, representing the stoichiometric matrix at the completion time , which is only relevant for the delayed type of CD and ICD reactions. If it is not a delay reaction namely ND reactions, the corresponding column is filled with zeros.
k
An R-dimensional vector representing the reaction constant rate or a function representing the variable rate.
reactant_matrix
An N-by-R matrix, representing the quantity of reactants.
reactant_matrix_delay
An N-by-R matrix, representing the quantity of reactants in delay part.
f_r
A function determined by and . f_r represents the propensity function.
delay_type
An R-dimensional vector or a 1-by-R matrix representing the type of the reactions. It is a numerical vector, where each element can take on the values 0, 1, or 2. Here 0 represents ND, 1 represents CD and 2represents ICD.
delaytime_list
An R-dimensional list representing the delay time of the reactions. Every element can be a fixed number or a stochastic value governed by a function.
delay_effect_matrix
A 2-row matrix. The first row represents the reaction index of S_matrix, the second row represents the reaction index of S_matrix_dalay. If not empty, each column of the matrix links S_matrix and S_matrix_delay: the first row shows the column index from S_matrix, while the second row shows the corresponding column index ' from S_matrix_delay, where both columns are the same. This will cause the '-th reaction in delay part to be randomly eliminated when the -th reaction in S_matrix occurs called interruption.
Function
| Function | Description |
|---|---|
| check_delay_relative* | Delay Effect Matrix |
| convert_pdf | Probability Density Functions Conversion |
| fun_fr | Propensity Function |
| picksample | Results Sampling |
| plot_mean | The Mean Value of Species i at Time t |
| plot_SSA_density* | Density at Time t_pick |
| plot_SSA_mean* | Mean of Species over Time |
| propensity_n | For Propensity Calculation |
| simulate_reaction* | Gillespie Algorithm |
| simulate_reaction_delay_direct* | Delay Direct Method Algorithm |
| simulate_reaction_delay_modifiednextreaction* | Delay Modified Next Reaction Method Algorithm |
| simulate_reaction_delay_rejection* | Delay Rejection Method Algorithm |
| simulate_reaction_modifiednextreaction* | Modified Next Reaction Method Algorithm |
| simulate_reaction_nextreaction* | Next Reaction Method Algorithm |
| simulation_DelaySSA* | Simulation Algorithm (SSA) with Delays |
| tau_element | Delay Time |
* This function can be used independently.
Basic Concepts
Biological processes encompass a multitude of intricate mechanisms involving various molecules and physical interactions. By conceptualizing these processes as a series of discrete chemical reactions, we can mathematically formalize them. This mathematical representation enables precise modeling of the reaction dynamics and facilitates accurate predictions of reactant quantities.
Stochastic Simulation Algorithm (SSA) [1] is a method used to simulate stochastic processes in chemical reaction systems. This algorithm is particularly suitable for systems with a small number of molecules.
However, some chemical reactions, such as gene transcription and translation within living cells, necessitate a defined temporal duration to reach completion following their initiation. As a result, the products of these reactions will manifest only after a time delay which makes the traditional SSA algorithm unsuitable for these reactions. DelaySSA implements a stochastic simulation algorithm (SSA) with delays in R language. It can simulate chemical reaction systems both with and without delays. now DelaySSA supports three exact delay stochastic simulation algorithms, namely, delay direct method DelayDirect [2], delay rejection method DelayRejction DelayReject [3] and delay modified next reaction method DelayMNR [4]. Here we give the basic concepts of these algorithms.
Given a finite set of chemical species and chemical reactions, we define the reactions by the notation
where and denote numbers of reactant and product molecules, respectively. is the reaction rate constant of the -th reacion. And the stoichiometric matrix is given by
According to [5], propensity function are in the form of mass-action kinetics type
where , is the number of species , is the volume of a closed compartment.
The time delay could be a fixed number or a stochastic value. According to [3], reactions with delays are categorized into consuming and nonconsuming reactions. If a delayed reaction is a nonconsuming reaction, it initiates at and will finish until , then the of the number of species will change only at . If a delayed reaction is a consuming reactions, it initiates at and will finish until , then the of the number of species will change both at and .
We can categorize reactions into the following three cases.
Case 1: If reaction loses the reactant species and gains the product species at the initiation time , we define the reaction with no delays as ND.
Case 2: If reaction loses the reactant species and gains the product species at the completion time , we define the reaction with delays as CD.
Case 3: If reaction loses the reactant species and gains the product species at the initiation time and the completion time , respectively, we define the reaction with delays as ICD.
Delay Direct Method Algorithm
Consider that delay reactions are ongoing at the time . The delay reactions will complete at , where . As in the derivation of Gillespie’s exact Stochastic Simulation Algorithm (SSA), can be found from the fundamental assumption as , where is the probability that no reaction will happen in the time interval . The delay effects the propensity function. So comes to , where the exponent assumes equal to zero when .
According this two equations, and can be generated as,
where respectively.
Algorithm
Suppose that at time there are ongoing delayed reactions set to complete at times . Define and .
Define Tstruct, whose -th row stores and the index, , of the reaction that is associated with.
-
Initialize. Set and set species number . Create a empty Tstruct.
-
Calculate propensity functions .
-
Generate .
-
Generate an independent random number .
-
If Tstruct is empty, it means there is no ongoing delayed reaction.
- Set .
-
Else
-
Set .
-
Set , and .
-
While
-
Calculate .
-
Calculate propensity due to the finish of the delayed reaction at and calculate .
-
Set . Update .
-
If , update species number due to the delay reaction at
-
-
EndWhile
-
Set .
-
Calculate .
-
-
EndIf
-
-
If , delete the columns $1,\ldots,iT_iT_j=T_j-\tau$.
-
Generate from a Uniform(0,1) random variable, and find the integer which satisfies .
-
Update according to the type of reaction belongs to: if the reaction belongs to type ND, update species number ; if the reaction belongs to type CD, store the time ; if the reaction belongs to type ICD, update species number and store the time . If it is a delay reaction, insert and the reaction into Tstruct, ensuring that the times in Tstruct remain in ascending order.
-
Set .
-
Return to step 2 or quit.
Remark. Notice that in the above pseudocode, we modified the Step 4 in the original algorithm for computational efficiency, but both are equivalent. More details are illustrated in [2].
Delay Modified Next Reaction Method Algorithm
Delay Modified Next Reaction Method Algorithm is modified Next Reaction Method to systems with delays. According to [4], if is the current internal time of , the first internal time after at which fires, and the propensity function for the -th reaction channel is given by , then the time until the next initiation of reaction (assuming no other reactions initiate or complete) is still given by . To each delayed reaction channel we therefore assign a vector, , that stores the completion times of that reaction in ascending order. Thus, the time until there is a change in the state of the system, be it an initiation or a completion, will be given by
where is the current time of the system.
Algorithm
-
Initialize. Set and set species number . For each , set , , and .
-
Calculate the propensity function, , for each reaction.
-
Generate independent, random numbers, , and set .
-
Set .
-
Set .
-
Set .
-
If choose the completion of the delayed reaction :
-
Update species number based upon the completion of the reaction .
-
Delete the first element of .
-
-
Elseif reaction initiated and
- Update species number according to reaction .
-
Elseif reaction initiated and
- Update by inserting into ensuring that the times in remain in ascending order.
-
Elseif reaction initiated and
-
Update species number based upon the initiation of reaction .
-
Update by inserting into ensuring that the times in remain in ascending order.
-
-
For each r, set .
-
If reaction initiated, let be and set .
-
Recalculate the propensity functions, .
-
Return to step 4 or quit.
Delay Rejection Method Algorithm
Based upon the non-delay algorithm, we see that simulation methods for systems with delays need to calculate when reactions initiate and store when they complete. However, because of the delayed reactions, the propensity functions can change between initiation times. [3] used an algorithm for computing the initiation times that is exactly like the original Gillespie Algorithm except that if there is a stored delayed reaction set to finish within a computed timestep, then the computed timestep is discarded, and the system is updated to incorporate the stored delayed reaction. The algorithm then attempts another step starting at its new state. This algorithm will be referred to as the Rejection Method.
Algorithm
-
Initialize. Set and set species number .
-
Calculate propensity functions .
-
Generate from a Uniform(0,1) random variable, and set .
-
If there is a delayed reaction to finish in
- Discard .
- Update to be the time of the first next delayed reaction and update the species number.
- Return to step 2 or quit
-
Else if there is no delayed reaction in .
- Generate from a Uniform(0,1) random variable, and find the integer which satisfies .
-
Update according to the type of reaction belongs to: if the reaction belongs to type ND, update species number ; if the reaction belongs to type CD, store the time ; if the reaction belongs to type ICD, update species number and store the time .
-
Set the time .
-
Return to step 2 or quit.
Gillespie Algorithm
Consider a system of chemical species with ongoing chemical reactions, each of which has a corresponding tendency function . The Gillespie algorithm assumes that the time from start to finish for each reaction is negligible. Through random simulations, calculate 1) how much time will pass before the next reaction occurs (i.e. starts and finishes), and 2) which reaction will occur at that future point in time. The following assumptions, sometimes referred to as the basic premises of chemical dynamics, are based on physical principles and serve as the underlying assumptions for methods of simulating chemical reaction systems [1]:
Based on this fundamental assumptions, and are two independent random variables and following the probability density functions as:
According this two equations, and can be generated as,
where respectively.
Algorithm
-
Initialize. Set and set species number .
-
Calculate the propensity function, , for each reaction.
-
Generate two independent, Uniform random numbers, and .
-
Set .
-
Find the integer which satisfies .
-
Set .
-
Update species number based upon the completion of the reaction .
-
Return to step 2 or quit.
The Next Reaction Method Algorithm
Let , be the vectors representing the number of each species consumed and created in the rth reaction, respectively. Then, if is the number of initiations of reaction by time , the state of the system at time is
However, based upon the fundamental assumption of stochastic chemical kinetics, is a counting process with intensity such that , for small . Therefore, we have
where the are independent, unit rate Poisson processes.
Each Poisson process brings its own time frame. If we define for each , then it is relevant for us to consider . We will call the "internal time" for reaction .
notes the gap time which the rth reaction needs. . For the moment we denote and the updated propensity functions by .
The internal time of the next firing of has not changed and is still given by . We also know that the updated internal time of is given by . Therefore, the amount of internal time that must pass before the rth reaction fires is given as the difference:
Thus, the amount of absolute time that must pass before the rth reaction channel fires, , is given as the solution to , and we can get:
We have therefore found the absolute times of the next firings of reactions without having to generate any new random numbers.
Note that after the first timestep is taken in the Next Reaction Method, all subsequent timesteps only demand one random number to be generated. The Next Reaction Method developed with the notion of a dependency graph and a priority queue in order to increase computational efficiency [6]. This is similar with random numbers needed for each step of the original Gillespie Algorithm.
Algorithm
-
Initialize. Set and set species number .
-
Calculate the propensity function, , for each reaction.
-
Generate independent, random numbers, .
-
set .
-
Set . Here we assume that is the minimum.
-
Update species number based upon the completion of the reaction .
-
Recalculate the propensity function, , for each reaction.
-
For each , set .
-
For reaction , let be Uniform and set . If is either or , it also needs to be recalculated in this manner.
-
For each , set .
-
Return to step 5 or quit.
Modified Next Reaction Method Algorithm
According to [4], Modified Next Reaction Method Algorithm that is completely equivalent to Next Reaction Method Algorithm. But makes more explicit use of the internal times . The main idea of this algorithm is .
Algorithm
-
Initialize. Set and set species number . For each , set and .
-
Calculate the propensity function, , for each reaction.
-
Generate independent, Uniform random numbers, , and set .
-
Set .
-
Set . Here we assume that is the minimum.
-
Set . And update species number based upon the completion of the reaction .
-
For each , set .
-
For reaction , let be Uniform and set .
-
Recalculate the propensity function, , for each reaction.
-
Return to step 4 or quit.
References
[1]Gillespie, D. T. (1977). Exact stochastic simulation of coupled chemical reactions. The journal of physical chemistry, 81(25), 2340-2361.
[2] Cai, X. (2007). Exact stochastic simulation of coupled chemical reactions with delays. The Journal of chemical physics, 126(12).
[3] Barrio, M., Burrage, K., Leier, A., & Tian, T. (2006). Oscillatory regulation of Hes1: discrete stochastic delay modelling and simulation. PLoS computational biology, 2(9), e117.
[4] Anderson, D. F. (2007). A modified next reaction method for simulating chemical systems with time dependent propensities and delays. The Journal of chemical physics, 127(21).
[5] Van Kampen, N. G. (1992). Stochastic processes in physics and chemistry (Vol. 1). Elsevier.
[6] Gibson, M. A., & Bruck, J. (2000). Efficient exact stochastic simulation of chemical systems with many species and many channels. The journal of physical chemistry A, 104(9), 1876-1889.