This document is a companion to our submission to the 2020 Artificial Life conference, ‘Genetic regulation facilitates the evolution of signal-response plasticity in digital organisms’. Here, we compare the gene regulatory networks that evolved to solve the repeated signal task with regulatory networks that evolved to solve the directional signal task. See our paper or other documents in this supplemental material for more details on either the repeated signal or directional signal tasks. All of our source code for statistical analyses and data visualizations is embedded in this document. Please file an issue or make a pull request on github to report any mistakes or request more explanation.
While both the repeated signal and directional signal tasks require signal-response plasticity to solve, the directional signal task is more challenging, requiring a higher degree of response flexibility to solve. We evolved 100 replicate populations of 1000 SignalGP digital organisms for 10000 generations to solve each of the directional signal and repeated signal tasks. We expect that the simple regulation that evolved to solve the repeated signal task will be insufficient for coping with the branching nature of the directional signal task.
Load the required R libraries.
library(ggplot2) # (Wickham, 2016)
library(tidyr) # (Wickham and Henry, 2020)
library(dplyr) # (Wickham et al., 2020)
library(reshape2) # (Wickham, 2007)
library(cowplot) # (Wilke, 2019)
library(viridis) # (Garnier, 2018)
library(igraph) # (Csardi and Nepusz, 2006)
library(patchwork) # (Pedersen, 2020)
These analyses were conducted in the following computing environment:
print(version)
## _
## platform x86_64-apple-darwin15.6.0
## arch x86_64
## os darwin15.6.0
## system x86_64, darwin15.6.0
## status
## major 3
## minor 6.2
## year 2019
## month 12
## day 12
## svn rev 77560
## language R
## version.string R version 3.6.2 (2019-12-12)
## nickname Dark and Stormy Night
Configure a few global variables.
theme_set(theme_cowplot())
alpha <- 0.05
correction_method <- "bonferroni"
Load repeated signal task data.
alt_sig_org_data_loc <- "../data/balanced-reg-mult/reg-graph-comps/alt-sig/max_fit_orgs.csv"
alt_sig_org_data <- read.csv(alt_sig_org_data_loc, na.strings="NONE")
alt_sig_org_data <- subset(alt_sig_org_data, select = -c(program))
# Specify factors (not all are relevant to these runs).
alt_sig_org_data$matchbin_thresh <-
factor(alt_sig_org_data$matchbin_thresh,
levels=c(0, 25, 50, 75))
alt_sig_org_data$NUM_SIGNAL_RESPONSES <-
factor(alt_sig_org_data$NUM_SIGNAL_RESPONSES,
levels=c(2, 4, 8, 16, 32))
alt_sig_org_data$NUM_ENV_CYCLES <-
factor(alt_sig_org_data$NUM_ENV_CYCLES,
levels=c(2, 4, 8, 16, 32))
alt_sig_org_data$TAG_LEN <- factor(alt_sig_org_data$TAG_LEN,
levels=c(32, 64, 128))
# Define function to summarize regulation/memory configurations.
get_con <- function(reg, mem) {
if (reg == "0" && mem == "0") {
return("none")
} else if (reg == "0" && mem=="1") {
return("memory")
} else if (reg=="1" && mem=="0") {
return("regulation")
} else if (reg=="1" && mem=="1") {
return("both")
} else {
return("UNKNOWN")
}
}
# Specify experimental condition for each datum.
alt_sig_org_data$condition <-
mapply(get_con,
alt_sig_org_data$USE_FUNC_REGULATION,
alt_sig_org_data$USE_GLOBAL_MEMORY)
alt_sig_org_data$condition <-
factor(alt_sig_org_data$condition,
levels=c("regulation", "memory", "none", "both"))
# Does this program rely on a stochastic strategy?
alt_sig_org_data$stochastic <- 1 - alt_sig_org_data$consistent
alt_sig_network_data_loc <-
"../data/balanced-reg-mult/reg-graph-comps/alt-sig/reg_graphs_summary.csv"
alt_sig_network_data <- read.csv(alt_sig_network_data_loc, na.strings="NA")
Load directional signal task data.
dir_sig_org_data_loc <-
"../data/balanced-reg-mult/reg-graph-comps/dir-sig/max_fit_orgs.csv"
dir_sig_org_data <- read.csv(dir_sig_org_data_loc, na.strings="NONE")
dir_sig_org_data <- subset(dir_sig_org_data, select = -c(program))
# Specify factors.
dir_sig_org_data$matchbin_thresh <-
factor(dir_sig_org_data$matchbin_thresh,
levels=c(0, 25, 50, 75))
dir_sig_org_data$NUM_ENV_UPDATES <-
factor(dir_sig_org_data$NUM_ENV_UPDATES,
levels=c(2, 4, 8, 16, 32))
dir_sig_org_data$NUM_ENV_STATES <-
factor(dir_sig_org_data$NUM_ENV_STATES,
levels=c(2, 4, 8, 16, 32))
dir_sig_org_data$TAG_LEN <-
factor(dir_sig_org_data$TAG_LEN,
levels=c(32, 64, 128))
dir_sig_org_data$MUT_RATE__FUNC_DUP <-
factor(dir_sig_org_data$MUT_RATE__FUNC_DUP,
levels=c("0.01", "0.05", "0.1"))
# Define function to summarize regulation/memory configurations.
get_con <- function(reg, mem) {
if (reg == "0" && mem == "0") {
return("none")
} else if (reg == "0" && mem=="1") {
return("memory")
} else if (reg=="1" && mem=="0") {
return("regulation")
} else if (reg=="1" && mem=="1") {
return("both")
} else {
return("UNKNOWN")
}
}
# Specify experimental condition for each datum.
dir_sig_org_data$condition <-
mapply(get_con,
dir_sig_org_data$USE_FUNC_REGULATION,
dir_sig_org_data$USE_GLOBAL_MEMORY)
dir_sig_org_data$condition <-
factor(dir_sig_org_data$condition,
levels=c("regulation", "memory", "none", "both"))
dir_sig_network_data_loc <-
"../data/balanced-reg-mult/reg-graph-comps/dir-sig/reg_graphs_summary.csv"
dir_sig_network_data <- read.csv(dir_sig_network_data_loc, na.strings="NA")
dir_sig_network_data$test_id <-
factor(dir_sig_network_data$test_id,
levels(factor(dir_sig_network_data$test_id)))
sol_alt_sigs <- filter(alt_sig_org_data, solution=="1")
alt_sig_sol_plot <- ggplot(sol_alt_sigs, aes(x=condition, fill=condition)) +
geom_bar() +
geom_text(stat="count", aes(label=..count..), position=position_dodge(0.9), vjust=0) +
xlab("Repeated signal task") +
scale_y_continuous(name="# solutions",
limits=c(0, 100)) +
theme(legend.position="none",
axis.text.x = element_blank())
sol_dir_sigs <- filter(dir_sig_org_data, solution=="1")
dir_sig_sol_plot <- ggplot(sol_dir_sigs, aes(x=condition, fill=condition)) +
geom_bar() +
geom_text(stat="count", aes(label=..count..), position=position_dodge(0.9), vjust=0) +
xlab("Directional signal task") +
scale_y_continuous(name="# solutions",
limits=c(0, 100)) +
scale_fill_discrete(direction=-1)+
theme(legend.position = "none",
axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.title.y=element_blank())
alt_sig_sol_plot | dir_sig_sol_plot
Extract regulatory networks of solutions.
# Extract successful runs (solution) of given mutation rate.
# -- directional signal solutions --
dir_sig_solutions <- filter(dir_sig_org_data, solution=="1")
dir_sig_sol_networks <- filter(dir_sig_network_data,
run_id %in% dir_sig_solutions$SEED)
dir_sig_sol_networks$exp <- "Directional Signal"
# -- repeated signal solutions --
alt_sig_solutions <- filter(alt_sig_org_data, solution=="1")
alt_sig_sol_networks <- filter(alt_sig_network_data, run_id %in% alt_sig_solutions$SEED)
alt_sig_sol_networks$exp <- "Repeated Signal"
# -- our powers rbind! --
all_sol_networks <- rbind(alt_sig_sol_networks, dir_sig_sol_networks)
A wall of comparisons!
# Melt all of the network metrics together, allowing us to facet over them.
network_measures_melted <- melt(all_sol_networks,
variable.name = "network_measure",
value.name = "network_measure_value",
measure.vars=c("node_cnt",
"edge_cnt",
"promoted_edges_cnt",
"repressed_edges_cnt",
"non_self_promoting_edges",
"non_self_repressing_edges",
"non_self_edges",
"self_promoting_edges",
"self_repressing_edges",
"self_edges",
"graph_density",
"total_degree",
"total_out_degree",
"total_in_degree",
"flow_hierarchy",
"reciprocity"))
ggplot(network_measures_melted, aes(x=exp, y=network_measure_value, color=exp)) +
geom_boxplot() +
facet_wrap(~network_measure, scales="free") +
theme(legend.position = "bottom", axis.text.x=element_blank()) +
ggsave("./imgs/network_measures_all.pdf", width=10, height=10)
# Plot only one test_id for directional signal task
ds_test_id <- 15
ggplot(filter(network_measures_melted,
exp=="Repeated Signal" |
(exp=="Directional Signal" & test_id==ds_test_id)),
aes(x=exp, y=network_measure_value, color=exp)) +
geom_boxplot() +
facet_wrap(~network_measure, scales="free") +
theme(legend.position = "bottom", axis.text.x=element_blank()) +
ggtitle(paste("Just one input sequence (", ds_test_id ,") for dir. signal task", sep="")) +
ggsave("./imgs/network_measures_all_single_test.pdf", width=10, height=10)
Okay, from all of this our intuition (see other analysis files) that networks that solve the directional signal task have more function-function interactions (i.e., more edges). We’ll take a closer look at network vertices and network edges.
## --- All environment sequences ---
# node_cnt
plot_nodes <-
ggplot(all_sol_networks, aes(x=exp, y=node_cnt, color=exp)) +
geom_boxplot() +
ylab("# Functions") +
scale_x_discrete(name="Task",
limits=c("Directional Signal", "Repeated Signal"),
labels=c("D", "R")) +
ylim(0, 40) +
coord_flip() +
theme(legend.position = "none",
axis.text.x = element_text(size=9),
axis.title.x = element_text(size=10))
# edge_cnt
plot_edges <-
ggplot(all_sol_networks, aes(x=exp, y=edge_cnt, color=exp)) +
geom_boxplot() +
ylab("# Reg. Connections") +
scale_x_discrete(name="Task",
limits=c("Directional Signal", "Repeated Signal"),
labels=c("D", "R")) +
coord_flip() +
theme(legend.position = "none",
axis.title.y=element_blank(),
axis.text.x = element_text(size=9),
axis.title.x = element_text(size=10))
plots <- plot_nodes | plot_edges
plots
plots + ggsave("./imgs/reg_networks_vertices_and_edges.pdf", width=4, height=2)
Let’s compare node counts and edge counts (wilcoxon rank sum test)
Vertices
wilcox.test(formula=node_cnt ~ exp, data=all_sol_networks, conf.int=TRUE)
##
## Wilcoxon rank sum test with continuity correction
##
## data: node_cnt by exp
## W = 75806, p-value = 0.3736
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## -9.99966e-01 6.94339e-05
## sample estimates:
## difference in location
## -3.034882e-05
ds_test_id=15 # all 'right' environment shifts
wilcox.test(formula=node_cnt ~ exp,
data=filter(all_sol_networks,
exp=="Repeated Signal" |
(exp=="Directional Signal" & test_id==ds_test_id)))
##
## Wilcoxon rank sum test with continuity correction
##
## data: node_cnt by exp
## W = 5479, p-value = 0.2378
## alternative hypothesis: true location shift is not equal to 0
We find no evidence that networks evolved to solve the directional signal task have more vertices than networks evolved to solve the repeated signal task. I.e., directional signal networks do not necessarily involve more total functions than networks evolved to solve the repeated signal task. This result makes sense given that both tasks require the same number of responses.
Edges
wilcox.test(formula=edge_cnt ~ exp, data=all_sol_networks, conf.int=TRUE)
##
## Wilcoxon rank sum test with continuity correction
##
## data: edge_cnt by exp
## W = 121722, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## 4.000009 6.999983
## sample estimates:
## difference in location
## 5.000017
wilcox.test(formula=edge_cnt ~ exp, data=filter(all_sol_networks,
exp=="Repeated Signal" |
(exp=="Directional Signal" & test_id==ds_test_id)), conf.int=TRUE)
##
## Wilcoxon rank sum test with continuity correction
##
## data: edge_cnt by exp
## W = 8301.5, p-value = 6.759e-16
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## 6.000052 9.000002
## sample estimates:
## difference in location
## 7.999961
There is, however, a significant difference in the number of edges in networks evolved to solve the repeated signal task compared to the networks evolved to solve the directional signal task.
We used knockout experiments to determine whether or not evolved gene regulatory networks rely on up-regulation, down-regulation, or both up- and down-regulation.
Repeated signal task
alt_sig_org_sols <- filter(alt_sig_org_data, solution=="1")
# How many rely only on down-regulation, only on up-regulation, and on both?
get_reg_relies_on <- function(uses_down, uses_up, uses_reg) {
if (uses_down == "0" && uses_up == "0" & uses_reg == "0") {
return("neither")
} else if (uses_down == "0" && uses_up == "0" & uses_reg == "1") {
return("either")
} else if (uses_down == "0" && uses_up == "1") {
return("up-regulation-only")
} else if (uses_down == "1" && uses_up == "0") {
return("down-regulation-only")
} else if (uses_down == "1" && uses_up == "1") {
return("up-and-down-regulation")
} else {
return("UNKNOWN")
}
}
alt_sig_org_sols$regulation_type_usage <-
mapply(get_reg_relies_on,
alt_sig_org_sols$relies_on_down_reg,
alt_sig_org_sols$relies_on_up_reg,
alt_sig_org_sols$relies_on_regulation)
alt_sig_org_sols$regulation_type_usage <-
factor(alt_sig_org_sols$regulation_type_usage,
levels=c("neither",
"either",
"up-regulation-only",
"down-regulation-only",
"up-and-down-regulation"))
ggplot(alt_sig_org_sols, aes(x=regulation_type_usage, fill=regulation_type_usage)) +
geom_bar() +
geom_text(stat="count",
mapping=aes(label=..count..),
position=position_dodge(0.9), vjust=0) +
ylim(0, 100) +
scale_x_discrete(name="Regulation Usage",
limits=c("neither",
"either",
"up-regulation-only",
"down-regulation-only",
"up-and-down-regulation"),
labels=c("None",
"Either",
"Only up",
"Only down",
"Both")) +
theme(legend.position="none")
The majority of networks evolved to solve the repeated signal task rely exclusively on down-regulation, and all other networks evolved to solve the repeated signal task rely on both down- and up-regulation.
Directional signal task
dir_sig_org_sols <- filter(dir_sig_org_data, solution=="1")
dir_sig_org_sols$regulation_type_usage <-
mapply(get_reg_relies_on,
dir_sig_org_sols$relies_on_down_reg,
dir_sig_org_sols$relies_on_up_reg,
dir_sig_org_sols$relies_on_regulation)
dir_sig_org_sols$regulation_type_usage <-
factor(dir_sig_org_sols$regulation_type_usage,
levels=c("neither",
"either",
"up-regulation-only",
"down-regulation-only",
"up-and-down-regulation"))
ggplot(dir_sig_org_sols, aes(x=regulation_type_usage, fill=regulation_type_usage)) +
geom_bar() +
geom_text(stat="count",
mapping=aes(label=..count..),
position=position_dodge(0.9), vjust=0) +
ylim(0, 100) +
scale_x_discrete(name="Regulation Usage",
limits=c("neither",
"either",
"up-regulation-only",
"down-regulation-only",
"up-and-down-regulation"),
labels=c("None", "Either", "Only up", "Only down", "Both")) +
theme(legend.position="none")
All networks that evolved to solve the directional signal task rely on both up- and down-regulation.
We can also look at the distributions of edge types (promoting or repressing) in evolved regulatory networks.
melted <- melt(all_sol_networks,
variable.name = "reg_edge_type",
value.name = "reg_edge_cnt",
measure.vars=c("repressed_edges_cnt", "promoted_edges_cnt"))
ggplot(melted, aes(x=exp, y=reg_edge_cnt, color=reg_edge_type)) +
geom_boxplot() +
ylab("# Edges") +
scale_color_discrete(name="Edge type:",
limits=c("repressed_edges_cnt", "promoted_edges_cnt"),
labels=c("Repressing edges", "Promoting edges")) +
theme(legend.position = "bottom")
Repeated signal task
wilcox.test(formula=reg_edge_cnt ~ reg_edge_type, data=filter(melted, exp=="Repeated Signal"), conf.int=TRUE)
##
## Wilcoxon rank sum test with continuity correction
##
## data: reg_edge_cnt by reg_edge_type
## W = 8036, p-value = 8.657e-14
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## 2.000001 3.999993
## sample estimates:
## difference in location
## 3.000074
Gene regulatory networks that solve the repeated signal task have significantly more repressing edges than promoting edges.
Directional signal task
wilcox.test(formula=reg_edge_cnt ~ reg_edge_type,
data=filter(melted, exp=="Directional Signal"), conf.int=TRUE)
##
## Wilcoxon rank sum test with continuity correction
##
## data: reg_edge_cnt by reg_edge_type
## W = 1599599, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## 1.000030 1.999975
## sample estimates:
## difference in location
## 1.000098
Likewise, gene regulatory networks evolved to solve the directional signal task have significantly more repressing edges than promoting edges, but the effect is smaller than in the networks evolved to solve the repeated signal task.