How can we learn about principles of cortical circuit organization from the wealth of data generated by modern connectomics approaches? How could these structural principles help us to understand brain function? The goal of this project is to address these two major challenges for the mammalian neocortex. Specifically, we propose the development of a powerful statistical framework that allows discovering and testing which ‘laws’ of synaptic organization could in principle underlie measurements of both network connectivity and function. This framework will enable neuroscientists via web-based tools to (1) formulate mathematically hypotheses of synapse formation strategies, (2) discover the impact of such ‘wiring rules’ on network architecture, (3) reveal the rules’ relevance for function by constraining simulations with rule-predicted connectivity patterns, and (4) test simulation predictions against empirical connectivity and activity measurements. To achieve this goal, we will combine complementary expertise in data science (Baum), Bayesian statistics and machine-learning (Macke) with in vivo measurements of neuroanatomy and neurophysiology (Oberlaender). Our collaboration thereby seeks to identify which structural and/or functional parameters are (or are not) predictive for synapse formation, quantify how well a wiring rule is constrained by empirical data, reveal which additional data would be most useful in constraining it, and perform quantitative model comparison for determining which of a set of rules is most consistent with all of the available empirical data. We will apply and validate our approaches by comparing the in silico predictions against novel in vivo-based connectivity and activity measurements for the major output cell type of the neocortex – pyramidal tract neurons in layer 5. Our preliminary data provides first evidence that our envisioned approaches have the potential to be revealing of the local rules that underlie the global organization of neocortical circuits. The proposed project will thus provide the foundation to explore fundamental relationships between structural properties of neuronal networks, their underlying principles/rules of synaptic organization and cortical functions. This insight will ultimately allow investigating developmental mechanisms leading to these relationships, and the structural origin and/or correlates of cortical malfunctions during diseased and pathological conditions. In line with the goals of the SPP Computational Connectomics, we want to develop general powerful computational methods to discover, compare and statistically test different synapse organization theories against empirical observations.

**For detailed information, please also refer to our web portal** BarrelCortexInSilico

### Efficient Computation of Connectomes

Our approach to investigate synapse formation rules consists of the following steps:

- Derive local and/or global structural features from the anatomical network (e.g., cell type, location within a layer, pre- and postsynaptic target density, inter-somatic distance).
- Formulate a synapse formation rule in which the relative weight of each structural feature is parametrized.
- Use Bayesian Inference to efficiently sample the parameter space and assess the predictiveness of the rule. This entails for each sample:
- Computing the connectome based on the specified parameter values.
- Computing aggregate network statistics (e.g., connection probability between selected neuron subpopulations) that are compared against empirical observations and steer the inference algorithm to a new parameter set.

The last step poses algorithmic challenges, because the size of the connectome is in quadratic proportion to the number of neurons (>500k) resulting in a connectivity matrix with more than 250 billion entries. Likewise, the computation of aggregate network statistics depends directly on the size of the connectome. We address this by:

- Exploiting the sparsity of the connectome, in which many neuron pairs are not connected because their pre- and postsynaptic structures do not overlap.
- Offering informed trade-offs between accuracy and speed when determining the network statistics through sampling the connectivity matrix.
- Developing efficient algorithms that integrate the computation of both connectome and selected network statistics.
- Devising partitioning schemes that enable large scale parallelism and by utilizing the high-performance computing infrastructure at ZIB.

### Web-based Analysis Framework

As part of the project, we create a web-based analysis framework that allows anyone to (re-)calculate network statistics and compare results between different connectomes. In particular, we want to meet the following requirements:

*Pairwise network statistics*: Free selection of pre- and postsynaptic neuron subpopulations based on structural features (e.g., cell type, soma location) and computation of aggregate network statistics, such as connection probability or innervation between the specified subpopulations.*Higher order network statistics*: Free definition of triplet motifs that represent the 16 possible connection configurations between 3 neurons; and computation of the respective occurrence statistics.*Multiple connectomes*: Uploading and comparing different connectome datasets.*Extensibility*: Users can extend the framework by implementing their own network statistics.

The ability to compare connectomes is crucial for the following intended use cases:

*Rule vs. rule*: Two connectomes generated by different synapse formation rules (or parameter sets) are compared.*Rule vs. empirical observation*: A connectome generated by a synapse formation rule is compared to a connectome acquired experimentally through electron-microscopy.*In silico slicing vs. in vitro slicing*: In many experimental in vitro settings, axons and dendrites are physically cut, which leads to deviations compared to the in silico model. To investigate these effects, we create a virtually truncated network model for comparison.

### Interactive Visual Analysis

To better interpret the implications of different synapse formation rules, we develop interactive visual analysis methods that intuitively convey the following aspects:

*Differences between connectomes*: Deviations in network connectivity as reflected in the following types of statistics:- first order statistics: e.g., number of synapses
- second order statistics: e.g., connection probability between subpopulations
- higher order statistics: e.g., distribution of triplet motifs

*Multi-scale connectivity*: Exploration of connectivity from local synapse counts to high level connection strength patterns*Uncertainty*: Prediction uncertainty in connection probability and related network statistics*Structural connectivity patterns*: Facilitating analysis of network dynamics through matrix-based visualizations of the network graph

We plan to integrate the new visualization methods into the web-based analysis framework.

### Management of Data and Software

The anatomical network model and the derived structural features and connectomes represent large datasets. We expect that the following datasets will be generated during the project:

- anatomical networks (~3.3 GB each, ca. 10 in total)
- derived structural features (100 GB - 1 TB per anatomical network)
- generated connectomes (10 - 50 GB per rule evaluation)
- aggregate network statistics (< 500 MB)

We take advantage of the technical infrastructure at ZIB to store, publish, and archive these datasets. The source code of our software tools can be obtained via github.