# Predicting Anatomically Realistic Cortical Connectomes using Statistical Inference

Synapses link neurons in the brain forming a network called the *connectome*. The goal of this project is to infer mathematical rules that predict the formation of the connectome. Our study is based on an anatomical network model of the rat vibrissal cortex that contains more than 500k neurons with axon and dendrite morphologies. In a preceding project we showed that a synapse formation rule based on structural features of the anatomical network, namely the pre- and postsynaptic target density, generates synapse innervation patterns that are in line with empirical observations. We now generalize this approach by formulating freely parametrized connectivity rules that are based on extended structural features of the anatomical network. Using statistical inference, we can constrain and efficiently sample the parameter spaces of these hypothesized rules and assess their predictiveness. In parallel, we develop interactive visual analysis methods that enable us to better interpret the impact of different synapse formation rules on the connectome. We also provide a web-based framework for the analysis and comparison of connectomes.

The project is part of the DFG-funded research program Computational Connectomics (SPP 2041) and is carried out in cooperation with the research groups of Marcel Oberlaender (caesar institute, Bonn) and Jakob Macke (Technische Universität München). The primary focus of our own group are the tasks outlined below.

*Connectivity matrix of 831 neurons.*

### Efficient Computation of Connectomes

Our approach to investigate synapse formation rules consists of the following steps:

- Derive local and/or global structural features from the anatomical network (e.g., cell type, location within a layer, pre- and postsynaptic target density, inter-somatic distance).
- Formulate a synapse formation rule in which the relative weight of each structural feature is parametrized.
- Use Bayesian Inference to efficiently sample the parameter space and assess the predictiveness of the rule. This entails for each sample:
- Computing the connectome based on the specified parameter values.
- Computing aggregate network statistics (e.g., connection probability between selected neuron subpopulations) that are compared against empirical observations and steer the inference algorithm to a new parameter set.

The last step poses algorithmic challenges, because the size of the connectome is in quadratic proportion to the number of neurons (>500k) resulting in a connectivity matrix with more than 250 billion entries. Likewise, the computation of aggregate network statistics depends directly on the size of the connectome. We address this by:

- Exploiting the sparsity of the connectome, in which many neuron pairs are not connected because their pre- and postsynaptic structures do not overlap.
- Offering informed trade-offs between accuracy and speed when determining the network statistics through sampling the connectivity matrix.
- Developing efficient algorithms that integrate the computation of both connectome and selected network statistics.
- Devising partitioning schemes that enable large scale parallelism and by utilizing the high-performance computing infrastructure at ZIB.

### Web-based Analysis Framework

As part of the project, we create a web-based analysis framework that allows anyone to (re-)calculate network statistics and compare results between different connectomes. In particular, we want to meet the following requirements:

*Pairwise network statistics*: Free selection of pre- and postsynaptic neuron subpopulations based on structural features (e.g., cell type, soma location) and computation of aggregate network statistics, such as connection probability or innervation between the specified subpopulations.*Higher order network statistics*: Free definition of triplet motifs that represent the 16 possible connection configurations between 3 neurons; and computation of the respective occurrence statistics.*Multiple connectomes*: Uploading and comparing different connectome datasets.*Extensibility*: Users can extend the framework by implementing their own network statistics.

The ability to compare connectomes is crucial for the following intended use cases:

*Rule vs. rule*: Two connectomes generated by different synapse formation rules (or parameter sets) are compared.*Rule vs. empirical observation*: A connectome generated by a synapse formation rule is compared to a connectome acquired experimentally through electron-microscopy.*In silico slicing vs. in vitro slicing*: In many experimental in vitro settings, axons and dendrites are physically cut, which leads to deviations compared to the in silico model. To investigate these effects, we create a virtually truncated network model for comparison.

### Interactive Visual Analysis

To better interpret the implications of different synapse formation rules, we develop interactive visual analysis methods that intuitively convey the following aspects:

*Differences between connectomes*: Deviations in network connectivity as reflected in the following types of statistics:- first order statistics: e.g., number of synapses
- second order statistics: e.g., connection probability between subpopulations
- higher order statistics: e.g., distribution of triplet motifs
*Multi-scale connectivity*: Exploration of connectivity from local synapse counts to high level connection strength patterns*Uncertainty*: Prediction uncertainty in connection probability and related network statistics*Structural connectivity patterns*: Facilitating analysis of network dynamics through matrix-based visualizations of the network graph

We plan to integrate the new visualization methods into the web-based analysis framework.

### Management of Data and Software

The anatomical network model and the derived structural features and connectomes represent large datasets. We expect that the following datasets will be generated during the project:

- anatomical networks (~3.3 GB each, ca. 10 in total)
- derived structural features (100 GB - 1 TB per anatomical network)
- generated connectomes (10 - 50 GB per rule evaluation)
- aggregate network statistics (< 500 MB)

We take advantage of the technical infrastructure at ZIB to store, publish, and archive these datasets. All raw datasets used in publications or considered useful for other purposes are made available for download through the ZIB website. The source code of our software tools can be obtained via github.