## New ideas for making sense of big data

### October 18, 2013

Okay, so suppose you’ve just measured responses in hundreds of neurons, over time, during a complex behavioral task. Now what?? My lab members and I attended a conference at Columbia this week focussed on this issue. The conference, organized by Mark Churchland, Larry Abbott, John Cunningham and Liam Paninski was sponsored by Sandy Grossman and is a timely topic: advances in recording and imaging technology have made large neural datasets the norm and understanding how to analyze such datasets is nontrivial.

The talks included one from our lab, in which I described our recent ideas about the posterior parietal cortex and its response during a high dimensional decision task. Our work dovetailed with several others at the meeting: For example, Chris Machens spoke about his demixing principal components analysis, an analysis we have been using in our data. Chris, along with his student Wieland Brendel, developed this analysis to ask whether parameters that are mixed at the level of single neurons might be orthogonal at the level of the population. Observing an orthogonal representation in the population is important because it suggests that task parameters are represented in a way that could be trivially decoded by a downstream area.

In another talk, Jonathan Pillow described recent work from his lab on Bayesian nonparametric models for spike patterns in large datasets. The basic idea in “Bayesian nonparametrics” is to define models whose complexity grows gracefully with the amount of data available. Jonathan described an approach for modeling binary spike patterns using a Dirichlet process, which marries the parsimony of a simple parametric model (e.g., each neuron fires independently with probability “p”) and a “histogram” model that describes arbitrarily complex distributions over binary spike patterns. These models, which Jonathan’s group calls “universal binary models”, strike a happy medium between overly complex models and those that are so simple they fail to capture key features of spike data.