Shoot, why’d I just do that?

June 20, 2019

We’ve all had the experience of botching an easy decision. Laboratory subjects, both human and animal, also sometimes make the wrong choice when categorizing stimuli that should be really easy to judge. We recently wrote a paper about this which is on biorxiv. We argued that these lapses are not goof-ups but instead reflect the need for subjects to explore an environment to better understand its rules and rewards. We also made a cake about this finding, which was delicious.

We were happy to hear that Jonathan Pillow‘s lab picked our paper to discuss in their lab meeting. Pillow’s team have, like us, been enthusiastic about new ways to characterize lapses and in fact have a rather interesting (and complimentary account) which you can read if interested. We really enjoyed reading this thoughtful blog by Zoe Ashwood about their lab meeting discussion.

They raised a few concerns which we address below:

Concern #1: The first concern had to do with the probability of attending ( $p_{attend}$ ), the parameter that determined the overall rate of lapses in the traditional inattention model. We would have liked to see further justification for keeping $p_{attend}$ the same across the matched and neutral experiments and we question if this is a fair test for the inattention model of lapses. Previous work such as Körding et al. (2007) makes us question whether the animal uses different strategies to solve the task for the matched and neutral experiments. In particular, in the matched experiment, the animal may infer that the auditory and visual stimuli are causally related; whereas in the neutral experiment, the animal may detect the two stimuli as being unrelated. If this is true, then it seems strange to assume that $p_{attend}$ and $p_{bias}$ for the inattention model should be the same for the matched and neutral experiments.

Our response: In the inattention model, $p_{attend}$ represents the probability of not missing the stimulus – hence it should be influenced by (a) the animal’s attentional state before experiencing the stimulus & (b) the bottom-up salience that allows the stimulus to “pop” into the animal’s attention. Since the matched & neutral stimuli are interleaved and both consisted of equally salient multisensory events, we reasoned that $p_{attend}$ should be the same on these trials. Also note that even on matched trials, the auditory and visual events are not presented synchronously (the two event streams are independently generated). Surprisingly, this does little to deter a causal inference: animals integrate nonetheless (see Raposo, 2012). So from the point of view of the animal, a trial isn’t obviously a neutral trial right from the outset. In keeping with that, we found that animals were influenced by stimuli over the entire course of the trials, even for neutral trials (see psychophysical kernels below).

But we agree about the different strategies- *after* the animal attends to the stimulus & estimates their rates, it could potentially use this information to infer that a trial is neutral, and should discard the irrelevant visual information (a “causal inference” strategy akin to Kording et. al) rather than integrating it (a “forced fusion” strategy). However, this retrospective discarding differs from inattention because it requires knowledge of the rates and doesn’t produce lapses, instead affecting the $\sigma$ – causal inference predicts comparable neutral and auditory sigmas, while forced fusion predicts neutral values of $\sigma$ that are higher than auditory, due to inappropriately integrated noise. Indeed we see comparable neutral and auditory values of $\sigma$ (and values of $\beta$ too), suggesting causal inference.

The second concern had to do with how asymmetric lapses could be accounted for in our new exploration model:

Concern #2 When there are equal rewards for left and right stimuli, is there only a single free parameter determining the lapse rates in the exploration model (namely $\beta$ )? If so, how do the authors allow for asymmetric left and right lapse rates for the exploration model curves of Figure 3e (that is, the upper and lower asymptotes look different for both the matched and neutral curves despite equal left and right reward, yet the exploration model seems able to handle this – how does the model do this?).

Our response: In the exploration model, in addition to $\beta$ , the lapse rates on either side are determined by the *subjective* values of left and right actions (rL & rR), which must be learnt from experience and hence could be different even when the true rewards are equal, permitting asymmetric lapse rates . When one of the rewards is manipulated, we only allow the corresponding subjective value to change. Since there is an arbitrary scale factor on rR & rL and we only ever manipulate one of the rewards, we can set the un-manipulated reward (say rL) to unity & fit 2 parameters to capture lapses – $\beta$ & rR in units of rL.

The final concern had to do with $\beta$ , the parameter that determined the overall rate of lapses in the exploration model:

Concern #3 How could uncertainty $\beta$ be calculated by the rat? Can the empirically determined values of $\beta$ be predicted from, for example, the number of times that the animal has seen the stimulus in the past? And what were some typical values for the parameter $\beta$ when the exploration model was fit with data? How exploratory were the rats for the different experimental paradigms considered in this paper?

Our response: From the rat’s perspective, $\beta$ can arise naturally as a consequence of Thompson sampling from action value beliefs (Supplementary Fig. 2, also see Gershman 2018) yielding a beta inversely proportional to the root sum of squared variances of action value beliefs. This should also naturally depend on the history of feedback – if the animal receives unambiguous feedback (like sure-bet trials), then these beliefs should be well separated, yielding a higher beta. Supplementary 2 simulates this for 3 levels of sensory noise for a particular sequence of stimuli & a Thompson sampling policy.

Screen Shot 2019-06-20 at 10.58.05 AM

On unisensory trials the $\beta$ values were ~5, meaning that the average uncertainty (S.D.) for unit expected rewards was ~.2, this was reduced to ~.14 on multisensory trials. The near-perfect performance on sure-bet trials suggests negligible uncertainty/exploration on those trials. These values remained unchanged for reward/neural manipulations, which only affected rR/rL depending on the side.

Posted by churchlandlab

Filed in Uncategorized

Churchland lab