Quantcast
Channel: Active questions tagged expected-value - Mathematics Stack Exchange
Viewing all articles
Browse latest Browse all 1306

Variance of expectation when updating from binary signals

$
0
0

Consider the following problem: There's a stream of signals $s=(s_1,...,s_T)$ with $s_t \in \{0,1\}$ and $P(s_t = 1) = p_t$. Now, $p_1\sim U[0,1]$ and, for every $t>1$, $p_t = p_{t-1}$ with probability $q$; with probability $1-q$, $p_t$ is drawn again from $U[0,1]$. So this is a process where the probability that the signal is positive can change over time or can be persistent. The larger $q$, the less persistent is the process.

Suppose now that we generate a large number of signal streams $s$ and feed them into a computer who then calculates the expected value of $p_T$ given the signal stream, $E[p_T|s]$. Then, we take the variance of the expected values calculated by the computer. My conjecture is that this variance decreases in $q$. The reasoning is the following: If $q$ is large, then there's little persistence in the process. This means that the expectation of $p_T$ most of the time is not so different from $1/2$. If $q$ is small, then there's lots of persistence; in the extreme case if $q=0$ and $T=\infty$, the distribution of the expected values of $p_T$ will simply be $U[0,1]$, therefore having a large variance.

I have a formula for $E[p_T|s]$. For example, when $q=0$, then $E[p_T|s] = \frac{\sum_{t=1}^{T}s_t + 1}{T+2}$ and when $q=1$ then $E[p_T|s] = \frac{s_T+1}{3}$. For the general case,

$$E[p_T|s] = w_T \frac{s_T+1}{3} + (1-w_T)[w_{T-1}\frac{s_T +s_{T-1}+1}{4} + (1-w_{T-1})[w_{T-2}\frac{s_T +s_{T-1} + s_{T-2}+1}{5}+... \\ = \sum_{t=1}^T \tilde{w}_t \frac{\sum_{t'=t}^{T}s_{t'} + 1}{T-t+3},$$where $w_t = \frac{q/2}{q/2 + (1-q)E[p_{T-1}|s\setminus s_T]}$ if $s_T=1$ and $\frac{q/2}{q/2 + (1-q)(1-E[p_{T-1}|s\setminus s_T])}$ if $s_T=0$.

Essentially, the expectation is a weighted average of the estimates conditional on $p_t$ having last changed in time $t$. Given this formula, the variance of the expectation is

$$ Var = \sum_{s\in S} P(s) (E[p_T|s] - 1/2)^2. $$ ($S$ is the set of possible signal streams)

$P(s) = \frac{1}{2}\Pi_{t=2}^T (q/2 + (1-q)E[p_{t-1}|(s_1,...,s_{t-1})])$.

The intuition for why the variance decreases as $q$ increases is that increasing $q$ will increase the weight that the expectation puts on $p_t$ having changed more recently. This means that the estimate of $p_T$ is based on fewer signals. Therefore, it should be closer to $1/2$.

Despite this fairly clear intuition, I find it difficult to derive an analytical statement. The problem seems to be that this is a variance of a weighted average, which is tricky. Looking forward to hints on how to approach this problem. If there are sources which study such a process, I would also be interested in links to them.


Viewing all articles
Browse latest Browse all 1306

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>