Starting at the case where there is no future looks over-simplistic, but still:
0. No more future
At the end of the horizon, there is no use for future stock. You reinvest nothing, because that maximizes utility ln((1-u)bx) = ln(bx)+ ln(1-u). Maximized by u = 0.
Often, one would not even formulate this "optimization" step: the model would say that at this stage, there is no reinvestment.
Anyway: the value is ln b + ln x where x is whatever stock you might have at this stage.
1. Time ends tomorrow.
Choosing u today yields ln((1-u)bx) today. But tomorrow, you get a payoff from tomorrow's stock g = (1-?+bu)x: Namely, ln b + ln g.
Inserting, you get to maximize ln((1-u)bx) + ln b + ln((1-?+bu)x).
In this particular case, because logs behave so nice, we get 2 ln x + 2 ln b + ln((1-u)(1-?+bu)) to maximize. Already here, we see that the maximizer will not depend on x, so we will end up with 2 ln x + some constant. (The maximizer u* equals (b+?-1)/2b (interior max, from the assumptions made). Insert for this, and you get the constant determined.)
What we just did.
We maximized to get what? Today's value = today's direct utility + tomorrow's value.
(If there were discounting: "present value of tomorrow's value".)
Note, "tomorrow's value" is a function of tomorrow's state, which depends on today's state and today's choice: g(x,u).
(We could have had time-dependence too.)
So if we let f be today's running utility and V be tomorrow's value, then we have today's value \(= v(x) = \displaystyle\max_{u\in[0,1]} \Big\{f(x,u)+V(g(x,u))\Big\}.\)
General principle
Value depends on the horizon. Call the time of the "0" case T. So the value ln b + ln x should be indexed with time. It is not uncommon to use the letter "J" for value (why not? It isn't that much used for other things?) - so we index it by time and write JxT() = ln b + ln x.
Recursively, we then have \({J_{t-1}(x)= \displaystyle\max_{u\in[0,1]} \Big\{f(t,x,u)+J_t(g(t,x,u))\Big\}}\)
(Here we have allowed both running utility f and the dynamics g to depend on time.)
In words: To get the optimal value today, optimize the sum of
- today's direct utility, and
- value from tomorrow.
Note: "value" from tomorrow assumes that we behave optimally from tomorrow on. A half-assed attempt at implementation here in Norwegian: https://www.youtube.com/watch?v=BERRCrSBNdk - with an English translation at http://www.jakobsande.no/?info=12&dikt=822
Exercise:
With time t left, call the value JT-t(x). For the above problem: prove by induction that this value is of the form CT-t + AT-t ln x. (understood: where the A and C do not depend on x).
To note on the language: phrases like "T period model" could lead to a bit confusion. Is a static model of zero periods or one? I intended "With time t left" to mean that the "0" case above is t = 0, i.e. JT-t = JT-0