(***********************************************************************
Mathematica-Compatible Notebook
This notebook can be used on any computer system with Mathematica 3.0,
MathReader 3.0, or any compatible application. The data for the notebook
starts with the line of stars above.
To get the notebook into a Mathematica-compatible application, do one of
the following:
* Save the data starting with the line of stars above into a file
with a name ending in .nb, then open the file inside the application;
* Copy the data starting with the line of stars above to the
clipboard, then use the Paste menu command inside the application.
Data for notebooks contains only printable 7-bit ASCII and can be
sent directly in email or through ftp in text mode. Newlines can be
CR, LF or CRLF (Unix, Macintosh or MS-DOS style).
NOTE: If you modify the data for this notebook not in a Mathematica-
compatible application, you must delete the line below containing the
word CacheID, otherwise Mathematica-compatible applications may try to
use invalid cache data.
For more information on notebooks and Mathematica-compatible
applications, contact Wolfram Research:
web: http://www.wolfram.com
email: info@wolfram.com
phone: +1-217-398-0700 (U.S.)
Notebook reader applications are available free of charge from
Wolfram Research.
***********************************************************************)
(*CacheID: 232*)
(*NotebookFileLineBreakTest
NotebookFileLineBreakTest*)
(*NotebookOptionsPosition[ 56180, 1620]*)
(*NotebookOutlinePosition[ 65623, 1906]*)
(* CellTagsIndexPosition[ 64784, 1869]*)
(*WindowFrame->Normal*)
Notebook[{
Cell[CellGroupData[{
Cell["Bernoulli Bandits", "Title",
CellTags->"0.1"],
Cell["\<\
Methods are presented for using arrays to solve a class of \
statistical decision problems.\
\>", "Subtitle",
CellTags->{"S0.0.1", "1.1"}],
Cell["by Leida M. Bonilla M. and Martin L. Jones", "Subsubtitle",
CellTags->{"S0.0.1", "1.2"}],
Cell[TextData[{
"A bandit problem is a statistical decision problem in which an observer \
must choose one of finitely many random processes to be observed at each of a \
possible infinite number of stages. The parameters governing the randomness \
in each process are not completely known to the observer in advance, \
therefore selections may be made for the purpose of gaining information as \
well as for reward. In this article we present a method of solving two\
\[Hyphen]point Bernoulli bandits using arrays. The ideas are extendible to \
distributions with more than two points. \nSuppose that two medicines of \
unknown effectiveness are to be used to treat patients with a particular \
disease. As patients arrive requesting treatment, the practitioner must \
select one of the two medicines to be administered based on the available \
information at that time. The results of each treatment will increase the \
available information thereby aiding the practitioner in treating future \
patients. His objective is to maximize the number of successful treatments. \
Based on the preliminary information on the two medicines, does there exist a \
decision strategy for the selection process at each stage that would be \
optimal in the sense that on average this strategy would maximize the number \
of successful treatments? The answer, although not an unqualified one, is \
yes, and pertains to most situations of interest such as the one considered \
here in this article. Problems of this type are called a bandit problems, the \
name coming from the many\[Hyphen]armed slot machines in which a player must \
choose an arm at each stage. Bandit problems are statistical decision \
problems in sequential experimental design. The practitioner not only \
performs an experiment (observes a process) at each stage in a sequential \
manner, but also must choose the experiment to be run at that stage. \
Moreover, because of the sequential nature of the design, he may use \
information from earlier experiments to aid in the selection of future ones. \
Throughout the article we will refer to the experiments or processes as ",
StyleBox["arms", "TI"],
" and a bandit problem with two possible experiments at each stage as a ",
StyleBox["two\[Hyphen]armed bandit", "TI"],
", etc. An excellent treatise on the subject is given by [Berry and \
Fristedt 1985]. \nFor simplicity, in this article we will concentrate on the \
special case of two\[Hyphen]armed Bernoulli bandits although the ideas are \
easily extendible to more complicated bandits. The name ",
StyleBox["Bernoulli", "TI"],
" refers to the fact that each experiment or selection results in either a \
",
StyleBox["success", "TI"],
" or a ",
StyleBox["failure", "TI"],
". For the medical trial example we would regard the patients' recovery as \
a success and continued illness or death as a failure. Each success will be \
counted as a ",
StyleBox["reward", "TI"],
" of ",
Cell[BoxData[
\(TraditionalForm\`1\)], "InlineFormula",
CellTags->"S0.0.1"],
" unit and each failure as a reward of ",
Cell[BoxData[
\(TraditionalForm\`0\)], "InlineFormula",
CellTags->"S0.0.1"],
" units. The parameters ",
Cell[BoxData[
\(TraditionalForm\`\[Theta]\_1\)], "InlineFormula",
CellTags->"S0.0.1"],
" and ",
Cell[BoxData[
\(TraditionalForm\`\[Theta]\_2\)], "InlineFormula",
CellTags->"S0.0.1"],
" will be used to represent the probability of a success on ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.1"],
" and ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_2\)], "InlineFormula",
CellTags->"S0.0.1"],
" respectively. At least one of these two parameters will be unknown to the \
observer at the beginning of the selection process although he will have some \
prior information regarding the likelihoods of different values for the \
parameters. This prior information will be represented by a distribution ",
Cell[BoxData[
\(TraditionalForm\`G\)], "InlineFormula",
CellTags->"S0.0.1"],
" on the parameter space ",
Cell[BoxData[
\(TraditionalForm\`\([0, 1]\)\[Cross]\([0, 1]\)\)], "InlineFormula",
CellTags->"S0.0.1"],
". In this article we will assume that ",
Cell[BoxData[
\(TraditionalForm\`G\)], "InlineFormula",
CellTags->"S0.0.1"],
" is supported by two points, that is the observer believes the parameters \
to be either ",
Cell[BoxData[
\(TraditionalForm\`\((\[Theta]\_11, \[Theta]\_21)\)\)],
"InlineFormula",
CellTags->"S0.0.1"],
" or ",
Cell[BoxData[
\(TraditionalForm\`\((\[Theta]\_12, \[Theta]\_22)\)\)],
"InlineFormula",
CellTags->"S0.0.1"],
". Letting ",
Cell[BoxData[
\(TraditionalForm\`p \[Element] \([0, 1]\)\)], "InlineFormula",
CellTags->"S0.0.1"],
" represent the observer's belief that the parameters are equal to ",
Cell[BoxData[
\(TraditionalForm\`\((\[Theta]\_11, \[Theta]\_21)\)\)],
"InlineFormula",
CellTags->"S0.0.1"],
", we may represent ",
Cell[BoxData[
\(TraditionalForm\`G\)], "InlineFormula",
CellTags->"S0.0.1"],
" by ",
Cell[BoxData[
\(TraditionalForm
\`G = p \((\[CenterDot]\[Delta])\)\_\((\[Theta]\_11, \[Theta]\_21)\) +
\((1 - p)\)
\((\[CenterDot]\[Delta])\)\_\((\[Theta]\_12, \[Theta]\_22)\)\)],
"InlineFormula",
CellTags->"S0.0.1"],
". We will further assume that the observer has only finitely many \
selections. This so\[Hyphen]called ",
StyleBox["finite horizon uniform", "TI"],
" case is often represented by ",
Cell[BoxData[
\(TraditionalForm\`A = {1, 1, ... , 1, 0, \( ... }\)\)],
"InlineFormula",
CellTags->"S0.0.1"],
" which indicates that for the first ",
Cell[BoxData[
\(TraditionalForm\`n\)], "InlineFormula",
CellTags->"S0.0.1"],
" selections the rewards are multiplied by a factor of ",
Cell[BoxData[
\(TraditionalForm\`1\)], "InlineFormula",
CellTags->"S0.0.1"],
" and thereafter by a factor of ",
Cell[BoxData[
\(TraditionalForm\`0\)], "InlineFormula",
CellTags->"S0.0.1"],
". The observer may continue making selections after stage ",
Cell[BoxData[
\(TraditionalForm\`n\)], "InlineFormula",
CellTags->"S0.0.1"],
" but receives no further rewards. Once the prior distribution ",
Cell[BoxData[
\(TraditionalForm\`G\)], "InlineFormula",
CellTags->"S0.0.1"],
" and the discount sequence ",
Cell[BoxData[
\(TraditionalForm\`A\)], "InlineFormula",
CellTags->"S0.0.1"],
" have been specified, the problem is referred to as the ",
Cell[BoxData[
\(TraditionalForm\`\((G, A)\)\)], "InlineFormula",
CellTags->"S0.0.1"],
" bandit. A strategy ",
Cell[BoxData[
\(TraditionalForm\`\[Tau]\)], "InlineFormula",
CellTags->"S0.0.1"],
" is a function of the partial histories of the observations which \
indicates the arm to be selected at the next stage. For example, ",
Cell[BoxData[
\(TraditionalForm\`\[Tau] \((\[EmptySet])\) = 2\)], "InlineFormula",
CellTags->"S0.0.1"],
" means that ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_2\)], "InlineFormula",
CellTags->"S0.0.1"],
" should be selected initially while ",
Cell[BoxData[
\(TraditionalForm\`\[Tau] \((0)\) = 1\)], "InlineFormula",
CellTags->"S0.0.1"],
" indicates that ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.1"],
" should be selected after a failure is observed on the first selection. \
Similarly, ",
Cell[BoxData[
\(TraditionalForm\`\[Tau] \((0, 0, 1)\) = 2\)], "InlineFormula",
CellTags->"S0.0.1"],
" indicates that ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_2\)], "InlineFormula",
CellTags->"S0.0.1"],
" should be selected at stage ",
Cell[BoxData[
\(TraditionalForm\`4\)], "InlineFormula",
CellTags->"S0.0.1"],
" after a pattern of two failures and one success in the first three \
observations on the arms indicated at these stages by ",
Cell[BoxData[
\(TraditionalForm\`\[Tau]\)], "InlineFormula",
CellTags->"S0.0.1"],
". The expected gain from using strategy ",
Cell[BoxData[
\(TraditionalForm\`\[Tau]\)], "InlineFormula",
CellTags->"S0.0.1"],
" for the ",
Cell[BoxData[
\(TraditionalForm\`\((G, A)\)\)], "InlineFormula",
CellTags->"S0.0.1"],
" bandit is denoted ",
Cell[BoxData[
\(TraditionalForm\`W \((G, A, \[Tau])\)\)], "InlineFormula",
CellTags->"S0.0.1"],
" (called the ",
StyleBox["worth", "TI"],
") and the supremum of ",
Cell[BoxData[
\(TraditionalForm\`W \((G, A, \[Tau])\)\)], "InlineFormula",
CellTags->"S0.0.1"],
" over all possible strategies is called the ",
StyleBox["value", "TI"],
" of the ",
Cell[BoxData[
\(TraditionalForm\`\((G, A)\)\)], "InlineFormula",
CellTags->"S0.0.1"],
" bandit and is denoted ",
Cell[BoxData[
\(TraditionalForm\`V \((G, A)\)\)], "InlineFormula",
CellTags->"S0.0.1"],
". \nFor Bernoulli bandits with a finite horizon discount sequence there \
always exists an ",
StyleBox["optimal", "TI"],
" strategy ",
Cell[BoxData[
\(TraditionalForm\`\(\[Tau]\^*\)\)], "InlineFormula",
CellTags->"S0.0.1"],
" for which ",
Cell[BoxData[
\(TraditionalForm\`V \((G, A)\) = W \((G, A, \(\[Tau]\^*\))\)\)],
"InlineFormula",
CellTags->"S0.0.1"],
". Moreover, both the value and an optimal strategy can be constructed \
using a procedure known as dynamic programming. The purpose of this article \
is to show how this can be accomplished efficiently using ",
StyleBox["Mathematica", "TI"],
". To begin we will give a short example taken from [Berry and Fristedt \
1985] in which we construct by hand the value and an optimal strategy using \
the dynamic programming algorithm. Afterwards we will show how to implement \
the algorithm in ",
StyleBox["Mathematica", "TI"],
". "
}], "Text",
CellTags->{"S0.0.2", "1.3"}],
Cell[CellGroupData[{
Cell["A Two\[Hyphen]point Finite Horizon Bernoulli Bandit", "Section",
CellTags->{"S0.0.2", "2.1"}],
Cell[TextData[{
"In this example, ",
Cell[BoxData[
\(TraditionalForm
\`G = \((4/5)\) \[Delta]\_\((1/10, 0)\) + \((1/5)\)
\[Delta]\_\((9/10, 1)\)\)], "InlineFormula",
CellTags->"S0.0.2"],
" and ",
Cell[BoxData[
\(TraditionalForm\`A = {1, 1, 0, \( ... }\)\)], "InlineFormula",
CellTags->"S0.0.2"],
". That is, initially with probability ",
Cell[BoxData[
\(TraditionalForm\`4/5\)], "InlineFormula",
CellTags->"S0.0.2"],
" the observer believes that the parameters are both small and with \
probability ",
Cell[BoxData[
\(TraditionalForm\`1/5\)], "InlineFormula",
CellTags->"S0.0.2"],
" that both are large, and he has only two selections. Notice that a \
selection on ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_2\)], "InlineFormula",
CellTags->"S0.0.2"],
" gives complete information about the distributions since ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_2\)], "InlineFormula",
CellTags->"S0.0.2"],
" either always yields a success or always yields a failure. However, \
despite this information advantage on ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_2\)], "InlineFormula",
CellTags->"S0.0.2"],
", a selection of ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.2"],
" offers a positive probability of a success regardless of which of the two \
parameter sets represents the actual parameters. For a problem of this type \
with only two selections we could write down all of the possible strategies \
one by one and check the worth of each until we found an optimal one, however \
for illustrative purposes we will use our dynamic programming algorithm. We \
solve the problem by working backwards from the penultimate stage. That is, \
faced with the last (in this case the second) selection, the observer will \
have already witnessed one of four possible histories which we will denote ",
Cell[BoxData[
\(TraditionalForm\`\[Sigma]\_1\)], "InlineFormula",
CellTags->"S0.0.2"],
", ",
Cell[BoxData[
\(TraditionalForm\`\[Phi]\_1\)], "InlineFormula",
CellTags->"S0.0.2"],
", ",
Cell[BoxData[
\(TraditionalForm\`\[Sigma]\_2\)], "InlineFormula",
CellTags->"S0.0.2"],
", and ",
Cell[BoxData[
\(TraditionalForm\`\[Phi]\_2\)], "InlineFormula",
CellTags->"S0.0.2"],
" representing a success or failure (",
Cell[BoxData[
\(TraditionalForm\`\[Sigma]\)], "InlineFormula",
CellTags->"S0.0.2"],
" or ",
Cell[BoxData[
\(TraditionalForm\`\[Phi]\)], "InlineFormula",
CellTags->"S0.0.2"],
") and the arm which was selected. Using Baye's Theorem we can calculate \
the updated distributions for each of the four histories. For example, after \
witnessing a success on ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.2"],
", the new probability that the parameters are ",
Cell[BoxData[
\(TraditionalForm\`\((1/10, 0)\)\)], "InlineFormula",
CellTags->"S0.0.2"],
" is "
}], "Text",
CellTags->{"S0.0.2", "2.2"}],
Cell[BoxData[
FormBox[
RowBox[{GridBox[{
{
StyleBox[\(p\_\(\[Sigma]\_1\)\),
"InlineFormula"],
StyleBox[
\( = P \((
\((\[Theta]\_1, \[Theta]\_2)\) = \((1/10, 0)\)\
\( \[VerticalSeparator] \ \) \[Sigma]\_1)\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\( = \(P
\((\[Sigma]\_1\ \( \[VerticalSeparator] \ \)
\[Theta]\_1 = 1/10)\)\[CenterDot]\((4/5)\)\)\/\(P
\((\[Sigma]\_1\ \( \[VerticalSeparator] \ \)
\[Theta]\_1 = 1/10)\)\[CenterDot]\((4/5)\) + P
\((\[Sigma]\_1\ \( \[VerticalSeparator] \ \)
\[Theta]\_1 = 9/10)\)\[CenterDot]\((1/5)\)\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\( = \(\((1/10)\)\[CenterDot]\((4/5)\)\)\/\(\((1/10)
\)\[CenterDot]\((4/5)\) +
\((9/10)\)\[CenterDot]\((1/5)\)\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\( = \(4\/13\) .
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \((1)\)\),
"InlineFormula"]}
},
ColumnAlignments->{Right, Left}], " "}], TraditionalForm]],
"DisplayFormula",
CellTags->"S0.0.2"],
Cell["\<\
Using similar calculations we can obtain the four updated \
distributions \
\>", "Text",
CellTags->{"S0.0.2", "2.4"}],
Cell[BoxData[
FormBox[GridBox[{
{
StyleBox[\(\(\[Sigma]\_1\) G\),
"InlineFormula"],
StyleBox[
\( = \((4/13)\) \[Delta]\_\((1/10, 0)\) + \((9/13)\)
\[Delta]\_\((9/10, 1)\)\),
"InlineFormula"]},
{
StyleBox[\(\(\[Phi]\_1\) G\),
"InlineFormula"],
StyleBox[
\( = \((36/37)\) \[Delta]\_\((1/10, 0)\) + \((1/37)\)
\[Delta]\_\((9/10, 1)\)\),
"InlineFormula"]},
{
StyleBox[\(\(\[Sigma]\_2\) G\),
"InlineFormula"],
StyleBox[
\( = 0 \((\[CenterDot]\[Delta])\)\_\((1/10, 0)\) + 1
\((\[CenterDot]\[Delta])\)\_\((9/10, 1)\)\),
"InlineFormula"]},
{
StyleBox[\(\(\[Phi]\_2\) G\),
"InlineFormula"],
StyleBox[
\( = 1 \((\[CenterDot]\[Delta])\)\_\((1/10, 0)\) + 0
\((\[CenterDot]\[Delta])\)\_\((9/10, 1)\) . \),
"InlineFormula"]}
},
ColumnAlignments->{Right, Left}], TraditionalForm]], "DisplayFormula",\
CellTags->"S0.0.2"],
Cell[TextData[{
"We will represent the observer's one remaining selection at this stage by \
",
Cell[BoxData[
\(TraditionalForm\`A\^\((1)\) = {1, 0, 0, \( ... }\)\)],
"InlineFormula",
CellTags->"S0.0.2"],
", the original discount sequence with the first term deleted. For each \
each history we calculate the expected gain on the last observation which is \
the maximum (denoted below by `",
Cell[BoxData[
\(TraditionalForm\`\[Vee]\)], "InlineFormula",
CellTags->"S0.0.2"],
"') expected gain between the two arms. Using ",
Cell[BoxData[
\(TraditionalForm\`V \((\[CenterDot], A\^\((1)\))\)\)],
"InlineFormula",
CellTags->"S0.0.2"],
" to represent this ",
StyleBox["partial history value", "TI"],
" we have "
}], "Text",
CellTags->{"S0.0.2", "2.6"}],
Cell[BoxData[
FormBox[GridBox[{
{
StyleBox[\(V \((\(\[Sigma]\_1\) G, A\^\((1)\))\)\),
"InlineFormula"],
StyleBox[
\( = \((4\/13\[CenterDot]1\/10 + 9\/13\[CenterDot]9\/10)
\)\[Vee]\((4\/13\[CenterDot]0 + 9\/13\[CenterDot]1)\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
RowBox[{"=", \(17\/26\), "\[Vee]",
FractionBox[
StyleBox["\<\"9\"\>",
"TB"],
StyleBox["\<\"13\"\>",
"TB"]], "=", \(9\/13\)}],
"InlineFormula"]},
{
StyleBox[\(V \((\(\[Phi]\_1\) G, A\^\((1)\))\)\),
"InlineFormula"],
StyleBox[
\( = \((36\/37\[CenterDot]1\/10 + 1\/37\[CenterDot]9\/10)
\)\[Vee]\((36\/37\[CenterDot]0 + 1\/37\[CenterDot]1)\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
RowBox[{"=",
FractionBox[
StyleBox["\<\"9\"\>",
"TB"],
StyleBox["\<\"74\"\>",
"TB"]], "\[Vee]", \(1\/37\), "=", \(9\/74\)}],
"InlineFormula"]},
{
StyleBox[\(V \((\(\[Sigma]\_2\) G, A\^\((1)\))\)\),
"InlineFormula"],
StyleBox[
RowBox[{
"=", \((1\[CenterDot]9\/10)\), "\[Vee]", \((1\[CenterDot]1)\),
"=", \(9\/10\), "\[Vee]",
StyleBox["\<\"1\"\>",
"TB"]}],
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[\( = 1\),
"InlineFormula"]},
{
StyleBox[\(V \((\(\[Phi]\_2\) G, A\^\((1)\))\)\),
"InlineFormula"],
StyleBox[
RowBox[{"=",
RowBox[{\((1\[CenterDot]1\/10)\), "\[Vee]",
RowBox[{\((1\[CenterDot]0)\), "=",
RowBox[{
RowBox[{
FractionBox[
StyleBox["\<\"1\"\>",
"TB"],
StyleBox["\<\"10\"\>",
"TB"]], "\[Vee]", "0"}], "=",
\(\(1\/10\) .
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
\((2)\)\)}]}]}]}],
"InlineFormula"]}
},
ColumnAlignments->{Right, Left}], TraditionalForm]], "DisplayFormula",\
CellTags->"S0.0.2"],
Cell[TextData[{
"We can see from the above calculations that after observing a success on ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.2"],
" it is wise to switch to ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_2\)], "InlineFormula",
CellTags->"S0.0.2"],
" for the last selection, whereas after observing a failure on ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.2"],
" it is wise to continue with ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.2"],
". Similar interpretations hold for initial selections on ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_2\)], "InlineFormula",
CellTags->"S0.0.2"],
". Now that we know how to select at the penultimate stage we may continue \
to work backwards and now average these stage two possibilities over the \
distribution ",
Cell[BoxData[
\(TraditionalForm\`G\)], "InlineFormula",
CellTags->"S0.0.2"],
" to obtain an optimal initial selection. The value obtained is "
}], "Text",
CellTags->{"S0.0.2", "2.8"}],
Cell[BoxData[
FormBox[GridBox[{
{
StyleBox[\(V \((G, A)\)\),
"InlineFormula"],
StyleBox[
\( = \(4\/5\)
\((\(1\/10\) \((1 + 9\/13)\) + \(9\/10\) \((0 + 9\/74)\))
\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\(+\(1\/5\)
\((\(9\/10\) \((1 + 9\/13)\) + \(1\/10\) \((0 + 9\/74)\))
\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\(\[Vee]\(4\/5\) \((0 \((1 + 1)\) + 1 \((0 + 1\/10)\))\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[\(+\(1\/5\) \((1 \((1 + 1)\) + 0 \((0 + 1\/10)\))\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
RowBox[{
RowBox[{"=",
RowBox[{
RowBox[{
FractionBox[
StyleBox["\<\"53\"\>",
"TB"],
StyleBox["\<\"100\"\>",
"TB"]], "\[Vee]", \(12\/25\)}], "=",
\(53\/100\)}]}], ",",
" \
", \((3)\)}],
"InlineFormula"]}
},
ColumnAlignments->{Right, Left}], TraditionalForm]], "DisplayFormula",\
CellTags->"S0.0.2"],
Cell[TextData[{
"and we can see that the optimal initial selection is ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.2"],
". To explain the calculations in (3), consider the term "
}], "Text",
CellTags->{"S0.0.2", "2.10"}],
Cell[BoxData[
\(TraditionalForm
\`\(4\/5\) \((\(1\/10\) \((1 + 9\/13)\) + \(9\/10\) \((0 + 9\/74)\))\) .
\)], "DisplayFormula",
CellTags->"S0.0.2"],
Cell[TextData[{
"With probability ",
Cell[BoxData[
\(TraditionalForm\`4/5\)], "InlineFormula",
CellTags->"S0.0.2"],
" the parameters will be ",
Cell[BoxData[
\(TraditionalForm\`\((1/10, 0)\)\)], "InlineFormula",
CellTags->"S0.0.2"],
", so that a selection of ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.2"],
" yields a success with probability ",
Cell[BoxData[
\(TraditionalForm\`1/10\)], "InlineFormula",
CellTags->"S0.0.2"],
" and a failure with probability ",
Cell[BoxData[
\(TraditionalForm\`9/10\)], "InlineFormula",
CellTags->"S0.0.2"],
". Therefore, if a success is observed, a reward of ",
Cell[BoxData[
\(TraditionalForm\`1\)], "InlineFormula",
CellTags->"S0.0.2"],
" will be earned plus the expected gain on the second selection which from \
(2) is seen to be on the average ",
Cell[BoxData[
\(TraditionalForm\`9/13\)], "InlineFormula",
CellTags->"S0.0.2"],
". Similarly a failure on ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.2"],
" will earn a reward of ",
Cell[BoxData[
\(TraditionalForm\`0\)], "InlineFormula",
CellTags->"S0.0.2"],
" plus, from (2), an expected gain on the second selection of ",
Cell[BoxData[
\(TraditionalForm\`9/74\)], "InlineFormula",
CellTags->"S0.0.2"],
". An optimal strategy may be pieced together starting from the optimal \
initial selection forward to yield ",
Cell[BoxData[
\(TraditionalForm
\`\[Tau] \((\[EmptySet])\) = 1, \[Tau] \((0)\) = 1, \[Tau] \((1)\) = 2
\)], "InlineFormula",
CellTags->"S0.0.2"],
". In the next section we will demonstrate how the dynamic programming \
method can be implemented in an efficient manner using arrays. "
}], "Text",
CellTags->{"S0.0.3", "2.12"}]
}, Open ]],
Cell[CellGroupData[{
Cell["An Algorithm for Solving Two\[Hyphen]point Bernoulli Bandits", "Section",
CellTags->{"S0.0.3", "3.1"}],
Cell[TextData[{
"The equations (2) and (3) of the previous example are known as the \
fundamental equations of dynamic programming, adjusted here to the two\
\[Hyphen]point Bernoulli bandit setting. Let * represent a partial history of \
observations (e.g. ",
Cell[BoxData[
\(TraditionalForm\`\(\[Sigma]\_1\) \(\[Phi]\_1\) \[Sigma]\_2\)],
"InlineFormula",
CellTags->"S0.0.3"],
") and ",
Cell[BoxData[
\(TraditionalForm\`\(\[Sigma]\_2*\)\)], "InlineFormula",
CellTags->"S0.0.3"],
", for example, represent the same partial history followed by a success on \
",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_2\)], "InlineFormula",
CellTags->"S0.0.3"],
". Then using ",
Cell[BoxData[
\(TraditionalForm\`A\^\((k)\)\)], "InlineFormula",
CellTags->"S0.0.3"],
" to denote the original discount sequence with the first ",
Cell[BoxData[
\(TraditionalForm\`k\)], "InlineFormula",
CellTags->"S0.0.3"],
" terms deleted and ",
Cell[BoxData[
\(TraditionalForm\`\(q\_*\) = 1 - \(p\_*\)\)], "InlineFormula",
CellTags->"S0.0.3"],
" we can represent the general form of the fundamental equation of dynamic \
programming as \n"
}], "Text",
CellTags->{"S0.0.3", "3.2"}],
Cell[BoxData[
FormBox[GridBox[{
{
StyleBox[\(V \((*G, A\^\((k)\))\)\),
"InlineFormula"],
StyleBox[
\( = \([\(p\_*\)
\((\(\[Theta]\_11\)
\((1 + V \((\[Sigma]\_1*G, A\^\((\(k\ \) + 1)\))\))
\)\)\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\(+\((1 - \[Theta]\_11)\)
\((0 + V \((\[Phi]\_1*G, A\^\((\(k\ \) + 1)\))\))\))\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\(+\(q\_*\)
\((\((\(\[Theta]\_12\)
\((1 + V \((\[Sigma]\_1*G, A\^\((\(k\ \) + 1)\))\))
\)\)\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\(+\((1 - \[Theta]\_12)\)
\((0 + V \((\[Phi]\_1*G, A\^\((\(k\ \) + 1)\))\))\))]\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\(\[Vee]\([
\(p\_*\)
\((\(\[Theta]\_21\)
\((1 + V \((\[Sigma]\_2*G, A\^\((\(k\ \) + 1)\))\))
\)\)\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\(+\((1 - \[Theta]\_21)\)
\((0 + V \((\[Phi]\_2*G, A\^\((\(k\ \) + 1)\))\))\))\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\(+\(q\_*\)
\((\((\(\[Theta]\_22\)
\((1 + V \((\[Sigma]\_2*G, A\^\((\(k\ \) + 1)\))\))
\)\)\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\(\(\(\(+\((1 - \[Theta]\_22)\)\)
\((0 + V \((\[Phi]\_2*G, A\^\((\(k\ \) + 1)\))\))\))\)]
\) . \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
\ \ \ \ \ \ \ \ \ \ \ \((4)\)\),
"InlineFormula"]}
},
ColumnAlignments->{Right, Left}], TraditionalForm]], "DisplayFormula",\
CellTags->"S0.0.3"],
Cell[TextData[{
"At the penultimate stage, the partial history values in the \
right\[Hyphen]hand side of (4) are equal to zero since there is no further \
chance of rewards. Thus (4) simplifies at that stage as was seen in the \
example in equations (2). The difficulty in programming lies in the rapid \
growth of the size of the problem as the horizon increases. In the horizon \
two example of the previous section, we calculated four partial history \
values ",
Cell[BoxData[
\(TraditionalForm\`V \((\(\[Sigma]\_1\) G, A\^\((1)\))\)\)],
"InlineFormula",
CellTags->"S0.0.3"],
", ",
Cell[BoxData[
\(TraditionalForm\`V \((\(\[Phi]\_1\) G, A\^\((1)\))\)\)],
"InlineFormula",
CellTags->"S0.0.3"],
", ",
Cell[BoxData[
\(TraditionalForm\`V \((\(\[Sigma]\_2\) G, A\^\((1)\))\)\)],
"InlineFormula",
CellTags->"S0.0.3"],
", and ",
Cell[BoxData[
\(TraditionalForm\`V \((\(\[Phi]\_2\) G, A\^\((1)\))\)\)],
"InlineFormula",
CellTags->"S0.0.3"],
" which were used to then calculate ",
Cell[BoxData[
\(TraditionalForm\`V \((G, A)\)\)], "InlineFormula",
CellTags->"S0.0.3"],
". In a horizon three problem we would need to calculate sixteen partial \
history values at the penultimate stage. To avoid an unwieldy number of loops \
we must find some method of storing the partial history values in arrays and \
then manipulate the arrays in such a way as to perform the operations in (4). \
In this algorithm there is a single loop. Each pass through the loop takes an \
array containing ",
Cell[BoxData[
\(TraditionalForm\`4\^k\)], "InlineFormula",
CellTags->"S0.0.3"],
" partial history values (for some ",
Cell[BoxData[
\(TraditionalForm\`k\)], "InlineFormula",
CellTags->"S0.0.3"],
") and applies equation (4) in a sequence of manipulations to create an \
array of ",
Cell[BoxData[
\(TraditionalForm\`4\^\(k - 1\)\)], "InlineFormula",
CellTags->"S0.0.3"],
" partial history values. We will demonstrate how this may be done by \
showing the manipulations required to create the four partial history values \
in (2). For simplicity we will drop the ",
Cell[BoxData[
\(TraditionalForm\`A\^\((k)\)\)], "InlineFormula",
CellTags->"S0.0.3"],
" notation and write ",
Cell[BoxData[
\(TraditionalForm\`V\_\(\[Phi]\_1\)\)], "InlineFormula",
CellTags->"S0.0.3"],
" for ",
Cell[BoxData[
\(TraditionalForm\`V \((\(\[Phi]\_1\) G, A\^\((1)\))\)\)],
"InlineFormula",
CellTags->"S0.0.3"],
", and write ",
Cell[BoxData[
\(TraditionalForm\`V\_\(\(\[Sigma]\_1\) \[Phi]\_1\)\)],
"InlineFormula",
CellTags->"S0.0.3"],
" for ",
Cell[BoxData[
\(TraditionalForm
\`V \((\(\[Sigma]\_1\) \(\[Phi]\_1\) G, A\^\((2)\))\)\)],
"InlineFormula",
CellTags->"S0.0.3"],
", where ",
Cell[BoxData[
\(TraditionalForm\`\(\[Sigma]\_1\) \[Phi]\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
" is the history that begins with a failure on ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
" followed by a success on ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
". We will assume for now that we are able to calculate arrays of the \
desired dimension containing the conditional probabilities in (1) and then \
show later how these may be generated. The sequence of manipulations begins \
with an array which we shall call ",
StyleBox["value", "TI"],
" which has been generated by earlier passes through the loop and now \
contains the partial history values for the sixteen partial histories of \
length two. In what follows, we pictorially represent the output of the ",
StyleBox["Mathematica", "TI"],
" manipulations required to produce the first term (",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
") in the maximum on the right\[Hyphen]hand side of (4). The ",
StyleBox["Mathematica", "TI"],
" commands appear in ",
StyleBox["bold", "MB"],
" font and the pictorial output appears after each command. "
}], "Text",
CellTags->{"S0.0.3", "3.4"}],
Cell[BoxData[
RowBox[{"value", " ", "=", " ",
RowBox[{"(", GridBox[{
{
RowBox[{"(", GridBox[{
{\(V\_\(\(\[Sigma]\_1\) \[Sigma]\_1\)\),
\(V\_\(\(\[Sigma]\_1\) \[Phi]\_1\)\)},
{\(V\_\(\(\[Sigma]\_1\) \[Sigma]\_2\)\),
\(V\_\(\(\[Sigma]\_1\) \[Phi]\_2\)\)}
}], ")"}],
RowBox[{"(", GridBox[{
{\(V\_\(\(\[Phi]\_1\) \[Sigma]\_1\)\),
\(V\_\(\(\[Phi]\_1\) \[Phi]\_1\)\)},
{\(V\_\(\(\[Phi]\_1\) \[Sigma]\_2\)\),
\(V\_\(\(\[Phi]\_1\) \[Phi]\_2\)\)}
}], ")"}]},
{
RowBox[{"(", GridBox[{
{\(V\_\(\(\[Sigma]\_2\) \[Sigma]\_1\)\),
\(V\_\(\(\[Sigma]\_2\) \[Phi]\_1\)\)},
{\(V\_\(\(\[Sigma]\_2\) \[Sigma]\_2\)\),
\(V\_\(\(\[Sigma]\_2\) \[Phi]\_2\)\)}
}], ")"}],
RowBox[{"(", GridBox[{
{\(V\_\(\(\[Phi]\_2\) \[Sigma]\_1\)\),
\(V\_\(\(\[Phi]\_2\) \[Phi]\_1\)\)},
{\(V\_\(\(\[Phi]\_2\) \[Sigma]\_2\)\),
\(V\_\(\(\[Phi]\_2\) \[Phi]\_2\)\)}
}], ")"}]}
}], ")"}]}]], "Input"],
Cell[CellGroupData[{
Cell[BoxData[
\(valueplus\ = \ Transpose[{1, 0}\ + \ Transpose[value]]\)], "Input"],
Cell[BoxData[
FormBox[
InterpretationBox[
RowBox[{"(", GridBox[{
{
RowBox[{"(", GridBox[{
{\(V\_\(\[Sigma]\_1\%2\) + 1\),
\(V\_\(\[Sigma]\_1\ \[Phi]\_1\) + 1\)},
{\(V\_\(\[Sigma]\_1\ \[Sigma]\_2\) + 1\),
\(V\_\(\[Sigma]\_1\ \[Phi]\_2\) + 1\)}
}], ")"}],
RowBox[{"(", GridBox[{
{\(V\_\(\[Sigma]\_1\ \[Phi]\_1\)\),
\(V\_\(\[Phi]\_1\%2\)\)},
{\(V\_\(\[Sigma]\_2\ \[Phi]\_1\)\),
\(V\_\(\[Phi]\_1\ \[Phi]\_2\)\)}
}], ")"}]},
{
RowBox[{"(", GridBox[{
{\(V\_\(\[Sigma]\_1\ \[Sigma]\_2\) + 1\),
\(V\_\(\[Sigma]\_2\ \[Phi]\_1\) + 1\)},
{\(V\_\(\[Sigma]\_2\%2\) + 1\),
\(V\_\(\[Sigma]\_2\ \[Phi]\_2\) + 1\)}
}], ")"}],
RowBox[{"(", GridBox[{
{\(V\_\(\[Sigma]\_1\ \[Phi]\_2\)\),
\(V\_\(\[Phi]\_1\ \[Phi]\_2\)\)},
{\(V\_\(\[Sigma]\_2\ \[Phi]\_2\)\),
\(V\_\(\[Phi]\_2\%2\)\)}
}], ")"}]}
}], ")"}],
MatrixForm[ {{{{
Plus[ 1,
Subscript[ V,
Power[
Subscript[ \[Sigma], 1], 2]]],
Plus[ 1,
Subscript[ V,
Times[
Subscript[ \[Sigma], 1],
Subscript[ \[Phi], 1]]]]}, {
Plus[ 1,
Subscript[ V,
Times[
Subscript[ \[Sigma], 1],
Subscript[ \[Sigma], 2]]]],
Plus[ 1,
Subscript[ V,
Times[
Subscript[ \[Sigma], 1],
Subscript[ \[Phi], 2]]]]}}, {{
Subscript[ V,
Times[
Subscript[ \[Sigma], 1],
Subscript[ \[Phi], 1]]],
Subscript[ V,
Power[
Subscript[ \[Phi], 1], 2]]}, {
Subscript[ V,
Times[
Subscript[ \[Sigma], 2],
Subscript[ \[Phi], 1]]],
Subscript[ V,
Times[
Subscript[ \[Phi], 1],
Subscript[ \[Phi], 2]]]}}}, {{{
Plus[ 1,
Subscript[ V,
Times[
Subscript[ \[Sigma], 1],
Subscript[ \[Sigma], 2]]]],
Plus[ 1,
Subscript[ V,
Times[
Subscript[ \[Sigma], 2],
Subscript[ \[Phi], 1]]]]}, {
Plus[ 1,
Subscript[ V,
Power[
Subscript[ \[Sigma], 2], 2]]],
Plus[ 1,
Subscript[ V,
Times[
Subscript[ \[Sigma], 2],
Subscript[ \[Phi], 2]]]]}}, {{
Subscript[ V,
Times[
Subscript[ \[Sigma], 1],
Subscript[ \[Phi], 2]]],
Subscript[ V,
Times[
Subscript[ \[Phi], 1],
Subscript[ \[Phi], 2]]]}, {
Subscript[ V,
Times[
Subscript[ \[Sigma], 2],
Subscript[ \[Phi], 2]]],
Subscript[ V,
Power[
Subscript[ \[Phi], 2], 2]]}}}}, TableDepth -> 2]],
TraditionalForm]], "Output"]
}, Open ]],
Cell[CellGroupData[{
Cell[BoxData[
\(temp1\ = \
Dot[{\[Theta]\_11, \((1 - \[Theta]\_11)\)}, \ valueplus[\([1]\)]]\)],
"Input"],
Cell[BoxData[
FormBox[
RowBox[{"(", GridBox[{
{
\(V\_\(\[Sigma]\_1\ \[Phi]\_1\)\ \((1 - \[Theta]\_11)\) +
\((V\_\(\[Sigma]\_1\%2\) + 1)\)\ \[Theta]\_11\),
\(V\_\(\[Phi]\_1\%2\)\ \((1 - \[Theta]\_11)\) +
\((V\_\(\[Sigma]\_1\ \[Phi]\_1\) + 1)\)\ \[Theta]\_11\)},
{
\(V\_\(\[Sigma]\_2\ \[Phi]\_1\)\ \((1 - \[Theta]\_11)\) +
\((V\_\(\[Sigma]\_1\ \[Sigma]\_2\) + 1)\)\ \[Theta]\_11\),
\(V\_\(\[Phi]\_1\ \[Phi]\_2\)\ \((1 - \[Theta]\_11)\) +
\((V\_\(\[Sigma]\_1\ \[Phi]\_2\) + 1)\)\ \[Theta]\_11\)}
}], ")"}], TraditionalForm]], "Output"]
}, Open ]],
Cell[CellGroupData[{
Cell[BoxData[
\(temp2 = Times[probability[2], \ temp1]\)], "Input"],
Cell[BoxData[
FormBox[
RowBox[{"(", GridBox[{
{
\(\(probability(2)\)\
\((V\_\(\[Sigma]\_1\ \[Phi]\_1\)\ \((1 - \[Theta]\_11)\) +
\((V\_\(\[Sigma]\_1\%2\) + 1)\)\ \[Theta]\_11)\)\),
\(\(probability(2)\)\
\((V\_\(\[Phi]\_1\%2\)\ \((1 - \[Theta]\_11)\) +
\((V\_\(\[Sigma]\_1\ \[Phi]\_1\) + 1)\)\ \[Theta]\_11)
\)\)},
{
\(\(probability(2)\)\
\((V\_\(\[Sigma]\_2\ \[Phi]\_1\)\ \((1 - \[Theta]\_11)\) +
\((V\_\(\[Sigma]\_1\ \[Sigma]\_2\) + 1)\)\ \[Theta]\_11)
\)\), \(\(probability(2)\)\
\((V\_\(\[Phi]\_1\ \[Phi]\_2\)\ \((1 - \[Theta]\_11)\) +
\((V\_\(\[Sigma]\_1\ \[Phi]\_2\) + 1)\)\ \[Theta]\_11)
\)\)}
}], ")"}], TraditionalForm]], "Output"]
}, Open ]],
Cell[BoxData[
RowBox[{\(probability[2]\), " ", "=", " ",
RowBox[{"(", GridBox[{
{\(p\_\(\[Sigma]\_1\)\), \(p\_\(\[Phi]\_1\)\)},
{\(p\_\(\[Sigma]\_2\)\), \(p\_\(\[Phi]\_2\)\)}
}], ")"}]}]], "Input"],
Cell["Similarly, letting ", "Text",
CellTags->{"S0.0.3", "3.6"}],
Cell[BoxData[
RowBox[{\(minusprobability[2]\), " ", "=", " ",
RowBox[{"(", GridBox[{
{\(1 - p\_\(\[Sigma]\_1\)\), \(1 - p\_\(\[Phi]\_1\)\)},
{\(1 - p\_\(\[Sigma]\_2\)\), \(1 - p\_\(\[Phi]\_2\)\)}
}], ")"}]}]], "Input"],
Cell["the manipulations ", "Text",
CellTags->{"S0.0.3", "3.7"}],
Cell[BoxData[
\(\(temp3 = \ {\[Theta]\_12, \ 1 - \[Theta]\_12} . valueplus[\([1]\)];
\)\)], "Input"],
Cell[BoxData[
\(\(temp4\ = \ minusprobability[2]\ *\ temp3; \)\)], "Input"],
Cell[TextData[{
"yield with ",
Cell[BoxData[
\(TraditionalForm\`\(q\_*\) = 1 - \(p\_*\)\)], "InlineFormula",
CellTags->"S0.0.3"],
" \nFinally, "
}], "Text",
CellTags->{"S0.0.3", "3.10"}],
Cell[BoxData[
\(arm1\ = \ temp2\ + \ temp4\)], "Input"],
Cell["\<\
where, for example, \
\>", "Text"],
Cell[BoxData[
FormBox[GridBox[{
{
StyleBox[\(\(\((a r m)\)\_1\) \[Phi]\_1\),
"InlineFormula"],
StyleBox[
\( = \(p\_\(\[Phi]\_1\)\)
\((\(\[Theta]\_11\)
\((1 + V\_\(\(\[Sigma]\_1\) \[Phi]\_1\))\) +
\((1 - \[Theta]\_11)\) V\_\(\(\[Phi]\_1\) \[Phi]\_1\))
\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\(+\((1 - p\_\(\[Phi]\_1\))\)
\((\(\[Theta]\_12\)
\((1 + V\_\(\(\[Sigma]\_1\) \[Phi]\_1\))\) +
\((1 - \[Theta]\_12)\) V\_\(\(\[Phi]\_1\) \[Phi]\_1\))
\) . \),
"InlineFormula"]}
},
ColumnAlignments->{Right, Left}], TraditionalForm]], "DisplayFormula",\
CellTags->"S0.0.3"],
Cell[TextData[{
"The array ",
StyleBox["arm1", "MR"],
" represents the first term in the maximum of equation (4) for each of the \
partial histories ",
Cell[BoxData[
\(TraditionalForm\`\[Sigma]\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
", ",
Cell[BoxData[
\(TraditionalForm\`\[Phi]\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
", ",
Cell[BoxData[
\(TraditionalForm\`\[Sigma]\_2\)], "InlineFormula",
CellTags->"S0.0.3"],
", and ",
Cell[BoxData[
\(TraditionalForm\`\[Phi]\_2\)], "InlineFormula",
CellTags->"S0.0.3"],
". The manipulations on the lower half of the ",
StyleBox["valueplus", "MR"],
" array "
}], "Text",
CellTags->{"S0.0.3", "3.12"}],
Cell[BoxData[
\(\(temp5 = {\[Theta]\_12, 1 - \[Theta]\_12} .
valueplus\[LeftDoubleBracket]2\[RightDoubleBracket]; \)\)], "Input",
CellTags->"S0.0.3"],
Cell["temp6 = probability[2]*temp5;", "Input",
CellTags->"S0.0.3"],
Cell[BoxData[
\(\(temp7 = {\[Theta]\_22, 1 - \[Theta]\_22} .
valueplus\[LeftDoubleBracket]2\[RightDoubleBracket]; \)\)], "Input",
CellTags->"S0.0.3"],
Cell["temp8 = minusprobability[2] + temp7;", "Input",
CellTags->"S0.0.3"],
Cell["arm2 = temp6 + temp8", "Input",
CellTags->"S0.0.3"],
Cell["where, for example, ", "Text",
CellTags->{"S0.0.3", "3.13"}],
Cell[BoxData[
FormBox[GridBox[{
{
StyleBox[\(\(\((a r m)\)\_2\) \[Phi]\_1\),
"InlineFormula"],
StyleBox[
\( = \(p\_\(\[Phi]\_1\)\)
\((\(\[Theta]\_21\)
\((1 + V\_\(\(\[Sigma]\_2\) \[Phi]\_1\))\) +
\((1 - \[Theta]\_21)\) V\_\(\(\[Phi]\_2\) \[Phi]\_1\))
\)\),
"InlineFormula"]},
{
StyleBox["\[Null]",
"InlineFormula"],
StyleBox[
\(+\((1 - p\_\(\[Phi]\_1\))\)
\((\(\[Theta]\_22\)
\((1 + V\_\(\(\[Sigma]\_2\) \[Phi]\_1\))\) +
\((1 - \[Theta]\_22)\) V\_\(\(\[Phi]\_2\) \[Phi]\_1\))
\) . \),
"InlineFormula"]}
},
ColumnAlignments->{Right, Left}], TraditionalForm]], "DisplayFormula",\
CellTags->"S0.0.3"],
Cell[TextData[{
"The new ",
StyleBox["value", "MR"],
" array is created by the commands "
}], "Text",
CellTags->{"S0.0.3", "3.15"}],
Cell["\<\
x = Dimensions[Dimensions[arm1]];
value = MapThread[Max, {arm1, arm2}, x[[1]]]\
\>", "Input",
CellTags->"S0.0.3"],
Cell[TextData[{
"representing equation (4) for each of the partial histories ",
Cell[BoxData[
\(TraditionalForm\`\[Sigma]\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
", ",
Cell[BoxData[
\(TraditionalForm\`\[Phi]\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
", ",
Cell[BoxData[
\(TraditionalForm\`\[Sigma]\_2\)], "InlineFormula",
CellTags->"S0.0.3"],
", and ",
Cell[BoxData[
\(TraditionalForm\`\[Phi]\_2\)], "InlineFormula",
CellTags->"S0.0.3"],
". Since the value array at any stage is created by threading the ",
StyleBox["Max", "MR"],
" command across the two arrays ",
StyleBox["arm1", "MR"],
" and ",
StyleBox["arm2", "MR"],
", it is fairly straightforward to save the information as to which arm \
produced the maximum for each partial history. This information can then be \
used to reconstruct an optimal strategy. For example, we can produce an array \
",
StyleBox["strategy", "MR"],
" consisting of ",
Cell[BoxData[
\(TraditionalForm\`\(-1\)\)], "InlineFormula",
CellTags->"S0.0.3"],
"'s, ",
Cell[BoxData[
\(TraditionalForm\`0\)], "InlineFormula",
CellTags->"S0.0.3"],
"'s, and ",
Cell[BoxData[
\(TraditionalForm\`1\)], "InlineFormula",
CellTags->"S0.0.3"],
"'s for each stage using the command "
}], "Text",
CellTags->{"S0.0.3", "3.18"}],
Cell["\<\
strategy[stage] = MapThread[Order, {arm1, arm2}, 2*{stage - \
1}]\
\>", "Input",
CellTags->"S0.0.3"],
Cell["for any stage greater than one, and ", "Text",
CellTags->{"S0.0.3", "3.19"}],
Cell["strategy[1] = Order[arm1, arm2]", "Input",
CellTags->"S0.0.3"],
Cell[TextData[{
"otherwise. A ",
Cell[BoxData[
\(TraditionalForm\`\(-1\)\)], "InlineFormula",
CellTags->"S0.0.3"],
" indicates that ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
" produced the maximum, a ",
Cell[BoxData[
\(TraditionalForm\`1\)], "InlineFormula",
CellTags->"S0.0.3"],
" indicates that ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_2\)], "InlineFormula",
CellTags->"S0.0.3"],
" produced the maximum, and ",
Cell[BoxData[
\(TraditionalForm\`0\)], "InlineFormula",
CellTags->"S0.0.3"],
" indicates that the arms are equal and that either may be considered as an \
optimal selection. In the example from the previous section, the strategy \
arrays are as follows: "
}], "Text",
CellTags->{"S0.0.3", "3.20"}],
Cell[BoxData[
\(TraditionalForm\`\(s t r a t e g y\)\ \([1]\)\ = \(-1\)\)],
"DisplayFormula",
CellTags->"S0.0.3"],
Cell["and ", "Text",
CellTags->{"S0.0.3", "3.22"}],
Cell[BoxData[
\(TraditionalForm
\`\(s t r a t e g y\)\ \([2]\)\ = \((\[Null]\ \ )\)\)],
"DisplayFormula",
CellTags->"S0.0.3"],
Cell[TextData[{
"indicating that initially we should choose ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
" and then at stage two we should choose ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_2\)], "InlineFormula",
CellTags->"S0.0.3"],
" if a success was observed initially (the ",
Cell[BoxData[
\(TraditionalForm\`\[Sigma]\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
" position in the upper left\[Hyphen]hand corner of the ",
StyleBox["strategy", "TI"],
"[2] array) and should continue with ",
Cell[BoxData[
\(TraditionalForm\`\((a r m)\)\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
" if a failure was observed (the ",
Cell[BoxData[
\(TraditionalForm\`\[Phi]\_1\)], "InlineFormula",
CellTags->"S0.0.3"],
" position in the upper right\[Hyphen]hand corner of the ",
StyleBox["strategy", "TI"],
"[2] array). \nIn the initial pass through the main loop, the beginning ",
StyleBox["value", "MR"],
" array is an array of the appropriate dimension containing all zeros. For \
any partial history of length equal to the horizon, the expected gain to the \
observer from this point onward is zero since no more selections are \
available. In the example from the previous section the initial ",
StyleBox["value", "MR"],
" array is given by "
}], "Text",
CellTags->{"S0.0.3", "3.24"}],
Cell[BoxData[
\(TraditionalForm\`v a l u e\ \ = \((\[Null]\ \ )\) . \)],
"DisplayFormula",
CellTags->"S0.0.3"],
Cell["\<\
Arrays of this type for any horizon can be generated recursively as \
follows: \
\>", "Text",
CellTags->{"S0.0.3", "3.26"}],
Cell["\<\
initial[1] = 0;
initial[n_] := Table[initial[n - 1], {i, 1, 2}, {j, 1, 2}]\
\>", "Input",
CellTags->"S0.0.3"],
Cell[TextData[{
"The array used to start the calculations is ",
StyleBox["initial[horizon + 1]", "MR"],
" where ",
StyleBox["horizon", "TI"],
" is the number of selections available to the observer. \nFinally, the ",
StyleBox["probability", "MR"],
" and ",
StyleBox["minusprobability", "MR"],
" arrays containing the conditional probabilities of the points ",
Cell[BoxData[
\(TraditionalForm\`\((\[Theta]\_11, \[Theta]\_21)\)\)],
"InlineFormula",
CellTags->"S0.0.3"],
" and ",
Cell[BoxData[
\(TraditionalForm\`\((\[Theta]\_12, \[Theta]\_22)\)\)],
"InlineFormula",
CellTags->"S0.0.3"],
" respectively may be calculated recursively for any stage using Baye's \
Theorem as in (1) in the following manner: "
}], "Text",
CellTags->{"S0.0.3", "3.27"}],
Cell[BoxData[
\(\(m1 = {{\[Theta]\_11, 1 - \[Theta]\_11}, {\[Theta]\_21,
1 - \[Theta]\_21}}; \)\)], "Input",
CellTags->"S0.0.3"],
Cell[BoxData[
\(\(m2 = {{\[Theta]\_12, 1 - \[Theta]\_12}, {\[Theta]\_22,
1 - \[Theta]\_22}}; \)\)], "Input",
CellTags->"S0.0.3"],
Cell["above[1] = p;", "Input",
CellTags->"S0.0.3"],
Cell["above[n_] := m1*Table[above[n - 1]*{i, 1, 2}, {j, 1, 2}]", "Input",
CellTags->"S0.0.3"],
Cell["below[1] = 1 - p;", "Input",
CellTags->"S0.0.3"],
Cell["below[n_] := m2*Table[below[n - 1]*{i, 1, 2}, {j, 1, 2}]", "Input",
CellTags->"S0.0.3"],
Cell["denominator[1] = 1;", "Input",
CellTags->"S0.0.3"],
Cell["denominator[n_] := above[n] + below[n];", "Input",
CellTags->"S0.0.3"],
Cell["Dimensions[Dimensions[above[n]]];", "Input",
CellTags->"S0.0.3"],
Cell["\<\
probability[n_] := MapThread[Divide, {above[n], denominator[n]}, \
y[n][[1]]];\
\>", "Input",
CellTags->"S0.0.3"],
Cell["minusprobability[n_] := 1 = probability[n]", "Input",
CellTags->"S0.0.3"]
}, Open ]],
Cell[CellGroupData[{
Cell["Conclusions", "Section",
CellTags->{"S0.0.4", "4.1"}],
Cell[TextData[{
"In this article we have described a method using ",
StyleBox["Mathematica",
FontSlant->"Italic"],
" commands\nto solve two-point Bernoulli bandit problems in a program \
containing a\nsingle loop. The ideas used can be extended to distributions ",
StyleBox["G",
FontSlant->"Italic"],
" involving\nmore than two points by changing dimensions of the arrays in \
an appropriate\nmanner and modifying the equations in (4). The interested \
reader should\nconsult [Berry and Fristedt 1985] for the generalized form of \
these\n equations.\n"
}], "Text"]
}, Open ]],
Cell[CellGroupData[{
Cell["Acknowledgements", "Section",
CellTags->{"S0.0.1", "1.1"}],
Cell["\<\
The authors would like to thank the University of the Andes, M\
\[EAcute]rida, Venezuela for their hospitality and the second author would \
like to thank the Fulbright Program for the grant which enabled the author to \
spend a year working in Venezuela. \
\>", "Text",
CellTags->{"S0.0.2", "1.2"}]
}, Open ]],
Cell[CellGroupData[{
Cell["References", "Section",
CellTags->{"S0.0.2", "2.1"}],
Cell[TextData[{
"Abell, M. L. and Braselton, J. P. (1994) ",
StyleBox["Mathematica by Example", "TI"],
", revised edition, Academic Press, Boston. \n\nBerry, D. A. and Fristedt, \
B. (1985) ",
StyleBox["Bandit Problems, Sequential Allocation of Experiments", "TI"],
", Chapman Hall, London. "
}], "Text",
CellTags->{"S0.0.3", "2.2"}]
}, Open ]],
Cell[CellGroupData[{
Cell["About the Authors", "Section",
CellTags->{"S0.0.3", "3.1"}],
Cell["\<\
Leida M. Bonilla M. is a graduate student at the University of the \
Andes in M\[EAcute]rida, Venezuela.
lbonilla@ciens.ula.ve
Martin Jones is an Associate Professor of Mathematics at the University of \
Charleston, S.C. where he has been since 1989. His research interests are in \
statistical decision theory, gambling theory, and optimal stopping. He spent \
a year working at the University of the Andes under the auspices of a \
Fulbright Grant.
jonesm@math.cofc.edu \
\>", "Text",
CellTags->{"S0.0.4", "3.2"}]
}, Open ]]
}, Open ]]
},
FrontEndVersion->"X 3.0",
ScreenRectangle->{{0, 1280}, {0, 1024}},
WindowSize->{520, 600},
WindowMargins->{{2, Automatic}, {Automatic, 124}},
ShowCellLabel->False
]
(***********************************************************************
Cached data follows. If you edit this Notebook file directly, not using
Mathematica, you must remove the line containing CacheID at the top of
the file. The cache data will then be recreated when you save this file
from within Mathematica.
***********************************************************************)
(*CellTagsOutline
CellTagsIndex->{
"0.1"->{
Cell[1731, 51, 53, 1, 107, "Title",
CellTags->"0.1"]},
"S0.0.1"->{
Cell[1787, 54, 150, 4, 93, "Subtitle",
CellTags->{"S0.0.1", "1.1"}],
Cell[1940, 60, 96, 1, 53, "Subsubtitle",
CellTags->{"S0.0.1", "1.2"}],
Cell[54686, 1569, 66, 1, 54, "Section",
CellTags->{"S0.0.1", "1.1"}]},
"1.1"->{
Cell[1787, 54, 150, 4, 93, "Subtitle",
CellTags->{"S0.0.1", "1.1"}],
Cell[54686, 1569, 66, 1, 54, "Section",
CellTags->{"S0.0.1", "1.1"}]},
"1.2"->{
Cell[1940, 60, 96, 1, 53, "Subsubtitle",
CellTags->{"S0.0.1", "1.2"}],
Cell[54755, 1572, 311, 6, 68, "Text",
CellTags->{"S0.0.2", "1.2"}]},
"S0.0.2"->{
Cell[2039, 63, 10077, 243, 1077, "Text",
CellTags->{"S0.0.2", "1.3"}],
Cell[12141, 310, 101, 1, 54, "Section",
CellTags->{"S0.0.2", "2.1"}],
Cell[12245, 313, 3139, 81, 302, "Text",
CellTags->{"S0.0.2", "2.2"}],
Cell[15387, 396, 1887, 47, 104, "DisplayFormula",
CellTags->"S0.0.2"],
Cell[17277, 445, 129, 4, 32, "Text",
CellTags->{"S0.0.2", "2.4"}],
Cell[17409, 451, 1258, 37, 80, "DisplayFormula",
CellTags->"S0.0.2"],
Cell[18670, 490, 807, 22, 104, "Text",
CellTags->{"S0.0.2", "2.6"}],
Cell[19480, 514, 2792, 76, 172, "DisplayFormula",
CellTags->"S0.0.2"],
Cell[22275, 592, 1184, 30, 122, "Text",
CellTags->{"S0.0.2", "2.8"}],
Cell[23462, 624, 1719, 52, 129, "DisplayFormula",
CellTags->"S0.0.2"],
Cell[25184, 678, 283, 7, 50, "Text",
CellTags->{"S0.0.2", "2.10"}],
Cell[25470, 687, 164, 4, 41, "DisplayFormula",
CellTags->"S0.0.2"],
Cell[54755, 1572, 311, 6, 68, "Text",
CellTags->{"S0.0.2", "1.2"}],
Cell[55103, 1583, 60, 1, 54, "Section",
CellTags->{"S0.0.2", "2.1"}]},
"1.3"->{
Cell[2039, 63, 10077, 243, 1077, "Text",
CellTags->{"S0.0.2", "1.3"}]},
"2.1"->{
Cell[12141, 310, 101, 1, 54, "Section",
CellTags->{"S0.0.2", "2.1"}],
Cell[55103, 1583, 60, 1, 54, "Section",
CellTags->{"S0.0.2", "2.1"}]},
"2.2"->{
Cell[12245, 313, 3139, 81, 302, "Text",
CellTags->{"S0.0.2", "2.2"}],
Cell[55166, 1586, 346, 8, 104, "Text",
CellTags->{"S0.0.3", "2.2"}]},
"2.4"->{
Cell[17277, 445, 129, 4, 32, "Text",
CellTags->{"S0.0.2", "2.4"}]},
"2.6"->{
Cell[18670, 490, 807, 22, 104, "Text",
CellTags->{"S0.0.2", "2.6"}]},
"2.8"->{
Cell[22275, 592, 1184, 30, 122, "Text",
CellTags->{"S0.0.2", "2.8"}]},
"2.10"->{
Cell[25184, 678, 283, 7, 50, "Text",
CellTags->{"S0.0.2", "2.10"}]},
"S0.0.3"->{
Cell[25637, 693, 1893, 52, 158, "Text",
CellTags->{"S0.0.3", "2.12"}],
Cell[27567, 750, 110, 1, 54, "Section",
CellTags->{"S0.0.3", "3.1"}],
Cell[27680, 753, 1244, 33, 140, "Text",
CellTags->{"S0.0.3", "3.2"}],
Cell[28927, 788, 2634, 80, 172, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[31564, 870, 4234, 106, 428, "Text",
CellTags->{"S0.0.3", "3.4"}],
Cell[43033, 1174, 66, 1, 32, "Text",
CellTags->{"S0.0.3", "3.6"}],
Cell[43370, 1184, 65, 1, 32, "Text",
CellTags->{"S0.0.3", "3.7"}],
Cell[43633, 1194, 204, 7, 50, "Text",
CellTags->{"S0.0.3", "3.10"}],
Cell[43952, 1211, 956, 27, 44, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[44911, 1240, 715, 24, 50, "Text",
CellTags->{"S0.0.3", "3.12"}],
Cell[45629, 1266, 164, 3, 27, "Input",
CellTags->"S0.0.3"],
Cell[45796, 1271, 68, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[45867, 1274, 164, 3, 27, "Input",
CellTags->"S0.0.3"],
Cell[46034, 1279, 75, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[46112, 1282, 59, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[46174, 1285, 68, 1, 32, "Text",
CellTags->{"S0.0.3", "3.13"}],
Cell[46245, 1288, 956, 27, 44, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[47204, 1317, 139, 5, 32, "Text",
CellTags->{"S0.0.3", "3.15"}],
Cell[47346, 1324, 126, 5, 57, "Input",
CellTags->"S0.0.3"],
Cell[47475, 1331, 1366, 42, 122, "Text",
CellTags->{"S0.0.3", "3.18"}],
Cell[48844, 1375, 112, 4, 27, "Input",
CellTags->"S0.0.3"],
Cell[48959, 1381, 84, 1, 32, "Text",
CellTags->{"S0.0.3", "3.19"}],
Cell[49046, 1384, 70, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[49119, 1387, 847, 25, 86, "Text",
CellTags->{"S0.0.3", "3.20"}],
Cell[49969, 1414, 130, 3, 24, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[50102, 1419, 52, 1, 32, "Text",
CellTags->{"S0.0.3", "3.22"}],
Cell[50157, 1422, 146, 4, 24, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[50306, 1428, 1424, 34, 176, "Text",
CellTags->{"S0.0.3", "3.24"}],
Cell[51733, 1464, 124, 3, 24, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[51860, 1469, 135, 4, 32, "Text",
CellTags->{"S0.0.3", "3.26"}],
Cell[51998, 1475, 121, 4, 42, "Input",
CellTags->"S0.0.3"],
Cell[52122, 1481, 803, 22, 104, "Text",
CellTags->{"S0.0.3", "3.27"}],
Cell[52928, 1505, 145, 3, 27, "Input",
CellTags->"S0.0.3"],
Cell[53076, 1510, 145, 3, 27, "Input",
CellTags->"S0.0.3"],
Cell[53224, 1515, 52, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53279, 1518, 95, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53377, 1521, 56, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53436, 1524, 95, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53534, 1527, 58, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53595, 1530, 78, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53676, 1533, 72, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53751, 1536, 125, 4, 42, "Input",
CellTags->"S0.0.3"],
Cell[53879, 1542, 81, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[55166, 1586, 346, 8, 104, "Text",
CellTags->{"S0.0.3", "2.2"}],
Cell[55549, 1599, 67, 1, 54, "Section",
CellTags->{"S0.0.3", "3.1"}]},
"2.12"->{
Cell[25637, 693, 1893, 52, 158, "Text",
CellTags->{"S0.0.3", "2.12"}]},
"3.1"->{
Cell[27567, 750, 110, 1, 54, "Section",
CellTags->{"S0.0.3", "3.1"}],
Cell[55549, 1599, 67, 1, 54, "Section",
CellTags->{"S0.0.3", "3.1"}]},
"3.2"->{
Cell[27680, 753, 1244, 33, 140, "Text",
CellTags->{"S0.0.3", "3.2"}],
Cell[55619, 1602, 533, 14, 212, "Text",
CellTags->{"S0.0.4", "3.2"}]},
"3.4"->{
Cell[31564, 870, 4234, 106, 428, "Text",
CellTags->{"S0.0.3", "3.4"}]},
"3.6"->{
Cell[43033, 1174, 66, 1, 32, "Text",
CellTags->{"S0.0.3", "3.6"}]},
"3.7"->{
Cell[43370, 1184, 65, 1, 32, "Text",
CellTags->{"S0.0.3", "3.7"}]},
"3.10"->{
Cell[43633, 1194, 204, 7, 50, "Text",
CellTags->{"S0.0.3", "3.10"}]},
"3.12"->{
Cell[44911, 1240, 715, 24, 50, "Text",
CellTags->{"S0.0.3", "3.12"}]},
"3.13"->{
Cell[46174, 1285, 68, 1, 32, "Text",
CellTags->{"S0.0.3", "3.13"}]},
"3.15"->{
Cell[47204, 1317, 139, 5, 32, "Text",
CellTags->{"S0.0.3", "3.15"}]},
"3.18"->{
Cell[47475, 1331, 1366, 42, 122, "Text",
CellTags->{"S0.0.3", "3.18"}]},
"3.19"->{
Cell[48959, 1381, 84, 1, 32, "Text",
CellTags->{"S0.0.3", "3.19"}]},
"3.20"->{
Cell[49119, 1387, 847, 25, 86, "Text",
CellTags->{"S0.0.3", "3.20"}]},
"3.22"->{
Cell[50102, 1419, 52, 1, 32, "Text",
CellTags->{"S0.0.3", "3.22"}]},
"3.24"->{
Cell[50306, 1428, 1424, 34, 176, "Text",
CellTags->{"S0.0.3", "3.24"}]},
"3.26"->{
Cell[51860, 1469, 135, 4, 32, "Text",
CellTags->{"S0.0.3", "3.26"}]},
"3.27"->{
Cell[52122, 1481, 803, 22, 104, "Text",
CellTags->{"S0.0.3", "3.27"}]},
"S0.0.4"->{
Cell[53997, 1548, 61, 1, 54, "Section",
CellTags->{"S0.0.4", "4.1"}],
Cell[55619, 1602, 533, 14, 212, "Text",
CellTags->{"S0.0.4", "3.2"}]},
"4.1"->{
Cell[53997, 1548, 61, 1, 54, "Section",
CellTags->{"S0.0.4", "4.1"}]}
}
*)
(*CellTagsIndex
CellTagsIndex->{
{"0.1", 56782, 1637},
{"S0.0.1", 56861, 1640},
{"1.1", 57113, 1647},
{"1.2", 57284, 1652},
{"S0.0.2", 57458, 1657},
{"1.3", 58584, 1686},
{"2.1", 58677, 1689},
{"2.2", 58849, 1694},
{"2.4", 59020, 1699},
{"2.6", 59109, 1702},
{"2.8", 59200, 1705},
{"2.10", 59293, 1708},
{"S0.0.3", 59386, 1711},
{"2.12", 62776, 1802},
{"3.1", 62869, 1805},
{"3.2", 63041, 1810},
{"3.4", 63213, 1815},
{"3.6", 63306, 1818},
{"3.7", 63395, 1821},
{"3.10", 63485, 1824},
{"3.12", 63577, 1827},
{"3.13", 63670, 1830},
{"3.15", 63761, 1833},
{"3.18", 63853, 1836},
{"3.19", 63948, 1839},
{"3.20", 64039, 1842},
{"3.22", 64132, 1845},
{"3.24", 64223, 1848},
{"3.26", 64318, 1851},
{"3.27", 64410, 1854},
{"S0.0.4", 64506, 1857},
{"4.1", 64678, 1862}
}
*)
(*NotebookFileOutline
Notebook[{
Cell[CellGroupData[{
Cell[1731, 51, 53, 1, 107, "Title",
CellTags->"0.1"],
Cell[1787, 54, 150, 4, 93, "Subtitle",
CellTags->{"S0.0.1", "1.1"}],
Cell[1940, 60, 96, 1, 53, "Subsubtitle",
CellTags->{"S0.0.1", "1.2"}],
Cell[2039, 63, 10077, 243, 1077, "Text",
CellTags->{"S0.0.2", "1.3"}],
Cell[CellGroupData[{
Cell[12141, 310, 101, 1, 54, "Section",
CellTags->{"S0.0.2", "2.1"}],
Cell[12245, 313, 3139, 81, 302, "Text",
CellTags->{"S0.0.2", "2.2"}],
Cell[15387, 396, 1887, 47, 104, "DisplayFormula",
CellTags->"S0.0.2"],
Cell[17277, 445, 129, 4, 32, "Text",
CellTags->{"S0.0.2", "2.4"}],
Cell[17409, 451, 1258, 37, 80, "DisplayFormula",
CellTags->"S0.0.2"],
Cell[18670, 490, 807, 22, 104, "Text",
CellTags->{"S0.0.2", "2.6"}],
Cell[19480, 514, 2792, 76, 172, "DisplayFormula",
CellTags->"S0.0.2"],
Cell[22275, 592, 1184, 30, 122, "Text",
CellTags->{"S0.0.2", "2.8"}],
Cell[23462, 624, 1719, 52, 129, "DisplayFormula",
CellTags->"S0.0.2"],
Cell[25184, 678, 283, 7, 50, "Text",
CellTags->{"S0.0.2", "2.10"}],
Cell[25470, 687, 164, 4, 41, "DisplayFormula",
CellTags->"S0.0.2"],
Cell[25637, 693, 1893, 52, 158, "Text",
CellTags->{"S0.0.3", "2.12"}]
}, Open ]],
Cell[CellGroupData[{
Cell[27567, 750, 110, 1, 54, "Section",
CellTags->{"S0.0.3", "3.1"}],
Cell[27680, 753, 1244, 33, 140, "Text",
CellTags->{"S0.0.3", "3.2"}],
Cell[28927, 788, 2634, 80, 172, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[31564, 870, 4234, 106, 428, "Text",
CellTags->{"S0.0.3", "3.4"}],
Cell[35801, 978, 1381, 29, 88, "Input"],
Cell[CellGroupData[{
Cell[37207, 1011, 89, 1, 27, "Input"],
Cell[37299, 1014, 3563, 100, 98, "Output"]
}, Open ]],
Cell[CellGroupData[{
Cell[40899, 1119, 120, 3, 27, "Input"],
Cell[41022, 1124, 704, 13, 56, "Output"]
}, Open ]],
Cell[CellGroupData[{
Cell[41763, 1142, 71, 1, 27, "Input"],
Cell[41837, 1145, 934, 19, 106, "Output"]
}, Open ]],
Cell[42786, 1167, 244, 5, 48, "Input"],
Cell[43033, 1174, 66, 1, 32, "Text",
CellTags->{"S0.0.3", "3.6"}],
Cell[43102, 1177, 265, 5, 48, "Input"],
Cell[43370, 1184, 65, 1, 32, "Text",
CellTags->{"S0.0.3", "3.7"}],
Cell[43438, 1187, 109, 2, 27, "Input"],
Cell[43550, 1191, 80, 1, 27, "Input"],
Cell[43633, 1194, 204, 7, 50, "Text",
CellTags->{"S0.0.3", "3.10"}],
Cell[43840, 1203, 61, 1, 27, "Input"],
Cell[43904, 1206, 45, 3, 50, "Text"],
Cell[43952, 1211, 956, 27, 44, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[44911, 1240, 715, 24, 50, "Text",
CellTags->{"S0.0.3", "3.12"}],
Cell[45629, 1266, 164, 3, 27, "Input",
CellTags->"S0.0.3"],
Cell[45796, 1271, 68, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[45867, 1274, 164, 3, 27, "Input",
CellTags->"S0.0.3"],
Cell[46034, 1279, 75, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[46112, 1282, 59, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[46174, 1285, 68, 1, 32, "Text",
CellTags->{"S0.0.3", "3.13"}],
Cell[46245, 1288, 956, 27, 44, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[47204, 1317, 139, 5, 32, "Text",
CellTags->{"S0.0.3", "3.15"}],
Cell[47346, 1324, 126, 5, 57, "Input",
CellTags->"S0.0.3"],
Cell[47475, 1331, 1366, 42, 122, "Text",
CellTags->{"S0.0.3", "3.18"}],
Cell[48844, 1375, 112, 4, 27, "Input",
CellTags->"S0.0.3"],
Cell[48959, 1381, 84, 1, 32, "Text",
CellTags->{"S0.0.3", "3.19"}],
Cell[49046, 1384, 70, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[49119, 1387, 847, 25, 86, "Text",
CellTags->{"S0.0.3", "3.20"}],
Cell[49969, 1414, 130, 3, 24, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[50102, 1419, 52, 1, 32, "Text",
CellTags->{"S0.0.3", "3.22"}],
Cell[50157, 1422, 146, 4, 24, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[50306, 1428, 1424, 34, 176, "Text",
CellTags->{"S0.0.3", "3.24"}],
Cell[51733, 1464, 124, 3, 24, "DisplayFormula",
CellTags->"S0.0.3"],
Cell[51860, 1469, 135, 4, 32, "Text",
CellTags->{"S0.0.3", "3.26"}],
Cell[51998, 1475, 121, 4, 42, "Input",
CellTags->"S0.0.3"],
Cell[52122, 1481, 803, 22, 104, "Text",
CellTags->{"S0.0.3", "3.27"}],
Cell[52928, 1505, 145, 3, 27, "Input",
CellTags->"S0.0.3"],
Cell[53076, 1510, 145, 3, 27, "Input",
CellTags->"S0.0.3"],
Cell[53224, 1515, 52, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53279, 1518, 95, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53377, 1521, 56, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53436, 1524, 95, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53534, 1527, 58, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53595, 1530, 78, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53676, 1533, 72, 1, 27, "Input",
CellTags->"S0.0.3"],
Cell[53751, 1536, 125, 4, 42, "Input",
CellTags->"S0.0.3"],
Cell[53879, 1542, 81, 1, 27, "Input",
CellTags->"S0.0.3"]
}, Open ]],
Cell[CellGroupData[{
Cell[53997, 1548, 61, 1, 54, "Section",
CellTags->{"S0.0.4", "4.1"}],
Cell[54061, 1551, 588, 13, 158, "Text"]
}, Open ]],
Cell[CellGroupData[{
Cell[54686, 1569, 66, 1, 54, "Section",
CellTags->{"S0.0.1", "1.1"}],
Cell[54755, 1572, 311, 6, 68, "Text",
CellTags->{"S0.0.2", "1.2"}]
}, Open ]],
Cell[CellGroupData[{
Cell[55103, 1583, 60, 1, 54, "Section",
CellTags->{"S0.0.2", "2.1"}],
Cell[55166, 1586, 346, 8, 104, "Text",
CellTags->{"S0.0.3", "2.2"}]
}, Open ]],
Cell[CellGroupData[{
Cell[55549, 1599, 67, 1, 54, "Section",
CellTags->{"S0.0.3", "3.1"}],
Cell[55619, 1602, 533, 14, 212, "Text",
CellTags->{"S0.0.4", "3.2"}]
}, Open ]]
}, Open ]]
}
]
*)
(***********************************************************************
End of Mathematica Notebook file.
***********************************************************************)