Solve Optimization Problems in Density Estimation
Leverage the symbolic capabilities of KernelMixtureDistribution to solve for the least-squares cross-validation bandwidth. This method uses leave-one-out cross-validation to select a bandwidth that minimizes the integrated squared error of the resulting estimate.
d = BlockRandom[SeedRandom[12];
RandomVariate[NormalDistribution[], 25]];
Rk[h_, data_] := With[{n = Length[data]}, 1 / (h Sqrt[π]) (Exp[-((Subtract@@@Subsets[data, {2}]) ^ 2 / (4 h ^ 2))].ConstantArray[1 / n ^ 2, Total[Range[1, n - 1]]] + 1 / (2 n))]
Ro[h_, data_] := Total[1 / ((Length[data] - 1) h Sqrt[2 π]) Table[Plus@@Exp[-(data[[i]] - Delete[data, {i}]) ^ 2 / (2 h ^ 2)], {i, Length[data]}]]
LSCV[h_, data_] := With[{n = Length[data]}, Rk[h, data] - 2 / n Ro[h, data]]
bw = h /. FindMinimum[LSCV[h, d], {h}][[2]];𝒟 = KernelMixtureDistribution[d, bw];Show[Plot[LSCV[h, d], {h, 0.03, 2}, PlotLabel -> Text[Style[Row[{"h -> ", bw}], Bold, FontFamily -> "Verdana", FontSize -> 14]], Frame -> True, Axes -> None, Filling -> None, PlotStyle -> Thick, PlotRange -> {{0, 1.99}, {-.39, 0}}, ImageSize -> {570, 374}], Graphics[{Lighter[Blend[{Red, Orange}], 0.3], Dashed, Thick, Line[{{bw, -.385}, {bw, .005}}]}], Epilog -> Inset[Plot[PDF[𝒟, x], {x, -3.5, 3}, Filling -> Axis, FillingStyle -> Lighter[Blend[{Red, Orange}], 0.4], Axes -> {True, False}, ImageSize -> Medium], {1.3, -.24}]]