# dynamic programming – How to create a subset with a given length and mean?

For every $$i in {0,ldots,|P|}$$, $$j in {0,ldots,|S|}$$, and $$T in {0,ldots,mathcal{T}}$$ (I discuss $$mathcal{T}$$ below), compute whether there is a subset of size $$j$$ of $$p_1,ldots,p_i$$ which sums to $$T$$. You should take $$mathcal{T}$$ to be your maximum allowable answer – $$max(P) cdot |S|$$ would definitely do, but you can probably pick something which is closer to $$mu_S |S|$$, say $$mu_P |S|$$. The running time is $$O(|P| |S| mathcal{T})$$. In your case, assuming you choose $$mathcal{T} approx mu_p |S|$$, then $$|P| |S| mathcal{T} approx 2^{40}$$, so this is barely feasible.

This is a naive dynamic programming algorithm, which probably can be improved, especially since you’re only looking for an approximation.

For example, you could only look for solutions such that $$S cap {0,ldots,i} approx frac{|S|}{|P|} i$$ and $$sum(S cap {0,ldots,i}) approx frac{|S|}{|P|} mu_S$$, which will reduce the running time significantly. While this is only a heuristic, it might be possible to show that you get a decent approximation on average, if you randomize the order of $$P$$.

Another possible optimization is to quantize the partial sums, quantizing more aggressively as $$i$$ gets larger. If done carefully, you won’t lose much in accuracy but will have a significant gain in running time.