You are playing your favorite video game and have just entered a reward off. In this reward Sekiri, the system will randomly throw the K-times treasure, each time you can choose to eat or not to eat (must be before the next treasure to make a choice, and now decide not to eat the treasure will not eat again). There are a total of n species, the system each time the probability of throwing the N-type treasures are the same and independent of each other. That is, even if the former k-1 system throws the Treasure 1 (which is likely to occur, although the probability is very small), the probability of the K-throw of each treasure is still 1/n. I will get pi points for the first treasures, but not every treasure is freely available. The first treasures have a prerequisite for the collection of Si. Only if all the treasures in Si have been eaten at least once, can I have a treasure (if the system throws a treasure that is not currently edible, it is equivalent to losing one chance in vain). Note that pi can be negative, but if it is a precondition for many high-score treasures, losing short-term benefits and eating this negative-score treasure will gain greater long-term benefits. Assuming you take the optimal strategy, how many points can you get on average in the reward?
In fact, there is a place in the beginning did not read ...
At first I said you had to make a choice before running out of the next treasure, and then I thought it was a contradiction to the best strategy ...
In fact, all you get is the desired score .
Must make a choice before running out of the next treasure, because the specific process of the game is operated by the system we don't know
The optimal strategy is to perform the operation with the maximum expected score under some probability conditions of the system execution.
We consider reverse DP
Enumerates which rounds are currently, and how the n number is not taken before making a selection
And then enumerate which of the treasures that this round system has left behind.
If the treasure can be taken, in the take and not take the trade-offs, if not to take, you can only not take
Finally, think of one more question.
There is a sentence in the title "Now decide not to eat the treasures can not eat again," this limitation in the DP process does not reflect
It is also important to note that this condition restricts existence only when the present treasure can be eaten, that is, we have the right to choose freely
But if you think about it, there's no problem.
Because the current can eat treasures, must also be able to eat, because has met the prerequisite treasures set conditions
However now eaten, in the future can also eat some premise treasures set for the current treasures of the treasure
So the current eating must be better than the strategy to eat later
So in the process of implementing the optimal decision-making, there is no situation of eating after eating now ~
1 Program bzoj1076;2 ConstMAXN = the; MAXM =32768;3 varI,k,n,x,j,p:longint;4w:array[-1.. Maxn]of Longint;5a:array[-1.. -,-1.. -]of Longint;6vis:array[-1.. -,-1.. Maxm]of Boolean;7f:array[-1.. maxn,-1.. Maxm]of extended;8 9 function ok (x,y:longint): boolean;Ten varI:longint; Onetmp:array[-1.. -]of Longint; A begin - fori:=1to n DoTmp[i]:=y >> (n-i) and1; - fori:=1To A[x,0] Do iftmp[a[x,i]]=0Then Exit (false); theExittrue); - end; - - function Max (a,b:extended): extended; + begin - ifA>b then exit (a)Elseexit (b); + end; A at begin - readln (k,n); -Fillchar (A,sizeof(a),0); - fori:=1to n Do - begin - read (w[i]); in read (x); - whileX<>0 Do to begin +Inc (A[i,0]); -A[i,a[i,0]]:=x; the read (x); * end; $ Readln;Panax Notoginseng end; - fori:=1to n Do the forj:=0To1<< N-1 Dovis[i,j]:=OK (i,j); + forI:=k Downto1 Do //I represents the current progress to the first round A forj:=0To1<< N-1 Do //n the state of taking and not taking items the begin + forp:=1to n Do ifVIS[P,J] then F[i,j]:=f[i,j]+max (f[i+1, j],f[i+1, J or (1<< (N-P))]+w[p])Elsef[i,j]:=f[i,j]+f[i+1, j]; -f[i,j]:=f[i,j]/N; $ end; $Writeln (f[1,0]:0:6); -End.
[BZOJ1076] [SCOI2008] reward off problem Solving report | pressure DP