[BZOJ1076] [SCOI2008] reward off problem Solving report | pressure DP

Source: Internet
Author: User

You are playing your favorite video game and have just entered a reward off. In this reward Sekiri, the system will randomly throw the K-times treasure, each time you can choose to eat or not to eat (must be before the next treasure to make a choice, and now decide not to eat the treasure will not eat again). There are a total of n species, the system each time the probability of throwing the N-type treasures are the same and independent of each other. That is, even if the former k-1 system throws the Treasure 1 (which is likely to occur, although the probability is very small), the probability of the K-throw of each treasure is still 1/n. I will get pi points for the first treasures, but not every treasure is freely available. The first treasures have a prerequisite for the collection of Si. Only if all the treasures in Si have been eaten at least once, can I have a treasure (if the system throws a treasure that is not currently edible, it is equivalent to losing one chance in vain). Note that pi can be negative, but if it is a precondition for many high-score treasures, losing short-term benefits and eating this negative-score treasure will gain greater long-term benefits. Assuming you take the optimal strategy, how many points can you get on average in the reward?

In fact, there is a place in the beginning did not read ...

At first I said you had to make a choice before running out of the next treasure, and then I thought it was a contradiction to the best strategy ...

In fact, all you get is the desired score .

Must make a choice before running out of the next treasure, because the specific process of the game is operated by the system we don't know

The optimal strategy is to perform the operation with the maximum expected score under some probability conditions of the system execution.

We consider reverse DP

Enumerates which rounds are currently, and how the n number is not taken before making a selection

And then enumerate which of the treasures that this round system has left behind.

If the treasure can be taken, in the take and not take the trade-offs, if not to take, you can only not take

Finally, think of one more question.

There is a sentence in the title "Now decide not to eat the treasures can not eat again," this limitation in the DP process does not reflect

It is also important to note that this condition restricts existence only when the present treasure can be eaten, that is, we have the right to choose freely

But if you think about it, there's no problem.

Because the current can eat treasures, must also be able to eat, because has met the prerequisite treasures set conditions

However now eaten, in the future can also eat some premise treasures set for the current treasures of the treasure

So the current eating must be better than the strategy to eat later

So in the process of implementing the optimal decision-making, there is no situation of eating after eating now ~

1 Program bzoj1076;2 ConstMAXN = the; MAXM =32768;3 varI,k,n,x,j,p:longint;4w:array[-1.. Maxn]of Longint;5a:array[-1.. -,-1.. -]of Longint;6vis:array[-1.. -,-1.. Maxm]of Boolean;7f:array[-1.. maxn,-1.. Maxm]of extended;8 9 function ok (x,y:longint): boolean;Ten varI:longint; Onetmp:array[-1.. -]of Longint; A begin -      fori:=1to n DoTmp[i]:=y >> (n-i) and1; -      fori:=1To A[x,0] Do iftmp[a[x,i]]=0Then Exit (false); theExittrue); - end; -  - function Max (a,b:extended): extended; + begin -     ifA>b then exit (a)Elseexit (b); + end; A  at begin - readln (k,n); -Fillchar (A,sizeof(a),0); -      fori:=1to n Do - begin - read (w[i]); in read (x); -          whileX<>0  Do to begin +Inc (A[i,0]); -A[i,a[i,0]]:=x; the read (x); * end; $ Readln;Panax Notoginseng end; -      fori:=1to n Do the          forj:=0To1<< N-1  Dovis[i,j]:=OK (i,j); +      forI:=k Downto1  Do //I represents the current progress to the first round A          forj:=0To1<< N-1  Do //n the state of taking and not taking items the begin +              forp:=1to n Do ifVIS[P,J] then F[i,j]:=f[i,j]+max (f[i+1, j],f[i+1, J or (1<< (N-P))]+w[p])Elsef[i,j]:=f[i,j]+f[i+1, j]; -f[i,j]:=f[i,j]/N; $ end; $Writeln (f[1,0]:0:6); -End.

[BZOJ1076] [SCOI2008] reward off problem Solving report | pressure DP

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.