[BZOJ1076] [SCOI2008] reward off problem Solving report

[BZOJ1076] [SCOI2008] reward off problem Solving report | pressure DP

Last Update:2015-05-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

You are playing your favorite video game and have just entered a reward off. In this reward Sekiri, the system will randomly throw the K-times treasure, each time you can choose to eat or not to eat (must be before the next treasure to make a choice, and now decide not to eat the treasure will not eat again). There are a total of n species, the system each time the probability of throwing the N-type treasures are the same and independent of each other. That is, even if the former k-1 system throws the Treasure 1 (which is likely to occur, although the probability is very small), the probability of the K-throw of each treasure is still 1/n. I will get pi points for the first treasures, but not every treasure is freely available. The first treasures have a prerequisite for the collection of Si. Only if all the treasures in Si have been eaten at least once, can I have a treasure (if the system throws a treasure that is not currently edible, it is equivalent to losing one chance in vain). Note that pi can be negative, but if it is a precondition for many high-score treasures, losing short-term benefits and eating this negative-score treasure will gain greater long-term benefits. Assuming you take the optimal strategy, how many points can you get on average in the reward?

In fact, there is a place in the beginning did not read ...

At first I said you had to make a choice before running out of the next treasure, and then I thought it was a contradiction to the best strategy ...

In fact, all you get is the desired score .

Must make a choice before running out of the next treasure, because the specific process of the game is operated by the system we don't know

The optimal strategy is to perform the operation with the maximum expected score under some probability conditions of the system execution.

We consider reverse DP

Enumerates which rounds are currently, and how the n number is not taken before making a selection

And then enumerate which of the treasures that this round system has left behind.

If the treasure can be taken, in the take and not take the trade-offs, if not to take, you can only not take

Finally, think of one more question.

There is a sentence in the title "Now decide not to eat the treasures can not eat again," this limitation in the DP process does not reflect

It is also important to note that this condition restricts existence only when the present treasure can be eaten, that is, we have the right to choose freely

But if you think about it, there's no problem.

Because the current can eat treasures, must also be able to eat, because has met the prerequisite treasures set conditions

However now eaten, in the future can also eat some premise treasures set for the current treasures of the treasure

So the current eating must be better than the strategy to eat later

So in the process of implementing the optimal decision-making, there is no situation of eating after eating now ~

1 Program bzoj1076;2 ConstMAXN = the; MAXM =32768;3 varI,k,n,x,j,p:longint;4w:array[-1.. Maxn]of Longint;5a:array[-1.. -,-1.. -]of Longint;6vis:array[-1.. -,-1.. Maxm]of Boolean;7f:array[-1.. maxn,-1.. Maxm]of extended;8 9 function ok (x,y:longint): boolean;Ten varI:longint; Onetmp:array[-1.. -]of Longint; A begin -      fori:=1to n DoTmp[i]:=y >> (n-i) and1; -      fori:=1To A[x,0] Do iftmp[a[x,i]]=0Then Exit (false); theExittrue); - end; -  - function Max (a,b:extended): extended; + begin -     ifA>b then exit (a)Elseexit (b); + end; A  at begin - readln (k,n); -Fillchar (A,sizeof(a),0); -      fori:=1to n Do - begin - read (w[i]); in read (x); -          whileX<>0  Do to begin +Inc (A[i,0]); -A[i,a[i,0]]:=x; the read (x); * end; $ Readln;Panax Notoginseng end; -      fori:=1to n Do the          forj:=0To1<< N-1  Dovis[i,j]:=OK (i,j); +      forI:=k Downto1  Do //I represents the current progress to the first round A          forj:=0To1<< N-1  Do //n the state of taking and not taking items the begin +              forp:=1to n Do ifVIS[P,J] then F[i,j]:=f[i,j]+max (f[i+1, j],f[i+1, J or (1<< (N-P))]+w[p])Elsef[i,j]:=f[i,j]+f[i+1, j]; -f[i,j]:=f[i,j]/N; $ end; $Writeln (f[1,0]:0:6); -End.

[BZOJ1076] [SCOI2008] reward off problem Solving report | pressure DP

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[BZOJ1076] [SCOI2008] reward off problem Solving report | pressure DP

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support