Six Value Function APPROXIMATION-LSPI Code (5)

Source: Internet
Author: User

This article is sample.py

1 #-*-coding:utf-8-*-2 """Contains class representing an LSPI sample."""3 4 5 classSample (object):6 7     """represents an Lspi sample tuple "(s, A, R, S ', absorb) '.8 #表达了LSPI的采样, expressed in tuple9 parameters# Input ParametersTen     ---------- One          A state:numpy.array# status Vector - State of the environment at the start of the sample. The status of the environment at the start of the sample - ' s ' in the sample tuple. the (the usual type is a numpy array.) - number of actions performed by the action:int# - The Index of action is executed. - "A" in the sample tuple + reward:float# Rewards from the environment - reward received from the environment. + "R" in the sample tuple A The next_state:numpy.array# uses the next state of the environment after the action in the sample at State of the environment after executing the sample ' s action. - ' s ' in the sample tuple - (the type should match that's state.) - Absorb:bool, optional# If this sample ends this episode, then it returns true. - True If this sample ended the episode. False otherwise. - ' absorb ' in the sample tuple in (the default is False, which implies. - non-episode-ending sample) to  +  - assumes that's a non-absorbing sample (as the vast majority the of samples would be non-absorbing). * # Assuming this sample is not going to end episode, $ # do this: Set as a class to facilitate different invocation methodsPanax Notoginseng This was just a dumb data holder so the types of the different - Fields can is anything convenient for the problem domain. the  + For states represented by vectors a numpy arrays works well. A  the     """ +  -     def __init__(Self, State, action, reward, Next_State, absorb=False): # Initialize $         """Initialize Sample instance.""" $Self.state = State -Self.action =Action -Self.reward =Reward theSelf.next_state =Next_State -Self.absorb =AbsorbWuyi  the     def __repr__(self): # This function is called when printing.  -         """Create string representation of tuple.""" Wu         return 'Sample (%s ,%s,%s,%s,%s)'%(Self.state, - Self.action, About Self.reward, $ Self.next_state, -Self.absorb)

Six Value Function APPROXIMATION-LSPI Code (5)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.