這是一個建立於 的文章,其中的資訊可能已經有所發展或是發生改變。
raft.Node
有一個memoryStore
成員(定義在manager/state/raft/raft.go
):
// Node represents the Raft Node useful// configuration.type Node struct { ...... memoryStore *store.MemoryStore ......}
它非常重要,因為cluster
中用來響應swarmctl
命令的manager leader
中的store
成員其實就是指向manager
中Node
結構體中的memoryStore
:
// Server is the Cluster API gRPC server.type Server struct { store *store.MemoryStore raft *raft.Node rootCA *ca.RootCA}
store.MemoryStore
定義在manager/state/store/memory.go
:
// MemoryStore is a concurrency-safe, in-memory implementation of the Store// interface.type MemoryStore struct { // updateLock must be held during an update transaction. updateLock sync.Mutex memDB *memdb.MemDB queue *watch.Queue proposer state.Proposer}
其中實際用來儲存的memory database
部分使用的是go-memdb項目。初始化store.MemoryStore
使用NewMemoryStore()
函數:
// NewMemoryStore returns an in-memory store. The argument is an optional// Proposer which will be used to propagate changes to other members in a// cluster.func NewMemoryStore(proposer state.Proposer) *MemoryStore { memDB, err := memdb.NewMemDB(schema) if err != nil { // This shouldn't fail panic(err) } return &MemoryStore{ memDB: memDB, queue: watch.NewQueue(0), proposer: proposer, }}
其中schema
是一個*memdb.DBSchema
類型的變數:
schema = &memdb.DBSchema{ Tables: map[string]*memdb.TableSchema{}, }
往schema
新增成員使用的是register
函數(定義在manager/state/store/memory.go
):
func register(os ObjectStoreConfig) { objectStorers = append(objectStorers, os) schema.Tables[os.Name] = os.Table}
register
函數在store package
裡各個檔案(分別是cluster.go
,networks.go
,nodes.go
,services.go
和tasks.go
,正好對應swarmctl
的5
個子命令。)的init()
函數中使用,用來註冊如何處理相應的object
。
ObjectStoreConfig
定義在manager/state/store/object.go
:
// ObjectStoreConfig provides the necessary methods to store a particular object// type inside MemoryStore.type ObjectStoreConfig struct { Name string Table *memdb.TableSchema Save func(ReadTx, *api.StoreSnapshot) error Restore func(Tx, *api.StoreSnapshot) error ApplyStoreAction func(Tx, *api.StoreAction) error NewStoreAction func(state.Event) (api.StoreAction, error)}
它定義了如何儲存一個object
。
以services.go
為例:
const tableService = "service"func init() { register(ObjectStoreConfig{ Name: tableService, Table: &memdb.TableSchema{ Name: tableService, Indexes: map[string]*memdb.IndexSchema{ indexID: { Name: indexID, Unique: true, Indexer: serviceIndexerByID{}, }, indexName: { Name: indexName, Unique: true, Indexer: serviceIndexerByName{}, }, }, }, Save: func(tx ReadTx, snapshot *api.StoreSnapshot) error { var err error snapshot.Services, err = FindServices(tx, All) return err }, Restore: func(tx Tx, snapshot *api.StoreSnapshot) error { services, err := FindServices(tx, All) if err != nil { return err } for _, s := range services { if err := DeleteService(tx, s.ID); err != nil { return err } } for _, s := range snapshot.Services { if err := CreateService(tx, s); err != nil { return err } } return nil }, ApplyStoreAction: func(tx Tx, sa *api.StoreAction) error { switch v := sa.Target.(type) { case *api.StoreAction_Service: obj := v.Service switch sa.Action { case api.StoreActionKindCreate: return CreateService(tx, obj) case api.StoreActionKindUpdate: return UpdateService(tx, obj) case api.StoreActionKindRemove: return DeleteService(tx, obj.ID) } } return errUnknownStoreAction }, NewStoreAction: func(c state.Event) (api.StoreAction, error) { var sa api.StoreAction switch v := c.(type) { case state.EventCreateService: sa.Action = api.StoreActionKindCreate sa.Target = &api.StoreAction_Service{ Service: v.Service, } case state.EventUpdateService: sa.Action = api.StoreActionKindUpdate sa.Target = &api.StoreAction_Service{ Service: v.Service, } case state.EventDeleteService: sa.Action = api.StoreActionKindRemove sa.Target = &api.StoreAction_Service{ Service: v.Service, } default: return api.StoreAction{}, errUnknownStoreAction } return sa, nil }, })}
NewStoreAction
是建立針對service
這張table
的api.StoreAction
;而ApplyStoreAction
則是根據具體情況,使用相應的action
(create
,update
還是delete
,等等);Save
是從資料庫讀取所有的service
並儲存到一個snapshot
中;Restore
則是用snapshot
中的值更新資料庫中相應的service
。
再看一下manager leader
用來建立service
的函數(manager\controlapi\service.go
):
// CreateService creates and return a Service based on the provided ServiceSpec.// - Returns `InvalidArgument` if the ServiceSpec is malformed.// - Returns `Unimplemented` if the ServiceSpec references unimplemented features.// - Returns `AlreadyExists` if the ServiceID conflicts.// - Returns an error if the creation fails.func (s *Server) CreateService(ctx context.Context, request *api.CreateServiceRequest) (*api.CreateServiceResponse, error) { ...... err := s.store.Update(func(tx store.Tx) error { return store.CreateService(tx, service) }) if err != nil { return nil, err } ......}
s.store.Update()
函數是核心部分(manager/state/store/memory.go
):
// Update executes a read/write transaction.func (s *MemoryStore) Update(cb func(Tx) error) error { return s.update(s.proposer, cb)}
再看一下MemoryStore.update()
函數(manager/state/store/memory.go
):
func (s *MemoryStore) update(proposer state.Proposer, cb func(Tx) error) error { s.updateLock.Lock() memDBTx := s.memDB.Txn(true) var curVersion *api.Version if proposer != nil { curVersion = proposer.GetVersion() } var tx tx tx.init(memDBTx, curVersion) err := cb(&tx) if err == nil { if proposer == nil { memDBTx.Commit() } else { var sa []*api.StoreAction sa, err = tx.changelistStoreActions() if err == nil { if sa != nil { err = proposer.ProposeValue(context.Background(), sa, func() { memDBTx.Commit() }) } else { memDBTx.Commit() } } } } if err == nil { for _, c := range tx.changelist { s.queue.Publish(c) } if len(tx.changelist) != 0 { s.queue.Publish(state.EventCommit{}) } } else { memDBTx.Abort() } s.updateLock.Unlock() return err}
分析一下上面這個函數:
(1)
memDBTx := s.memDB.Txn(true)
這是go-memdb的用法,true
表明建立一個write transaction
。
(2)
if proposer != nil { curVersion = proposer.GetVersion()}
proposer
是manager
中raft.Node
成員,其功能是用來通知cluster
中其它follower manager
所發生的變化:
// ProposeValue calls Propose on the raft and waits// on the commit log action before returning a resultfunc (n *Node) ProposeValue(ctx context.Context, storeAction []*api.StoreAction, cb func()) error { _, err := n.processInternalRaftRequest(ctx, &api.InternalRaftRequest{Action: storeAction}, cb) if err != nil { return err } return nil}// GetVersion returns the sequence information for the current raft round.func (n *Node) GetVersion() *api.Version { n.stopMu.RLock() defer n.stopMu.RUnlock() if !n.IsMember() { return nil } status := n.Node.Status() return &api.Version{Index: status.Commit}}
(3)
var tx txtx.init(memDBTx, curVersion)err := cb(&tx)
其中tx
定義如下:
// Tx is a read/write transaction. Note that transaction does not imply// any internal batching. The purpose of this transaction is to give the// user a guarantee that its changes won't be visible to other transactions// until the transaction is over.type Tx interface { ReadTx create(table string, o Object) error update(table string, o Object) error delete(table, id string) error}type tx struct { readTx curVersion *api.Version changelist []state.Event}
tx
用來實現read/write transaction
。
tx.init()
就是一個“一對一”的賦值:
func (tx *tx) init(memDBTx *memdb.Txn, curVersion *api.Version) { tx.memDBTx = memDBTx tx.curVersion = curVersion tx.changelist = nil}
cb
就是:
func(tx store.Tx) error { return store.CreateService(tx, service)}
store.CreateService()
函數:
// CreateService adds a new service to the store.// Returns ErrExist if the ID is already taken.func CreateService(tx Tx, s *api.Service) error { // Ensure the name is not already in use. if tx.lookup(tableService, indexName, strings.ToLower(s.Spec.Annotations.Name)) != nil { return ErrNameConflict } return tx.create(tableService, serviceEntry{s})}
以上代碼確定service name
沒有重複後,再建立service
:
// create adds a new object to the store.// Returns ErrExist if the ID is already taken.func (tx *tx) create(table string, o Object) error { if tx.lookup(table, indexID, o.ID()) != nil { return ErrExist } copy := o.Copy() meta := copy.Meta() if err := touchMeta(&meta, tx.curVersion); err != nil { return err } copy.SetMeta(meta) err := tx.memDBTx.Insert(table, copy) if err == nil { tx.changelist = append(tx.changelist, copy.EventCreate()) o.SetMeta(meta) } return err}
上面這個函數會建立一個Object
副本(也就是serviceEntry
結構體)存放到資料庫裡,並把一個state.EventCreateService
加到tx.changelist
中。
其實這些有callbak作為參數的函數,真正用來做事就是callback,函數的其它部分僅僅是提供了一些common的功能。比如:獲得transaction和commit。
(4)
if err == nil { if proposer == nil { memDBTx.Commit() } else { var sa []*api.StoreAction sa, err = tx.changelistStoreActions() if err == nil { if sa != nil { err = proposer.ProposeValue(context.Background(), sa, func() { memDBTx.Commit() }) } else { memDBTx.Commit() } } } }
把資料commit
到資料庫。
(5)
if err == nil { for _, c := range tx.changelist { s.queue.Publish(c) } if len(tx.changelist) != 0 { s.queue.Publish(state.EventCommit{}) } } else { memDBTx.Abort() }
s.queue.Publish()
函數把建立service
這個訊息通知到其它的goroutine
(例如m.globalOrchestrator.Run()
),這些goroutine
會做具體的建立service
操作。
此外,MemoryStore
還提供了View
函數,用來完成read transaction
:
// ReadTx is a read transaction. Note that transaction does not imply// any internal batching. It only means that the transaction presents a// consistent view of the data that cannot be affected by other// transactions.type ReadTx interface { lookup(table, index, id string) Object get(table, id string) Object find(table string, by By, checkType func(By) error, appendResult func(Object)) error}type readTx struct { memDBTx *memdb.Txn}// View executes a read transaction.func (s *MemoryStore) View(cb func(ReadTx)) { memDBTx := s.memDB.Txn(false) readTx := readTx{ memDBTx: memDBTx, } cb(readTx) memDBTx.Commit()}