OpenStack Swift source code Introduction: overall business architecture and Proxy process
The source code analysis of OpenStack has been widely used on the Internet, and the interpretation of each part is also very detailed. Here, I will record some key points of the Swift source code I have read Based on my own understanding, hoping to help the students who need it.
I. Overall framework of Swift
For example, the source code directory structure of Swift. Proxy is the frontend service access process. The account, container, and object directories are the business processing logic processes of the account, container, and object respectively. The common directory is some common tool code. Common is the processing logic of the hash ring. Next, we will introduce the source code logic of each process and some key mechanisms.
For the logical relationship between various business processes or modules, refer to the architecture diagram in Openstack Swift introduction.
Ii. Proxy Process Business Processing
First, you must master the PasteDeploy-based stack-based WSGI architecture. According to the layers defined by PasteDeploy, the Code flow defined by the configuration file can be quickly clarified, from middleware to server. Find the middleware on the outermost layer, which is the service entry. For the proxy process, you can give a simple business sequence diagram:
The division of labor at each layer is very clear. For example, in the default configuration file of the proxy process, exception handling is performed at the top, and unprocessed exceptions are thrown by all business processes, all of them will be processed here.
The Proxy process analyzes the resource paths composed of the requested URIaccount, iner, and object) and the request method put, del, and so on to analyze the specific types of resources requested currently, then decibel finds the controller that controls the resource, and the controller distributes requests to the specific resource server. The principle of distribution is consistent hash ring. Consistent hashing loops are generated by tools during system initialization. Specific steps are included in the Swift and Keystone standalone installation summary document.
In the introduction to Openstack Swift, we will introduce the specific node search process from the theoretical perspective. Use the md5 value plus Shift Method to Determine the part, and then find all the virtual nodes. The specific code is:
- container_partition, containers = self.app.container_ring.get_nodes(
-
- self.account_name, self.container_name)
- def get_nodes(self, account, container=None, obj=None):
-
- """
- Get the partition and nodes
- for an account/container/object.
- If a node is responsible
- for more than one replica, it will
- only appear in the
- output once.
- :param account: account name
- :param
- container: container name
- :param obj: object name
-
- :returns: a tuple of (partition, list of node dicts)
- Each node dict will have at least the following keys:
- ======
- ===============================================================
-
- id unique integer
- identifier amongst devices
- weight a float of the
- relative weight of this device as compared to
-
- others;
- this indicates how many partitions the builder will try
-
- to assign
- to this device
- zone integer indicating
- which zone the device is in; a given
-
- partition
- will not be assigned to multiple devices within the
-
- same zone
-
- ip the ip address of the
- device
- port the tcp port of the device
-
- device the device's name on disk (sdb1, for
- example)
- meta general use 'extra'
- field; for example: the online date, the
-
- hardware
- description
- ======
- ===============================================================
-
- """
- part = self.get_part(account,
- container, obj)
- return part,
- self._get_part_nodes(part)
- def get_part(self, account, container=None, obj=None):
-
- """
- Get the partition for an
- account/container/object.
- :param account: account name
- :param
- container: container name
- :param obj: object name
-
- :returns: the partition number
- """
-
- key = hash_path(account, container, obj, raw_digest=True)
-
- if time() >; self._rtime:
-
- self._reload()
-
- part = struct.unpack_from('>;I', key)[0] >>
- self._part_shift
- return part
- def _get_part_nodes(self, part):
- part_nodes = []
-
- seen_ids = set()
- for r2p2d in
- self._replica2part2dev_id:
- if
- part <; len(r2p2d):
-
- dev_id =
- r2p2d[part]
-
- if dev_id
- not in seen_ids:
-
-
- part_nodes.append(self.devs[dev_id])
-
-
- seen_ids.add(dev_id)
- return part_nodes
Then, based on the quorum principle, it is determined that the current request requires at least a few nodes to return successfully. For example, if NWR is 322, at least two nodes must be successfully written to ensure that the write is successful. Reflected in the public make_request method:
- def make_requests(self, req, ring, part, method, path, headers,
-
- query_string=''):
- """
- Sends an
- HTTP request to multiple nodes and aggregates the results.
-
- It attempts the primary nodes concurrently, then iterates
- over the
- handoff nodes as needed.
- :param req: a request sent by the client
-
- :param ring: the ring used for finding backend servers
-
- :param part: the partition number
-
- :param method: the method to send to the backend
- :param
- path: the path to send to the backend
-
-
- (full path ends up being /<$device>/<$part>/<$path>)
-
- :param headers: a list of dicts, where each dict
- represents one
-
-
- backend request that should be made.
- :param query_string:
- optional query string to send to the backend
- :returns: a
- swob.Response object
- """
-
- start_nodes = ring.get_part_nodes(part)
- nodes =
- GreenthreadSafeIterator(self.app.iter_nodes(ring, part))
-
- pile = GreenAsyncPile(len(start_nodes))
- for head in
- headers:
-
- pile.spawn(self._make_request, nodes, part, method, path,
-
-
- head, query_string, self.app.logger.thread_locals)
-
- response = []
- statuses = []
- for
- resp in pile:
- if not resp:
-
- continue
-
- response.append(resp)
-
- statuses.append(resp[0])
-
- if self.have_quorum(statuses,
- len(start_nodes)):
-
- break
-
- # give any pending requests *some* chance to finish
-
- pile.waitall(self.app.post_quorum_timeout)
-
- while len(response) <; len(start_nodes):
-
-
- response.append((HTTP_SERVICE_UNAVAILABLE, '', '', ''))
-
- statuses, reasons, resp_headers, bodies = zip(*response)
-
- return self.best_response(req, statuses, reasons, bodies,
-
-
- '%s %s' % (self.server_type, req.method),
-
-
- headers=resp_headers)