Tornado中的IOStream封裝了socket的非阻塞IO的讀寫操作,我個人覺得比較有意思的是read_util()介面:設定一個標誌字串和回呼函數,其餘的工作都可以省略了,當IOStream讀到標誌字串時自動調用該回呼函數,整個介面很人性化,簡潔方便。
屬性:
self.sockt: 封裝的通訊端,nonblocking模式;
self._read_buffer: 讀緩衝器,collections.deque類型, self._write_buffer類似
self.io_loop: 事件驅動模型,因為需要添加/修改 f偵聽d的讀寫event
self._state: 事件驅動模型偵聽socket的event(讀/寫/錯誤)
self._read_callback: 讀到指定位元組資料時,或是指定標誌字串時,需要執行的回呼函數
self._write_callback: 發送完_write_buffer的資料時,需要執行的回呼函數
self._connect_callback: 此時self._socket(nonblocking)是用戶端,正在向服務端發送請求,如果成功建立串連,需要執行的回呼函數
self._connecting: 此時self._socket(nonblocking)是用戶端, 正在等待串連的建立
對外介面:
1、建構函式: 初始化iostream執行個體的各個屬性,並把socket的ERROR event 加入ioloop(epoll)中監聽,對應的handler是self._handle_events.
def __init__(self, socket, io_loop=None, max_buffer_size=104857600, read_chunk_size=4096): .... self.io_loop.add_handler( self.socket.fileno(), self._handle_events, self._state)
2、connect函數:此時iostream用在用戶端。注意connect返回時不一定表示串連已建立,因為socket是nonblocking。
def connect(self, address, callback=None): self._connecting = True try: self.socket.connect(address) except socket.error, e: if e.args[0] not in (errno.EINPROGRESS, errno.EWOULDBLOCK): raise self._connect_callback = stack_context.wrap(callback) self._add_io_state(self.io_loop.WRITE)
3、read_util 函數:它的主要作用就是設定標誌字串,以及對應的回呼函數。順便掃描一下當前緩衝區中是否有標誌字串,並嘗試從socket中讀取新資料。_handle_events中有詳細地分析。
def read_until(self, delimiter, callback): assert not self._read_callback, "Already reading" self._read_delimiter = delimiter self._read_callback = stack_context.wrap(callback) while True: # See if we've already got the data from a previous read if self._read_from_buffer(): return self._check_closed() # hp: 繼續把資料讀入_read_buffer中 == 0,表示 EAGAIN errno or closed if self._read_to_buffer() == 0: break# hp: callback 什麼時候才會執行呢, handle_events# hp: 繼續監聽socket的Read事件,調用ioloop.update_hander()# hp: self.socket的read_handler == self_handle_events self._add_io_state(self.io_loop.READ)
4、write函數類似read_util,主要的作用也是設定需要發送的資料,以及資料發送完之後需要執行的回呼函數。
def write(self, data, callback=None): self._check_closed() self._write_buffer.append(data) self._add_io_state(self.io_loop.WRITE) # hp: 開始監聽socket的寫事件 self._write_callback = stack_context.wrap(callback)
5、其他的reading writing closed 都很簡單
我們分析iostream的重點:self.socket的讀寫處理函數self._handle_events
def _handle_events(self, fd, events):... try: if events & self.io_loop.READ: self._handle_read() if not self.socket: return if events & self.io_loop.WRITE: if self._connecting:# hp: 和服務端建立串連 self._handle_connect() # hp: 裡面調用connect設定好的鉤子self._connect_callback self._handle_write() # hp: 把_write_buffer裡面的資料寫入socket中,如果全部寫完,執行回呼函數 if not self.socket: return if events & self.io_loop.ERROR:# hp: epoll出現錯誤,直接關閉串連 self.close() return # hp: 更新epoll的監聽 state = self.io_loop.ERROR if self.reading(): # self._read_callback is not None state |= self.io_loop.READ if self.writing(): state |= self.io_loop.WRITE if state != self._state: self._state = state self.io_loop.update_handler(self.socket.fileno(), self._state) except: logging.error("Uncaught exception, closing connection.", exc_info=True) self.close() raise
當socket可讀時,直接調用self._handle_read()處理讀事件,先調用_read_to_buffer把協議棧準備好的資料讀入_read_buffer緩衝中,接著調用_read_from_buffer分析緩衝中的資料,看看是否滿足 read_util/ read_bytes設定的條件。
def _handle_read(self): while True: try: # Read from the socket until we get EWOULDBLOCK or equivalent. # SSL sockets do some internal buffering, and if the data is # sitting in the SSL object's buffer select() and friends # can't see it; the only way to find out if it's there is to # try to read it. result = self._read_to_buffer() except Exception:# hp: 出現異常(EWOULDBLOCK/EAGAIN不算) self.close() return if result == 0:# hp: closed or EAGAIN break else:# hp: 裡面會判斷是否有self._read_delimiter/self._read_callback if self._read_from_buffer(): return
由於socket就緒可讀,_read_from_socket裡面直接調用recv()迴圈讀取資料,直到出現EAGAIN/EWOULDBLOCK錯誤或是串連已經關閉。
def _read_to_buffer(self): try: chunk = self._read_from_socket() #hp: 裡面就是調用recv() except socket.error, e: # ssl.SSLError is a subclass of socket.error logging.warning("Read error on %d: %s", self.socket.fileno(), e) self.close() raise if chunk is None:# hp: EAGAIN errno or peer has closed return 0 self._read_buffer.append(chunk) if self._read_buffer_size() >= self.max_buffer_size: logging.error("Reached maximum read buffer size") self.close() raise IOError("Reached maximum read buffer size") return len(chunk)
_read_from_buffer就是對快取資料進行分析,看看是否滿足 read_util/ read_bytes設定的條件,如果滿足就執行相應的回呼函數。
def _read_from_buffer(self): if self._read_bytes: ... elif self._read_delimiter:# hp: read_util設定 # hp: _read_buffer太長了吧,2^32,高並發時不怕把記憶體擠爆!!! # hp: 相當於所有的資料都放在_read_buffer[0]上了 _merge_prefix(self._read_buffer, sys.maxint) loc = self._read_buffer[0].find(self._read_delimiter) if loc != -1: callback = self._read_callback delimiter_len = len(self._read_delimiter) self._read_callback = None self._read_delimiter = None self._run_callback(callback,# hp: _consume返回的是delimiter之前包括delimiter的資料 self._consume(loc + delimiter_len)) return True return False
其中_merge_prefix(deque, size)函數比較有意思,它把deque中的前size個位元組資料放到deque的第一個位置上。但我個人覺得這樣子會頻繁地分配和釋放記憶體,比較影響效能,好處就是太方便了,不用自己手寫一個緩衝處理結構。
總的來說,IOStream使用deque容器和_merge_prefix(deque, size)函數完成了非阻塞IO讀取資料的緩衝和重組,再結合read_util函數,很好的把非阻塞IO分區資料的處理封裝在IOStream類裡面,上層很方便地處理socket的讀寫操作。