ZG manual python2.7.7 Source Code Analysis (3)--list object and Dict object

Source: Internet
Author: User

List ObjectDefinition of a List object

Inside a list object is an array implementation that stores a pointer in an array, pointing to the object to be saved.

Allocated is the size of the array in the list, and ob_size is the size of the array currently in use.

typedef struct {//variable-length object has ob_size, save the currently used array size pyobject_var_head pyobject **ob_item;//array pointer py_ssize_t AL Located The allocated array length} Pylistobject;


Cache for List objects

The list object has a caching mechanism, and the object is saved to the free cache pool when it is released, and is used when the next request is applied. The cache pool can cache 80 list objects, and the list object is released directly when the cache pool is full.

The

understands its caching mechanism from the creation and destruction process of a list object (the code is simplified for the sake of focus).

  Cache pool Size definition #define pylist_maxfreelist 80//  Create a new List Object Pyobject* pylist_new (py_ssize_t  size) {    pylistobject *op;    size_t nbytes =  size * sizeof (pyobject *);    //  allocates the memory of the list object directly from the cache if the cache is idle      if  (Numfree)  {        numfree--;         op = free_list[numfree];         _py_newreference ((pyobject *) op);    } else {         op = pyobject_gc_new (Pylistobject, &pylist_type);         if  (op == null)              return NULL;    }    if  (size <= 0)        op->ob_item = null;    else  {        op->ob_item =  (pyobject **)   Pymem_malloc (nbytes);        if  (Op->ob_item == NULL )  {            py_decref (OP);             return pyerr_nomemory ();         }        memset (Op->ob_item, 0,  nbytes);     }    py_size (OP)  = size;     op->allocated = size;    _pyobject_gc_track (OP);     return  (pyobject *)  op;}   Destroy List Object Static void list_dealloc (pylistobject *op) {&NBSP;&NBSP;&NBSP;&NBsp Py_ssize_t i;    pyobject_gc_untrack (OP);     py_trashcan_safe_ BEGIN (OP)     if  (op->ob_item != null)  {         i = py_size (OP);        while  ( --i >= 0)  {            py_ Xdecref (Op->ob_item[i]);        }         pymem_free (Op->ob_item);    }    //  Save the list object to the free cache     if  (numfree < pylist_maxfreelist &&  Pylist_checkexact (OP))         free_list[numfree++] = op;     else        py_type (OP)->tp_free (PyObject  *) op);    Py_trashcan_safe_end (OP)} 


List INSERT, delete, add operation

The internal implementation of the list is an array, so insertions and deletions can cause internal elements to move. When adding an operation, if the currently allocated memory for the list object is not used up, it is appended directly at the end.

Look at the insert and add operation of the list.

  Insert Operation Int pylist_insert (Pyobject *op, py_ssize_t where, pyobject *newitem) {    if  (! Pylist_check (OP))  {        pyerr_badinternalcall ();         return -1;    }    return  ins1 ((pylistobject *) op, where, newitem);} STATIC&NBSP;INT&NBSP;INS1 (PYLISTOBJECT&NBSP;*SELF,&NBSP;PY_SSIZE_T&NBSP;WHERE,&NBSP;PYOBJECT&NBSP;*V) {     py_ssize_t i, n = py_size (self);     pyobject ** items;    if  (v == null)  {         pyerr_badinternalcall ();        return -1;     }    if  (N == py_ssize_t_max)  {         pyerr_setstring (pyexc_overflowerror,             "Cannot add  more objects to list ");        return -1;     }    //  determine if the length is redistributed     if  (list_ Resize (self, n+1)  == -1)         return -1;     //  Find insertion point     if  (where < 0)  {         where += n;        if   (where < 0)             where  = 0;    }    if  (where > n)          where = n;    //  Moving Elements      items = self->ob_item;    for  (i = n; --i >= where; )          items[i+1] = items[i];    py_incref (v);     items[where] = v;    return 0;}   Add Operation Int pylist_append (Pyobject *op, pyobject *newitem) {    if   (Pylist_check (OP)  &&  (newitem != null))          return app1 ((pylistobject *) op, newitem);     pyerr_ Badinternalcall ();     return -1;} Static int app1 (pylistobject *self, pyobject *v) {    py_ssize_t  n = pylist_get_size (self);    assert  (v != null);     if  (N == py_ssize_t_max) &NBSP;{&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSp Pyerr_setstring (pyexc_overflowerror,             " Cannot add more objects to list ");         return -1;    }    if  (List_resize (self, n+1)  ==  -1)         return -1;    py_incref (v);     pylist_set_item (self, n, v);     return 0;}


Summary of List objects
    1. The list object has a quantitative cache inside, increasing the speed at which the list object is created

    2. The insertion of a list object is expensive to delete and is not suitable for frequent operation.

    3. Append operation speed is faster.


Dict ObjectDefinition of Dict Object

The implementation of the Dict object is the hash list, the open address method adopted by the hash function, and the time complexity of the algorithm is O (1).

The Dict object uses the memory of the object's internal array ma_smalltable when the hash list is less than 8.

Internal array space, creating a small hash of length using the # define Pydict_minsize 8//Hash list data item typedef struct {py_ssize_t Me_hash;    Pyobject *me_key; Pyobject *me_value;}  pydictentry;//Dict object struct _dictobject {pyobject_head py_ssize_t ma_fill;  Count of used + pseudo-deleted dummy count py_ssize_t ma_used;    Use the Count py_ssize_t ma_mask; Pydictentry *ma_table;    Hash table Memory Pointer pydictentry * (*ma_lookup) (Pydictobject *mp, Pyobject *key, long hash); Pydictentry Ma_smalltable[pydict_minsize]; Internal optimization, memory of small hash table};


Caching of Dict objects

The Dict object also has a caching mechanism, which is saved to the cache pool when the object is released, and is used for the next request. The cache pool can cache 80 Dict objects, and Dict objects are released directly when the cache pool is full.

Learn about its caching mechanism from the Dict object creation and destruction process (the code is simplified for the sake of focus).

  Cache pool Size definition #define pydict_maxfreelist 80//  Create  dict  object Pyobject* pydict_new ( void) {    register pydictobject *mp;    //  Create   Dummy object, placeholder when deleted     if  (dummy == null)  { /*  Auto-initialize dummy */        dummy = pystring_ FromString ("<dummy key>");        if  (dummy ==  null)             return NULL;     }    //  determine if the cache has idle, use the  dict object in the cache     if   (Numfree)  {        mp = free_list[--numfree];         assert  (mp != null);         assert  (Py_type (MP) &Nbsp;== &pydict_type);         _py_newreference (PyObject  *) MP);        if  (Mp->ma_fill)  {             empty_to_minsize (MP);         } else {             Init_nonzero_dict_slots (MP);        }         assert  (mp->ma_used == 0);         assert  (mp->ma_table == mp->ma_smalltable);         assert  (mp->ma_mask == pydict_minsize - 1);    }  Else {        mp = pyobject_gc_new (PyDictObject,  &pydict_type);         if  (mp == null)              return null;        empty_to_ MINSIZE (MP);    }    mp->ma_lookup = lookdict_string;     return  (pyobject *) MP;}   Release dict function Static void dict_dealloc (REGISTER&NBSP;PYDICTOBJECT&NBSP;*MP) {     register pydictentry *ep;    py_ssize_t fill = mp->ma_ Fill;    pyobject_gc_untrack (MP);     py_trashcan_safe_begin (MP)      for  (ep = mp->ma_table; fill > 0; ep++)  {         if  (Ep->me_key)  {             --fill;            py_decref (Ep->me_key);             py_xdecref (Ep->me_value);        }     }    if  (mp->ma_table != mp->ma_smalltable)          pymem_del (mp->ma_table);    //  if the cache has free space, Caches the freed  dict  objects     if  (numfree < pydict_maxfreelist & & py_type (MP)  == &pydict_type)         free_list [Numfree++] = mp;    else        py_type (MP)->tp_free ((pyobject *) MP);     py_trashcan_safe_end (MP)}


The main searching algorithm of open address hash table

The Dict object Hash lookup algorithm first compares whether the key is the same or not, and probes to the next position, until the element is found, or the lookup fails. Returns the first available location when a lookup fails.

Static pydictentry *lookdict (Pydictobject *mp, pyobject *key, register long  hash) {    register size_t i;    register size_t  perturb;    register PyDictEntry *freeslot;     register size_t mask =  (size_t) mp->ma_mask;    pydictentry * ep0 = mp->ma_table;    register pydictentry *ep;     register int cmp;    pyobject *startkey;    //   Find hash position     i =  (size_t) Hash & mask;    ep  = &ep0[i];    if  (ep->me_key == null | |  ep->me_key == key)         return ep;     //  determine if the hash position is a placeholder pair after deletionElephant     if  (ep->me_key == dummy)          freeslot = ep;    else {         //  Hash Hash Match, further search         if  (ep->me_hash ==  hash)  {            startkey =  ep->me_key;            py_incref (Startkey );             cmp = pyobject_ Richcomparebool (STARTKEY,&NBSP;KEY,&NBSP;PY_EQ);             py_decref (Startkey);             if   (cmp < 0)                  return null;            if  (ep0 == mp->ma_table  && ep->me_key == startkey)  {                 if  (cmp > 0)                      return ep ;            }             else {                 return lookdict (Mp, key, hash);             }        }         freeslot = NULL;    }     //  finding matches on the detection chain &NBSP;&Nbsp;  for  (Perturb = hash; ; perturb >>= perturb_shift)  {        i =  (i << 2)  + i  + perturb + 1;        ep = &ep0[i  & mask];        if  (ep->me_key ==  NULL)             return freeslot ==  NULL ? ep : freeslot;        if  (ep- >me_key == key)             return  ep;        if  (ep->me_hash == hash & & ep->me_key != dummy)  {             startkey = eP->me_key;            py_incref (Startkey);             cmp = pyobject_ Richcomparebool (STARTKEY,&NBSP;KEY,&NBSP;PY_EQ);             py_decref (Startkey);             if   (cmp < 0)                  return null;            if   (Ep0 == mp->ma_table && ep->me_key == startkey)  {                 if  (cmp  > 0)                      return ep;            }             else {                 return lookdict (Mp, key, hash);             }        }         else if  (ep->me_key == dummy &&  Freeslot == null)             freeslot  = ep;    }    return 0;}


Dict Object Summary
    1. The Dict object uses an open address hashing method.

    2. The Dict object has a quantitative cache inside, increasing the speed at which Dict objects are created.

    3. For Dict objects with a small length, it is better to use the memory inside the object directly, without allocating memory two times.


Original link: ZG manual python2.7.7 Source Analysis (3)--list object and Dict object


ZG manual python2.7.7 Source Code Analysis (3)--list object and Dict object

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.