Overview
As Python has become more widely used in machine learning and data science, the related Python libraries have grown very fast. But Python itself has a very deadly problem, that is, Python2 and Python3, two versions are incompatible, and GitHub Python2 Open Source Library has a lot of incompatible Python3, Causes a large number of Python2 users to be unwilling to migrate to Python3.
Python3 has made changes in many ways, optimizing many of the deficiencies of the Python2, and the standard library has expanded a lot of content, such as a coprocessor-related library. Now list some of the features that are available in Python3 and the reason you better move from Python2 to Python3. System file path processing library: Pathlib
Use Python2 students, should have used os.path this library, to deal with a variety of path problems, such as the Mosaic file path function: Os.path.join (), with Python3, you can use pathlib very convenient to complete this function:
From Pathlib import Path
DataSet = ' wiki_images '
datasets_root = Path ('/path/to/datasets/')
Train_path = Datasets_root/dataset/' train '
Test_path = datasets_root/dataset/' Test ' for
image_path in Train_path.iterdi R ():
with Image_path.open () as F: # Note, the Open is a method of path object # does something with an
image
Compared with the Os.path.join () function, the pathlib is more secure, convenient and readable. Pathlib has a lot of other functions.
P.exists ()
p.is_dir ()
p.parts ()
p.with_name (' Sibling.png ') # only change the name, but keep the folder
P.with_suffix ('. jpg ') # Only change the extension, but keep the folder and the name
p.chmod (mode)
P.rmdir ()
type reminder: type hinting
In Pycharm, type reminders look like this:
! [] (Http://7xpx6h.com1.z0.glb.clouddn.com/bd231966d3936432ed6ef90b416e4a19
)
Type reminders can be a good way to help us avoid some hand errors or type errors in a complex project, Python2 are identified by the IDE, format IDE recognition methods are inconsistent, and only recognized, not strictly qualified. For example, there are the following code, parameters can be Numpy.array, astropy. Table and Astropy. Column, Bcolz, Cupy, Mxnet.ndarray and so on.
def repeat_each_entry (data): "" "Each entry in the
data is doubled
<blah blah Nobody reads the documentation Ti ll the end>
"" "
index = numpy.repeat (Numpy.arange len (data), 2) return
Data[index]
The same code above, incoming pandas. The parameters of the series type are also possible, but the run-time error occurs.
Repeat_each_entry (Pandas. Series (data=[0, 1, 2], index=[3, 4, 5]) # returns Series with Nones inside
This is just a function, for large projects, there will be many such functions, the code is very easy to run away. So the parameter types that are identified are very important for large projects, and Python2 has no such capability, Python3 can.
def repeat_each_entry (Data:union[numpy.ndarray, Bcolz.carray]):
Currently, for example, JetBrains's pycharm already supports type hint syntax checking, and if you use this IDE, you can do it with IDE functionality. If you use the Sublimetext editor as I do, then Third-party tools Mypy can help you.
PS: Current type reminders are not very good for ndarrays/tensors support. run-time type checking:
Normally, a function annotation is used to understand the code, and nothing else. You can use enforce to force the runtime to check the type.
@enforce. Runtime_validation
def foo (text:str)-> None:
print (text)
foo (' Hi ') # OK
foo (5) # Fails
@enforce. Runtime_validation
def any2 (X:list[bool])-> bool: Return any
(x) any
(False, False, True, False]) # True
Any2 ([False, False, True, False]) # True any
([' false ']] # true
any2 ([' false ']) # fails any
(false, None, "", 0]) # false
Any2 ([False, None, ", 0]) # fails
using @ special characters to represent matrix multiplication
The following code:
# l2-regularized linear regression: | | Ax-b | | ^2 + Alpha * | | x| | ^2-> min
# Python 2
X = NP.LINALG.INV (Np.dot (A.T, A) + Alpha * Np.eye (a.shape[1)). dot (A.t.dot (b))
# Py Thon 3
X = NP.LINALG.INV (a.t @ A + Alpha * Np.eye (A.SHAPE[1))) @ (A.T @ b)
With the @ symbol, the entire code becomes more readable and easy to migrate to other libraries such as NumPy, TensorFlow, and so on. * * Special character recursive file path
In Python2, it is not easy to find a file recursively, even if you use the Glob library, but in Python3, you can simply implement it through wildcards.
Import Glob
# Python 2
found_images = \
glob.glob ('/path/*.jpg ') \
+ glob.glob ('/path/*/*.jpg ') \
+ Glob.glob ('/path/*/*/*.jpg ') \
+ glob.glob ('/path/*/*/*/*.jpg ') \
+ glob.glob ('/path/*/*/*/*/*.jpg ')
# Python 3
found_images = Glob.glob ('/path/**/*.jpg ', recursive=true)
Works better with the Pathlib mentioned earlier:
# Python 3
found_images = Pathlib. Path ('/path/'). Glob (' **/*.jpg ')
Print function
Print to the specified file
Print >>sys.stderr, "Critical Error" # python 2
print ("Critical Error", File=sys.stderr) # python 3
Do not use join function to stitch strings
# Python 3
print (*array, sep= ' t ')
print (batch, epoch, loss, accuracy, time, sep= ' \ t ')
Overriding the Print function
# Python 3
_print = Print # Store the original print function
def print (*args, **kargs):
pass # do Somethi ng useful, e.g. store output to some file
Again, like the following code
@contextlib. ContextManager
def replace_print ():
import builtins
_print = print # saving old print function< c7/># or use some the other function here
Builtins.print = Lambda *args, **kwargs: _print (' New printing ', *args, **kwargs)
yield
builtins.print = _print with
replace_print (): <code here would invoke the other
print function >
Although the above code can also be used to rewrite the print function, it is not recommended. string Formatting
The string format system provided by Python2 is still not good enough, too verbose, and usually we write such a piece of code to output the log information:
# Python 2
print (' {batch:3} {Epoch:3}/{total_epochs:3} accuracy: {acc_mean:0.4f}±{acc_std:0.4f} time: {Avg_ time:3.2f} '. Format (
batch=batch, Epoch=epoch, Total_epochs=total_epochs,
Acc_mean=numpy.mean (accuracies) , ACC_STD=NUMPY.STD (accuracies),
Avg_time=time/len (data_batch))
# Python 2 too Error-prone during fast Modifications, please avoid):
print (' {: 3} {: 3}/{: 3} accuracy: {: 0.4f}±{:0.4f} time: {: 3.2f} '. Format (
Batch, Epoch, Total_epochs, Numpy.mean (accuracies), NUMPY.STD (accuracies),
Time/len (data_batch)
)
The result of the output is:
12/300 accuracy:0.8180±0.4649 time:56.60
PYTHON3.6 's f-strings functions are much simpler to implement.
# Python 3.6+
print (f ' {batch:3} {Epoch:3}/{total_epochs:3} accuracy: {Numpy.mean (accuracies): 0.4f}±{ NUMPY.STD (accuracies): 0.4f} time: {Time/len (Data_batch): 3.2f} ')
Also, it is convenient to write a query or generate a code fragment:
query = f "INSERT into station VALUES ([{City}], ' {state} ', {latitude}, {longitude})"
Strict sorting
The following comparison operations are illegal in Python3.
# All this comparisons are illegal in Python 3
3 < ' 3 '
2 < None
(3, 4) < (3, None)
(4, 5) < [ 4, 5]
# False in both Python 2 and Python 3
(4, 5) = = [4, 5]
Different types of data cannot be sorted
Sorted ([2, ' 1 ', 3]) # Invalid for Python 3, in Python 2 returns [2, 3, ' 1 ']
NLP Unicode problem
s = ' Hello '
print (len (s)
print (S[:2])
Output:
python 2:6\n
python 3:2\n hello.
x = U ' со '
x + = ' Co ' # OK
x + = ' со ' # fail
The following code fails to run in Python2 but Python3 runs successfully, and PYTHON3 strings are Unicode encoded, so this is convenient for NLP, for example:
' A ' < type < U ' a ' # python 2:true
' a ' < U ' a ' # python 2:false
From collections import Counter
Counter (' Möbelstück ')
Python 2:counter ({' \xc3 ': 2, ' B ': 1, ' E ': 1, ' C ': 1, ' K '): 1 , ' M ': 1, ' l ': 1, ' s ': 1, ' t ': 1, ' \xb6 ': 1, ' \XBC ': 1} '
Python 3:counter ({' M ': 1, ' O ': 1, ' B ': 1, ' E ': 1, ' l ': 1, ' s ': 1, ' t ': 1, ' U ': 1, ' C ': 1, ' K ': 1}
Dictionaries
cpython3.6+ dict The default behavior is similar to orderdict.
Import JSON
x = {str (i): I for I in range (5)}
json.loads (Json.dumps (x))
# Python 2
{u ' 1 ': 1, u ' 0 ': 0, U ' 3 ' : 3, U ' 2 ': 2, U ' 4 ': 4} # Python 3 {' 0 ': 0, ' 1 ': 1, ' 2 ': 2, ' 3 ': 3
, ' 4 ': 4}
Similarly, the **kwargs dictionary content is in the same order as the incoming parameters.
From torch import nn
# Python 2
model = NN. Sequential (ordereddict (
' conv1 ', nn. conv2d (1,20,5)),
(' relu1 ', nn. Relu ()),
(' Conv2 ', nn. conv2d (20,64,5)),
(' Relu2 ', nn. Relu ())
])
# Python 3.6+, how it *can* is done and not supported right now in Pytorch
model = NN. Sequential (
conv1=nn. Conv2d (1,20,5),
relu1=nn. Relu (),
conv2=nn. Conv2d (20,64,5),
relu2=nn. Relu ())
)
iterable Unpacking
# Handy amount of additional stored info may vary between experiments, but the same code can is used in all cases
model_paramteres, optimizer_parameters, *other_params = Load (checkpoint_name)
# Picking two last values from a Seque nCE
*prev, next_to_last, last = values_history
# This also works with any iterables, so if you have a function tha T yields e.g. qualities,
# Below is a simple way to take only last two values from a list
*prev, Next_to_last, Las t = Iter_train (args)
higher-performance default pickle engine
# Python 2
import cpickle as pickle
import numpy
print len (pickle.dumps numpy.random.normal (size=[1000, 1000]))
# result:23691675
# Python 3
Import pickle
import numpy
len Pickle.dumps ( Numpy.random.normal (size=[1000, 1000]))
# result:8000162
Reduced to Python2 time 1/3 more secure list derivation
Labels = <initial_value>
predictions = [model.predict (data) for data, labels in DataSet]
# labels are OVERWR Itten in Python 2
# Labels are no affected by comprehension in Python 3
more Simple super ()
# Python 2
class MySubClass (Mysuperclass):
def __init__ (self, Name, **options):
super (MySubClass, self). _ _init__ (name= ' subclass ', **options)
# Python 3
class MySubClass (Mysuperclass):
def __init__ (self, name, * *options):
super (). __init__ (name= ' subclass ', **options)
Multiple Unpacking
Merging two Dict
x = Dict (a=1, b=2)
y = dict (b=3, d=4)
# Python 3.5+
z = {**x, **y}
# z = {' A ': 1, ' B ': 3, ' d ': 4}, not E This value for ' B ' is taken from the latter dict.
Python3.5+ is not only convenient for merging Dict, it is also convenient to merge list
[*a, *b, *c] # list, concatenating (*a, *b, *c
) # tuple, concatenating
Python 3.5+
do_something (**{**default_settings, **custom_settings})
# Also Possible, this code Also checks There is no intersection between keys of dictionaries
do_something (**first_args, **second_args)
integer Type
Python2 provides two integer types: int and long,python3 only provide an integer type: int, like the following code:
Isinstance (x, Numbers. Integral) # python 2, the canonical way
isinstance (x, (Long, int)) # python 2
isinstance (x, int) # python 3, easier to remember
Summary
Python3 offers a number of new features to facilitate our coding while also bringing better security and higher performance. And the authorities have always recommended moving to Python3 as soon as possible. Of course, the cost of migration varies by system, and hopefully this article will help you migrate Python2 to Python3. related articles English original
Self-Blogging blog: Snake catchers say
+ + + +