Fork me on GitHub

Python进阶中

生成器Generator

def countdown(num):
    print('Starting')
    while num > 0:
        yield num
        num -= 1

# this will not print 'Starting'
cd = countdown(3)

# this will print 'Starting' and the first value
print(next(cd))

# will print the next values
print(next(cd))
print(next(cd))

# this will raise a StopIteration
print(next(cd))

迭代器使用方式

# you can iterate over a generator object with a for in loop
cd = countdown(3)
for x in cd:
    print(x)

# you can use it for functions that take iterables as input
cd = countdown(3)
sum_cd = sum(cd)
print(sum_cd)

cd = countdown(3)
sorted_cd = sorted(cd)
print(sorted_cd)

生成器表达式

# generator expression
mygenerator = (i for i in range(1000) if i % 2 == 0)
print(sys.getsizeof(mygenerator), "bytes")

# list comprehension
mylist = [i for i in range(1000) if i % 2 == 0]
print(sys.getsizeof(mylist), "bytes")

# 120bytes
# 4272 bytes
生成器概念

类可以实现生成器作为一个可迭代对象,它需要实现__iter__方法和__next__方法,使得类对象可迭代。此外,还需要注意记录迭代次数,以及最后raise一个StopIteration异常。

class firstn:
    def __init__(self, n):
        self.n = n
        self.num = 0

    def __iter__(self):
        return self

    def __next__(self):
        if self.num < self.n:
            cur = self.num
            self.num += 1
            return cur
        else:
            raise StopIteration()

firstn_object = firstn(1000000)
print(sum(firstn_object))

装饰器Decorators

装饰器的典型使用场景有:

  • 计算函数执行时间
  • 用于调试,打印出函数的参数和调用信息
  • 作为函数的参数校验
  • 以插件的形式注册函数
  • 降低代码执行速度来测试网络,如使用sleep函数
  • 缓存代码执行结果Memoization
  • 附加信息或者更新状态

装饰器就是一个语法糖,如下装饰器就类似target = somedecorator(target)

装饰器一个特性就是将被装饰函数替换为其他函数

@somedecorator
def target():
    print("running target")

简单装饰器模板

import functools

def my_decorator(func):
    @functools.wraps(func)      # 保持被装饰函数的属性
    def wrapper(*args, **kwargs):
        # Do something before
        result = func(*args, **kwargs)
        # Do something after
        return result
    return wrapper

带参数的装饰器函数,可以认为是两层函数,在简单装饰器外部套一个函数来扩展装饰器的行为。

即两层闭包,一个持有外部环境变量的函数就是闭包。

def repeat(num_times):
    def decorator_repeat(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            for _ in range(num_times):
                result = func(*args, **kwargs)
            return result
        return wrapper
    return decorator_repeat

@repeat(num_times=3)
def greet(name):
    print(f"Hello {name}")

greet('Alex')

# 输出
"""
    Hello Alex
    Hello Alex
    Hello Alex
"""
层叠装饰器

装饰器的执行顺序是decorator(func),从外到内执行,如果有多个装饰器堆叠在一起,按照decorator2(docorator1(func))的执行顺序。

其更清晰的执行顺序是:

  • func = decorator1(func)
  • func = decorator2(func)
  • func()

多个装饰器装饰函数时,有个规律是从下到上包裹(装饰)函数,在装饰的过程中执行装饰器函数和内部闭包wrapper函数之间代码,闭包函数知识作为一个对象被返回,在装饰过程中并不执行。

而在执行被装饰函数的过程中,从上到下执行wrapper函数内部的代码。

如下代码多个装饰器,其装饰的顺序和wrapper执行的顺序相反。

# 装饰过程
say_hello = start_end_decorator_2(start_end_decorator_1(say_hello))

say_hello()

多装饰器实验

import functools
# a decorator function that prints debug information about the wrapped function
def start_end_decorator_2(func): 
    print('Start decorator2')
    @functools.wraps(func)
    def wrapper2(*args, **kwargs):
        print('Exec wrapper 2')
        result = func(*args, **kwargs)
        print('End wrapper 2')
        return result
    return wrapper2


def start_end_decorator_1(func):
    print('Start decorator1')
    @functools.wraps(func)
    def wrapper1(*args, **kwargs):
        print('Exec wrapper 1')
        result = func(*args, **kwargs)
        print('End wrapper 1')
        return result
    return wrapper1


@start_end_decorator_2
@start_end_decorator_1
def say_hello(name):
    greeting = f'Hello {name}'
    print(greeting)
    return greeting
"""
相当于
func = start_end_decorator_1(func)
此时func是下面这个wrapper1函数
    def wrapper1(*args, **kwargs):
        print('Exec wrapper 1')
        result = func(*args, **kwargs)
        print('End wrapper 1')
        return result
再经过下一个装饰器start_end_decorator_2
func再次被替换
func = start_end_decorator_2(start_end_decorator_1(func))
"""

say_hello(name='Alex')

# exec result
"""
    Start decorator1
    Start decorator2
    Exec wrapper 2
    Exec wrapper 1
    Hello Alex
    End wrapper 1
    End wrapper 2
"""
装饰器执行时间

装饰器需要区分导入时和运行时,

装饰器一个特性就是装饰的过程在import时执行,当import代码时,装饰器立刻执行,将被装饰函数变为另一个函数。

registry = []

def register(func):
    print("running register(%s)" % func)
    registry.append(func)
    return func

@register
def f1():
    print("running f1")

@register
def f2():
    print("running f2")

def f3():
    print("running f3")

def main():
    print("running main")
    print("registry ->", registry)
    f1()
    f2()
    f3()

if __name__ == "__main__":
    main()

# import该模块的输出如下
"""
running register(<function f1 at 0x7f4105056af0>)
running register(<function f2 at 0x7f4105056c10>)
"""

# 执行main函数的输出如下    
# 在运行时,被装饰函数才开始执行
"""
running register(<function f1 at 0x7f65b62d70d0>)
running register(<function f2 at 0x7f65b62d7160>)
running main
registry -> [<function f1 at 0x7f65b62d70d0>, <function f2 at 0x7f65b62d7160>]
running f1
running f2
running f3
"""
类装饰器

可以使用类作为装饰器,因此,需要首先实现魔法方法__call__,使得对象是callable可调用的,类装饰器典型用处是保存状态,如函数被调用次数。我们使用functools.update_wrapper()而不是functools.wraps来持久化被装饰器函数信息。

import functools

class CountCalls:
    # the init needs to have the func as argument and stores it
    def __init__(self, func):
        functools.update_wrapper(self, func)
        self.func = func
        self.num_calls = 0

    # extend functionality, execute function, and return the result
    def __call__(self, *args, **kwargs):
        self.num_calls += 1
        print(f"Call {self.num_calls} of {self.func.__name__!r}")
        return self.func(*args, **kwargs)

@CountCalls
def say_hello(num):
    print("Hello!")

say_hello(5)
say_hello(5)

# result
"""
    Call 1 of 'say_hello'
    Hello!
    Call 2 of 'say_hello'
    Hello!
"""

Context Managers

上下文管理器用于资源管理,允许你方便的分配和释放资源

Python内置的关键字with用于处理上下文管理器,上下文管理器典型的用途有:

  • 打开和关闭文件
  • 打开和关闭数据库连接
  • 获得和释放锁
from threading import Lock
lock = Lock()

# error-prone:
lock.acquire()
# do stuff
# lock should always be released!
lock.release()

# Better:
with lock:
    # do stuff

实现一个上下文管理器类

为了支持with关键字,需要在类中实现__enter____exit__方法,当Python执行到with语句,会执行__enter__方法,此时应该获取资源并返回,而当离开上下文环境时,将执行__exit__方法,此时应该释放资源。

class ManagedFile:
    def __init__(self, filename):
        print('init', filename)
        self.filename = filename

    def __enter__(self):
        print('enter')
        self.file = open(self.filename, 'w')
        return self.file

    def __exit__(self, exc_type, exc_value, exc_traceback):
        if self.file:
            self.file.close()
        print('exit')

with ManagedFile('notes.txt') as f:
    print('doing stuff...')
    f.write('some todo...')
处理异常

当异常产生时,Python将异常类型、值和traceback信息传递给__exit__方法,它可以处理该异常。如果__exit__方法返回了除True之外的任何值,则由with语句引发异常。

class ManagedFile:
    def __init__(self, filename):
        print('init', filename)
        self.filename = filename

    def __enter__(self):
        print('enter')
        self.file = open(self.filename, 'w')
        return self.file

    def __exit__(self, exc_type, exc_value, exc_traceback):
        if self.file:
            self.file.close()
        print('exc:', exc_type, exc_value)
        print('exit')

# No exception
with ManagedFile('notes.txt') as f:
    print('doing stuff...')
    f.write('some todo...')
print('continuing...')

print()

# Exception is raised, but the file can still be closed
with ManagedFile('notes2.txt') as f:
    print('doing stuff...')
    f.write('some todo...')
    f.do_something()
print('continuing...')

也可以在__exit__方法中处理异常,并返回True

class ManagedFile:
    def __init__(self, filename):
        print('init', filename)
        self.filename = filename

    def __enter__(self):
        print('enter')
        self.file = open(self.filename, 'w')
        return self.file

    def __exit__(self, exc_type, exc_value, exc_traceback):
        if self.file:
            self.file.close()
        if exc_type is not None:
            print('Exception has been handled')
        print('exit')
        return True


with ManagedFile('notes2.txt') as f:
    print('doing stuff...')
    f.write('some todo...')
    f.do_something()
print('continuing...')

用生成器实现一个上下文管理器

与其写一个类,也可以写一个生成器函数,并用contextlib.contextmanager来装饰它。

为了实现这个目的,函数必须在try语句段中yield资源,而在finally语句中实现类似__exit__的功能,即释放资源。

from contextlib import contextmanager

@contextmanager
def open_managed_file(filename):
    f = open(filename, 'w')
    try:
        yield f
    finally:
        f.close()

with open_managed_file('notes.txt') as f:
    f.write('some todo...')

生成器首先获取资源,然后暂时挂起执行流程,并yeild返回资源,资源可以被调用者使用,当调用着离开with上下文,生成器接着执行后续的finally语句,释放资源。

Python中的解引用

Python中的*号具有多种作用:

  • Use *args for variable-length arguments
  • Use **kwargs for variable-length keyword arguments
  • Use *, followed by more function parameters to enforce keyword-only arguments
def my_function(*args, **kwargs):
    for arg in args:
        print(arg)
    for key in kwargs:
        print(key, kwargs[key])

my_function("Hey", 3, [0, 1, 2], name="Alex", age=8)

# Parameters after '*' or '*identifier' are keyword-only parameters and may only be passed using keyword arguments.
def my_function2(name, *, age):
    print(name)
    print(age)

# my_function2("Michael", 5) --> this would raise a TypeError
my_function2("Michael", age=5)

Python函数传参以及深拷贝浅拷贝

在Python中,赋值语句obj_b = obj_a不产生真正的对象拷贝,只创建一个新的变量和obj_a具有相同的引用,因此当你想要产生可变对象的真正的拷贝,并在不影响原来对象的情况下修改拷贝对象时,需要格外小心。

可以使用copy模块产生真正的拷贝,然而,对于混合/嵌套对象,浅拷贝和深拷贝有重要的区别,

  • 浅拷贝

只有一层深,对于比一层深的嵌套对象是源对象的引用,因此修改会导致源对象的更改

  • 深拷贝

一份完全独立的拷贝,递归产生源对象中所有嵌套对象的拷贝

赋值操作

会产生源对象的一个引用,修改会导致源对象的变更

list_a = [1, 2, 3, 4, 5]
list_b = list_a

list_a[0] = -10
print(list_a)
print(list_b)

"""
    [-10, 2, 3, 4, 5]
    [-10, 2, 3, 4, 5]
"""
浅拷贝

浅拷贝只有一层深度,修改第一层不会影响源对象,使用copy.copy()方法或者对象特定的拷贝方法或者拷贝构造函数

import copy
list_a = [1, 2, 3, 4, 5]
list_b = copy.copy(list_a)

# not affects the other list
list_b[0] = -10
print(list_a)
print(list_b)

但是在嵌套对象中,修改第二层或者更深层次的数据时,会影响到源对象,因为在第二层时拷贝的是引用,而不是值。

import copy
list_a = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
list_b = copy.copy(list_a)

# affects the other!
list_a[0][0]= -10
print(list_a)
print(list_b)

"""
    [[-10, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
    [[-10, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
"""

对于列表,类似的浅拷贝方法还有

# shallow copies
list_b = list(list_a)
list_b = list_a[:]
list_b = list_a.copy()
深拷贝

深拷贝是一份完全独立的克隆,使用copy.deepcopy方法实现

import copy
list_a = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
list_b = copy.deepcopy(list_a)

# not affects the other
list_a[0][0]= -10
print(list_a)
print(list_b)

"""
    [[-10, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
    [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
"""
对象的深拷贝和浅拷贝

可以使用copy模块实现对特定对象的深拷贝或者浅拷贝

  • 赋值只拷贝对象引用
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

# Only copies the reference
p1 = Person('Alex', 27)
p2 = p1
p2.age = 28
print(p1.age)
print(p2.age)

"""
    28
    28
"""
  • 浅拷贝拷贝一层
# shallow copy
import copy
p1 = Person('Alex', 27)
p2 = copy.copy(p1)
p2.age = 28
print(p1.age)
print(p2.age)

"""
    27
    28
"""
  • 深拷贝可以完整拷贝
class Company:
    def __init__(self, boss, employee):
        self. boss = boss
        self.employee = employee

# shallow copy will affect nested objects
boss = Person('Jane', 55)
employee = Person('Joe', 28)
company = Company(boss, employee)

company_clone = copy.copy(company)
company_clone.boss.age = 56
print(company.boss.age)
print(company_clone.boss.age)

"""
    56
    56
"""
# deep copy will not affect nested objects
boss = Person('Jane', 55)
employee = Person('Joe', 28)
company = Company(boss, employee)
company_clone = copy.deepcopy(company)
company_clone.boss.age = 56
print(company.boss.age)
print(company_clone.boss.age)
"""
    55
    56
"""
函数传参

在C语言中传参有显式的值传递和地址传递,在Python中也有类似的机制。

Python中数据类型存在可变和不可变的区别,即mutable和immutable。

对于可变类型,例如列表,由于传递的是列表的引用,列表可以在一个方法中被修改

# immutable objects -> no change
def foo(x):
    x = 5 # x += 5 also no effect since x is immutable and a new variable must be created

var = 10
print('var before foo():', var)
foo(var)
print('var after foo():', var)

"""
    var before foo(): 10
    var after foo(): 10
"""
  • 可变对象
# mutable objects -> change
def foo(a_list):
    a_list.append(4)

my_list = [1, 2, 3]
print('my_list before foo():', my_list)
foo(my_list)
print('my_list after foo():', my_list)
"""
    my_list before foo(): [1, 2, 3]
    my_list after foo(): [1, 2, 3, 4]
"""
  • 重新绑定一个可变对象的引用
# Rebind a mutable reference -> no change
def foo(a_list):
    # 赋值操作产生一个新的局部变量
    a_list = [50, 60, 70] # a_list is now a new local variable within the function
    a_list.append(50)

my_list = [1, 2, 3]
print('my_list before foo():', my_list)
foo(my_list)
print('my_list after foo():', my_list)
  • 区分+=和=
# another example with rebinding references:
def foo(a_list):
    a_list += [4, 5] # this chanches the outer variable

def bar(a_list):
    a_list = a_list + [4, 5] # this rebinds the reference to a new local variable

my_list = [1, 2, 3]
print('my_list before foo():', my_list)
foo(my_list)
print('my_list after foo():', my_list)

my_list = [1, 2, 3]
print('my_list before bar():', my_list)
bar(my_list)
print('my_list after bar():', my_list)

"""
    my_list before foo(): [1, 2, 3]
    my_list after foo(): [1, 2, 3, 4, 5]
    my_list before bar(): [1, 2, 3]
    my_list after bar(): [1, 2, 3]
"""

参考

Python-Notebook

Python中的*号

闭包概念

Python装饰器

Comments