00089-Python dataclasses --- 数据类-windows10

前言

这个模块提供了一个装饰器和一些函数，用于自动为用户自定义的类添加生成的 special method 例如 __init__() 和 __repr__()。它的初始描述见 PEP 557。

在这些生成的方法中使用的成员变量是使用 PEP 526 类型标注来定义的。例如以下代码：

from dataclasses import dataclass

@dataclass
class InventoryItem:
    """Class for keeping track of an item in inventory."""
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

除其他内容以外，还将添加如下所示的 __init__():

def __init__(self, name: str, unit_price: float, quantity_on_hand: int = 0):
    self.name = name
    self.unit_price = unit_price
    self.quantity_on_hand = quantity_on_hand

请注意，此方法会自动添加到类中：而不是在如上所示的 InventoryItem 定义中被直接指定。

操作系统：Windows 10 专业版

参考文档

dataclasses — 数据类

dataclasses.field

定义: dataclasses.field(*, default=MISSING, default_factory=MISSING, init=True, repr=True, hash=None, compare=True, metadata=None, kw_only=MISSING)

你可以通过调用提供的 field() 函数来替换字段默认值。

@dataclass
class C:
    mylist: list[int] = field(default_factory=list)

c = C()
c.mylist += [1, 2, 3]

default：如果提供，这将是该字段的默认值。设计这个形参是因为 field() 调用将会占据原来用来提供默认值的位置。
default_factory：如果提供，它必须是一个需要零个参数的可调用对象，当该字段需要一个默认值时，它将被调用。这能解决当默认值是可变对象时会带来的问题，如下所述。同时指定 default 和 default_factory 将产生错误。
init: 如果为真值（默认），则该字段将作为一个形参被包括在所生成的 __init__() 方法中。
repr: 如果为真值（默认），则该字段将被包括在所生成的 __repr__() 方法返回的字符串中。
hash: 这可以是一个布尔值或为 None。如果为真值，则此字段将被包括在所生成的 __hash__() 方法中。如果为 None (默认)，则将使用 compare 的值：这通常是预期的行为。一个字段如果被用于比较那么就应当在哈希时考虑到它。不建议将该值设为 None 以外的任何其他对象。设置 hash=False 但 compare=True 的一个合理情况是，一个计算哈希值的代价很高的字段是检验等价性需要的，且还有其他字段可以用于计算类型的哈希值。可以从哈希值中排除该字段，但仍令它用于比较。
compare: 如果为真值（默认），则该字段将被包括在所生成的相等性和大小比较方法中 (__eq__(), __gt__() 等等)。
metadata：可以是映射或 None。None 被视为一个空的字典。这个值将被包装在 MappingProxyType() 中，使其只读，并暴露在 Field 对象上。数据类不使用它——它是作为第三方扩展机制提供的。多个第三方可以各自拥有自己的键，以用作元数据中的命名空间。
kw_only: 如果为真值，则该字段将被标记为仅限关键字字段。这将在计算所生成的 __init__() 方法的形参时被使用。

如果通过调用 field() 指定字段的默认值，则该字段对应的类属性的值将最终被替换为指定的 default 值。如果没有提供 default，那么将删除该字段对应的类属性。目的是在 dataclass() 装饰器运行之后，类属性将包含字段的默认值，和直接指定了默认值一样。例如，在运行如下代码之后：

@dataclass
class C:
    x: int
    y: int = field(repr=False)
    z: int = field(repr=False, default=10)
    t: int = 20

类属性 C.z 将是 10，类属性 C.t 将是 20，类属性 C.x 和 C.y 将不设置。

dataclasses.fields

定义: dataclasses.fields(class_or_instance)

返回一个能描述此数据类所包含的字段的元组，元组的每一项都是 Field 对象。接受数据类或数据类的实例。如果没有传递一个数据类或实例将引发 TypeError。不返回 ClassVar 或 InitVar 等伪字段。

dataclasses.asdict

定义: dataclasses.asdict(obj, *, dict_factory=dict)

将数据类 obj 转换为一个字典（使用工厂函数 dict_factory）。每个数据类被转换为以 name: value 键值对来储存其字段的字典。数据类、字典、列表和元组的内容会被递归地访问。其它对象用 copy.deepcopy() 来复制。

在嵌套的数据类上使用 asdict() 的例子：

@dataclass
class Point:
     x: int
     y: int

@dataclass
class C:
     mylist: list[Point]

p = Point(10, 20)
assert asdict(p) == {'x': 10, 'y': 20}

c = C([Point(0, 0), Point(10, 4)])
assert asdict(c) == {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]}

要创建一个浅拷贝，可以使用以下的变通方法：

1	dict((field.name, getattr(obj, field.name)) for field in fields(obj))

如果 obj 不是一个数据类实例，asdict() 引发 TypeError。

初始化后处理

The generated __init__() code will call a method named __post_init__(), if __post_init__() is defined on the class.

在其他用途中，这允许初始化依赖于一个或多个其他字段的字段值。例如:

@dataclass
class C:
    a: float
    b: float
    c: float = field(init=False)

    def __post_init__(self):
        self.c = self.a + self.b

The __init__() method generated by dataclass() does not call base class __init__() methods. If the base class has an __init__() method that has to be called, it is common to call this method in a __post_init__() method:

@dataclass
class Rectangle:
    height: float
    width: float

@dataclass
class Square(Rectangle):
    side: float

    def __post_init__(self):
        super().__init__(self.side, self.side)

但是，请注意一般来说 dataclass 生成的 __init__() 方法不需要被调用，因为派生的 dataclass 将负责初始化任何本身为 dataclass 的基类的所有字段。

See the section below on init-only variables for ways to pass parameters to __post_init__(). Also see the warning about how replace() handles init=False fields.