当前位置：首页>python>Python 零基础 100 天 — Day11 集合

Python 零基础 100 天 — Day11 集合

2026-07-02 16:50:36

🐍 集合 — 去重大师与数学运算

🕐 预计用时：2-3 小时｜ 🎯 目标：掌握集合的创建、运算、去重技巧和 frozenset

📖 今日目录

什么是集合？
创建集合
集合的基本操作
集合的数学运算
集合的方法大全
集合推导式
frozenset — 不可变集合
实战项目
今日小结

1. 什么是集合？

集合（Set）是 Python 中的"去重大师"——它自动去除重复元素，且内部元素无序。

想象一个班级花名册：不管你念多少遍"张三、李四、张三"，花名册上永远只有"张三、李四"——这就是集合。

特性	列表 list	元组 tuple	字典 dict	集合 set
符号	`[]`	`()`	`{}`	`{}`
有序	✅	✅	❌（3.7+ 插入序）	❌
可变	✅	❌	✅	✅
重复元素	✅ 允许	✅ 允许	键不重复	❌ 自动去重
索引访问	✅	✅	用键	❌ 不支持

# 列表可以有重复fruits_list = ["苹果", "香蕉", "苹果", "橘子", "苹果"]print(len(fruits_list))  # 5（5个元素）# 集合自动去重fruits_set = {"苹果", "香蕉", "苹果", "橘子", "苹果"}print(fruits_set)         # {'苹果', '香蕉', '橘子'}（只剩3个）print(len(fruits_set))    # 3

💡 集合的核心价值：去重 + 集合运算。当你需要"去除重复"或做"交集、并集、差集"时，集合是最佳选择。

2. 创建集合

📖 用花括号 {} 创建

# 直接创建colors = {"红", "绿", "蓝", "红"}  # "红"会自动去重print(colors)  # {'红', '绿', '蓝'}（顺序不确定）# 创建空集合 ⚠️ 注意！empty = {}        # ❌ 这是字典，不是集合！empty = set()     # ✅ 空集合只能用 set()print(type({}))   # <class 'dict'>print(type(set()))# <class 'set'>

⚠️ 面试经典坑：{} 是空字典，不是空集合！空集合必须用 set() 创建。这是 Python 的历史包袱，记住就好。

🏭 用 set() 从其他类型转换

# 从列表创建（自动去重）nums = [1, 2, 3, 2, 1, 4, 5, 4]unique_nums = set(nums)print(unique_nums)  # {1, 2, 3, 4, 5}# 从字符串创建（每个字符变成一个元素）letters = set("hello")print(letters)  # {'h', 'e', 'l', 'o'}（去掉了重复的 'l'）# 从元组创建coords = set([(1, 2), (3, 4), (1, 2)])print(coords)  # {(1, 2), (3, 4)}

⚠️ 集合元素的要求

# ✅ 可以放入集合的：不可变类型valid = {1, "hello", (1, 2), True, 3.14}# ❌ 不能放入集合的：可变类型# invalid = {[1, 2], {"a": 1}}  # TypeError: unhashable type

💡 为什么要求不可变？集合内部用哈希表实现，元素放入后位置由哈希值决定。如果元素可变（如列表），哈希值会变，集合就乱套了。所以只有"可哈希"的类型才能放进集合。

3. 集合的基本操作

➕ 添加元素

s = {1, 2, 3}# add() — 添加单个元素s.add(4)print(s)  # {1, 2, 3, 4}s.add(2)  # 添加已存在的元素，无效果print(s)  # {1, 2, 3, 4}（没有变化）# update() — 批量添加（可以传列表、元组、集合）s.update([5, 6], {7, 8})print(s)  # {1, 2, 3, 4, 5, 6, 7, 8}

➖ 删除元素

s = {"苹果", "香蕉", "橘子", "葡萄"}# remove() — 删除指定元素，不存在会报 KeyErrors.remove("香蕉")print(s)  # {'苹果', '橘子', '葡萄'}# s.remove("西瓜")  # ❌ KeyError: '西瓜'# discard() — 删除指定元素，不存在也不报错（推荐！）s.discard("西瓜")   # ✅ 不报错，什么也没发生s.discard("苹果")print(s)  # {'橘子', '葡萄'}# pop() — 随机删除一个元素并返回s = {10, 20, 30, 40}elem = s.pop()print(elem)  # 某个元素（不确定是哪个）print(s)     # 剩下3个# clear() — 清空集合s.clear()print(s)  # set()

💡 remove vs discard 怎么选？确定元素存在 → remove()（更快）不确定是否存在 → discard()（更安全）实际开发中 discard() 用得更多，因为不用先判断。

🔍 判断与长度

s = {1, 2, 3, 4, 5}print(len(s))     # 5 — 元素个数print(3 in s)     # True — 存在print(6 not in s) # True — 不存在

4. 集合的数学运算

这是集合最强大的部分——交集、并集、差集、对称差集。

假设有两个班级的选课名单：

math_class = {"张三", "李四", "王五", "赵六"}     # 数学课english_class = {"李四", "王五", "孙七", "周八"}   # 英语课

🤝 交集（Intersection）— 两人都选的课

# 运算符：&both = math_class & english_classprint(both)  # {'李四', '王五'}（两门课都选的人）# 方法形式both = math_class.intersection(english_class)print(both)  # {'李四', '王五'}

🤝 并集（Union）— 所有选课的人

# 运算符：|all_students = math_class | english_classprint(all_students)  # {'张三', '李四', '王五', '赵六', '孙七', '周八'}# 方法形式all_students = math_class.union(english_class)print(all_students)

➖ 差集（Difference）— 只选了数学没选英语的人

# 运算符：-only_math = math_class - english_classprint(only_math)  # {'张三', '赵六'}only_english = english_class - math_classprint(only_english)  # {'孙七', '周八'}

💡 差集是有方向的！A - B ≠ B - AA - B 是"A中有但B中没有的"，B - A 是"B中有但A中没有的"。

🔄 对称差集（Symmetric Difference）— 只选了一门课的人

# 运算符：^only_one = math_class ^ english_classprint(only_one)  # {'张三', '赵六', '孙七', '周八'}# 等价于 (A - B) | (B - A)only_one = (math_class - english_class) | (english_class - math_class)print(only_one)  # {'张三', '赵六', '孙七', '周八'}

📐 完整图解

运算	运算符	方法	含义
交集	`A & B`	`A.intersection(B)`	A 和 B 共有的
并集	`A \| B`	`A.union(B)`	A 和 B 所有的
差集	`A - B`	`A.difference(B)`	A 有但 B 没有的
对称差集	`A ^ B`	`A.symmetric_difference(B)`	只在一方有的

5. 集合的方法大全

📋 判断方法

A = {1, 2, 3, 4, 5}B = {1, 2, 3}C = {6, 7}# issubset() — 是否是子集print(B.issubset(A))       # True（B 是 A 的子集）print(B <= A)              # True（同上，运算符形式）print(B < A)               # True（真子集，B ≠ A）# issuperset() — 是否是超集print(A.issuperset(B))     # True（A 包含 B）print(A >= B)              # True# isdisjoint() — 是否没有交集print(A.isdisjoint(C))     # True（A 和 C 没有共同元素）print(B.isdisjoint(C))     # True

方法	含义	运算符
`A.issubset(B)`	A 是 B 的子集	`A <= B`
`A.issuperset(B)`	A 是 B 的超集	`A >= B`
`A.isdisjoint(B)`	A 和 B 无交集	无

🔄 原地更新方法

s = {1, 2, 3}# intersection_update() — 原地取交集s.intersection_update({2, 3, 4})print(s)  # {2, 3}# difference_update() — 原地取差集s = {1, 2, 3, 4}s.difference_update({3, 4, 5})print(s)  # {1, 2}# symmetric_difference_update() — 原地取对称差集s = {1, 2, 3}s.symmetric_difference_update({3, 4, 5})print(s)  # {1, 2, 4, 5}

💡 有 vs 没有 update 的区别：A.union(B) → 返回新集合，A 不变A.update(B) → 直接修改 A，不返回记住：update 结尾 = 原地修改 = 更高效。

6. 集合推导式

和列表推导式语法几乎一样，只是把 [] 换成 {}。

# 基本形式nums = [1, 4, 9, 16, 25]roots = {int(n ** 0.5) for n in nums}print(roots)  # {1, 2, 3, 4, 5}（自动去重）# 带条件evens = {n for n in range(20) if n % 2 == 0}print(evens)  # {0, 2, 4, 6, 8, 10, 12, 14, 16, 18}# 实用场景：提取字符串中不重复的字符text = "hello world"unique_chars = {c for c in text if c != ' '}print(unique_chars)  # {'h', 'e', 'l', 'o', 'w', 'r', 'd'}# 从字典中提取不重复的值scores = {"张三": 90, "李四": 85, "王五": 90, "赵六": 85}unique_scores = {v for v in scores.values()}print(unique_scores)  # {85, 90}

7. frozenset — 不可变集合

frozenset 是集合的"只读版本"——创建后不能增删改。

# 创建 frozensetfs = frozenset([1, 2, 3, 4])print(fs)        # frozenset({1, 2, 3, 4})print(type(fs))  # <class 'frozenset'># 不能修改# fs.add(5)      # ❌ AttributeError# fs.remove(1)   # ❌ AttributeError# 但可以做集合运算（返回新的 frozenset）fs2 = frozenset([3, 4, 5, 6])print(fs & fs2)   # frozenset({3, 4})（交集）print(fs | fs2)   # frozenset({1, 2, 3, 4, 5, 6})（并集）# frozenset 可以作为字典的键或集合的元素d = {frozenset([1, 2]): "pair A"}print(d[frozenset([1, 2])])  # "pair A"# 嵌套集合（普通 set 不行，因为 set 是可变的）nested = {frozenset([1, 2]), frozenset([3, 4])}print(nested)  # {frozenset({1, 2}), frozenset({3, 4})}

💡 什么时候用 frozenset？1. 需要把集合作为字典的键或另一个集合的元素时2. 需要保证数据不被意外修改时3. 多线程环境下需要不可变的集合数据时日常开发中用得不多，但面试和特定场景会遇到。

8. 实战项目

🎯 项目 1：投票去重统计器

某班级投票选班长，每个同学可以投多票，但同一人不能重复投票。统计有效票数。

def count_votes(votes):    """统计投票结果，自动去重"""    results = {}    for voter, candidate in votes:        # 每个投票人对每个候选人只能投一票        if voter not in results:            results[voter] = set()        results[voter].add(candidate)    return results# 模拟投票数据（投票人, 候选人）votes = [    ("张三", "李四"), ("张三", "王五"), ("张三", "李四"),  # 张三投了两次李四    ("李四", "王五"), ("李四", "赵六"),    ("王五", "李四"), ("王五", "王五"),                     # 王五投了自己    ("赵六", "李四"), ("赵六", "李四"),                     # 赵六重复投李四    ("孙七", "王五"),]results = count_votes(votes)print("📊 投票结果统计：")print("-" * 40)for voter, candidates in results.items():    print(f"  {voter} 投了: {', '.join(candidates)}（{len(candidates)}票）")# 统计每个候选人的总票数all_candidates = set()for candidates in results.values():    all_candidates.update(candidates)print(f"\n🏆 候选人得票统计：")for candidate in sorted(all_candidates):    count = sum(1 for cands in results.values() if candidate in cands)    print(f"  {candidate}: {count} 票")

🎯 项目 2：共同好友查找器

社交网络中，查找两个人的共同好友、独有好友、所有好友。

def analyze_friends(person_a, friends_a, person_b, friends_b):    """分析两个人的好友关系"""    set_a = set(friends_a)    set_b = set(friends_b)    common = set_a & set_b           # 共同好友    only_a = set_a - set_b           # 只有A有的好友    only_b = set_b - set_a           # 只有B有的好友    all_friends = set_a | set_b      # 所有好友    mutual_only = set_a ^ set_b      # 非共同好友    print(f"👥 {person_a} vs {person_b} 好友分析")    print("=" * 50)    print(f"  {person_a} 好友数: {len(set_a)}")    print(f"  {person_b} 好友数: {len(set_b)}")    print(f"  共同好友: {len(common)} 人 → {common or '无'}")    print(f"  {person_a} 独有: {len(only_a)} 人 → {only_a or '无'}")    print(f"  {person_b} 独有: {len(only_b)} 人 → {only_b or '无'}")    print(f"  所有好友: {len(all_friends)} 人")    # 相似度（Jaccard 系数）    if all_friends:        similarity = len(common) / len(all_friends) * 100        print(f"  相似度: {similarity:.1f}%")    return {        "common": common,        "only_a": only_a,        "only_b": only_b,        "all": all_friends    }# 好友数据alice_friends = ["Bob", "Charlie", "David", "Eve", "Frank"]bob_friends = ["Alice", "Charlie", "Grace", "Eve", "Henry"]result = analyze_friends("Alice", alice_friends, "Bob", bob_friends)

🎯 项目 3：标签分析器

分析文章标签的相似度，找出热门标签和独特标签。

def analyze_tags(articles):    """分析文章标签"""    # 每篇文章的标签集合    article_tags = {name: set(tags) for name, tags in articles.items()}    # 所有出现过的标签    all_tags = set()    for tags in article_tags.values():        all_tags.update(tags)    # 统计每个标签出现次数    tag_count = {}    for tag in all_tags:        tag_count[tag] = sum(1 for tags in article_tags.values() if tag in tags)    # 找热门标签（出现在 >= 2 篇文章中）    hot_tags = {tag for tag, count in tag_count.items() if count >= 2}    # 找每篇文章的独特标签    unique_tags = {}    for name, tags in article_tags.items():        others = set()        for other_name, other_tags in article_tags.items():            if other_name != name:                others.update(other_tags)        unique_tags[name] = tags - others    print("🏷️ 标签分析报告")    print("=" * 50)    print(f"\n📊 标签统计（共 {len(all_tags)} 个标签）：")    for tag, count in sorted(tag_count.items(), key=lambda x: -x[1]):        bar = "█" * count        print(f"  {tag:12s} | {bar} ({count}篇)")    print(f"\n🔥 热门标签: {hot_tags or '无'}")    print(f"\n✨ 各文章独特标签：")    for name, tags in unique_tags.items():        print(f"  {name}: {tags or '无'}")    # 计算文章间标签相似度    print(f"\n🔗 文章相似度矩阵：")    names = list(article_tags.keys())    for i, name_a in enumerate(names):        for name_b in names[i+1:]:            tags_a = article_tags[name_a]            tags_b = article_tags[name_b]            common = tags_a & tags_b            union = tags_a | tags_b            sim = len(common) / len(union) * 100 if union else 0            print(f"  {name_a} ↔ {name_b}: {sim:.0f}%（共同: {common or '无'}）")# 测试数据articles = {    "Python入门": ["python", "编程", "入门", "教程"],    "Python进阶": ["python", "编程", "进阶", "装饰器"],    "Web开发": ["python", "web", "flask", "编程"],    "数据分析": ["python", "pandas", "数据", "分析"],    "机器学习": ["机器学习", "sklearn", "数据", "模型"],}analyze_tags(articles)

9. 今日小结

知识点	核心内容
集合特性	无序、自动去重、元素必须可哈希
创建	`{}` （非空）、`set()`（空集合）、`set(iterable)`
增删	`add()` / `update()` / `remove()` / `discard()` / `pop()`
集合运算	`&` 交集、`\|`并集、`-`差集、`^`对称差集
判断	`issubset()` / `issuperset()` / `isdisjoint()`
推导式	`{x for x in ...}`
frozenset	不可变集合，可做字典键

🧠 记忆口诀：交并差对称，四运算搞定。去重用 set，不可变 frozenset。花括号创建，空集要 set()。discard 比 remove，安全不报错。

🔮 预告： Day 12 是综合练习日！我们将用列表、元组、字典、集合四大金刚，完成猜数字游戏和简易计算器。学了这么多，是时候大展身手了！

轻松时刻：

本文来自网友投稿或网络内容，如有侵犯您的权益请联系我们删除，联系邮箱：wyl860211@qq.com 。

Python 零基础 100 天 — Day11 集合

🐍 集合 — 去重大师与数学运算

📖 今日目录

1. 什么是集合？

2. 创建集合

📖 用花括号 {} 创建

🏭 用 set() 从其他类型转换

⚠️ 集合元素的要求

3. 集合的基本操作

➕ 添加元素

➖ 删除元素

🔍 判断与长度

4. 集合的数学运算

🤝 交集（Intersection）— 两人都选的课

🤝 并集（Union）— 所有选课的人

➖ 差集（Difference）— 只选了数学没选英语的人

🔄 对称差集（Symmetric Difference）— 只选了一门课的人

📐 完整图解

5. 集合的方法大全

📋 判断方法

🔄 原地更新方法

6. 集合推导式

7. frozenset — 不可变集合

8. 实战项目

🎯 项目 1：投票去重统计器

🎯 项目 2：共同好友查找器

🎯 项目 3：标签分析器

9. 今日小结

最新文章

热门文章

随机文章

Python 零基础 100 天 — Day11 集合

🐍 集合 — 去重大师与数学运算

📖 今日目录

1. 什么是集合？

2. 创建集合

📖 用花括号 {} 创建

🏭 用 set() 从其他类型转换

⚠️ 集合元素的要求

3. 集合的基本操作

➕ 添加元素

➖ 删除元素

🔍 判断与长度

4. 集合的数学运算

🤝 交集（Intersection）— 两人都选的课

🤝 并集（Union）— 所有选课的人

➖ 差集（Difference）— 只选了数学没选英语的人

🔄 对称差集（Symmetric Difference）— 只选了一门课的人

📐 完整图解

5. 集合的方法大全

📋 判断方法

🔄 原地更新方法

6. 集合推导式

7. frozenset — 不可变集合

8. 实战项目

🎯 项目 1：投票去重统计器

🎯 项目 2：共同好友查找器

🎯 项目 3：标签分析器

9. 今日小结

什么是 Python 中的三元表达式?

python经典100道练习题-2026最新版

最新文章

热门文章

随机文章