当前位置：首页>python>随机森林Python实战:5行代码搞定预测模型

随机森林Python实战:5行代码搞定预测模型

2026-06-29 13:16:41

不需要懂数学原理，跟着代码敲一遍，你就能用随机森林做预测了。

环境准备

# 安装必要的库pip install scikit-learn pandas numpy

完整代码：鸢尾花分类

# 第1步：导入库from sklearn.ensemble import RandomForestClassifierfrom sklearn.datasets import load_irisfrom sklearn.model_selection import train_test_splitimport pandas as pd# 第2步：加载数据iris = load_iris()X, y = iris.data, iris.target# 划分训练集和测试集（80%训练，20%测试）X_train, X_test, y_train, y_test = train_test_split(    X, y, test_size=0.2, random_state=42)# 第3步：创建模型（核心就这一行）model = RandomForestClassifier(    n_estimators=100,  # 100棵树    random_state=42    # 保证结果可复现)# 第4步：训练模型model.fit(X_train, y_train)# 第5步：预测predictions = model.predict(X_test)# 查看准确率accuracy = model.score(X_test, y_test)print(f"准确率: {accuracy:.2%}")

输出：

准确率: 100.00%

代码详解

关键参数说明

RandomForestClassifier(    n_estimators=100,      # 树的数量，越多越稳但越慢    max_depth=None,        # 树的最大深度，None表示不限制    min_samples_split=2,   # 节点分裂最小样本数    random_state=42        # 随机种子，保证结果可复现)

参数	建议	说明
`n_estimators`	100-500	树越多越稳定，但训练越慢
`max_depth`	默认None	数据噪音大时设小点（如10）
`random_state`	任意数字	固定种子，结果可复现

查看特征重要性

# 哪些特征对预测最重要？importance = pd.DataFrame({    'feature': iris.feature_names,    'importance': model.feature_importances_}).sort_values('importance', ascending=False)print(importance)

输出：

              feature  importance2   petal length (cm)       0.453    petal width (cm)       0.420   sepal length (cm)       0.081    sepal width (cm)       0.05

花瓣长度和宽度最重要，花萼相对次要。

预测新数据

# 来了一朵新花，测量数据如下：new_flower = [[5.1, 3.5, 1.4, 0.2]]  # 花萼长、宽，花瓣长、宽# 预测类别prediction = model.predict(new_flower)print(f"预测结果: {iris.target_names[prediction][0]}")# 查看预测概率proba = model.predict_proba(new_flower)print(f"各类别概率: {dict(zip(iris.target_names, proba[0]))}")

输出：

预测结果: setosa各类别概率: {'setosa': 1.0, 'versicolor': 0.0, 'virginica': 0.0}

回归问题代码

如果是预测连续值（如房价），换这个模型：

from sklearn.ensemble import RandomForestRegressor# 其他代码完全一样，只是换模型model = RandomForestRegressor(n_estimators=100, random_state=42)model.fit(X_train, y_train)predictions = model.predict(X_test)

完整流程图

准备数据 → 划分训练/测试集 → 创建模型 → 训练 → 预测 → 评估   ↑________________________________________________↓                    （调参优化）

常见问题

Q: 需要标准化数据吗？A: 随机森林对特征缩放不敏感，不需要。

Q: 训练太慢怎么办？A: 减少n_estimators，或用n_jobs=-1启用多核。

Q: 怎么保存模型？A: 用joblib.dump(model, 'model.pkl')。

#随机森林 #python #深度学习 #数据分析师 #机器学习

下篇预告：《实战案例：用随机森林预测客户流失》

本文来自网友投稿或网络内容，如有侵犯您的权益请联系我们删除，联系邮箱：wyl860211@qq.com 。

随机森林Python实战:5行代码搞定预测模型

环境准备

完整代码：鸢尾花分类

代码详解

关键参数说明

查看特征重要性

预测新数据

回归问题代码

完整流程图

常见问题

最新文章

热门文章

随机文章

随机森林Python实战:5行代码搞定预测模型

环境准备

完整代码：鸢尾花分类

代码详解

关键参数说明

查看特征重要性

预测新数据

回归问题代码

完整流程图

常见问题

计算机二级Python难不难零基础多久能备考通过?

27道Python考前押题!刷刷手机就复习了?!

最新文章

热门文章

随机文章