当前位置：首页>python>【时间序列机器学习】Python00逐小时及逐日气象及疾病数据生成及初步可视化

【时间序列机器学习】Python00逐小时及逐日气象及疾病数据生成及初步可视化

2026-01-29 18:12:54

时间序列机器学习

00 逐小时及逐日气象及疾病数据生成及初步可视化

Python（标准化代码）

概念、原理、思想、应用

概念：收集或生成逐小时和逐日的气象数据（如温度、湿度、气压等）和疾病数据（如发病数、死亡数等）。

原理：通过观测或模拟生成时间序列数据。

思想：数据是分析的基础，通过可视化可以初步了解数据的分布、趋势、季节性和异常值。

应用：用于探索性数据分析（EDA），为后续建模做准备。

可视化：折线图、柱状图、热力图、散点图等，展示时间序列的趋势、周期和相关性。

公共卫生意义：帮助公共卫生官员了解疾病与气象条件的关系，为疾病预警和干预措施提供依据。

操作流程

-数据预处理：

-模型构建：

-训练：

-评估：

-可视化：

-保存结果：

代码及操作演示与功能解析

时间序列机器学习模型大致可以分为三类：经典统计模型、传统机器学习模型和深度学习模型。

一、经典统计模型

这类模型基于序列自身的统计特性（如自相关性、趋势性、季节性）进行建模。

二、传统机器学习模型

这类模型将时间序列问题转化为监督学习问题，利用特征工程来捕捉时序模式。

三、深度学习模型

这类模型能自动从原始序列数据中学习复杂的时序依赖关系和非线性模式。

时间序列数据的可视化方法

1. 线图：最基础、最核心的可视化。横轴为时间，纵轴为观测值。用于直观展示趋势、季节性、异常值。

2. 自相关图和偏自相关图：

ACF：展示时间序列与其自身各阶滞后之间的相关性。用于识别MA模型的阶数`q`和序列的周期性。

PACF：展示在控制中间滞后项后，序列与某阶滞后项之间的纯粹相关性。用于识别AR模型的阶数`p`。

3. 季节图：将多年的数据按季节周期（如月、周）叠加在一张图上，用于清晰地观察季节性模式以及模式是否随时间变化。

4. 子序列图：将时间序列分解为多个子序列（如每年的数据），并绘制在同一张图中，便于比较不同周期的模式。

5. 箱线图：按时间周期（如月份、星期几）对数据进行分组并绘制箱线图，用于观察数据在不同周期内的分布情况（中位数、四分位数、异常值）。

6. 热力图：常用于展示一天内不同小时、一周内不同天的模式（如网站流量、电力负荷）。

7. 分解图：将时间序列分解为趋势、季节性和残差三个部分，分别进行可视化，帮助我们理解数据的构成。

8. 预测结果对比图：将历史数据、真实值和模型的预测值绘制在同一张图上，是评估模型性能最直观的方式。

# pip install pandas numpy matplotlib seaborn scipy statsmodelsimport osimport numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsfrom datetime import datetime, timedeltaimport warningswarnings.filterwarnings('ignore')# 设置中文字体plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号# 获取桌面路径并创建结果文件夹desktop_path = os.path.join(os.path.expanduser("~"), "Desktop")results_path = os.path.join(desktop_path, "Results时间")# 创建结果文件夹if not os.path.exists(results_path):    os.makedirs(results_path)def generate_weather_disease_data():"""生成气象和疾病数据"""print("开始生成数据...")

# 温度 (°C)    temperature = (10 + 15 * np.sin(2 * np.pi * (day_of_year - 80) / 365) +                   8 * np.sin(2 * np.pi * (hour - 6) / 24) +                   np.random.normal(0, 3, n))# 相对湿度 (%)    humidity = (60 + 20 * np.sin(2 * np.pi * (day_of_year - 100) / 365) -                0.5 * (temperature - 15) +                np.random.normal(0, 10, n))    humidity = np.clip(humidity, 20, 100)# 风速 (m/s)    wind_speed = (3 + 2 * np.sin(2 * np.pi * day_of_year / 365) +                  np.abs(np.random.normal(0, 1.5, n)))# 气压 (hPa)    pressure = (1013 + 10 * np.sin(2 * np.pi * (day_of_year - 180) / 365) +                np.random.normal(0, 2, n))# 降水量 (mm/h)    precipitation = np.where(np.random.uniform(0, 1, n) < 0.1,                             np.random.exponential(2, n), 0)# 日照时数    sunshine = np.where((hour >= 6) & (hour <= 18) &                        (precipitation == 0) &                        (np.random.uniform(0, 1, n) > 0.3), 1, 0)# 添加气象变量到DataFrame    hourly_weather['temperature'] = temperature    hourly_weather['humidity'] = humidity    hourly_weather['wind_speed'] = wind_speed    hourly_weather['pressure'] = pressure    hourly_weather['precipitation'] = precipitation    hourly_weather['sunshine'] = sunshine# 保存逐小时气象数据    hourly_weather.to_csv(os.path.join(results_path, 'hourly_weather_data.csv'), index=False)print(f"逐小时气象数据已保存到: {os.path.join(results_path, 'hourly_weather_data.csv')}")# 2. 按日汇总气象数据print("正在生成逐日气象汇总数据...")    hourly_weather['date'] = pd.to_datetime(hourly_weather['timestamp'].dt.date)    daily_weather = hourly_weather.groupby('date').agg({'temperature': ['max', 'min', 'mean', 'median'],'humidity': ['max', 'min', 'mean', 'median'],'wind_speed': ['max', 'mean'],'pressure': ['max', 'min', 'mean'],'precipitation': 'sum','sunshine': 'sum'    }).reset_index()# 重命名列    daily_weather.columns = ['date', 'temp_max', 'temp_min', 'temp_mean', 'temp_median','humidity_max', 'humidity_min', 'humidity_mean', 'humidity_median','wind_max', 'wind_mean', 'pressure_max', 'pressure_min', 'pressure_mean','precipitation_total', 'sunshine_hours']# 添加时间特征    daily_weather['year'] = daily_weather['date'].dt.year    daily_weather['month'] = daily_weather['date'].dt.month    daily_weather['day'] = daily_weather['date'].dt.day    daily_weather['day_of_year'] = daily_weather['date'].dt.dayofyear    daily_weather['week'] = daily_weather['date'].dt.isocalendar().week# 保存逐日气象汇总数据    daily_weather.to_csv(os.path.join(results_path, 'daily_weather_summary.csv'), index=False)print(f"逐日气象汇总数据已保存到: {os.path.join(results_path, 'daily_weather_summary.csv')}")# 3. 生成逐日疾病数据print("正在生成逐日疾病数据...")# 创建日期序列    date_seq = pd.date_range(start='1981-01-01', end='2025-12-31', freq='D')    n_days = len(date_seq)# 生成基础疾病数据框架    daily_diseases = pd.DataFrame({'date': date_seq,'year': date_seq.year,'month': date_seq.month,'day': date_seq.day,'day_of_year': date_seq.dayofyear,'week': date_seq.isocalendar().week    })# 生成呼吸道疾病数据    year_norm = daily_diseases['year'].values - 2000    day_of_year = daily_diseases['day_of_year'].values# 流感    influenza = np.exp(3 +                       1.5 * np.sin(2 * np.pi * (day_of_year - 10) / 365) +                       0.3 * np.sin(2 * np.pi * (day_of_year - 200) / 365) +                       0.1 * year_norm +                       np.random.normal(0, 0.3, n_days)) * \                (1 + 0.2 * np.sin(2 * np.pi * daily_diseases['year'].values / 3))# 普通感冒    common_cold = np.exp(2.5 +                         1.2 * np.sin(2 * np.pi * (day_of_year - 90) / 365) +                         1.0 * np.sin(2 * np.pi * (day_of_year - 270) / 365) +                         np.random.normal(0, 0.4, n_days))# 肺炎    pneumonia = np.exp(2 +                       1.8 * np.sin(2 * np.pi * (day_of_year - 15) / 365) +                       0.05 * (daily_diseases['year'].values - 1981) +                       np.random.normal(0, 0.35, n_days))    daily_diseases['influenza'] = np.round(influenza)    daily_diseases['common_cold'] = np.round(common_cold)    daily_diseases['pneumonia'] = np.round(pneumonia)# 生成肠道传染病数据    year_from_1981 = daily_diseases['year'].values - 1981# 细菌性痢疾    bacillary_dysentery = np.exp(2 +                                 2.0 * np.sin(2 * np.pi * (day_of_year - 200) / 365) +                                 0.02 * year_from_1981 +                                 np.random.normal(0, 0.5, n_days)) * \                          (1 - 0.005 * year_from_1981)# 手足口病    hand_foot_mouth = np.exp(2.2 +                             1.5 * np.sin(2 * np.pi * (day_of_year - 150) / 365) +                             0.8 * np.sin(2 * np.pi * (day_of_year - 330) / 365) +                             np.random.normal(0, 0.4, n_days))# 感染性腹泻    infectious_diarrhea = np.exp(2.8 +                                 1.0 * np.sin(2 * np.pi * (day_of_year - 180) / 365) +                                 np.random.normal(0, 0.3, n_days))    daily_diseases['bacillary_dysentery'] = np.round(bacillary_dysentery)    daily_diseases['hand_foot_mouth'] = np.round(hand_foot_mouth)    daily_diseases['infectious_diarrhea'] = np.round(infectious_diarrhea)# 生成自然疫源性疾病数据# 流行性出血热    hemorrhagic_fever = np.exp(1.5 +                               1.2 * np.sin(2 * np.pi * (day_of_year - 80) / 365) +                               1.0 * np.sin(2 * np.pi * (day_of_year - 280) / 365) +                               0.01 * year_from_1981 +                               np.random.normal(0, 0.6, n_days))# 莱姆病    lyme_disease = np.exp(1.2 +                          1.5 * np.sin(2 * np.pi * (day_of_year - 180) / 365) +                          np.random.normal(0, 0.5, n_days))# 布鲁氏菌病    brucellosis = np.exp(1.8 +                         1.0 * np.sin(2 * np.pi * (day_of_year - 100) / 365) +                         0.8 * np.sin(2 * np.pi * (day_of_year - 300) / 365) +                         np.random.normal(0, 0.4, n_days))    daily_diseases['hemorrhagic_fever'] = np.round(hemorrhagic_fever)    daily_diseases['lyme_disease'] = np.round(lyme_disease)    daily_diseases['brucellosis'] = np.round(brucellosis)# 确保所有疾病计数为非负整数    disease_cols = ['influenza', 'common_cold', 'pneumonia','bacillary_dysentery', 'hand_foot_mouth', 'infectious_diarrhea','hemorrhagic_fever', 'lyme_disease', 'brucellosis']for col in disease_cols:        daily_diseases[col] = np.maximum(daily_diseases[col], 0)# 保存逐日疾病数据    daily_diseases.to_csv(os.path.join(results_path, 'daily_disease_data.csv'), index=False)print(f"逐日疾病数据已保存到: {os.path.join(results_path, 'daily_disease_data.csv')}")# 4. 合并气象和疾病数据print("正在合并气象和疾病数据...")    combined_data = daily_weather.merge(daily_diseases, on='date', suffixes=('', '_y'))# 选择需要的列并重命名    combined_data = combined_data[['date', 'year', 'month', 'day', 'day_of_year', 'week','temp_max', 'temp_min', 'temp_mean', 'temp_median','humidity_max', 'humidity_min', 'humidity_mean', 'humidity_median','wind_max', 'wind_mean', 'pressure_max', 'pressure_min', 'pressure_mean','precipitation_total', 'sunshine_hours','influenza', 'common_cold', 'pneumonia','bacillary_dysentery', 'hand_foot_mouth', 'infectious_diarrhea','hemorrhagic_fever', 'lyme_disease', 'brucellosis']]    combined_data.rename(columns={'date': 'timestamp'}, inplace=True)# 保存合并后的数据    combined_data.to_csv(os.path.join(results_path, 'combined_weather_disease_data.csv'), index=False)print(f"合并数据已保存到: {os.path.join(results_path, 'combined_weather_disease_data.csv')}")# 5. 生成数据摘要报告print("\n=== 数据生成完成 ===")print(f"数据时间范围: {combined_data['timestamp'].min()} 至 {combined_data['timestamp'].max()}")print(f"总记录数: {len(combined_data)} 天")print(f"气象变量数: 15 个")print(f"疾病变量数: 9 个")print(f"总变量数: {len(combined_data.columns)} 个\n")print("合并数据的前6行:")print(combined_data.head(6))print("\n各疾病年度平均报告数:")    annual_summary = combined_data.groupby('year').agg({'influenza': 'mean','common_cold': 'mean','pneumonia': 'mean','bacillary_dysentery': 'mean','hand_foot_mouth': 'mean','hemorrhagic_fever': 'mean'    }).round(2)print(annual_summary.tail(10))print("\n=== 数据质量检查 ===")print("缺失值检查:")print(combined_data.isnull().sum())print("\n气象变量范围:")    temp_range = f"{combined_data['temp_min'].min():.1f} to {combined_data['temp_max'].max():.1f}°C"    humidity_range = f"{combined_data['humidity_min'].min():.1f} to {combined_data['humidity_max'].max():.1f}%"    max_precipitation = f"{combined_data['precipitation_total'].max():.1f}"print(f"温度范围: {temp_range}")print(f"湿度范围: {humidity_range}")print(f"最大降水量: {max_precipitation}")print("\n疾病报告范围:")    influenza_range = f"{combined_data['influenza'].min()} to {combined_data['influenza'].max()}"    dysentery_range = f"{combined_data['bacillary_dysentery'].min()} to {combined_data['bacillary_dysentery'].max()}"    hemorrhagic_range = f"{combined_data['hemorrhagic_fever'].min()} to {combined_data['hemorrhagic_fever'].max()}"print(f"流感范围: {influenza_range}")print(f"痢疾范围: {dysentery_range}")print(f"出血热范围: {hemorrhagic_range}")print(f"\n所有数据文件已保存到桌面 {results_path} 文件夹中。")print("文件列表:")print("1. hourly_weather_data.csv - 逐小时气象数据")print("2. daily_weather_summary.csv - 逐日气象汇总数据")print("3. daily_disease_data.csv - 逐日疾病数据")print("4. combined_weather_disease_data.csv - 合并的最终数据库")return combined_datadef create_time_series_plots(combined_data):"""创建各种时间序列图"""# 创建输出目录    plot_dirs = ['Daily_Original', 'Weekly_Aggregated', 'Monthly_Aggregated','Quarterly_Aggregated', 'Yearly_Individual', 'Facet_Matrix','Overlay_Plots', 'Seasonal_Decomposition']for dir_name in plot_dirs:        os.makedirs(os.path.join(results_path, 'TimeSeries_Plots', dir_name), exist_ok=True)# 定义变量分组    meteorological_vars = ['temp_max', 'temp_min', 'temp_mean', 'temp_median','humidity_max', 'humidity_min', 'humidity_mean', 'humidity_median','wind_max', 'wind_mean', 'pressure_max', 'pressure_min', 'pressure_mean','precipitation_total', 'sunshine_hours'    ]    disease_vars = ['influenza', 'common_cold', 'pneumonia','bacillary_dysentery', 'hand_foot_mouth', 'infectious_diarrhea','hemorrhagic_fever', 'lyme_disease', 'brucellosis'    ]    all_vars = meteorological_vars + disease_vars# 确保timestamp是datetime类型    combined_data['timestamp'] = pd.to_datetime(combined_data['timestamp'])# 添加时间聚合变量    combined_data['year_month'] = combined_data['timestamp'].dt.to_period('M').astype(str)    combined_data['quarter'] = combined_data['timestamp'].dt.quarter    combined_data['year_quarter'] = combined_data['year'].astype(str) + '-Q' + combined_data['quarter'].astype(str)# 1. 每日原始序列图print("生成每日原始序列图...")for var in all_vars:        plt.figure(figsize=(12, 6))        plt.plot(combined_data['timestamp'], combined_data[var],                 linewidth=0.3, alpha=0.7, color='steelblue')        plt.title(f'每日 {var} 时间序列\n1981-2025 原始日数据')        plt.xlabel('日期')        plt.ylabel(var)        plt.xticks(rotation=45)        plt.tight_layout()        plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Daily_Original', f'Daily_{var}.png'),                    dpi=300, bbox_inches='tight')        plt.close()

# 2. 每周统计图print("生成每周统计图...")# 按周聚合    weekly_data = combined_data.copy()    weekly_data['year_week'] = weekly_data['timestamp'].dt.isocalendar().year.astype(str) + '-W' + \                               weekly_data['timestamp'].dt.isocalendar().week.astype(str).str.zfill(2)    weekly_agg = weekly_data.groupby('year_week').agg({var: 'mean'for var in all_vars}).reset_index()    weekly_agg['date'] = pd.to_datetime(weekly_agg['year_week'] + '-1', format='%Y-W%W-%w')for var in all_vars:        plt.figure(figsize=(12, 6))        plt.plot(weekly_agg['date'], weekly_agg[var],                 linewidth=0.5, color='darkorange', label='周均值')# 添加平滑曲线        from scipy.ndimage import gaussian_filter1d        smoothed = gaussian_filter1d(weekly_agg[var].values, sigma=2)        plt.plot(weekly_agg['date'], smoothed,                 linewidth=1, color='red', label='平滑曲线')        plt.title(f'每周平均 {var} 时间序列\n按周聚合数据')        plt.xlabel('日期')        plt.ylabel(f'平均 {var}')        plt.legend()        plt.xticks(rotation=45)        plt.tight_layout()        plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Weekly_Aggregated', f'Weekly_Mean_{var}.png'),                    dpi=300, bbox_inches='tight')        plt.close()# 3. 每月统计图print("生成每月统计图...")    monthly_agg = combined_data.groupby('year_month').agg({var: 'mean'for var in all_vars}).reset_index()    monthly_agg['date'] = pd.to_datetime(monthly_agg['year_month'] + '-01')for var in all_vars:        plt.figure(figsize=(12, 6))        plt.plot(monthly_agg['date'], monthly_agg[var],                 linewidth=0.6, color='darkgreen', label='月均值')# 添加平滑曲线        smoothed = gaussian_filter1d(monthly_agg[var].values, sigma=1)        plt.plot(monthly_agg['date'], smoothed,                 linewidth=1, color='blue', label='平滑曲线')        plt.title(f'每月平均 {var} 时间序列\n按月聚合数据')        plt.xlabel('日期')        plt.ylabel(f'平均 {var}')        plt.legend()        plt.xticks(rotation=45)        plt.tight_layout()        plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Monthly_Aggregated', f'Monthly_Mean_{var}.png'),                    dpi=300, bbox_inches='tight')        plt.close()# 4. 季度统计图print("生成季度统计图...")    quarterly_agg = combined_data.groupby('year_quarter').agg({var: 'mean'for var in all_vars}).reset_index()    quarterly_agg['date'] = pd.to_datetime(quarterly_agg['year_quarter'].str.replace('Q', ''),                                           format='%Y-%m') + pd.offsets.MonthBegin(1)for var in all_vars:        plt.figure(figsize=(12, 6))        plt.plot(quarterly_agg['date'], quarterly_agg[var],                 linewidth=0.8, color='purple', marker='o', markersize=3, label='季度均值')# 添加线性趋势线        x_numeric = np.arange(len(quarterly_agg))        z = np.polyfit(x_numeric, quarterly_agg[var], 1)        p = np.poly1d(z)        plt.plot(quarterly_agg['date'], p(x_numeric),                 linewidth=1, color='red', linestyle='--', alpha=0.7, label='线性趋势')        plt.title(f'季度平均 {var} 时间序列\n按季度聚合数据，包含线性趋势')        plt.xlabel('日期')        plt.ylabel(f'平均 {var}')        plt.legend()        plt.xticks(rotation=45)        plt.tight_layout()        plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Quarterly_Aggregated', f'Quarterly_Mean_{var}.png'),                    dpi=300, bbox_inches='tight')        plt.close()# 5. 按年份的单独时间序列图print("生成按年份的单独时间序列图...")    recent_years = range(2016, 2026)for var in all_vars:        yearly_data = combined_data[combined_data['year'].isin(recent_years)].copy()        yearly_agg = yearly_data.groupby(['year', 'day_of_year'])[var].mean().reset_index()        plt.figure(figsize=(12, 8))for year in recent_years:            year_data = yearly_agg[yearly_agg['year'] == year]            plt.plot(year_data['day_of_year'], year_data[var],                     linewidth=0.6, label=str(year))        plt.title(f'按年份的 {var} 时间序列\n年份: 2016-2025')        plt.xlabel('一年中的第几天')        plt.ylabel(var)        plt.legend(bbox_to_anchor=(0.5, -0.2), loc='upper center', ncol=5)        plt.tight_layout()        plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Yearly_Individual', f'Yearly_Individual_{var}.png'),                    dpi=300, bbox_inches='tight')        plt.close()# 6. 分面矩阵排列图print("生成分面矩阵排列图...")# 气象变量分面    meteo_vars_to_plot = meteorological_vars[:8]    fig, axes = plt.subplots(len(meteo_vars_to_plot), 1, figsize=(14, 16))for i, var in enumerate(meteo_vars_to_plot):        axes[i].plot(monthly_agg['date'], monthly_agg[var],                     linewidth=0.3, color='steelblue')        axes[i].set_title(var, fontweight='bold')        axes[i].set_ylabel(var)if i == len(meteo_vars_to_plot) - 1:            axes[i].set_xlabel('日期')        axes[i].tick_params(axis='x', rotation=45)    plt.suptitle('气象变量分面时间序列图\n按月聚合数据', fontsize=16)    plt.tight_layout()    plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Facet_Matrix', 'Facet_Meteorological_Vars.png'),                dpi=300, bbox_inches='tight')    plt.close()# 疾病变量分面    fig, axes = plt.subplots(len(disease_vars), 1, figsize=(14, 16))for i, var in enumerate(disease_vars):        axes[i].plot(monthly_agg['date'], monthly_agg[var],                     linewidth=0.3, color='firebrick')        axes[i].set_title(var, fontweight='bold')        axes[i].set_ylabel('病例数')if i == len(disease_vars) - 1:            axes[i].set_xlabel('日期')        axes[i].tick_params(axis='x', rotation=45)    plt.suptitle('疾病变量分面时间序列图\n按月聚合数据', fontsize=16)    plt.tight_layout()    plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Facet_Matrix', 'Facet_Disease_Vars.png'),                dpi=300, bbox_inches='tight')    plt.close()# 7. 叠加时间序列图print("生成叠加时间序列图...")# 温度变量叠加    temp_vars = ['temp_max', 'temp_min', 'temp_mean']    temp_data = monthly_agg[['date'] + temp_vars].melt(id_vars=['date'],                                                       value_vars=temp_vars,                                                       var_name='temperature_type',                                                       value_name='value')    plt.figure(figsize=(12, 6))for temp_type in temp_vars:        data = temp_data[temp_data['temperature_type'] == temp_type]        plt.plot(data['date'], data['value'], label=temp_type, linewidth=0.6)    plt.title('温度变量叠加时间序列\n最大、最小、平均温度对比')    plt.xlabel('日期')    plt.ylabel('温度 (°C)')    plt.legend(bbox_to_anchor=(0.5, -0.15), loc='upper center', ncol=3)    plt.tight_layout()    plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Overlay_Plots', 'Overlay_Temperature.png'),                dpi=300, bbox_inches='tight')    plt.close()# 湿度变量叠加    humidity_vars = ['humidity_max', 'humidity_min', 'humidity_mean']    humidity_data = monthly_agg[['date'] + humidity_vars].melt(id_vars=['date'],                                                               value_vars=humidity_vars,                                                               var_name='humidity_type',                                                               value_name='value')    plt.figure(figsize=(12, 6))for humidity_type in humidity_vars:        data = humidity_data[humidity_data['humidity_type'] == humidity_type]        plt.plot(data['date'], data['value'], label=humidity_type, linewidth=0.6)    plt.title('湿度变量叠加时间序列\n最大、最小、平均湿度对比')    plt.xlabel('日期')    plt.ylabel('相对湿度 (%)')    plt.legend(bbox_to_anchor=(0.5, -0.15), loc='upper center', ncol=3)    plt.tight_layout()    plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Overlay_Plots', 'Overlay_Humidity.png'),                dpi=300, bbox_inches='tight')    plt.close()# 呼吸道疾病叠加    respiratory_vars = ['influenza', 'common_cold', 'pneumonia']    respiratory_data = monthly_agg[['date'] + respiratory_vars].melt(id_vars=['date'],                                                                     value_vars=respiratory_vars,                                                                     var_name='disease',                                                                     value_name='cases')    plt.figure(figsize=(12, 6))for disease in respiratory_vars:        data = respiratory_data[respiratory_data['disease'] == disease]        plt.plot(data['date'], data['cases'], label=disease, linewidth=0.6)    plt.title('呼吸道疾病叠加时间序列\n流感、普通感冒、肺炎对比')    plt.xlabel('日期')    plt.ylabel('病例数')    plt.legend(bbox_to_anchor=(0.5, -0.15), loc='upper center', ncol=3)    plt.tight_layout()    plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Overlay_Plots', 'Overlay_Respiratory_Diseases.png'),                dpi=300, bbox_inches='tight')    plt.close()# 肠道疾病叠加    intestinal_vars = ['bacillary_dysentery', 'hand_foot_mouth', 'infectious_diarrhea']    intestinal_data = monthly_agg[['date'] + intestinal_vars].melt(id_vars=['date'],                                                                   value_vars=intestinal_vars,                                                                   var_name='disease',                                                                   value_name='cases')    plt.figure(figsize=(12, 6))for disease in intestinal_vars:        data = intestinal_data[intestinal_data['disease'] == disease]        plt.plot(data['date'], data['cases'], label=disease, linewidth=0.6)    plt.title('肠道传染病叠加时间序列\n细菌性痢疾、手足口病、感染性腹泻对比')    plt.xlabel('日期')    plt.ylabel('病例数')    plt.legend(bbox_to_anchor=(0.5, -0.15), loc='upper center', ncol=3)    plt.tight_layout()    plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Overlay_Plots', 'Overlay_Intestinal_Diseases.png'),                dpi=300, bbox_inches='tight')    plt.close()

# 8. 季节性分解图print("生成季节性分解图...")    try:        from statsmodels.tsa.seasonal import seasonal_decompose        key_vars = ['temp_mean', 'influenza', 'bacillary_dysentery', 'hemorrhagic_fever']for var in key_vars:# 创建月度时间序列            ts_data = monthly_agg.set_index('date')[var]# 季节性分解            decomposition = seasonal_decompose(ts_data, model='additive', period=12)# 绘制分解图            fig, axes = plt.subplots(4, 1, figsize=(12, 8))            decomposition.observed.plot(ax=axes[0], title='原始序列', color='blue')            decomposition.trend.plot(ax=axes[1], title='趋势', color='red')            decomposition.seasonal.plot(ax=axes[2], title='季节性', color='green')            decomposition.resid.plot(ax=axes[3], title='残差', color='black')            plt.suptitle(f'{var} 季节性分解', fontsize=16)            plt.tight_layout()            plt.savefig(os.path.join(results_path, 'TimeSeries_Plots', 'Seasonal_Decomposition',                                     f'Seasonal_Decomposition_{var}.png'),                        dpi=300, bbox_inches='tight')            plt.close()    except ImportError:print("警告: 未安装statsmodels库，跳过季节性分解图")# 9. 生成汇总报告print("生成绘图汇总报告...")    plot_summary = pd.DataFrame({'图表类型': ['每日原始序列', '每周统计', '每月统计', '季度统计','按年份单独', '分面矩阵', '叠加序列', '季节性分解'],'数量': [len(all_vars), len(all_vars), len(all_vars), len(all_vars),                 len(all_vars), 4, 5, len(key_vars)],'文件位置': ['Daily_Original', 'Weekly_Aggregated', 'Monthly_Aggregated', 'Quarterly_Aggregated','Yearly_Individual', 'Facet_Matrix', 'Overlay_Plots', 'Seasonal_Decomposition'],'描述': ['原始日数据序列', '按周聚合的均值', '按月聚合的均值', '按季度聚合的均值','按年份分别展示', '变量分面排列', '相关变量叠加', '季节性模式分解']    })    plot_summary.to_csv(os.path.join(results_path, 'TimeSeries_Plots', 'plot_summary.csv'), index=False)# 生成变量统计摘要    variable_summary = pd.DataFrame({'变量名称': all_vars,'变量类型': ['气象'if var in meteorological_vars else'疾病'for var in all_vars],'数据范围': [f"{combined_data[var].min():.2f} - {combined_data[var].max():.2f}"for var in all_vars],'均值': [f"{combined_data[var].mean():.2f}"for var in all_vars]    })    variable_summary.to_csv(os.path.join(results_path, 'TimeSeries_Plots', 'variable_summary.csv'), index=False)print("\n=== 时间序列图生成完成 ===")print("总生成图表数量:")print(f"- 每日原始序列图: {len(all_vars)} 个")print(f"- 每周统计图: {len(all_vars)} 个")print(f"- 每月统计图: {len(all_vars)} 个")print(f"- 季度统计图: {len(all_vars)} 个")print(f"- 按年份单独图: {len(all_vars)} 个")print(f"- 分面矩阵图: 4 个")print(f"- 叠加序列图: 5 个")print(f"- 季节性分解图: {len(key_vars)} 个")print(f"总计: {len(all_vars) * 5 + 9} 个图表文件\n")print(f"文件保存位置: {os.path.join(results_path, 'TimeSeries_Plots')}")print("子文件夹:")print("1. Daily_Original - 每日原始序列")print("2. Weekly_Aggregated - 每周聚合序列")print("3. Monthly_Aggregated - 每月聚合序列")print("4. Quarterly_Aggregated - 每季度聚合序列")print("5. Yearly_Individual - 按年份单独序列")print("6. Facet_Matrix - 分面矩阵排列")print("7. Overlay_Plots - 叠加时间序列")print("8. Seasonal_Decomposition - 季节性分解\n")print("汇总文件:")print("- plot_summary.csv - 绘图汇总信息")print("- variable_summary.csv - 变量统计摘要")print("\n示例图表预览:")print(        f"气象变量分面图: {os.path.join(results_path, 'TimeSeries_Plots', 'Facet_Matrix', 'Facet_Meteorological_Vars.png')}")print(f"疾病变量分面图: {os.path.join(results_path, 'TimeSeries_Plots', 'Facet_Matrix', 'Facet_Disease_Vars.png')}")print(        f"呼吸道疾病叠加图: {os.path.join(results_path, 'TimeSeries_Plots', 'Overlay_Plots', 'Overlay_Respiratory_Diseases.png')}")def main():"""主函数"""print(f"结果将保存到: {results_path}")# 生成数据    combined_data = generate_weather_disease_data()# 创建时间序列图    create_time_series_plots(combined_data)if __name__ == "__main__":    main()