學習資料科學：中央氣象署地震資料清理與分析實戰

中央氣象署地震資料處理

處理從中央氣象署開放資料平台取得的地震資料 JSON 檔案，並進行資料清理與統計分析。

API 參考連結:

中央氣象署開放資料平台 - 地震海嘯 API

範例 Curl 指令 (屏東縣):

curl -X 'GET' \
  '[https://opendata.cwa.gov.tw/api/v1/rest/datastore/E-A0015-001?Authorization=您的驗證碼&format=JSON&AreaName=%E5%B1%8F%E6%9D%B1%E7%B8%A3](https://www.google.com/search?q=https://opendata.cwa.gov.tw/api/v1/rest/datastore/E-A0015-001%3FAuthorization%3D%E6%82%A8%E7%9A%84%E9%A9%97%E8%AD%89%E7%A2%BC%26format%3DJSON%26AreaName%3D%25E5%25B1%258F%25E6%259D%25B1%25E7%25B8%25A3)' \
  -H 'accept: application/json'

Python 程式碼:

import json
import pandas as pd

def process_earthquake_data(filepath):
    """
    處理地震資料 JSON 檔案，進行資料清理和統計。

    Args:
        filepath: JSON 檔案路徑。

    Returns:
        一個字典，包含清理後的 DataFrame 和一些統計結果。
    """
    try:
        with open(filepath, 'r', encoding='utf-8') as f:
            data = json.load(f)
    except FileNotFoundError:
        print(f"錯誤：找不到檔案 {filepath}")
        return None
    except json.JSONDecodeError:
        print(f"錯誤：{filepath} 的 JSON 格式無效")
        return None

    # 提取地震資料
    earthquake_list = []
    for earthquake in data['records']['Earthquake']:
        earthquake_data = {
            'EarthquakeNo': earthquake['EarthquakeNo'],
            'ReportType': earthquake['ReportType'],
            'ReportColor': earthquake['ReportColor'],
            'ReportContent': earthquake['ReportContent'],
            'ReportImageURI': earthquake['ReportImageURI'],
            'ReportRemark': earthquake['ReportRemark'],
            'Web': earthquake['Web'],
            'ShakemapImageURI': earthquake['ShakemapImageURI'],
            'OriginTime': earthquake['EarthquakeInfo']['OriginTime'],
            'Source': earthquake['EarthquakeInfo']['Source'],
            'FocalDepth': earthquake['EarthquakeInfo']['FocalDepth'],
            'Location': earthquake['EarthquakeInfo']['Epicenter']['Location'],
            'EpicenterLatitude': earthquake['EarthquakeInfo']['Epicenter']['EpicenterLatitude'],
            'EpicenterLongitude': earthquake['EarthquakeInfo']['Epicenter']['EpicenterLongitude'],
            'MagnitudeType': earthquake['EarthquakeInfo']['EarthquakeMagnitude']['MagnitudeType'],
            'MagnitudeValue': earthquake['EarthquakeInfo']['EarthquakeMagnitude']['MagnitudeValue'],
        }

        # 展開 Intensity 資料
        for area in earthquake['Intensity']['ShakingArea']:
            for station in area.get("EqStation", []):
                earthquake_data_copy = earthquake_data.copy()
                earthquake_data_copy['AreaDesc'] = area['AreaDesc']
                earthquake_data_copy['CountyName'] = area['CountyName']
                earthquake_data_copy['AreaIntensity'] = area['AreaIntensity']
                earthquake_data_copy['StationName'] = station.get('StationName')
                earthquake_data_copy['StationID'] = station.get('StationID')
                earthquake_data_copy['SeismicIntensity'] = station.get('SeismicIntensity')
                earthquake_data_copy['WaveImageURI'] = station.get('WaveImageURI')
                earthquake_data_copy['BackAzimuth'] = station.get('BackAzimuth')
                earthquake_data_copy['EpicenterDistance'] = station.get('EpicenterDistance')
                earthquake_data_copy['StationLatitude'] = station.get('StationLatitude')
                earthquake_data_copy['StationLongitude'] = station.get('StationLongitude')

                if 'pga' in station:
                    earthquake_data_copy['pga_EWComponent'] = station['pga'].get('EWComponent')
                    earthquake_data_copy['pga_NSComponent'] = station['pga'].get('NSComponent')
                    earthquake_data_copy['pga_VComponent'] = station['pga'].get('VComponent')
                    earthquake_data_copy['pga_IntScaleValue'] = station['pga'].get('IntScaleValue')
                    earthquake_data_copy['pga_unit'] = station['pga'].get('unit')
                if 'pgv' in station:
                    earthquake_data_copy['pgv_EWComponent'] = station['pgv'].get('EWComponent')
                    earthquake_data_copy['pgv_NSComponent'] = station['pgv'].get('NSComponent')
                    earthquake_data_copy['pgv_VComponent'] = station['pgv'].get('VComponent')
                    earthquake_data_copy['pgv_IntScaleValue'] = station['pgv'].get('IntScaleValue')
                    earthquake_data_copy['pgv_unit'] = station['pgv'].get('unit')
                earthquake_list.append(earthquake_data_copy)

            if len(area.get("EqStation", [])) == 0:
                earthquake_data_copy = earthquake_data.copy()
                earthquake_data_copy['AreaDesc'] = area['AreaDesc']
                earthquake_data_copy['CountyName'] = area['CountyName']
                earthquake_data_copy['AreaIntensity'] = area['AreaIntensity']
                earthquake_list.append(earthquake_data_copy)

    # 建立 DataFrame
    df = pd.DataFrame(earthquake_list)

    # 資料清理
    # 1. 轉換 OriginTime 為 datetime
    df['OriginTime'] = pd.to_datetime(df['OriginTime'])

    # 2. 轉換數值欄位為數值類型
    numeric_cols = ['FocalDepth', 'EpicenterLatitude', 'EpicenterLongitude', 'MagnitudeValue',
                    'pga_EWComponent', 'pga_NSComponent', 'pga_VComponent', 'pga_IntScaleValue',
                    'pgv_EWComponent', 'pgv_NSComponent', 'pgv_VComponent', 'pgv_IntScaleValue',
                    'BackAzimuth','EpicenterDistance','StationLatitude','StationLongitude'
                   ]

    for col in numeric_cols:
        if col in df.columns:
            df[col] = pd.to_numeric(df[col], errors='coerce')

    # 3. 處理缺失值
    for col in numeric_cols:
        if col in df.columns:
            df[col].fillna(df[col].mean(), inplace=True)
    for col in df.select_dtypes(include=['object']).columns:
        df[col].fillna(df[col].mode()[0], inplace=True)

    # 統計分析
    # 1. 平均地震規模
    avg_magnitude = df['MagnitudeValue'].mean()

    # 2. 最大震度的地區
    max_intensity_area = df.groupby('AreaIntensity')['EarthquakeNo'].count().idxmax()

    # 3. 每個縣市的地震次數
    earthquake_count_per_county = df['CountyName'].value_counts()

    # 4. 地震總數
    total_earthquake_num = len(df.drop_duplicates(subset="EarthquakeNo"))

    #5. 最大PGA 的測站
    if 'pga_IntScaleValue' in df.columns:
      max_pga_station = df.loc[df['pga_IntScaleValue'].idxmax()]
    else:
      max_pga_station = None

    #6. 最大PGV的測站
    if 'pgv_IntScaleValue' in df.columns:
        max_pgv_station = df.loc[df['pgv_IntScaleValue'].idxmax()]
    else:
        max_pgv_station = None

    return {
        'cleaned_df': df,
        'average_magnitude': avg_magnitude,
        'max_intensity_area': max_intensity_area,
        'earthquake_count_per_county': earthquake_count_per_county,
        'total_earthquake_num':total_earthquake_num,
        'max_pga_station':max_pga_station,
        'max_pgv_station':max_pgv_station,
    }

# 範例使用
filepath = 'api_earthquake.json'
results = process_earthquake_data(filepath)

if results:
    print("清理後的 DataFrame：")
    print(results['cleaned_df'])
    print("\n統計結果：")
    print(f"平均地震規模：{results['average_magnitude']:.2f}")
    print(f"最大震度地區：{results['max_intensity_area']}")
    print(f"每個縣市的地震次數：\n{results['earthquake_count_per_county']}")
    print(f"地震總數(獨立編號)：{results['total_earthquake_num']}")
    if results['max_pga_station'] is not None:
      print(f"\n最大 PGA 的測站：")
      print(results['max_pga_station'])
    if results['max_pgv_station'] is not None:
      print(f"\n最大 PGV 的測站：")
      print(results['max_pgv_station'])

1. 導入函式庫

import json
import pandas as pd

import json: 導入 Python 的 json 模組，用於處理 JSON (JavaScript Object Notation) 格式的資料。JSON 是一種輕量級的資料交換格式，常用于 Web API 數據傳輸。
import pandas as pd: 導入 pandas 函式庫並簡稱為 pd。 pandas 是一個強大的資料分析函式庫，提供 DataFrame 資料結構，便於進行資料清理、轉換與分析。

2. 定義 `process_earthquake_data` 函式

def process_earthquake_data(filepath):
    """
    處理地震資料 JSON 檔案，進行資料清理和統計。

    Args:
        filepath: JSON 檔案路徑。

    Returns:
        一個字典，包含清理後的 DataFrame 和一些統計結果。
    """
    # ... 程式碼 ...

def process_earthquake_data(filepath):: 定義名為 process_earthquake_data 的函式，接受 filepath 參數，代表地震資料 JSON 檔案的路徑。
"""...""": 函式的docstring，用於說明函式功能、參數及回傳值，提升程式碼可讀性與維護性。
- Args:: 說明函式接受的參數，filepath 為 JSON 檔案路徑。
- Returns:: 說明函式回傳值，為一字典，包含清理後的 DataFrame 及統計結果。

3. 檔案讀取與錯誤處理

    try:
        with open(filepath, 'r', encoding='utf-8') as f:
            data = json.load(f)
    except FileNotFoundError:
        print(f"錯誤：找不到檔案 {filepath}")
        return None
    except json.JSONDecodeError:
        print(f"錯誤：{filepath} 的 JSON 格式無效")
        return None

try...except: Python 的錯誤處理機制。try 區塊內程式碼會先執行，若發生錯誤，則會跳至 except 區塊執行相應的錯誤處理。
with open(filepath, 'r', encoding='utf-8') as f:: 使用 with open() 語法安全地開啟檔案。
- filepath: 檔案路徑。
- 'r': 以讀取模式開啟檔案。
- encoding='utf-8': 使用 UTF-8 編碼讀取檔案，確保能正確處理包含中文的 JSON 檔案。
- as f: 將檔案物件賦值給變數 f。 with 語法確保檔案使用完畢後自動關閉，即使發生錯誤亦然。
data = json.load(f): 使用 json.load(f) 將檔案 f 中的 JSON 資料讀取到 data 變數中。
except FileNotFoundError:: 捕捉 FileNotFoundError 錯誤，即檔案不存在。若發生此錯誤，印出錯誤訊息並回傳 None。
except json.JSONDecodeError:: 捕捉 json.JSONDecodeError 錯誤，即 JSON 格式無效。若發生此錯誤，印出錯誤訊息並回傳 None。

4. 資料提取與轉換

    # 提取地震資料
    earthquake_list = []
    for earthquake in data['records']['Earthquake']:
        earthquake_data = {
            'EarthquakeNo': earthquake['EarthquakeNo'],
            'ReportType': earthquake['ReportType'],
            'ReportColor': earthquake['ReportColor'],
            'ReportContent': earthquake['ReportContent'],
            'ReportImageURI': earthquake['ReportImageURI'],
            'ReportRemark': earthquake['ReportRemark'],
            'Web': earthquake['Web'],
            'ShakemapImageURI': earthquake['ShakemapImageURI'],
            'OriginTime': earthquake['EarthquakeInfo']['OriginTime'],
            'Source': earthquake['EarthquakeInfo']['Source'],
            'FocalDepth': earthquake['EarthquakeInfo']['FocalDepth'],
            'Location': earthquake['EarthquakeInfo']['Epicenter']['Location'],
            'EpicenterLatitude': earthquake['EarthquakeInfo']['Epicenter']['EpicenterLatitude'],
            'EpicenterLongitude': earthquake['EarthquakeInfo']['Epicenter']['EpicenterLongitude'],
            'MagnitudeType': earthquake['EarthquakeInfo']['EarthquakeMagnitude']['MagnitudeType'],
            'MagnitudeValue': earthquake['EarthquakeInfo']['EarthquakeMagnitude']['MagnitudeValue'],
        }

        # 展開 Intensity 資料
        for area in earthquake['Intensity']['ShakingArea']:
            for station in area.get("EqStation", []):
                earthquake_data_copy = earthquake_data.copy()
                earthquake_data_copy['AreaDesc'] = area['AreaDesc']
                earthquake_data_copy['CountyName'] = area['CountyName']
                earthquake_data_copy['AreaIntensity'] = area['AreaIntensity']
                earthquake_data_copy['StationName'] = station.get('StationName')
                earthquake_data_copy['StationID'] = station.get('StationID')
                earthquake_data_copy['SeismicIntensity'] = station.get('SeismicIntensity')
                earthquake_data_copy['WaveImageURI'] = station.get('WaveImageURI')
                earthquake_data_copy['BackAzimuth'] = station.get('BackAzimuth')
                earthquake_data_copy['EpicenterDistance'] = station.get('EpicenterDistance')
                earthquake_data_copy['StationLatitude'] = station.get('StationLatitude')
                earthquake_data_copy['StationLongitude'] = station.get('StationLongitude')

                if 'pga' in station:
                    earthquake_data_copy['pga_EWComponent'] = station['pga'].get('EWComponent')
                    earthquake_data_copy['pga_NSComponent'] = station['pga'].get('NSComponent')
                    earthquake_data_copy['pga_VComponent'] = station['pga'].get('VComponent')
                    earthquake_data_copy['pga_IntScaleValue'] = station['pga'].get('IntScaleValue')
                    earthquake_data_copy['pga_unit'] = station['pga'].get('unit')
                if 'pgv' in station:
                    earthquake_data_copy['pgv_EWComponent'] = station['pgv'].get('EWComponent')
                    earthquake_data_copy['pgv_NSComponent'] = station['pgv'].get('NSComponent')
                    earthquake_data_copy['pgv_VComponent'] = station['pgv'].get('VComponent')
                    earthquake_data_copy['pgv_IntScaleValue'] = station['pgv'].get('IntScaleValue')
                    earthquake_data_copy['pgv_unit'] = station['pgv'].get('unit')
                earthquake_list.append(earthquake_data_copy)

            if len(area.get("EqStation", [])) == 0:
                earthquake_data_copy = earthquake_data.copy()
                earthquake_data_copy['AreaDesc'] = area['AreaDesc']
                earthquake_data_copy['CountyName'] = area['CountyName']
                earthquake_data_copy['AreaIntensity'] = area['AreaIntensity']
                earthquake_list.append(earthquake_data_copy)

earthquake_list = []: 建立空列表 earthquake_list，用於儲存提取出的地震資料。
for earthquake in data['records']['Earthquake']:: 迴圈遍歷 data['records']['Earthquake'] 中的每個地震資料。
earthquake_data = {...}: 建立字典 earthquake_data，儲存單個地震的基本資訊。
- 字典的 key 為欄位名稱，value 從 JSON 資料中提取。例如 'EarthquakeNo': earthquake['EarthquakeNo'] 將 JSON 資料中 earthquake['EarthquakeNo'] 的值存入字典，鍵為 'EarthquakeNo'。
for area in earthquake['Intensity']['ShakingArea']:: 迴圈遍歷每個地震的 Intensity 中的 ShakingArea，ShakingArea 包含特定區域的震度資訊。
for station in area.get("EqStation", []):: 迴圈遍歷每個 ShakingArea 中的 EqStation (地震站)。使用 .get("EqStation", []) 在 EqStation 不存在時提供空列表，避免程式錯誤。
earthquake_data_copy = earthquake_data.copy(): 複製原始地震資料至 earthquake_data_copy，以便在不影響原始資料下，新增或修改站點特定資訊。
earthquake_data_copy['AreaDesc'] = area['AreaDesc'] … earthquake_data_copy['AreaIntensity'] = area['AreaIntensity']: 將區域描述、縣市名稱及區域震度等資訊加入 earthquake_data_copy 字典。
earthquake_data_copy['StationName'] = station.get('StationName') … earthquake_data_copy['SeismicIntensity'] = station.get('SeismicIntensity'): 從 station 字典提取站點名稱、ID 及地震強度等資訊，加入 earthquake_data_copy 字典。使用 .get() 方法安全處理可能不存在的鍵。
if 'pga' in station: 和 if 'pgv' in station:: 檢查 station 字典是否存在 'pga' (峰值地面加速度) 或 'pgv' (峰值地面速度) 資訊。
earthquake_data_copy['pga_EWComponent'] = station['pga'].get('EWComponent') … earthquake_data_copy['pga_unit'] = station['pga'].get('unit'): 若存在 'pga' 資訊，提取東-西、南-北、垂直方向分量、強度比例值及單位，並加入 earthquake_data_copy 字典。使用 .get() 處理可能缺失的鍵。
earthquake_data_copy['pgv_EWComponent'] = station['pgv'].get('EWComponent') … earthquake_data_copy['pgv_unit'] = station['pgv'].get('unit'): 若存在 'pgv' 資訊，提取東-西、南-北、垂直方向分量、強度比例值及單位，並加入 earthquake_data_copy 字典。
earthquake_list.append(earthquake_data_copy): 將包含地震與站點特定資訊的 earthquake_data_copy 字典加入 earthquake_list 列表。
if len(area.get("EqStation", [])) == 0:: 檢查 area 中是否存地震站 (EqStation)。若不存在 (長度為 0)，執行後續程式碼，處理區域可能缺少地震站資料情況，仍將區域描述、縣市名稱、區域震度等資訊加入 earthquake_list，即使缺乏具體站點資料。
earthquake_list.append(earthquake_data_copy): 將包含地震與區域資訊的 earthquake_data_copy 字典加入 earthquake_list 列表。

5. 建立 DataFrame

    # 建立 DataFrame
    df = pd.DataFrame(earthquake_list)

df = pd.DataFrame(earthquake_list): 使用 pandas.DataFrame() 函式將 earthquake_list 轉換為 DataFrame。DataFrame 是一種表格型資料結構，方便後續資料清理與分析。

6. 資料清理

    # 資料清理
    # 1. 轉換 OriginTime 為 datetime
    df['OriginTime'] = pd.to_datetime(df['OriginTime'])

    # 2. 轉換數值欄位為數值類型
    numeric_cols = ['FocalDepth', 'EpicenterLatitude', 'EpicenterLongitude', 'MagnitudeValue',
                    'pga_EWComponent', 'pga_NSComponent', 'pga_VComponent', 'pga_IntScaleValue',
                    'pgv_EWComponent', 'pgv_NSComponent', 'pgv_VComponent', 'pgv_IntScaleValue',
                    'BackAzimuth','EpicenterDistance','StationLatitude','StationLongitude'
                   ]

    for col in numeric_cols:
        if col in df.columns:
            df[col] = pd.to_numeric(df[col], errors='coerce')

    # 3. 處理缺失值
    for col in numeric_cols:
        if col in df.columns:
            df[col].fillna(df[col].mean(), inplace=True)
    for col in df.select_dtypes(include=['object']).columns:
        df[col].fillna(df[col].mode()[0], inplace=True)

df['OriginTime'] = pd.to_datetime(df['OriginTime']): 將 OriginTime 欄位轉換為 datetime 格式。 pd.to_datetime() 函式可將字串或數字轉為 datetime 格式，便於時間序列分析。
numeric_cols = [...]: 定義列表 numeric_cols，包含需轉換為數值類型的欄位名稱。
for col in numeric_cols:: 迴圈遍歷 numeric_cols 中的每個欄位名稱。
- if col in df.columns:: 檢查 DataFrame 是否存在該欄位。
- df[col] = pd.to_numeric(df[col], errors='coerce'): 使用 pd.to_numeric() 函式將欄位轉為數值類型。errors='coerce' 設定轉換失敗時設為 NaN (缺失值)。
for col in numeric_cols:: 迴圈遍歷 numeric_cols 中的每個欄位名稱。
- if col in df.columns:: 檢查 DataFrame 是否存在該欄位。
- df[col].fillna(df[col].mean(), inplace=True): 使用平均值填補數值欄位的缺失值 (NaN)。fillna() 函式用於填補缺失值，df[col].mean() 計算欄位平均值， inplace=True 直接修改 DataFrame。
for col in df.select_dtypes(include=['object']).columns:: 迴圈遍歷 DataFrame 中所有字串類型的欄位。
- df[col].fillna(df[col].mode()[0], inplace=True): 使用眾數填補字串類型欄位的缺失值。 df[col].mode()[0] 計算欄位眾數，inplace=True 直接修改 DataFrame。

7. 統計分析

    # 統計分析
    # 1. 平均地震規模
    avg_magnitude = df['MagnitudeValue'].mean()

    # 2. 最大震度的地區
    max_intensity_area = df.groupby('AreaIntensity')['EarthquakeNo'].count().idxmax()

    # 3. 每個縣市的地震次數
    earthquake_count_per_county = df['CountyName'].value_counts()

    # 4. 地震總數
    total_earthquake_num = len(df.drop_duplicates(subset="EarthquakeNo"))

    #5. 最大PGA 的測站
    if 'pga_IntScaleValue' in df.columns:
      max_pga_station = df.loc[df['pga_IntScaleValue'].idxmax()]
    else:
      max_pga_station = None

    #6. 最大PGV的測站
    if 'pgv_IntScaleValue' in df.columns:
        max_pgv_station = df.loc[df['pgv_IntScaleValue'].idxmax()]
    else:
        max_pgv_station = None

avg_magnitude = df['MagnitudeValue'].mean(): 計算平均地震規模。df['MagnitudeValue'] 選擇 MagnitudeValue 欄位，.mean() 計算平均值。
max_intensity_area = df.groupby('AreaIntensity')['EarthquakeNo'].count().idxmax(): 找出最大震度的地區。
- df.groupby('AreaIntensity'): 依 AreaIntensity 欄位分組。
- ['EarthquakeNo'].count(): 計算每組別中 EarthquakeNo 的數量。
- .idxmax(): 找出數量最多組別的名稱 (即最大震度地區)。
earthquake_count_per_county = df['CountyName'].value_counts(): 統計每個縣市的地震次數。 df['CountyName'] 選擇 CountyName 欄位，.value_counts() 計算各縣市出現次數。
total_earthquake_num = len(df.drop_duplicates(subset="EarthquakeNo")): 計算地震總數 (獨立編號)，以 EarthquakeNo 去除重複地震。
- df.drop_duplicates(subset="EarthquakeNo"): 移除 EarthquakeNo 欄位值重複的列，僅保留首次出現的列。
- len(...): 計算 DataFrame 列數，即地震總數。
if 'pga_IntScaleValue' in df.columns:: 檢查 DataFrame 是否存在 pga_IntScaleValue 欄位。
- max_pga_station = df.loc[df['pga_IntScaleValue'].idxmax()]: 找出最大 PGA 的測站資訊。
  - df['pga_IntScaleValue'].idxmax(): 找出 pga_IntScaleValue 欄位最大值的索引。
  - df.loc[...]: 使用索引選擇 DataFrame 相應列，即最大 PGA 測站資訊。
- else: max_pga_station = None: 若 DataFrame 無 pga_IntScaleValue 欄位，max_pga_station 設為 None。
if 'pgv_IntScaleValue' in df.columns:: 檢查 DataFrame 是否存在 pgv_IntScaleValue 欄位。
- max_pgv_station = df.loc[df['pgv_IntScaleValue'].idxmax()]: 找出最大 PGV 的測站資訊。
  - df['pgv_IntScaleValue'].idxmax(): 找出 pgv_IntScaleValue 欄位最大值的索引。
  - df.loc[...]: 使用索引選擇 DataFrame 相應列，即最大 PGV 測站資訊。
- else: max_pgv_station = None: 若 DataFrame 無 pgv_IntScaleValue 欄位，max_pgv_station 設為 None。

8. 回傳結果

    return {
        'cleaned_df': df,
        'average_magnitude': avg_magnitude,
        'max_intensity_area': max_intensity_area,
        'earthquake_count_per_county': earthquake_count_per_county,
        'total_earthquake_num':total_earthquake_num,
        'max_pga_station':max_pga_station,
        'max_pgv_station':max_pgv_station,
    }

return {...}: 將清理後的 DataFrame 及統計結果以字典形式回傳。字典的 key 為結果名稱，value 為對應值。

9. 範例使用

# 範例使用
filepath = '/Users/aaron/Downloads/api_earthquake.json'
results = process_earthquake_data(filepath)

if results:
    print("清理後的 DataFrame：")
    print(results['cleaned_df'])
    print("\n統計結果：")
    print(f"平均地震規模：{results['average_magnitude']:.2f}")
    print(f"最大震度地區：{results['max_intensity_area']}")
    print(f"每個縣市的地震次數：\n{results['earthquake_count_per_county']}")
    print(f"地震總數(獨立編號)：{results['total_earthquake_num']}")
    if results['max_pga_station'] is not None:
      print(f"\n最大 PGA 的測站：")
      print(results['max_pga_station'])
    if results['max_pgv_station'] is not None:
      print(f"\n最大 PGV 的測站：")
      print(results['max_pgv_station'])

filepath = '/Users/aaron/Downloads/api_earthquake.json': 設定 JSON 檔案路徑。請確保路徑正確指向您的地震資料 JSON 檔案。
results = process_earthquake_data(filepath): 呼叫 process_earthquake_data() 函式處理資料，結果存入 results 變數。
if results:: 檢查 results 是否為 None。若非 None，表示資料處理成功，執行後續程式碼印出結果。
print("清理後的 DataFrame：") print(results['cleaned_df']): 印出清理後的 DataFrame。
print("\n統計結果："): 印出統計結果標題。
print(f"平均地震規模：{results['average_magnitude']:.2f}"): 印出平均地震規模。f 表示 f-string，用於格式化字串。 :.2f 保留兩位小數。
print(f"最大震度地區：{results['max_intensity_area']}"): 印出最大震度地區。
print(f"每個縣市的地震次數：\n{results['earthquake_count_per_county']}"): 印出每個縣市的地震次數。\n 表示換行。
print(f"地震總數(獨立編號)：{results['total_earthquake_num']}"): 印出地震總數 (獨立編號)。
if results['max_pga_station'] is not None:: 檢查 results['max_pga_station'] 是否為 None。非 None 表示找到最大 PGA 站點，執行後續程式碼。
- print(f"\n最大 PGA 的測站：")
- print(results['max_pga_station']): 印出 最大 PGA 測站資訊。
if results['max_pgv_station'] is not None:: 檢查 results['max_pgv_station'] 是否為 None。非 None 表示找到最大 PGV 站點，執行後續程式碼。
- print(f"\n最大 PGV 的測站：")
- print(results['max_pgv_station']): 印出 最大 PGV 測站資訊。

學習資料科學：中央氣象署地震資料清理與分析實戰

中央氣象署地震資料處理

1. 導入函式庫

2. 定義 `process_earthquake_data` 函式

3. 檔案讀取與錯誤處理

4. 資料提取與轉換

5. 建立 DataFrame

6. 資料清理

7. 統計分析

8. 回傳結果

9. 範例使用

延伸閱讀

標籤

學習資料科學：中央氣象署地震資料清理與分析實戰

中央氣象署地震資料處理

1. 導入函式庫

2. 定義 process_earthquake_data 函式

3. 檔案讀取與錯誤處理

4. 資料提取與轉換

5. 建立 DataFrame

6. 資料清理

7. 統計分析

8. 回傳結果

9. 範例使用

延伸閱讀

標籤

2. 定義 `process_earthquake_data` 函式