Back to Portfolio

European Temperature Analysis with PCA

Climate classification of European cities using Principal Component Analysis

Project Summary

This project applies Principal Component Analysis (PCA) to monthly temperature data from European cities to identify climate patterns based on geographical location. The analysis reveals how European cities cluster according to temperature characteristics and geographical factors, reducing 16 dimensions to just 2 principal components that capture 93.9% of the variance.

93.9%
Variance Explained
16 → 2
Dimensionality Reduction
4
Climate Groups
78.0%
PC1 - North-South Gradient

Dataset Description

Temperature Dataset Characteristics

The dataset contains monthly temperature data from multiple European cities, including 12 monthly temperature measurements plus additional geographical variables.

Features

  • Monthly Temperatures: January through December
  • Average Temperature: Annual mean temperature
  • Thermal Amplitude: Annual temperature range
  • Geographical Coordinates: Latitude and Longitude

Data Characteristics

  • Multiple European Cities: Various geographical locations
  • 16 Total Variables: 12 months + average + amplitude + coordinates
  • Complete Records: No missing values
  • Standardized Data: Preprocessed for PCA analysis

Key Insight: Strong correlations between monthly temperatures, with latitude emerging as the dominant factor influencing temperature patterns across Europe.

Methodology and Analysis

Exploratory Analysis

  • Correlation Matrix Analysis
  • Heatmap Visualization
  • Scatter Matrix Plots
  • Data Standardization

PCA Implementation

  • Principal Component Analysis
  • Component Selection Criteria
  • Kaiser Rule Application
  • Elbow Method Analysis
  • Variance Thresholds

Visualization

  • Observation Projection
  • Correlation Circles
  • Component Interpretation
  • Quality Measures (COS²)

Results and Findings

PCA Results

Component Eigenvalue Variance Explained Cumulative Variance Interpretation
PC1 12.49 78.0% 78.0% North-South Temperature Gradient
PC2 2.54 15.9% 93.9% Continental vs Maritime Climate
PC3 0.64 4.0% 97.9% Secondary Climate Patterns

Climate Classification

European cities naturally cluster into four distinct climate groups based on the two principal components:

  • Cold + Maritime: Northwestern cities (Dublin, London, Amsterdam)
  • Cold + Continental: Northeastern cities (Helsinki, Moscow, Stockholm)
  • Warm + Maritime: Southwestern cities (Lisbon, Barcelona, Bordeaux)
  • Warm + Continental: Southeastern cities (Athens, Rome, Madrid)

Key Determinants

The analysis revealed that latitude is the dominant factor (78% of variance), while continentality explains an additional 16% of temperature variability.

Longitude showed minimal influence on annual temperature patterns, confirming that north-south position is far more significant than east-west position in Europe.

Key Findings

Dual Climate Classification

European cities can be classified simultaneously using two main axes: north-south (average temperature) and continental-maritime (thermal amplitude), creating a comprehensive climate classification system.

Geographical Patterns

Clear geographical distributions emerged: Northwest (cold+maritime), Northeast (cold+continental), Southwest (warm+maritime), and Southeast (warm+continental), with few exceptions.

Climate Theory Confirmation

The analysis confirmed established climate theories: oceanic influence softens temperatures, continentality extremizes temperatures, and latitude is the dominant factor in average temperatures.

Technologies Used

Languages & Libraries

  • Python 3.8+
  • scikit-learn
  • pandas & numpy
  • matplotlib & seaborn

Algorithms

  • Principal Component Analysis
  • Dimensionality Reduction
  • Correlation Analysis
  • Standardization Techniques

Techniques

  • Component Selection Criteria
  • Variance Analysis
  • Exploratory Data Analysis
  • Multivariate Visualization

Conclusions

This project successfully demonstrated the power of Principal Component Analysis for reducing complex multidimensional climate data into interpretable components. The two-dimensional representation captured 93.9% of the variance while providing clear insights into European climate patterns.

Main Contributions

  • Successful dimensionality reduction from 16 to 2 components
  • Clear interpretation of climate patterns across Europe
  • Confirmation of established geographical climate theories
  • Development of a dual-axis climate classification system

Key Learnings

  • PCA is highly effective for geographical and climate data analysis
  • Latitude dominates European temperature variations
  • Continentality provides important secondary differentiation
  • Visualization techniques are crucial for interpreting PCA results