Clustering French Cities (based on Temperatures)

(This article was first published on R-english – Freakonometrics, and kindly contributed to R-bloggers)

In order to illustrate hierarchical clustering techniques and k-means, I did borrow François Husson‘s dataset, with monthly average temperature in several French cities.

> temp=read.table(
+ "http://freakonometrics.free.fr/FR_temp.txt",
+ header=TRUE,dec=",")

We have 15 cities, with monthly observations

> X=temp[,1:12]
> boxplot(X)

Since the variance seems to be rather stable, we will not ‘normalize’ the variables here,

> apply(X,2,sd)
    Janv     Fevr     Mars     Avri 
2.007296 1.868409 1.529083 1.414820 
     Mai     Juin     juil     Aout 
1.504596 1.793507 2.128939 2.011988 
    Sept     Octo     Nove     Dece 
1.848114 1.829988 1.803753 1.958449

In order to get a hierarchical cluster analysis, use for instance

> h <- hclust(dist(X), method = "ward")
> plot(h, labels = rownames(X), sub = "")

An alternative is to use

> library(FactoMineR)
> h2=HCPC(X)
> plot(h2)

Here, we visualise observations with a principal components analysis. We have here also an automatic selection of the number of classes, here 3. We can get the description of the groups using

> h2$desc.ind

or directly

> cah=hclust(dist(X))
> groups.3 <- cutree(cah,3)

We can also visualise those classes by ourselves,

> acp=PCA(X,scale.unit=FALSE)
> plot(acp$ind$coord[,1:2],col="white")
> text(acp$ind$coord[,1],acp$ind$coord[,2],
+ rownames(acp$ind$coord),col=groups.3)

It is possible to plot the centroïds of those clusters

> PT=aggregate(acp$ind$coord,list(groups.3),mean)
> points(PT$Dim.1,PT$Dim.2,pch=19)

If we add Voroid sets around those centroïds, here we do not see them (actually, we see the point – in the middle – that is exactly at the intersection of the three regions),

> library(tripack)
> V <- voronoi.mosaic(PT$Dim.1,PT$Dim.2)
> plot(V,add=TRUE)

To visualize those regions, use

> p=function(x,y){
+   which.min((PT$Dim.1-x)^2+(PT$Dim.2-y)^2)
+ }
> vx=seq(-10,12,length=251)
> vy=seq(-6,8,length=251)
> z=outer(vx,vy,Vectorize(p))
> image(vx,vy,z,col=c(rgb(1,0,0,.2),
+ rgb(0,1,0,.2),rgb(0,0,1,.2)))
> CL=c("red","black","blue")
> text(acp$ind$coord[,1],acp$ind$coord[,2],
+ rownames(acp$ind$coord),col=CL[groups.3])

Actually, those three groups (and those three regions) are also the ones we obtain using a k-mean algorithm,

> km=kmeans(acp$ind$coord[,1:2],3)
> km
K-means clustering 
with 3 clusters of sizes 3, 7, 5

(etc). But actually, since again we have some spatial data, it is possible to visualize them on a map

> library(maps)
> map("france")
> points(temp$Long,temp$Lati,col=groups.3,pch=19)

or, to visualize the regions, use e.g.

> library(car)
> for(i in 1:3) 
+ dataEllipse(temp$Long[groups.3==i],
+ temp$Lati[groups.3==i], levels=.7,add=TRUE,
+ col=i+1,fill=TRUE)

Those three regions actually make sense, geographically speaking.

To leave a comment for the author, please follow the link and comment on their blog: R-english – Freakonometrics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

Clustering French Cities (based on Temperatures)

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...