Clusters of Texts
(This article was first published on R-english – Freakonometrics, and kindly contributed to R-bloggers) Another popular application of classification techniques is on texmining (see e.g. an old post...
View ArticleClustering French Cities (based on Temperatures)
(This article was first published on R-english – Freakonometrics, and kindly contributed to R-bloggers) In order to illustrate hierarchical clustering techniques and k-means, I did borrow François...
View ArticleLarge scale eigenvalue decomposition and SVD with rARPACK
(This article was first published on Yixuan's Blog - R, and kindly contributed to R-bloggers) In January 2016, I was honored to receive an “Honorable Mention” of the John Chambers Award 2016. A Short...
View ArticleNairobi Data Science Meet Up:Finding deep structures in data with Chris Orwa
(This article was first published on R – Data Science Africa, and kindly contributed to R-bloggers) I sat down with former rugby school captain whose rugby career was cut short by a shoulder injury...
View ArticlePrincipal Component Analysis using R
(This article was first published on Data Perspective, and kindly contributed to R-bloggers) Curse of Dimensionality:One of the most commonly faced problems while dealing with data analytics problem...
View ArticleNairobi Data Science Meetup: Paradigm Shift in Research with Samuel Kamande
(This article was first published on R – Data Science Africa, and kindly contributed to R-bloggers) Samuel Kamande is a Data Scientist at Nielsen and his presentation will focus on “Paradigm Shift in...
View ArticleAre you doing parallel computations in R? Then use BiocParallel
(This article was first published on Fellgernon Bit - rstats, and kindly contributed to R-bloggers) It’s the morning of the first day of oral conferences at #ENAR2016. I feel like I have a spidey...
View ArticlePerform co-operations with the coop package
(This article was first published on R – librestats, and kindly contributed to R-bloggers) About The coop package does co-operations: covariance, correlation, and cosine, and it does them quickly. The...
View ArticleR benchmark for High-Performance Analytics and Computing (I)
(This article was first published on Blog – ParallelR, and kindly contributed to R-bloggers) Objectives of Experiments R is more and more popular in various fields, including the high-performance...
View Article“Data Mining with R” Course | May 17-18
(This article was first published on MilanoR, and kindly contributed to R-bloggers) The two-days course Data Mining with R, is organized by the R training and consulting company Quantide. Next live...
View ArticlePrincipal Components Regression, Pt.1: The Standard Method
(This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers) In this note, we discuss principal components regression and some of the issues with it: The need for...
View ArticlePrincipal Components Regression in R, an operational tutorial
(This article was first published on Revolutions, and kindly contributed to R-bloggers) John Mount Ph. D.Data Scientist at Win-Vector LLC Win-Vector LLC's Dr. Nina Zumel has just started a two part...
View ArticleIP string to integer conversion with Rcpp
(This article was first published on Opiate for the masses, and kindly contributed to R-bloggers) IP address conversion At work I recently had to match data on IP addresses and some fuzzy timestamp...
View ArticleRTCGA factory of R packages – Quick Guide
(This article was first published on r-addict.com, and kindly contributed to R-bloggers) Yesterday we have been delivered with the new version of R – R 3.3.0 (codename Supposedly Educational). This...
View ArticleInstalling WVPlots and “knitting R markdown”
(This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers) Some readers have been having a bit of trouble using devtools to install WVPlots. I thought I would...
View ArticleTutorial: GitHub for Data Scientists without the Terminal
(This article was first published on R – Modern Data, and kindly contributed to R-bloggers) Git and GitHub are indispensable tools for anyone analysing data, developing software or disseminating...
View ArticlePrincipal Components Regression, Pt. 2: Y-Aware Methods
(This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers) In our previous note, we discussed some problems that can arise when using standard principal components...
View ArticlePrincipal Components Regression, Pt. 3: Picking the Number of Components
(This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers) In our previous note we demonstrated Y-Aware PCA and other y-aware approaches to dimensionality...
View ArticleBuilding the Data Matrix for the Task at Hand and Analyzing Jointly the...
(This article was first published on Engaging Market Research, and kindly contributed to R-bloggers) Someone decided what data ought to go into the matrix. They placed the objects of interest in the...
View ArticleWhat are the Best Machine Learning Packages in R?
Guest post by Khushbu Shah The most common question asked by prospective data scientists is – “What is the best programming language for Machine Learning?” The answer to this question always results...
View Article