Machine Learning

Iris Flowers

This famous (Fisher's or Anderson's) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.

The example JASP file demonstrates the use of a K-means clustering analysis.

Data set included in R.

Penguins

Data contains measurements for penguin species, island in Palmer Archipelago, size (flipper length, body mass, bill dimensions), and sex.

The example JASP file demonstrates the use of a decision tree classification analysis.

Data set included in R package palmerpenguins.

Police Cadet Evaluation

Real-world data on police cadet evaluation, collected in the late 1990s in the Netherlands, which was used to study the effects of lowering admission standards and to predict whether cadets would pass the standard five-year evaluation based on various attributes.

The example JASP file demonstrates the use of a decision tree classification analysis.

Data set made available by the National Police Corps and the Ministry of Justice of the Netherlands.

Spiral

Data contains a spiral figure on a two-dimensional plane. Two classes are represented in the 200 samples.

The example JASP file demonstrates the use of a support vector machine classification analysis.

Data set included in R.

Student Grades

Data contains information of 357 high school students in Portugal (Cortez & Silva, 2008).

The example JASP file demonstrates the use of a boosting regression analysis.

Publicly available at https://www.kaggle.com/dipam7/student-grade-prediction

Telco Customer Churn

Telecom data from 7043 customers of a phone provider. Variables range from personal descriptives to subscription fees and provide information on customer churning behaviour.

The example JASP file demonstrates the use of a K-nearest neighbors classification analysis.

Publicly available at https://www.kaggle.com/blastchar/telco-customer-churn.

Wine Types

The results of a chemical analysis of wines grown in a specific area of Italy. Three types of wine are represented in the 178 samples, with the results of 13 chemical analyses recorded for each sample.

The example JASP file demonstrates the use of a random forest clustering analysis.

Data set included in R.