In R, you can create various types of graphs, such as scatter plots, bar plots, line graphs, and histograms. These graphs help you visualize data points, trends, and distributions to better understand your data.
Key takeaways:
Plots are powerful tools for visualizing complex data. They help interpret data more easily and, hence, facilitate quicker decision-making by highlighting key trends and variations.
Each type of plot serves a unique purpose, allowing analysts to choose the most suitable visualization based on the data characteristics they wish to highlight
A scatter plot effectively represents the relationship between two continuous variables, highlighting correlations and outliers.
Line plots summarize data frequencies visually, offering a quick understanding of trends and values in a dataset.
Box plots illustrate the central tendency and dispersion of data, making it easier to identify outliers and understand the distribution.
Contour plots provide a two-dimensional representation of three-dimensional data, helping to visualize gradients and areas of interest in various fields like geography and meteorology.
Area plots illustrate cumulative values of multiple variables over a specific interval, enabling the identification of trends and patterns in the data.
Density plots reveal the distribution shape of a single variable, aiding in the identification of patterns such as symmetry, skewness, or multimodality.
Plot is the visual representation of complex data. It enables us to understand and interpret the trends of complex and large data sets. Data analysts use plots to help businesses grow by observing the data’s tendency. For instance, we can use the plots in R to analyze how well Uber is working in New York City or to explore stock market trends to make rational business decisions.
In this Answer, we’ll discuss a few examples of plots in the R language with illustrated examples.
Let’s look at each plot type with examples below:
A histogram is a graphical representation that displays the distribution of a continuous variable by dividing the data into intervals or bins and counting the frequency of data points within each bin. Each bar in the histogram represents the number of data points within a specific range, allowing you to see how data is spread across different values. Histograms are useful for identifying patterns in data, such as skewness, modality (e.g., unimodal or bimodal), and the presence of any outliers. Below is an example of how to create a histogram in the R language.
# Generate random datadata <- rnorm(100)# Create a histogram with colorful barshist(data, main = "Histogram", xlab = "Values", col = rainbow(10), border = "black")
The result of this code will be similar to the one shown in the following graph:
A scatter plot represents the relationship between two continuous variables. It represents the set of data points in the graph, and each dot represents the value between two variables. A scatter plot helps interpret the correlation or outliers in the dataset. The code for a scatter plot in the R language is given below:
# Generate random datax <- rnorm(100)y <- rnorm(100)# Create a scatter plot with colorful pointsplot(x, y, main = "Scatter Plot", xlab = "X", ylab = "Y", col = rainbow(length(x)))
The result of this code is shown in the following graph:
A line plot, also known as a strip plot, displays data points as individual markers along a number line. It is mainly used to show the frequency of the dataset and provides a quick visual summary of the dataset. If multiple points in the dataset share the same value, they will be stacked together. The R code to visualize a line plot is given below, along with its result.
# Generate example datax <- 1:10 # x-valuesy <- c(3, 5, 8, 4, 6, 2, 7, 9, 5, 4) # y-values# Create a line plotplot(x, y, type = "o", main = "Line Plot", xlab = "X", ylab = "Y")
The result of this code is shown in the following graph:
A box plot (or box-and-whisker plot) visualizes the distribution of a dataset. It is used to display the mean, mode, and median of the dataset and identify the outliers. A box plot provides the central tendency, spread, and dispersion of the dataset. Let’s look at the code and its result below to understand it better.
# Generate example datax <- rnorm(100) # Random data# Create a box plotboxplot(x, main = "Box Plot", ylab = "Values")
The result of this code is shown in the following graph:
A contour plot or level plot is the graphical representation of the 3-dimensional data to the 2-dimensional plane. It displays the variation of the variable over a defined plane. Contour plots are mainly used in geography, meteorology, and calculus to visualize a function’s behavior. They enable us to determine the area of high or low values, gradients, and boundaries of the function. The spacing between the contour plots indicates the rate of change in the variable. Here’s an example of a contour plot of three variables.
# Generate example Datax <- -20:20y <- -20:20z <- sqrt(outer(x ^ 2, y ^ 2, "+"))# Create a contour plotcontour(x=x, y=y, z=z)
The result of this code is shown in the following graph:
An area plot visualizes multiple variables with their corresponding values or proportions over a specific interval. It is used to create a cumulative representation, where the value of each variable is added to the previous variable, making a stacked area. It is also used to identify the trends and patterns in the dataset.
# Create example datax <- 1:10 # x-axis valuesy1 <- c(3, 5, 8, 4, 6, 2, 7, 9, 5, 4) # Data for variable 1y2 <- c(2, 4, 6, 3, 5, 1, 6, 8, 4, 3) # Data for variable 2y3 <- c(1, 3, 5, 2, 4, 1, 4, 6, 3, 2) # Data for variable 3# Create an area plotplot(x, y1, type = "n", ylim = c(0, max(y1+y2+y3)), ylab = "Values", main = "Area Plot")polygon(c(x, rev(x)), c(y1, rev(y2)), col = "blue", border = NA)polygon(c(x, rev(x)), c(y1+y2, rev(y1)), col = "green", border = NA)polygon(c(x, rev(x)), c(y1+y2+y3, rev(y1+y2)), col = "red", border = NA)legend("topright", legend = c("Variable 1", "Variable 2", "Variable 3"), fill = c("blue", "green", "red"))
The result of this code is shown in the following graph:
A density plot (kernel-density plot) consists of a single variable to visualize a distribution. It is used to understand whether the data is symmetric, skewed, or multimodal. It helps in identifying the unusual patterns in the dataset. In the density plot, the horizontal axis represents the continuous variable’s values, whereas the vertical axis represents the estimated density or probability of the variable.
# Generate example datax <- rnorm(100) # Random data# Create a density plotdensity_plot <- density(x) # Estimate the densityplot(density_plot, main = "Density Plot", xlab = "X", ylab = "Density")
The result of this code is shown in the following graph:
Here’s a table that compares different plots with the perspective of R language usage while highlighting the purpose of each plot type, the type of data it is best suited for, the corresponding R function used, and some key parameters needed for plotting:
Plot Type | Purpose | Suitable Data | R Function | Key Parameters |
Histogram | Displays the distribution of a continuous variable by counting values within intervals (bins) | One continuous variable |
|
|
Scatter | Displays the relationship between two continuous variables | Two continuous variables |
|
|
Line | Shows trends over time on ordered data | Sequential data (e.g., time series) |
|
|
Box | Summarizes distribution through quartiles and outliers | Continuous variable |
|
|
Contour | Displays 3D data in 2D with contour lines | Two continuous variables (with z-values) |
|
|
Area | Shows cumulative data over time or categories | Stacked or time-based data |
|
|
Density | Shows the distribution of a single continuous variable | Continuous variable |
|
|
In conclusion, plots give us a compelling visualization of the trends and patterns of the dataset. They allow us to understand variation across a given domain comprehensively and provide a visual representation of the data distribution by highlighting its shape, skewness, and multimodality. All this enables a deeper understanding of the underlying characteristics of the dataset.
Ready to dive deeper into the world of R data visualization? Here are some exciting project ideas to get hands-on experience:
Haven’t found what you were looking for? Contact Us
Free Resources