Use this guide to learn about the different R data types, the possible uses for each, what professionals utilize them, helpful tips to utilize R, and how you can get started on your journey with R.
The R programming language is a domain-specific language (DSL) used for statistical analysis, generating visualizations, machine learning, artificial intelligence (AI), and more. Created in 1993, R remains one of the most used and popular programming languages for professionals and offers strong salaries if you know how to utilize it. Universities, organizations, and professionals worldwide still choose R for data science, statistical analysis, research, and other complex analysis today.
R features capabilities to handle many data types, defined by the type of information contained in a variable. Grasping each data type and its potential uses allows you to operate effectively in R and manipulate data. It may also help you to learn the types of professions that utilize R data types, helpful tips associated with them, and the steps you can take to learn more about R data types to build your skill set.
R features a set of basic data types that provide the foundation to build data structures or objects. Understanding each data type and its differences allows you to work with them and utilize R efficiently and effectively. The common R data types include:
Character: The character data type includes text. Combinations of letters or numbers can become the character data type if quotes wrap around the text string.
Numeric: As the default data type in R, numeric data comes in decimal form. For example, R stores the number 13.2 as numeric.
Integer: All integers represent numbers created through the integer function in R and do not contain decimals. You must use the specific integer function as.integer to create an integer. For example, an input of as.integer(13.2) returns just 13 when printed.
Logical: The logical data type appears when you compare multiple variables. The variable returned in these cases is either True or False. If you write 2 > 3, R returns False. This response from R is Boolean and represents the logical data type.
Complex: The complex data type combines real and imaginary parts. An easy way to think of this is a mathematical equation with an undefined variable. R utilizes i to represent the imaginary or undefined portion. The data type returned is complex if you entered a = 12 - 4i.
In addition to these five common R data types, a few others commonly appear while utilizing R, such as factor. The factor data type comes into play during statistical modeling and shows qualitative relationships. An example of this is assigning a rating to some topic, such as TV shows you watched. In this case, your factors are above average, average, and below average.
In the R programming language, the variables you create are objects, and these objects have data types, which then provide a building block for data structures to expand upon. Other programming languages, such as C++ and Java, handle variables differently by directly characterizing them as data types and not objects. Data structures feature either one data type or a combination of data types as part of their makeup, depending on the specific structure. Since you frequently create and manipulate data structures when using R, understanding their nuances and the data types applied to each is crucial. The data structures included in R and their relationship to the data types discussed include vectors, lists, matrices, and data frames.
The vector data structure allows you to include multiple values or objects in a row or column. Each value within your vector must be the same data type, so you cannot combine the numeric and character data types within the same vector, for instance. The logical data type is the default data type in R for vectors. However, you can easily change the data type to numeric, character, or any other data type. For example, storing the individual grades of all your students out of 100 is a great example of using vectors. Every grade can be a part of the vector under one variable instead of creating a new variable for each student.
The list data structure is similar to vectors but allows you to simultaneously store information with different data types. Lists benefit you because they allow you to have multiple unrelated values in one variable by creating a combined list. You even have the ability to include other lists, vectors, or matrices within a list. For example, if you once again need to create a list with all your students' grades, but some of them have numeric grades, and others have character grades, you can use a list.
Matrices must also include data of the same type, but they build upon vectors by allowing multi-dimensional data. By having two dimensions, you can create tables with rows and columns. A simple example is building a matrix with the names of all the states in the US in one column and adding the names of the most populated cities in each state in the second column.
Referenced as R's most crucial data structure, especially for tabular data, data frames are a combination of unique lists with equivalent lengths. The data structure in a data frame appears like a table, where you have rows, columns, and headers. Data frames can include various data types at the same time, unlike matrices. For example, your first column may be a character, and the next two columns are integers.
Due to the popularity of the R programming language in various industries and disciplines, you may utilize R and its data types in many professions to complete your job duties. As mentioned, jobs related to statistical analysis, creating visualizations or graphics from data, machine learning, and computer modeling deal with R and its data types.
Attaining experience and building a skill set in R remains in high demand today. The Bureau of Labor Statistics in the US projects that jobs in computer and information research, common roles utilizing R, will grow by 23 percent from 2022 to 2032 [1]. Types of jobs that involve using R include:
Data science: Data scientists utilize R to create visualizations and perform in-depth analysis through building statistical models and data mining techniques.
Machine learning: R enables the use of predictive modeling and other algorithms commonly used in machine learning-based roles.
Finance: Jobs within finance leverage R to create powerful visualizations to model risk and performance.
Research: Universities and other institutions perform statistical analysis in R to support their academic research in many different disciplines.
Social media: Many social media roles utilize R and its data types to track analytics related to their customers and advertising campaigns through sentiment analysis.
Manufacturing: Similar to social media roles, jobs in manufacturing involve sentiment analysis and collecting customer data through R.
If you are beginning to learn about R data types for the first time, you may find yourself in some scenarios where you want to check your data types or alter them. To aid your development in learning R and efficiently utilizing its data types, learning these helpful tips provides you with a resource to use when you become stuck.
As mentioned, you may find yourself in a situation where you need to check the data type of your variable. Luckily, R includes two general functions to check a data type: class() and typeof(). For example, if you write a = 12 - 4i and then class(a), R returns complex to tell you that it is the complex data type.
Similar to checking your data types in general, you also have the ability to ask R if your data type is specifically a number or character through is.numeric() and is.character(). The response from R is either True or False, depending on the data you check. If you define a as a = 3.2 and then ask R is.numeric(a), an output of True appears because 3.2 is the numeric data type.
Through the use of as. functions, you can alter the data type of an object or variable. You may use this option when R defaults a vector of numbers to the character data type. For example, if you check the data type of your vector a by writing class(a), R may return character. To fix this and enable yourself to perform calculations with your vector, write as.numeric(a) to change the data type.
The last helpful tip to remember is that R is case-sensitive. When referring to a variable you create, make sure to maintain either an uppercase or lowercase spelling, depending upon how you defined the variable originally. Referencing the complex data type example, you may define a = 12 - 4i and then check the data type of a through class(A). In this case, R returns an error message to you because you defined your variable as a and not A.
To learn more about R data types and other topics about the R programming language in general, completing a course or receiving a relevant certificate is a great place to start. On Coursera, you can enroll in some of the top courses in the world.
Check out Data Analysis With R Programming by Google. Taught at a beginner level, this course introduces the R programming language and its programming environment. Additional features include learning fundamental concepts of R, such as functions, variables, data types, pipes, and vectors, and building your first visualizations.
Another relevant course worth checking out if you strive to learn intermediate-level content is R Programming by Johns Hopkins University. This course dives more in-depth into critical concepts within the R programming language, including using R profiler, loop functions, debugging tools, and statistical programming software.
US Bureau of Labor Statistics. “Computer and Information Research Scientists, https://www.bls.gov/ooh/computer-and-information-technology/computer-and-information-research-scientists.htm#tab-1.” Accessed March 20, 2024.
Editorial Team
Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...
This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.