How to Bin Continuous Variables Into Category R
How to create a column with binary variable based on a condition of other variable in an R data frame?
Sometimes we need to create extra variable to add more information about the present data because it adds value. This is especially used while we do feature engineering. If we come to know about something that may affect our response then we prefer to use it as a variable in our data, hence we make up that with the data we have. For example, creating another variable applying conditions on other variable such as creating a binary variable for goodness if the frequency matches a certain criterion.
Example
Consider the below data frame −
Live Demo
set.seed(100) Group<-rep(c("A","B","C","D","E"),times=4) Frequency<-sample(20:30,20,replace=TRUE) df1<-data.frame(Group,Frequency) df1
Output
Group Frequency 1 A 29 2 B 26 3 C 25 4 D 22 5 E 28 6 A 29 7 B 26 8 C 25 9 D 25 10 E 23 11 A 26 12 B 25 13 C 21 14 D 26 15 E 26 16 A 26 17 B 30 18 C 27 19 D 21 20 E 22
Creating a column category having two levels as Good and Bad, where Good is for those that have Frequency greater than 25−
Example
df1$Category<-ifelse(df1$Frequency>25,"Good","Bad") df1
Output
Group Frequency Category 1 A 29 Good 2 B 26 Good 3 C 25 Bad 4 D 22 Bad 5 E 28 Good 6 A 29 Good 7 B 26 Good 8 C 25 Bad 9 D 25 Bad 10 E 23 Bad 11 A 26 Good 12 B 25 Bad 13 C 21 Bad 14 D 26 Good 15 E 26 Good 16 A 26 Good 17 B 30 Good 18 C 27 Good 19 D 21 Bad 20 E 22 Bad
Let's have a look at another example −
Example
Live Demo
Class<-rep(c("Lower","Middle","Upper Middle","Higher"),times=5) Ratings<-sample(1:10,20,replace=TRUE) df2<-data.frame(Class,Ratings) df2
Output
Class Ratings 1 Lower 3 2 Middle 8 3 Upper Middle 2 4 Higher 9 5 Lower 2 6 Middle 3 7 Upper Middle 4 8 Higher 4 9 Lower 4 10 Middle 5 11 Upper Middle 7 12 Higher 9 13 Lower 4 14 Middle 2 15 Upper Middle 6 16 Higher 7 17 Lower 1 18 Middle 6 19 Upper Middle 9 20 Higher 9
Example
df2$Group<-ifelse(df2$Ratings>5,"Royal","Standard") df2
Output
Class Ratings Group 1 Lower 3 Standard 2 Middle 8 Royal 3 Upper Middle 2 Standard 4 Higher 9 Royal 5 Lower 2 Standard 6 Middle 3 Standard 7 Upper Middle 4 Standard 8 Higher 4 Standard 9 Lower 4 Standard 10 Middle 5 Standard 11 Upper Middle 7 Royal 12 Higher 9 Royal 13 Lower 4 Standard 14 Middle 2 Standard 15 Upper Middle 6 Royal 16 Higher 7 Royal 17 Lower 1 Standard 18 Middle 6 Royal 19 Upper Middle 9 Royal 20 Higher 9 Royal
Updated on 09-Sep-2020 08:00:43
- Related Questions & Answers
- How to create a new column in an R data frame based on some condition of another column?
- Convert a numeric column to binary factor based on a condition in R data frame
- How to create a frequency column for categorical variable in an R data frame?
- How to find the sum based on a categorical variable in an R data frame?
- How to create a categorical variable using a data frame column in R?
- How to subset an R data frame based on string values of a columns with OR condition?
- How to create a column with ratio of two columns based on a condition in R?
- How to subset an R data frame with condition based on only one value from categorical column?
- How to initialize a data frame with variable names in R?
- How to add a new column in an R data frame with count based on factor column?
- How to create a subset of an R data frame based on multiple columns?
- How to find the sum of values based on key in other column of an R data frame?
- How to convert a data frame into two column data frame with values and column name as variable in R?
- Replace each value in a column with the largest value based on a condition in R data frame.
- How to convert binary variable to 0/1 format in an R data frame?
Source: https://www.tutorialspoint.com/how-to-create-a-column-with-binary-variable-based-on-a-condition-of-other-variable-in-an-r-data-frame
Post a Comment for "How to Bin Continuous Variables Into Category R"