Skip to content Skip to sidebar Skip to footer

How to Bin Continuous Variables Into Category R

How to create a column with binary variable based on a condition of other variable in an R data frame?


Sometimes we need to create extra variable to add more information about the present data because it adds value. This is especially used while we do feature engineering. If we come to know about something that may affect our response then we prefer to use it as a variable in our data, hence we make up that with the data we have. For example, creating another variable applying conditions on other variable such as creating a binary variable for goodness if the frequency matches a certain criterion.

Example

Consider the below data frame −

 Live Demo

set.seed(100) Group<-rep(c("A","B","C","D","E"),times=4) Frequency<-sample(20:30,20,replace=TRUE) df1<-data.frame(Group,Frequency) df1

Output

 Group Frequency 1  A    29 2  B    26 3  C    25 4  D    22 5  E    28 6  A    29 7  B    26 8  C    25 9  D    25 10 E    23 11 A    26 12 B    25 13 C    21 14 D    26 15 E    26 16 A    26 17 B    30 18 C    27 19 D    21 20 E    22

Creating a column category having two levels as Good and Bad, where Good is for those that have Frequency greater than 25−

Example

df1$Category<-ifelse(df1$Frequency>25,"Good","Bad") df1

Output

 Group Frequency Category 1  A       29       Good 2  B       26       Good 3  C       25       Bad 4  D       22       Bad 5  E       28       Good 6  A       29       Good 7  B       26       Good 8  C       25       Bad 9  D       25       Bad 10 E       23       Bad 11 A       26       Good 12 B       25       Bad 13 C       21       Bad 14 D       26       Good 15 E       26       Good 16 A       26       Good 17 B       30       Good 18 C       27       Good 19 D       21       Bad 20 E       22       Bad

Let's have a look at another example −

Example

 Live Demo

Class<-rep(c("Lower","Middle","Upper Middle","Higher"),times=5) Ratings<-sample(1:10,20,replace=TRUE) df2<-data.frame(Class,Ratings) df2

Output

     Class    Ratings 1    Lower       3 2    Middle      8 3 Upper Middle   2 4    Higher      9 5    Lower       2 6    Middle      3 7 Upper Middle   4 8    Higher      4 9    Lower       4 10   Middle      5 11 Upper Middle  7 12    Higher     9 13    Lower      4 14    Middle     2 15 Upper Middle  6 16    Higher     7 17    Lower      1 18    Middle     6 19 Upper Middle  9 20    Higher     9

Example

df2$Group<-ifelse(df2$Ratings>5,"Royal","Standard") df2

Output

      Class    Ratings    Group 1    Lower       3       Standard 2    Middle      8         Royal 3 Upper Middle   2       Standard 4    Higher      9         Royal 5    Lower       2       Standard 6    Middle      3       Standard 7 Upper Middle   4       Standard 8    Higher      4       Standard 9    Lower       4       Standard 10   Middle      5       Standard 11 Upper Middle  7         Royal 12    Higher     9         Royal 13    Lower      4       Standard 14    Middle     2       Standard 15 Upper Middle  6         Royal 16    Higher     7         Royal 17    Lower      1       Standard 18    Middle     6         Royal 19 Upper Middle  9         Royal 20    Higher     9         Royal

raja

Updated on 09-Sep-2020 08:00:43

  • Related Questions & Answers
  • How to create a new column in an R data frame based on some condition of another column?
  • Convert a numeric column to binary factor based on a condition in R data frame
  • How to create a frequency column for categorical variable in an R data frame?
  • How to find the sum based on a categorical variable in an R data frame?
  • How to create a categorical variable using a data frame column in R?
  • How to subset an R data frame based on string values of a columns with OR condition?
  • How to create a column with ratio of two columns based on a condition in R?
  • How to subset an R data frame with condition based on only one value from categorical column?
  • How to initialize a data frame with variable names in R?
  • How to add a new column in an R data frame with count based on factor column?
  • How to create a subset of an R data frame based on multiple columns?
  • How to find the sum of values based on key in other column of an R data frame?
  • How to convert a data frame into two column data frame with values and column name as variable in R?
  • Replace each value in a column with the largest value based on a condition in R data frame.
  • How to convert binary variable to 0/1 format in an R data frame?

garretthoods1983.blogspot.com

Source: https://www.tutorialspoint.com/how-to-create-a-column-with-binary-variable-based-on-a-condition-of-other-variable-in-an-r-data-frame

Post a Comment for "How to Bin Continuous Variables Into Category R"