Row-wise sort then concatenate across specific columns of data frame

C8H10N4O2

(Related question that does not include sorting. It's easy to just use paste when you don't need to sort.)

I have a less-than-ideally-structured table with character columns that are generic "item1","item2" etc. I would like to create a new character variable that is the alphabetized, comma-separated concatenation of these columns. So for example, in row 5, if item1 = "milk", item2 = "eggs", and item3 = "butter", the new variable in row 5 might be "butter, eggs, milk"

I wrote a function f() below that works on two character variables. However, I am having trouble

  • Using mapply or other "vectorization" (I know it's really just a for loop)
  • Generalizing the function to an arbitrary number of columns

Any help much appreciated.

df <- data.frame(a =c("foo","bar"), 
                 b= c("baz","qux"))   
paste(df$a,df$b, sep=", ")
# returns [1] "foo, baz" "bar, qux" ... but I want [1] "baz, foo" "bar, qux"

f <- function(a,b) paste(c(a,b)[order(c(a,b))],collapse=", ")
f("foo","baz") 
# returns [1] "baz, foo" ... which is what I want ... how to vectorize?

df$new_var <- mapply(f, df$a, df$b)
df 
#     a   b new_var      <- new_var is not what I want
# 1 foo baz    1, 2
# 2 bar qux    1, 2

# Interestingly, data.table is smart enough to fix my bad mapply
library(data.table)
dt <- data.table(a =c("foo","bar"), 
                 b= c("baz","qux"))  
dt[,new_var:=mapply(f, a, b)]
dt
#     a    b  new_var    <- new var IS what I want
# 1: foo baz baz, foo
# 2: bar qux bar, qux
eddi

My first thought would've been to do this:

dt[, new_var := paste(sort(.SD), collapse = ", "), by = 1:nrow(dt)]

But you could make your function work with a couple of simple modifications:

f = function(...) paste(c(...)[order(c(...))],collapse=", ")

dt[, new_var := do.call(function(...) mapply(f, ...), .SD)]

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Apply function row wise on pandas data frame on columns with numerical values

From Dev

Sort columns by an specific row

From Dev

sort columns of a data frame

From Dev

Dividing columns of a data frame group-wise?

From Dev

Calculate ratio row wise in a data frame

From Dev

How do I build a data frame row wise not column wise?

From Dev

Remove columns from data frame if any row contains a specific string

From Dev

How to average data row-wise over a specific set of columns by two factors

From Dev

Concatenate columns and add them to beginning of Data Frame

From Dev

Count specific value across different columns in a row

From Dev

How to sort the columns of every row of a data frame and then save the column name in the corresponding columns in R?

From Dev

Delete specific row in a data frame?

From Dev

Sort Excel row columns' data

From Dev

Data frame/matrix row times vector/list element wise

From Dev

How to remove Empty Cell from data frame row wise

From Dev

Apply FUN row-wise on data frame with integer and character variables

From Dev

Calculate center of mass row-wise in a pandas data frame

From Dev

How to sort value of each row of data frame and return the sorted columns name

From Dev

R How to count occurrences of values across multiple columns of a data frame and save the columnwise counts from a particular value as a new row?

From Dev

Sort data frame by character and date columns

From Dev

Sort data frame by two columns (with condition)

From Dev

Sort data frame by character and date columns

From Dev

How to concatenate the row names of a data frame with a special character?

From Dev

Matrix to data frame with row/columns numbers

From Dev

Adding a row with fewer columns to a data frame in R

From Dev

All columns in a data frame Row which is not Nan

From Dev

Check if pair of columns is in a row of a data frame

From Dev

Matrix to data frame with row/columns numbers

From Dev

Check if pair of columns is in a row of a data frame

Related Related

  1. 1

    Apply function row wise on pandas data frame on columns with numerical values

  2. 2

    Sort columns by an specific row

  3. 3

    sort columns of a data frame

  4. 4

    Dividing columns of a data frame group-wise?

  5. 5

    Calculate ratio row wise in a data frame

  6. 6

    How do I build a data frame row wise not column wise?

  7. 7

    Remove columns from data frame if any row contains a specific string

  8. 8

    How to average data row-wise over a specific set of columns by two factors

  9. 9

    Concatenate columns and add them to beginning of Data Frame

  10. 10

    Count specific value across different columns in a row

  11. 11

    How to sort the columns of every row of a data frame and then save the column name in the corresponding columns in R?

  12. 12

    Delete specific row in a data frame?

  13. 13

    Sort Excel row columns' data

  14. 14

    Data frame/matrix row times vector/list element wise

  15. 15

    How to remove Empty Cell from data frame row wise

  16. 16

    Apply FUN row-wise on data frame with integer and character variables

  17. 17

    Calculate center of mass row-wise in a pandas data frame

  18. 18

    How to sort value of each row of data frame and return the sorted columns name

  19. 19

    R How to count occurrences of values across multiple columns of a data frame and save the columnwise counts from a particular value as a new row?

  20. 20

    Sort data frame by character and date columns

  21. 21

    Sort data frame by two columns (with condition)

  22. 22

    Sort data frame by character and date columns

  23. 23

    How to concatenate the row names of a data frame with a special character?

  24. 24

    Matrix to data frame with row/columns numbers

  25. 25

    Adding a row with fewer columns to a data frame in R

  26. 26

    All columns in a data frame Row which is not Nan

  27. 27

    Check if pair of columns is in a row of a data frame

  28. 28

    Matrix to data frame with row/columns numbers

  29. 29

    Check if pair of columns is in a row of a data frame

HotTag

Archive