Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
408 views
in Technique[技术] by (71.8m points)

r - How to address several datasets in a loop to perform a binary classification using a threshold value?

I'm relatively new in the R community (using RStudio) and am performing a study measuring the human gait. I'd like to set a threshold for the feet being on the floor (values >150) or oscilating (values <151). When feet lay on the ground the value should be changed to 1. Otherwise, I want to turn it to 0. This should be done to four of the five attributes - pd, pi, td, ti (all of these integers, as you can check on the code) - in each dataset: k1, k2, k3, k4.

I'd like to automate the process of classifying each measurement dataset. The classification itself works pretty good (the second for in the code below). The problem arises when trying to read each of the datasets (the first for in the code below). Specifically, by using the k[j] my intention is to switch between datasets k1, k2, k3 and k4. I think that using k = c(ki, k2, k3, k4) is not the appropiate way of addresing the different datasets, as k results in a "List of 20", and k1, k2, k3 and k4 are datasets of different length each (I cannot create a data frame due to incompatible lengths).

Of course a solution would be avoiding k = c(k1,k2,k3,k4); for(j in 1:4){n = length(k[j]$Tiempo)} and copypasting the rest of the code for each dataset. But I assume there might be more efficent ways of doing it!

Here you are the code of the datset extraction (so that you can see the nature of its attributes)...

k1 <- read_delim("Csv/kike1.csv", 
                 ";", escape_double = FALSE, col_types = cols(Tiempo = col_character(), 
                                                              `pd` = col_integer(), 
                                                              `pi` = col_integer(), 
                                                              `td` = col_integer(), 
                                                              `ti` = col_integer()), 
                 locale = locale(encoding = "WINDOWS-1252"), 
                 trim_ws = TRUE)
#There are four of these, one for each measurement.

...and the code for the loops:

k = c(k1,k2,k3,k4)
for(j in 1:4){
   n = length(k[j]$Tiempo)
      for(i in 1:n){
          if (k[j]$td[i]< 151){
                k[j]$td[i]=0
             }else{
                 k[j]$td[i]=1
             }
          if (k[j]$pd[i]< 151){
                k[j]$pd[i]=0
             }else{
                 k[j]$pd[i]=1
    }
    if (k[j]$ti[i]< 151){
      k[j]$ti[i]=0
    }else{
      k[j]$ti[i]=1
    }
    if (k[j]$pi[i]< 151){
      k[j]$pi[i]=0
    }else{
      k[j]$pi[i]=1
    }
  }

PD: When writing k[j]$... the assistive floating tag that shows up to select attributes (that follows the $ sign where I've written "...") just shows one attribute, which in fact changes in each of the lines where k[j]$...appears.

Finally I would like to plot it, but that's sorted out as you can see below. Suggestions are very welcome!

  ggplot(k[j])+
    ggtitle("Muestra 1") +
    geom_path(mapping = aes(Tiempo, td, colour = "Talón Derecho", group = 5))+
    geom_path(mapping = aes(Tiempo, pd, colour = "Puntera Derecha", group = 5)) +
    geom_path(mapping = aes(Tiempo, ti, colour = "Talón Izquierdo", group = 5)) +
    geom_path(mapping = aes(Tiempo, pi, colour = "Puntera Izquierda", group = 5))
}

I hope my first question has not been solved yet (I've spent quite a while extrapolating answers to similar questions but haven't made it so far) and having been clear enough :-).

UPDATE

As requested, here you are the sample!

>dput(head(k1))
structure(list(Tiempo = c("13:10:37.927", "13:10:37.927", "13:10:37.927", 
"13:10:37.927", "13:10:37.927", "13:10:37.927"), td = c(46L, 
903L, 903L, 903L, 903L, 904L), pd = c(256L, 457L, 458L, 457L, 
455L, 454L), ti = c(954L, 954L, 954L, 954L, 954L, 954L), pi = c(895L, 
895L, 895L, 895L, 895L, 895L)), spec = structure(list(cols = list(
    Tiempo = structure(list(), class = c("collector_character", 
    "collector")), td = structure(list(), class = c("collector_integer", 
    "collector")), pd = structure(list(), class = c("collector_integer", 
    "collector")), ti = structure(list(), class = c("collector_integer", 
    "collector")), pi = structure(list(), class = c("collector_integer", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), skip = 1L), class = "col_spec"), row.names = c(NA, 
6L), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"))

Btw, happy new year!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...