How to count observations per ID in R?

Question

I have a large amount of Data where I have to count meassurments per one ID. What I already did was creating a Data Frame over all Files and I omited the NAs. This part works properly. I was wondering if the nrow-function is the right function to solve this but I figured out that this will not lead me to the target as it returns a single number as output.

What I am looking for is if you have entries like that:

1155 2010-05-02  2.7200    1
1156 2010-05-05  2.6000    3
1157 2010-05-08  2.6700    1
1158 2010-05-11  3.5700    2

That I get a list:

ID          Number of observations
1           2
2           1
3           1

Just use table(Data$ID) or as.data.frame(table(Data$ID)) if you want a data.frame back. — David Arenburg, Aug 13 '15 at 06:12
I think this question should be better posted on other general programming spaces like Stack Overflow — German C M, May 20 '20 at 10:07

score 5 · Answer 1 · edited Aug 27 '15 at 07:42

Using the data.table structure (see the wiki),

library(data.table)
D <- data.table(x = c(1155, 1156, 1157, 1158),
                date = as.Date(c("2010-05-02", "2010-05-05", "2010-05-08", "2010-05-11")),
                y = c(2.7200, 2.6000, 2.6700, 3.5700),
                id = c(1, 3, 1, 2))
counts <- D[, .(rowCount = .N), by = id]
counts

This will return

counts
##    id rowCount
## 1:  1        2
## 2:  3        1
## 3:  2        1

score 3 · Answer 2 · answered Aug 18 '15 at 10:11

3

Another way is simply with the "table" function.

ids<-c(1,3,1,2)
counts<-data.frame(table(ids))
counts

answered Aug 18 '15 at 10:11

jason.p.pickering

131
2

score 1 · Answer 3 · answered Aug 12 '15 at 19:58

OK if I understood correctly you can do something like:

df$observations <- rep(1, nrow(df))
df <- df[ ,-file_name_column]
new_data <- data.frame(aggregate(df, by= ID, FUN=sum))

Caution: this might not work exactly since I am not sure what you data frame looks like.

score 1 · Answer 4 · answered Aug 13 '15 at 04:21

aggregate() should work, as the previous answer suggests. Another option is with the plyr package:

count(yourDF,c('id'))

Using more columns in the vector with 'id' will subdivide the count.

I believe ddply() (also part of plyr) has a summarize argument which can also do this, similar to aggregate().

score 1 · Answer 5 · answered Aug 18 '15 at 19:21

This is similar to Jeremy's but using dplyr:

library(dplyr)
mytable <-
"a    date        b         id
 1155 2010-05-02  2.7200    1
 1156 2010-05-05  2.6000    3
 1157 2010-05-08  2.6700    1
 1158 2010-05-11  3.5700    2"

mytable <- read.delim(textConnection(mytable), header=TRUE,  sep="")
mytable %>% count(id)

score 0 · Answer 6 · edited May 20 '20 at 13:08

0

Function rle is also great to do that if you don't want to download dplyr:

rle(as.vector(mytable$id))
rle(as.vector(mytable$id))$lengths

edited May 20 '20 at 13:08

Stephen Rauch

1,783
11
22
34

answered May 20 '20 at 09:52

user97523

1

How to count observations per ID in R?

6 Answers6