Freitag, 3. September 2010

SQL-OLAP in R

How to generate SQL-OLAP functions in R:

d.frm <- data.frame( x=rep(1:4,3), g=gl(4,3,labels=letters[1:4]) )

# SQL-OLAP: sum() over (partition by g)
# (more than 1 grouping variables are enumerated like ave(..., g1,g2,g3, FUN=...)):
d.frm$sum_g <- ave( d.frm$x, d.frm$g, FUN=sum )


# same with rank (decreasing):
d.frm$rank_g <- ave( -d.frm$x, d.frm$g, FUN=rank )
d.frm


# get some more data
d.frm <- data.frame(
  id=c("p1","p1","p2","p2","p2","p3","p2","p3","p1","p1","p2"),
  A=c(0,1,1,1,0,0,0,0,0,0,0),
  B=c(1,0,0,0,0,0,0,0,0,0,0),
  C=c(0,0,0,0,1,1,1,0,1,1,1)
)

# get rownumber by group, based by original order
d.frm$rownr <- ave( 1:nrow(d.frm), d.frm$id, FUN=order )

# get some groupwise aggregation on more than one column 
d.frmby <- data.frame( lapply( d.frm[,-c(1,5)], tapply, d.frm$id, "max", na.rm=TRUE ))

# (see also 'Split - Apply - Combine' post)

Keine Kommentare: