Dienstag, 22. November 2011

Recode a factor

Recoding a factor seems not to be a frequent activity of Ripley & Co. as it is astonishingly laboriously (because of the "else" part of it...).
The shortest trick I found so far is:

x <- factor(sample(letters, 20))
y <- x

levels(y) <- list(
  "good"=c("a","b","c"),
  "bad"=c("d","e","f"),
  "ugly"=c("h","m"),
  "else"=setdiff( levels(y), c("a","b","c","d","e","f","h","m") )
)
data.frame(x,y)

Anyone out there a better idea?

Mittwoch, 19. Oktober 2011

Evaluation of text

Just to remember (once for all):
eval( parse( text="a <- c(1,2,3)" ) )

Montag, 19. September 2011

Mengen-Vergleiche

Interessanten Mengenvergleichs-Funktionen:

(A <- c(sort(sample(1:20, 9)),NA))

[1] 1 2 3 12 16 17 18 19 20 NA
(B <- c(sort(sample(3:23, 7)),NA))
[1] 8 10 15 16 18 21 22 NA


# alle Elemente aus A und B (KEINE Duplikate)
union(A, B)
[1] 1 2 3 12 16 17 18 19 20 NA 8 10 15 21 22

# alle Elemente die in A und in B vorkommen (Schnittmenge)
intersect(A, B)
[1] 16 18 NA

# Elemente in A, aber nicht in B vorkommen
setdiff(A, B)
[1] 1 2 3 12 17 19 20

# Elemente in B, aber nicht in A vorkommen
setdiff(B, A)
[1] 8 10 15 21 22


# Enthält A die gleichen Elemente wie B?
setequal(A, B)
[1] FALSE

Donnerstag, 15. September 2011

Function call with a list of arguments

Use do.call to call a function with a previously built list of arguments:

# Define the arguments

args.legend <- list(
  x = "topleft"
  , legend = c(1,2)
  , fill = c("red","blue")
  , xjust = 1, yjust = 1)

plot(1:10, 1:10)
# call function with the arguments' list
do.call("legend", args.legend)

Get information about invisible built-in functions

How can we see the code of invisible functions, i.e. mosaicplot? 

> mosaicplot

yields only:

function (x, ...)
UseMethod("mosaicplot")


but the functions "methods" and "getAnywhere" do the trick:

> methods(mosaicplot)

[1] mosaicplot.default* mosaicplot.formula*

Non-visible functions are asterisked

> getAnywhere("mosaicplot.formula")
A single object matching ‘mosaicplot.formula’ was found
It was found in the following places
registered S3 method for mosaicplot from namespace graphics
namespace:graphics
with value

function (formula, data = NULL, ..., main = deparse(substitute(data)),
subset, na.action = stats::na.omit)
{
main
m <- match.call(expand.dots = FALSE)
.....

Donnerstag, 8. September 2011

Layout-Muster

Split screens:

m <- matrix(0:3, 2, 2)
layout(m, c(1, 3), c(1, 3))
layout.show(3)


m <- matrix(1:4, 2, 2)
layout(m, widths=c(1, 3), heights=c(3, 1))
layout.show(4)

Date formats and functions

Date formats:


Remember the functions:

d <- as.Date(c("1937-01-10","1916-03-02","1913-09-19","1927-12-23","1947-07-28"))

quarters(d)
[1] "Q1" "Q1" "Q3" "Q4" "Q3"


# ... or as function
quarter <- function (x) {
  # Berechnet das Quartal eines Datums
  y <- as.numeric( format( x, "%Y") )
  paste(y, "Q", (as.POSIXlt(x)$mon)%/%3 + 1, sep = "")
}

# ... or alternatively with cut
cut(d, breaks="quarters")
[1] 1937-01-01 1916-01-01 1913-07-01 1927-10-01 1947-07-01
137 Levels: 1913-07-01 1913-10-01 1914-01-01 1914-04-01 ... 1947-07-01


months(d)
[1] "Januar" "März" "September" "Dezember" "Juli"

format(d,"%B") # months alternative
[1] "Januar" "März" "September" "Dezember" "Juli"

format(d,"%Y") # years
[1] "1937" "1916" "1913" "1927" "1947"

Montag, 4. Juli 2011

Legend variations



par(mar=c(5.1,4.1,4.1,11.1))
plot( x=1:5, y=1:5, type="n", xlab="x", ylab="y" )

legend( x=2, y=6, legend=c("A","B","C")
  , fill=c("red","blue","green")
  , density=30, bty="n", horiz=TRUE
  , xpd=TRUE )

legend( x=2, y=2, xjust=0.5, yjust=0
  , title=" My title:", title.col="grey40", title.adj=0
  , legend=c("A","B","C","D","E")
  , pch=c(22,22,22,45,45), pt.cex=c(1.2,1.2,1.2,2,2)
  , col=c(rep("black",3),"orange","red")
  , pt.bg=c("blue","green","yellow")
  , bg="grey95", cex=0.8
  , box.col="darkgrey", box.lwd=3, box.lty="dotted" )

legend("topright", inset=0.05, cex=0.8, bg="white"
  , legend=c("A-1","A-2","B-1", "B-2")
  , col=c("lightblue","blue","salmon","red"), pch=15, pt.cex=1.5
  , y.intersp=1.5, x.intersp=1.5 , ncol=2 )

windowsFonts("sans2"="Arial Black")
usr <- par(font=4, family="sans2" )
legend( x=5.5, y=3, legend=c("Label A","Label B","Label C")
  , fill=c("red","orange","yellow")
  , border="brown"
  , y.intersp=2, text.width=strwidth("Make larger")
  , text.col=c("red","orange","yellow")
  , xpd=TRUE )
par(usr)

legend( x="bottomleft", inset=0.02, legend=c("A","B","C","D")
  , lty=c("dashed","dotted",NA,"solid"), lwd=2, cex=0.8
  , pch=c(NA,NA,21,15)
  , col=c("red","blue","black","grey"), bg="white" )

Mittwoch, 9. März 2011

merge

A simple merge example:

a <- data.frame(
  "id"=c(1,2,3,6,7,8),
  "name"=c("Anna","Berta","Claudia","Dora","Eliane","Frida"))
b <- data.frame(

  "id"=c(1,2,3,4,5,9),
  "ort"=c("Zürich","Davos","Zermatt","Chamonix","Verbier","Sedrun"))
c.inner <- merge(a,b)
c.left <- merge(a,b, all.x=TRUE)
c.right <- merge(a,b, all.y=TRUE)
c.full <-  merge(a,b, all.x=TRUE, all.y=TRUE)

Freitag, 25. Februar 2011

Split - Apply - Combine

Another effort to make groupwise operations with a split - apply - combine solution (maybe we'll need it someday):

d.frm <- data.frame(
    name=c("Max","Max","Max","Max","Max","Moritz","Moritz","Moritz")
  , typ=c("rot","blau","grün","blau","grün","rot","rot","blau")
  , anz=c(5,4,5,8,3,2,9,1) )


# Split to list
groups <- split(d.frm, list(d.frm$name, d.frm$typ))

# Create result vector
results <- vector("list", length(groups))
# Apply
for(i in seq_along(groups)) {
  groups[[i]] <- transform(groups[[i]],
    rank = rank(-anz, ties.method = "first"))
  results[[i]] <- groups[[i]]
}


# Combine
result <- do.call("rbind", results)
result

     name  typ anz rank
2     Max blau   4    2
4     Max blau   8    1
8  Moritz blau   1    1
3     Max grün   5    1
5     Max grün   3    2
6     Max  rot   5    1
61 Moritz  rot   2    2
7  Moritz  rot   9    1


We might want to try here the elegant function ave as well:

d.frm$rank_g <- ave( -d.frm$anz, d.frm$name, d.frm$typ,
  FUN=function(x) rank(x, ties.method="first") )


... and not to forget the specially R-flavour:

split(x, g) <- lapply(split(x, g), FUN)