Gesucht ist die Dummy-Kodierung eines Faktors:
d.frm <- data.frame(
name=c("Max","Max","Max","Max","Max","Moritz","Moritz","Moritz")
, typ=c("rot","blau","grün","blau","grün","rot","rot","blau")
, anz=c(5,4,5,8,3,2,9,1)
)
d.frm
name typ anz
1 Max rot 5
2 Max blau 4
3 Max grün 5
4 Max blau 8
5 Max grün 3
6 Moritz rot 2
7 Moritz rot 9
8 Moritz blau 1
Gewünschtes liefert
model.matrix(~d.frm$typ)[,-1]
d.frm$typgrün d.frm$typrot
1 0 1
2 0 0
3 1 0
4 0 0
5 1 0
6 0 1
7 0 1
8 0 0
oder alternativ die Funktion class.ind aus der library(nnet):
library(nnet)
class.ind( df$typ )
blau grün rot
[1,] 0 0 1
[2,] 1 0 0
[3,] 0 1 0
[4,] 1 0 0
[5,] 0 1 0
[6,] 0 0 1
[7,] 0 0 1
[8,] 1 0 0
Still another brilliant Ripley solution:
ff <- factor(sample(letters[1:5], 25, replace=TRUE))
diag(nlevels(ff))[ff,]
Montag, 5. Oktober 2009
Abonnieren
Posts (Atom)