Montag, 24. November 2014

p-values uniformly distributed

Are p-values really uniformly distributed? They are:


library(DescTools)
gg <- rnorm(10000)
n <- 100
m <- sapply(1:n, function(x) sample(1:10000, size = n))
m[] <- gg[m]

d.frm <- data.frame(x=as.vector(m), grp=factor(rep(1:n, each=n)))
p <- na.omit(as.vector(pairwise.t.test(x = d.frm$x, g = d.frm$grp,
                                       p.adjust.method = "none", pool.sd = FALSE)$p.value))
Desc(p)



See: https://www.youtube.com/watch?v=5OL1RqHrZQ8



Mittwoch, 2. April 2014

Unique List, but with order

How to create a list of elements in order:

(x <- structure(c(3L, 3L, 3L, 3L, 1L,  2L, 2L,
            2L, 2L, 3L, 2L, 1L, 1L, 2L, 2L,
            2L, 2L, 2L, 2L, 2L, 2L, 2L),
            .Label = c("B", "C", "D"), class = "factor"))

[1] D D D D B C C C C D C B B C C C C C C C C C
Levels: B C D


> x[c(TRUE, diff(as.numeric(x))!=0)]
[1] D B C D C B C
Levels: B C D
 

Montag, 27. Januar 2014

Elegant Summary

A really elegant solution for combining different summary results:

library(plyr)
mtcars$mpg_g <- cut(mtcars$mpg, breaks=seq(from=10, to=35, by=5))
d.res <- ddply(mtcars, c("cyl", "am"), summarize,
           anz = length(cyl),
           min.hp=min(hp), med.hp=median(hp), mean.hp=round(mean(hp),2), max.hp=max(hp),
           mpg_g=matrix(table(mpg_g), nrow=1))

d.res

Credits to Markus Naepflin (2014) for that.

Dienstag, 21. Januar 2014

DescTools 0.99.6 released

The first Version of DescTools can be downloaded from CRAN. So it is available as well in Mac-Version:
http://cran.r-project.org/web/packages/DescTools/index.html

DescTools contains a bunch of basic statistic functions and convenience wrappers for efficiently describing data, creating specific plots, doing reports using MS Word, Excel or PowerPoint. The package's intention is to offer a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'camel style' was consequently applied to functions borrowed from contributed R packages as well.