# beginner: sample size, P values ​​(their relation) and data visualization with graphics. ggplot and test t

We know that the P values ​​(within the context of the t test as an example …) are very sensitive to the size of the sample. A larger sample will produce a smaller p-value and everything else will be constant. On the other hand, the size of the Cohen effect d remains the same. I am inspired by this code here, but I have changed some parts so that the difference between means is constant, instead of creating a random variable based on a normal distribution.

Although everything works, I imagine that some of the experts in this community could improve my syntax.

``````library (tidyverse)

ctrl_mean <- 8
ctrl_sd <- 1

treated_media <- 7.9
treated_sd <- 1.2

sample <- numeric () #crive vetor to group results
nsim <- 1000 #criar variavel
t_result <- numeric ()

for (i in 1: nsim) {
set.seed (123)
t_result[i] <- (media (ctrl_mean) -mean (treated_mean)) / sqrt ((ctrl_sd ^ 2 / (i)) + (Tratada_sd ^ 2 / (i))) #manual t test
sample[i] <- i # number of participants
}
ds <- data.frame(
sample = sample, #assign the sample size
t_result = round(t_result,3), #get the t test result
degrees = sample*2-2) #compute the degrees of freedom

ds %>%
filter (sample> 1)%>%
mutate (P_Value = 2 * pt (abs (t_result), df = degrees, lower.tail = FALSE))%>%
left_join (ds,.) -> ds

#plot
ggplot (ds, aes (x = sample, y = P_Value)) +
geom_line () +
note ("segment", x = 1, xend = sample, y = 0.05, yend = 0.05, color = "purple", line type = "dotted") +
note ("segment", x = 1, xend = sample, y = 0.01, yend = 0.01, color = "red", line type = "dotted") +
record ("text", x = c (1,1), y = c (.035, .001), label = c ("p <0.05", "p <0.01"))
``````