Homework10 For Loops and Randomization

Using a for loop, write a function to calculate the number of zeroes in a numeric vector. Before entering the loop, set up a counter variable counter <- 0. Inside the loop, add 1 to counter each time you have a zero in the vector. Finally, use return(counter) for the output.
Use subsetting instead of a loop to rewrite the function as a single line of code.

#Vector
x <- c(0,1,2,3,0,3,7,4,0,4,0,0,0,4)
#for loop to count 0's
counter <- 0
for (i in x) {
  if (i == 0) {
    counter <- counter + 1 
  }
}

print(counter)

## [1] 6

#another for, simpler using subsetting. 

print(sum(x == 0))

## [1] 6

Write a function that takes as input two integers representing the number of rows and columns in a matrix. The output is a matrix of these dimensions in which each element is the product of the row number x the column number.

rows <- 3
cols <- 4
matrix1 <- matrix_function(rows, cols)
print(matrix1)

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]    2    4    6    8
## [3,]    3    6    9   12

4a. Simulate a dataset with 3 groups of data, each group drawn from a distribution with a different mean. The final data frame should have 1 column for group and 1 column for the response variable.

4b. Write a custom function that 1) reshuffles the response variable, and 2) calculates the mean of each group in the reshuffled data. Store the means in a vector of length 3.

4c. Use a for loop to repeat the function in b 100 times. Store the results in a data frame that has 1 column indicating the replicate number and 1 column for each new group mean, for a total of 4 columns.

4d. Use qplot() to create a histogram of the means for each reshuffled group. Or, if you want a challenge, use ggplot() to overlay all 3 histograms in the same figure. How do the distributions of reshuffled means compare to the original means?

#Using my function generate_random_data to create a group 3 with different means, N's, and CV's. Dataframe in a long vertion. 
my_data<-generate_random_data(n_groups = 3)
head(my_data)

##     Group    Value
## 1 Group_1 23.15628
## 2 Group_1 24.91941
## 3 Group_1 21.03801
## 4 Group_1 22.75493
## 5 Group_1 16.43838
## 6 Group_1 26.96913

w<-ggplot(my_data, aes(x = Group, y = Value)) +
  geom_boxplot() +
  labs(x = "Group", y = "Value")
print(w)

#shuffle my data and print the mean of each group in a vector. 

shuffle_mean<-shuffle_response(my_data)
print(shuffle_mean)

## [1] 20.73911 20.56692 21.23878

#Repeat this 100 times, and store in a new data frame 

# Number of iterations
N <- 100

#New df, with a first column = to n_interations, second a 3 row matrix filled with NA values for every interation. 
my_results <- data.frame(iteration = 1:N, matrix(NA, nrow = N, ncol = 3))
#Change column names from column 2 to 4. 
colnames(my_results)[-1] <- c("Group_1", "Group_2", "Group_3")
head(my_results)

##   iteration Group_1 Group_2 Group_3
## 1         1      NA      NA      NA
## 2         2      NA      NA      NA
## 3         3      NA      NA      NA
## 4         4      NA      NA      NA
## 5         5      NA      NA      NA
## 6         6      NA      NA      NA

# Repeat the function 100 times and store the results
for (i in 1:N) { # repeat funtion in i N number of times. 
  values <- shuffle_response(my_data) 
  my_results[i, -1] <- values #store vector results in column 2 to 4. 
}
glimpse(my_results)

## Rows: 100
## Columns: 4
## $ iteration <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1…
## $ Group_1   <dbl> 20.95041, 21.93417, 20.83981, 20.74572, 20.36495, 20.83377, …
## $ Group_2   <dbl> 20.80930, 20.35667, 20.62473, 20.27474, 21.07885, 21.12405, …
## $ Group_3   <dbl> 20.61232, 20.12816, 21.02899, 21.68035, 20.90019, 20.26806, …

#create a boxplot
new_resuls <- my_results[, c("Group_1", "Group_2", "Group_3")]

#convert to long version for ggplot. 
long_results <- pivot_longer(new_resuls, cols = starts_with("Group"), names_to = "Group", values_to = "Value")
head(long_results)

## # A tibble: 6 × 2
##   Group   Value
##   <chr>   <dbl>
## 1 Group_1  21.0
## 2 Group_2  20.8
## 3 Group_3  20.6
## 4 Group_1  21.9
## 5 Group_2  20.4
## 6 Group_3  20.1

x<-ggplot(long_results, aes(x = Group, y = Value)) +
  geom_boxplot() +
  labs(x = "Group", y = "Value")
print(x)

Comparing the results of the original data set my_data, and the shuffle one. Since the first random generate data set, my_data will shown different mean values for each group. When we shuffle the values, it is possible that the means become similar. This becouse shuffle function, mixes tha Values independently from each group. Creating a more homgeneaus mean among groups.

Homework10 For Loops and Randomization

Daniel Penados-Richter

2024-04-03