0
votes

I'm performing a small statistical analysis for an assignment, where I'm dealing with a bivariate normal distribution for adult females and males and a simulation of 100 heights and weights for each gender. I've combined the male and female heights into a single variable named "heights". I've done the same with weight. I'd like to create a dummy variable (1 for females, 0 for males) that matches height and weight, but I'm not sure how to do this.

Here is the code I have so far:

BivNormDist = function(n,m1,m2,s1,s2,r) {
  y = rnorm(n,m2,s2)
  x = rnorm(n,m1+r*s1*(y-m2)/s2,s1*sqrt(1-r^2))
  data.frame(x,y)
}

males = BivNormDist(100,167,60,7,9,0.60)
females = BivNormDist(100,177,76,8,11,0.55)

height = c(males$x, females$x)
weight = c(males$y, females$y)
gender = 

Any ideas? Thanks in advance!

1
Note that using numbers as dummy variables is not a good idea in R because they’re by default considered to be continuous by lots of functions. Use factors instead, they’re specifically designed to represent categories. It also makes the code more readable, because factors are represented by character strings — e.g. "male" and "female" — rather than arbitrary numbers.Konrad Rudolph

1 Answers

1
votes

You can add a new argument gender to your function like below:

BivNormDist = function(n,m1,m2,s1,s2,r, gender) {
  y = rnorm(n,m2,s2)
  x = rnorm(n,m1+r*s1*(y-m2)/s2,s1*sqrt(1-r^2))
  data.frame(x,y, gender)
}

males = BivNormDist(100,167,60,7,9,0.60, 0) #or 'male' instead of 0
females = BivNormDist(100,177,76,8,11,0.55, 1) #or 'female' instead of 1

head(males)
         x        y gender
1 174.5967 59.03692      0
2 166.8004 60.02427      0
3 166.3787 59.50217      0
4 171.4384 51.33848      0
5 165.3641 74.49850      0
6 169.6654 61.11999      0

head(females)
         x        y gender
1 172.0648 58.26547      1
2 173.0113 85.16080      1
3 200.1335 86.59496      1
4 184.2423 79.49594      1
5 183.0516 74.51125      1
6 196.1978 81.20334      1


height = c(males$x, females$x)
weight = c(males$y, females$y)
gender = c(males$gender, females$gender)

Hope this addresses your question.