4
votes

Sorry about the confusing title, what I am trying to achieve is, getting the total applications for a job via the table below:

CREATE TABLE IF NOT EXISTS `applications` (
  `application_id` int(11) NOT NULL AUTO_INCREMENT,
  `application_user` varchar(100) NOT NULL,
  `application_date` datetime NOT NULL,
  `application_job` int(11) NOT NULL,
  `application_status` varchar(10) DEFAULT 'pending',
  `application_enabled` int(2) NOT NULL DEFAULT '1',
  `application_resume` int(11) NOT NULL,
  `application_description` text NOT NULL,
  PRIMARY KEY (`application_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

to get their ages, I am left joining user_personal_information on to application_user because they're the user who has applied to a job. My query:

SELECT count(*) as total,
       user_gender as gender,
       TIMESTAMPDIFF(YEAR, user_birthdate, CURDATE()) AS age,
       application_date
FROM applications
LEFT JOIN user_personal_information
    ON user_personal_information_user = application_user
WHERE application_job = ?

My user's table with user_gender which can equal to male / female and user_birthdate which in the above statement I am converting it to an age.

I am trying to group all the applications with an age range of for example:

16 - 21
22 - 30
31 - 45
45 - 64
65+

And the male and female percentages for that age. To use for a datachart that needs data like so:

 "dataProvider": [
 {
   "age": "85+",
   "male": 25, //
   "female": 25
 }, {
   "age": "80-54",
   "male": 25,//percentage
   "female": 25//percentage
}]

So from the above, there's 25 % of males have applied aged 85 and older, and 25% of females. You get the gist, so that's how I am trying to get my select statement to work.

which will create a chart like so:

enter image description here

So just to clarify, I want to count the total applications and work out the percentage of applications based on gender and age group. How can I do this with the select statement above?

2

2 Answers

1
votes

A couple of nested group-by subqueries can do it for you. Note that the percentage calculated is gender specific:

select a.age, a.gender, a.cnt, 100*a.cnt/b.sm as percentage from
    (
        SELECT user_gender as gender,
               TIMESTAMPDIFF(YEAR, user_birthdate, CURDATE()) AS age,
               count(*) as cnt,
        FROM applications
        LEFT JOIN user_personal_information
        ON user_personal_information_user = application_user
        WHERE application_job = ?
        GROUP BY user_gender, TIMESTAMPDIFF(YEAR, user_birthdate, CURDATE())
    ) a,
    (
        SELECT TIMESTAMPDIFF(YEAR, user_birthdate, CURDATE()) AS age,
               count(*) as sm
        FROM applications
        LEFT JOIN user_personal_information
        ON user_personal_information_user = application_user
        WHERE application_job = ?
        GROUP BY TIMESTAMPDIFF(YEAR, user_birthdate, CURDATE())
    ) b
    where a.age = b.age;

If you are looking for a percentage specific to total applications, you need something like:

select a.age, 
       a.gender, 
       a.cnt, 
100*a.cnt/(
            select count(TIMESTAMPDIFF(YEAR, user_birthdate, CURDATE()))
            from
            applications 
            LEFT JOIN user_personal_information
            ON user_personal_information_user = application_user
            WHERE application_job = ?
         ) as percentage 
from
    (
        SELECT user_gender as gender,
               TIMESTAMPDIFF(YEAR, user_birthdate, CURDATE()) AS age,
               count(*) as cnt,
        FROM applications
        LEFT JOIN user_personal_information
        ON user_personal_information_user = application_user
        WHERE application_job = ?
        GROUP BY user_gender, TIMESTAMPDIFF(YEAR, user_birthdate, CURDATE())
    ) a;
0
votes

This should do the job:

select 
 sum(case when age between 16 and 21 and gender='male' then 1 end) as '[Male 16-21]',
 sum(case when age between 16 and 21 and gender='female' then 1 end) as 'Female [16-21]',
 sum(case when age between 22 and 30 and gender='male' then 1 end) as '[Male 22-30]',
 sum(case when age between 22 and 30 and gender='female' then 1 end) as '[Female 22-30]',
 sum(case when age between 31 and 45 and gender='male' then 1 end) as '[Male 31-45]',
 sum(case when age between 31 and 45 and gender='female' then 1 end) as '[Female 31-45]',
 sum(case when age between 46 and 64 and gender='male' then 1 end) as '[Male 46-64]',
 sum(case when age between 46 and 64 and gender='female' then 1 end) as '[Female 46-64]',
 sum(case when age > 64              and gender='male' then 1 end) as '[Male Over 64]',
 sum(case when age > 64              and gender='female' then 1 end) as '[Female Over 64]',
 sum(case when 1=1                   and gender='female' then 1 end) as '[Male TOTAL]',
 sum(case when 1=1                   and gender='male' then 1 end) as '[Female TOTAL]'

FROM applications