3
votes

Hi StackOverFlow members

reports = A table name.

Database

CREATE TABLE `reports` (
  `id` int(11) NOT NULL auto_increment,
  `report_day_name` varchar(20) NOT NULL,
  `report_day` varchar(20) NOT NULL,
  `report_month` varchar(20) NOT NULL,
  `report_year` varchar(20) NOT NULL,
  `report_result_number` varchar(20) NOT NULL,
  `report_result_text` varchar(20) NOT NULL,
  `report_since` varchar(20) NOT NULL,
  `report_date` varchar(20) NOT NULL,
  `catid` int(11) NOT NULL,
  `subjectid` int(11) NOT NULL,
  `userid` int(11) NOT NULL,
  `groupid` int(11) NOT NULL,
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=78 ;

INSERT INTO `reports` VALUES (73, 'day', '14', '1', '1434 h', '5', 'rate', '1234567890', '1434-1-14', 1, 132, 33, 35);
INSERT INTO `reports` VALUES (74, 'day', '12', '2', '1435 h', '4', 'rate', '1234567890', '1434-2-12', 2, 136, 36, 35);
INSERT INTO `reports` VALUES (75, 'day', '14', '1', '1434 h', '2', 'rate', '1354488730', '1434-1-14', 1, 132, 33, 35);
INSERT INTO `reports` VALUES (76, 'day', '12', '2', '1435 h', '4', 'rate', '1354488730', '1434-2-12', 2, 137, 36, 35);
INSERT INTO `reports` VALUES (77, 'day', '12', '2', '1435 h', '1', 'rate', '1354488730', '1434-2-12', 2, 134, 33, 35);

This is the database table:

id  report_result_number    subjectid   userid
73  5                       132         33
74  4                       136         36
75  2                       132         33
76  4                       137         36
77  1                       134         33

I want to SUM(reports.report_result_number) where (reports.subjectid) is DISTINCT

when i run this code..

SELECT
  users.user_id, users.user_name, users.user_country, SUM(reports.report_result_number) AS AllTotal, COUNT(DISTINCT reports.subjectid) AS TotalSubjects
FROM
  users
  INNER JOIN reports ON users.user_id = reports.userid
GROUP BY
  users.user_id
  ORDER BY
  AllTotal DESC LIMIT 4

it returns AllTotal

user_id user_name   user_country    AllTotal    TotalSubjects
36       name         country        8 (correct)        2
33        name        country        8 (not correct)    2
1
What should the value be if subjectid is not unique?Mr. Llama
Problem is, if you have multiple subjectid values for each user, and each has a different value for report_result_number, then which one should be summed?Eric Petroelje
Do you mean to do the summing only when the count of reports.subjectid = 1? If so, you could do:SELECT subjectid FROM reports GROUP BY subject_id HAVING COUNT(*) = 1. Then join to this by subjectid, and do your sum. Not sure if this is getting at what you want, your question is open to multiple interpretations.DWright
every user has subjectid rows..HasanAlyazidi
Adding the sample data helps, but what is the expected resultset? Do you want all report_result_number values for subjectid 132 for user 33 to be excluded from the SUM? My answer below does that. If you want something different, please provide the expected resultset.spencer7593

1 Answers

2
votes

The question is open to a couple of interpretations.

If what you want is the value of result_report_number to be included in the SUM aggregate ONLY if there is only ONE row for a given subjectid and userid, (if there is more than one row for the same subjectid, you want to EXCLUDE the report_result_number for all of those rows...

Then something like this will work:

SELECT u.user_id
     , u.user_name
     , u.user_country
     , SUM(s.report_result_number) AS AllTotal
     , COUNT(DISTINCT r.subjectid) AS TotalSubjects
  FROM users u
  JOIN reports r
    ON r.userid = u.user_id
  JOIN ( SELECT d.userid
              , d.subjectid
              , d.report_result_number
           FROM reports d
          GROUP
             BY d.userid
              , d.subjectid
         HAVING COUNT(1) = 1
       ) s
    ON s.userid = r.userid
 GROUP
    BY u.user_id
 ORDER
    BY AllTotal DESC
 LIMIT 4


That's only one (odd) interpretation of the requested result set. Sample data and an expected result set would go a long ways towards clarifying the specification.


For the data you added to your question, this query should be returning, e.g.

36 fee fi   8  2 
33 foo bar  1  2 

There are two rows with a subjectid value of 132 for user 33, so the report_result_number for those rows is excluded from the SUM. There are two distinct values for subjectid (132 and 134), so we're returning a :distinct: count of 2.


If you are asking for the SUM to return a value ONLY if there are no duplicate values for subjectid for a given user...

SELECT u.user_id
     , u.user_name
     , u.user_country
     , IF(COUNT(DISTINCT r.subjectid) = COUNT(r.subjectid)
         ,SUM(r.report_result_number)
         ,NULL
       ) AS AllTotal
     , COUNT(DISTINCT r.subjectid) AS TotalSubjects
  FROM users u
  JOIN reports r
    ON r.userid = u.user_id
 GROUP
    BY u.user_id
 ORDER
    BY AllTotal DESC
 LIMIT 4

Hasan said... "If there are duplicated values [of subjectid for a given userid], get one of them"

Just remove the HAVING clause from the inline view aliased as s. That will return the value of report_result_number for one row. (It will be arbitrary as to which "matching" row the value will be returned from:

SELECT u.user_id
     , u.user_name
     , u.user_country
     , SUM(r.report_result_number) AS AllTotal
     , COUNT(DISTINCT r.subjectid) AS TotalSubjects
  FROM users u
  JOIN ( SELECT d.userid
              , d.subjectid
              , d.report_result_number
           FROM reports d
          GROUP
             BY d.userid
              , d.subjectid
       ) r
    ON r.userid = u.user_id
 GROUP
    BY u.user_id
 ORDER
    BY AllTotal DESC
 LIMIT 4

To make the resultset repeatable, to always get the lowest or highest value, you could add an aggregate function to specify which value to return.

replace...

          , d.report_result_number

with...

          , MAX(d.report_result_number)  AS report_result_number

With the MAX() aggregate, this will return:

36 fee fi   8  2
33 foo bar  6  2

(The query will get the value of '5' for subjectid=132 userid=33, and will omit the value of '2' for the same subjectid.) Absent the MAX aggregate, the query could validly (and arbitrarily) return a '3' in place of the '6'. (It can include either the '5' or the '2', and omit the other.)

SQL Fiddle here

SQL Fiddle including the MAX aggregate here


Q: how can i use (where report_month = 'number') in your code?

A: add the WHERE clause in the inline view, after the FROM clause before the GROUP BY clause. Replace this:

       FROM reports d
      GROUP

with e.g.

       FROM reports d
      WHERE d.report_month = 'number'
      GROUP

Only rows that satisfy the specified predicate will be returned.