I have a table that contains duplicate names and emails but with different identifiers and other associated data. In this case, the identifier does not have to be unique. Here is an example of the data:
| course | user_code | First Name | Last Name | Email Address |
|---|---|---|---|---|
| econ | SKNKZ62HR28 | John | Doe | [email protected] |
| lang | C6NV4LQS5D4 | John | Doe | [email protected] |
| science | H9NNVNWVPM9 | Fred | Doe | [email protected] |
| govt | JVNH2BVTD3V | Jane | Doe | [email protected] |
| math | L2NK8S49N5G | Jane | Doe | [email protected] |
What I need to do is update the user_code so that John Doe has the same user_code for both courses:
| course | user_code | First Name | Last Name | Email Address |
|---|---|---|---|---|
| econ | SKNKZ62HR28 | John | Doe | [email protected] |
| lang | SKNKZ62HR28 | John | Doe | [email protected] |
| science | H9NNVNWVPM9 | Fred | Doe | [email protected] |
| govt | L2NK8S49N5G | Jane | Doe | [email protected] |
| math | L2NK8S49N5G | Jane | Doe | [email protected] |
I've been able to identify my duplicates with this code:
SELECT a.*
FROM attendees a
JOIN
(SELECT [Email Address], [Last Name], [First Name], COUNT(*) AS number
FROM attendees
GROUP BY [Email Address], [Last Name], [First Name]
HAVING count(*) > 1) b ON a.[Email Address] = b.[Email Address]
AND a.[Last Name] = b.[Last Name]
AND a.[First Name] = b.[First Name]
ORDER BY a.[Email Address]
I'm not sure of the next steps.