I've always struggled with mysql joins but have started incorporating more but struggling to understand despite reading dozens of tutorials and mysql manual.
My situation is I have 3 tables:
/* BASICALLY A TABLE THAT HOLDS FAN RECORDS */
CREATE TABLE `fans` ( `id` int(11) unsigned NOT NULL AUTO_INCREMENT, `first_name` varchar(255) DEFAULT NULL, `middle_name` varchar(255) DEFAULT NULL, `last_name` varchar(255) DEFAULT NULL, `email` varchar(255) DEFAULT NULL, `join_date` datetime DEFAULT NULL, `twitter` varchar(255) DEFAULT NULL, `twitterCrawled` datetime DEFAULT NULL, `twitterImage` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `email` (`email`) ) ENGINE=MyISAM AUTO_INCREMENT=20413 DEFAULT CHARSET=latin1; /* A TABLE OF OUR TWITTER FOLLOWERS */ CREATE TABLE `twitterFollowers` ( `id` int(11) unsigned NOT NULL AUTO_INCREMENT, `screenName` varchar(25) DEFAULT NULL, `twitterId` varchar(25) DEFAULT NULL, `customerId` int(11) DEFAULT NULL, `uniqueStr` varchar(50) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `unique` (`uniqueStr`) ) ENGINE=InnoDB AUTO_INCREMENT=13426 DEFAULT CHARSET=utf8; /* TABLE THAT SUGGESTS A LIKELY MATCH OF A TWITTER FOLLOWER BASED ON THE EMAIL / SCREEN NAME COMPARISON OF THE FAN vs OUR FOLLOWERS IF SOMEONE (ie. a moderator) CONFIRMS OR DENIES THAT IT'S A GOOD MATCH THEY PUT A DATESTAMP IN `dismissed` */ CREATE TABLE `contentSuggestion` ( `id` int(11) unsigned NOT NULL AUTO_INCREMENT, `userId` int(11) DEFAULT NULL, `fanId` int(11) DEFAULT NULL, `twitterAccountId` int(11) DEFAULT NULL, `contentType` varchar(50) DEFAULT NULL, `contentString` varchar(255) DEFAULT NULL, `added` datetime DEFAULT NULL, `dismissed` datetime DEFAULT NULL, `uniqueStr` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `unstr` (`uniqueStr`) ) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;
What I'm trying to get is:
SELECT [fan columns] WHERE fan screen name IS IN twitterfollowers AND WHERE fan screen name IS NOT IN contentSuggestion (with a datestamp in dismissed)
My attempts so far:
~33 seconds
SELECT fans.id, tf.screenName as col1, tf.twitterId as col2 FROM fans LEFT JOIN twitterFollowers tf ON tf.screenName = fans.emailUsername LEFT JOIN contentSuggestion cs ON cs.contentString = tf.screenName WHERE dismissed IS NULL GROUP BY(fans.id) HAVING col1 != ''
~14 seconds
SELECT id, emailUsername FROM fans WHERE emailUsername IN(SELECT DISTINCT(screenName) FROM twitterFollowers) AND emailUsername NOT IN(SELECT DISTINCT(contentString) FROM contentSuggestion WHERE dismissed IS NULL) GROUP BY (fans.id);
9.53 seconds
SELECT fans.id, tf.screenName as col1, tf.twitterId as col2 FROM fans LEFT JOIN twitterFollowers tf ON tf.screenName = fans.emailUsername WHERE tf.uniqueStr NOT IN(SELECT uniqueStr FROM contentSuggestion WHERE dismissed IS NULL)
I hope there is a better way. I've been struggling to really use JOINS outside of a single LEFT JOIN which has already helped me speed up other queries by a significant amount.
Thanks for any help you can give me.