I have this table:
CREATE TABLE `property_ads_history` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `property_ad_id` int(10) unsigned NOT NULL, `advertiser_type` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `agency_reference_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `average_sale_price` double(8,2) NOT NULL DEFAULT '-1.00', `bathrooms` double(8,2) NOT NULL DEFAULT '-1.00', `bedrooms` double(8,2) NOT NULL DEFAULT '-1.00', `carports` double(8,2) NOT NULL DEFAULT '-1.00', `DELETE_country` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `created_reason` enum('Geocoded','Sanitized Parking','Sanitized Representation','Sanitized Address','Scraped','URL Inserted','QA Sanitized Address','QA Sanitized Representation','QA Sanitized Parkings') COLLATE utf8_unicode_ci DEFAULT NULL, `description` longtext COLLATE utf8_unicode_ci NOT NULL, `ensuite_bathrooms` double(8,2) NOT NULL DEFAULT '-1.00', `DELETE_ad_expired_at` datetime NOT NULL, `floor_area` double(8,2) NOT NULL DEFAULT '-1.00', `formatted_address` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `garages` double(8,2) NOT NULL DEFAULT '-1.00', `geocode_status` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL, `is_represented` tinyint(1) DEFAULT NULL, `land_area` double(8,2) NOT NULL DEFAULT '-1.00', `latitude` double(10,6) NOT NULL, `location_id` int(10) unsigned DEFAULT NULL, `longitude` double(10,6) NOT NULL, `off_street_parkings` double(8,2) NOT NULL DEFAULT '-1.00', `official_property_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `page_url` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `parking` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `posted_at` datetime NOT NULL, `posted_at_string` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `postal_code` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `price` double(10,2) NOT NULL DEFAULT '-1.00', `primary_image` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `DELETE_property_ad_created_at` datetime NOT NULL, `property_type` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `rateable_value` double(10,2) NOT NULL DEFAULT '-1.00', `recent_sale_1` double(10,2) NOT NULL DEFAULT '-1.00', `recent_sale_2` double(10,2) NOT NULL DEFAULT '-1.00', `recent_sale_3` double(10,2) NOT NULL DEFAULT '-1.00', `reviewer_comments` longtext COLLATE utf8_unicode_ci NOT NULL, `route` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `source_id` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL, `street_number` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `title` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `user_id` int(10) unsigned DEFAULT NULL, `user_provided_address` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', `updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', `deleted_at` timestamp NULL DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `unique_index` (`created_at`,`created_reason`,`source_id`), KEY `property_ads_history_property_ad_id_foreign` (`property_ad_id`), KEY `property_ads_history_location_id_foreign` (`location_id`), KEY `created_reason` (`created_reason`) );
This is my SQL query:
SELECT * FROM `property_ads_history` `t1`
WHERE `t1`.`created_at` >= '2016-04-13 00:00:00'
AND `t1`.`created_reason` = 'Scraped'
AND (`t1`.`price` > -1 OR `t1`.`price` <> 999999.99)
AND (
SELECT `t2`.`price` FROM `property_ads_history` `t2`
WHERE `t2`.`property_ad_id` = `t1`.`property_ad_id`
AND `t2`.`created_at` < `t1`.`created_at`
AND (`t2`.`price` > -1 OR `t2`.`price` <> 999999.99) ORDER BY
DATE(`t2`.`created_at`) DESC LIMIT 1
) <> `t1`.`price` GROUP BY `t1`.`property_ad_id`;
All I want in that query are:
I want to get the records that meet certain criteria, particularly that the
created_at
field is within the last 24 hoursI need to get the record that immediately precedes the records in #1
further filter results in #1 to records whose price column has had a different value in a record's history, other than -1
The problem I'm facing is that the query is too slow. I have a million records. It would take more or less 2 mins for the query to execute. I'm guessing GROUP BY makes the query slow. I think I need to make a composite index but I'm not sure how to do it.
Explain result:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
1 | PRIMARY | t1 | ref | unique_index,created_reason | created_reason | 2 | const | 590030 | Using where; Using temporary; Using filesort
2 | DEPENDENT SUBQUERY | t2 | ref | unique_index,property_ads_history_property_ad_id_foreign | property_ads_history_property_ad_id_foreign | 4 | reis_real_estate.t1.property_ad_id | 7 | Using where; Using filesort
(m,n)
onFLOAT
orDOUBLE
(eg,double(8,2)
), it causes an extra rounding, when you probably want no rounding. – Rick James