Speed up mysql SQL query but with a huge dataset

Question

I have a table that has over 2.5 million rows and I would like to run the following SQL Statment to get the

select count(*) 
  from workflow 
 where action_name= 'Workflow' 
   and release_date >= '2019-12-01 13:24:22' 
   and release_date <= '2019-12-31 13:24:22' 
   AND project_name= 'Web' 
 group 
    by page_id
     , headline
     , release_full_name
     , release_date

The problem is that it takes over 2.7 seconds to return 0 rows as expected. Is there a way to speed it up more? I have 6 more SQL Statements that are similiar so that will take almost (2.7 seconds * 6) = 17 seconds at least.

Here is my table schema

CREATE TABLE workflow (
 id int(11) NOT NULL AUTO_INCREMENT,
 action_name varchar(100) NOT NULL,
 project_name varchar(30) NOT NULL,
 page_id int(11) NOT NULL,
 headline varchar(200) NOT NULL,
 create_full_name varchar(200) NOT NULL,
 create_date datetime NOT NULL,
 change_full_name varchar(200) NOT NULL,
 change_date datetime NOT NULL,
 release_full_name varchar(200) NOT NULL,
 release_date datetime NOT NULL,
 reject_full_name varchar(200) NOT NULL,
 reject_date datetime NOT NULL,
 PRIMARY KEY (id)
) ENGINE=InnoDB AUTO_INCREMENT=2948271 DEFAULT CHARSET=latin1

What I'm looking for in this query is to get the count of the pages that were released last month. that have project_name = "web" and action_name = "Workflow"

These days, I think 2.5 million is generally considered modest. Questions about query performance ALWAYS require, as a minimum, CREATE TABLE statements for ALL relevant tables, and the results of the EXPLAIN for the given query. — Strawberry
Post the current table schema and index info. Also, have you obtained any execution plans that would indicate where the bottleneck(s) are? — underscore_d
Also, note that this returns results per (page_id, headline, release_full_name, start_date), but there's no way of knowing which count belongs to which result, which seems odd. — Strawberry
I have edited the post to include the structure of the table i'm using. Thanks! — Alan Smith

Ankit Bajpai Ankit Bajpai · Accepted Answer · 2020-01-08T15:12:59

This is bit bigger for comments

Using Group by with Count function doesn't make any sense. Usually you need to count actual rows in DB not after aggregation. Not sure if this is your actual requirement reason being GROUP BY causes slowness of the query.
Use composite Index on (Web, start_date) as column project seems highest selective.

For other information, Please share the explain plan.

Speed up mysql SQL query but with a huge dataset

2 Answers