0
votes

We have a pretty large wordpress mysql database (9.8 GB)

rows in key tables:

  • 770K wp_posts
  • 14K wp_terms
  • 4M wp_term_relationships
  • 14K wp_term_taxonomy
  • 16M wp_postmeta
  • 1M wp_options (damn those transients)

I have a query I would like to optimize. I am struggling with this one for quite some time now. It is been used a lot at woocommerce based websites. I was wondering if someone could optimize this query.

SELECT  COUNT( DISTINCT ID )
    FROM  wp_posts p
    LEFT JOIN  
        ( SELECT  object_id
            FROM  wp_term_relationships
            WHERE  term_taxonomy_id IN ( 8128 ) 
        ) AS exclude_join  ON exclude_join.object_id = p.ID
    INNER JOIN  
        ( SELECT  object_id
            FROM  wp_term_relationships
            INNER JOIN  wp_term_taxonomy using( term_taxonomy_id )
            WHERE  term_id IN ( 20,21,31,46,3591,47,99 ) 
        ) AS include_join  ON include_join.object_id = p.ID
    WHERE  1=1
      AND  p.post_status = 'publish'
      AND  p.post_type = 'product'
      AND  exclude_join.object_id IS NULL

It is the query that recounts the products and is being found at the wc-terms-functions.php.

function _wc_term_recount( $terms, $taxonomy, $callback = true, $terms_are_term_taxonomy_ids = true ) {
global $wpdb;

// Standard callback.
if ( $callback ) {
    _update_post_term_count( $terms, $taxonomy );
}

$exclude_term_ids            = array();
$product_visibility_term_ids = wc_get_product_visibility_term_ids();

if ( $product_visibility_term_ids['exclude-from-catalog'] ) {
    $exclude_term_ids[] = $product_visibility_term_ids['exclude-from-catalog'];
}

if ( 'yes' === get_option( 'woocommerce_hide_out_of_stock_items' ) && $product_visibility_term_ids['outofstock'] ) {
    $exclude_term_ids[] = $product_visibility_term_ids['outofstock'];
}

$query = array(
    'fields' => "
        SELECT COUNT( DISTINCT ID ) FROM {$wpdb->posts} p
    ",
    'join'   => '',
    'where'  => "
        WHERE 1=1
        AND p.post_status = 'publish'
        AND p.post_type = 'product'

    ",
);

if ( count( $exclude_term_ids ) ) {
    $query['join']  .= " LEFT JOIN ( SELECT object_id FROM {$wpdb->term_relationships} WHERE term_taxonomy_id IN ( " . implode( ',', array_map( 'absint', $exclude_term_ids ) ) . " ) ) AS exclude_join ON exclude_join.object_id = p.ID";
    $query['where'] .= " AND exclude_join.object_id IS NULL";
}

// Pre-process term taxonomy ids.
if ( ! $terms_are_term_taxonomy_ids ) {
    // We passed in an array of TERMS in format id=>parent.
    $terms = array_filter( (array) array_keys( $terms ) );
} else {
    // If we have term taxonomy IDs we need to get the term ID.
    $term_taxonomy_ids = $terms;
    $terms             = array();
    foreach ( $term_taxonomy_ids as $term_taxonomy_id ) {
        $term    = get_term_by( 'term_taxonomy_id', $term_taxonomy_id, $taxonomy->name );
        $terms[] = $term->term_id;
    }
}

// Exit if we have no terms to count.
if ( empty( $terms ) ) {
    return;
}

// Ancestors need counting.
if ( is_taxonomy_hierarchical( $taxonomy->name ) ) {
    foreach ( $terms as $term_id ) {
        $terms = array_merge( $terms, get_ancestors( $term_id, $taxonomy->name ) );
    }
}

// Unique terms only.
$terms = array_unique( $terms );

// Count the terms.
foreach ( $terms as $term_id ) {
    $terms_to_count = array( absint( $term_id ) );

    if ( is_taxonomy_hierarchical( $taxonomy->name ) ) {
        // We need to get the $term's hierarchy so we can count its children too
        if ( ( $children = get_term_children( $term_id, $taxonomy->name ) ) && ! is_wp_error( $children ) ) {
            $terms_to_count = array_unique( array_map( 'absint', array_merge( $terms_to_count, $children ) ) );
        }
    }

    // Generate term query
    $term_query          = $query;
    $term_query['join'] .= " INNER JOIN ( SELECT object_id FROM {$wpdb->term_relationships} INNER JOIN {$wpdb->term_taxonomy} using( term_taxonomy_id ) WHERE term_id IN ( " . implode( ',', array_map( 'absint', $terms_to_count ) ) . " ) ) AS include_join ON include_join.object_id = p.ID";

    // Get the count
    $count = $wpdb->get_var( implode( ' ', $term_query ) );

    // Update the count
    update_woocommerce_term_meta( $term_id, 'product_count_' . $taxonomy->name, absint( $count ) );
}

delete_transient( 'wc_term_counts' );

Could SELECT SQL_CALC_FOUND_ROWS be a better alternative? I am no query expert, please help.

2
Can you please provide an EXPLAIN output and the schema structure (relevant tables and the existing indexes)?Tomer Shay

2 Answers

0
votes

Start by improving the many:many schema, as discussed here: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#speeding_up_wp_postmeta

SQL_CALC_FOUND_ROWS is mostly a convenience so that you don't have to scan the table twice. (And it is unclear how you might use it here.)

The query looks like a "explode-implode" situation where doing JOINs lead to more rows, then GROUP BY (or DISTINCT in your case) shrinks back.

I'm unclear on why you have LEFT JOIN in one case, but JOIN in the other.

Possibly each of the JOINs could be added to the WHERE clause in this format:

AND EXISTS (SELECT ... ) 

That would let you get rid of DISTINCT, thereby avoiding the explode-implode. At that point, a simple COUNT(*) should suffice.

0
votes

Since the function was called many many times every day and I there was no actual need to have a realtime update of the number of products I ended up recoding the function using a randomizer. Statistically out of 300 calls of the function it should run once (number 42), which is just perfect. Really managed to reduce the server load massively.

It is a WooCommerce core hack for now but I might write a plugin for this.

Here is the code :

function _wc_term_recount( $terms, $taxonomy, $callback = true, $terms_are_term_taxonomy_ids = true ) {
$number = rand(1,300);
if($number == 42) {
global $wpdb;

// Standard callback.
if ( $callback ) {
    _update_post_term_count( $terms, $taxonomy );
}

$exclude_term_ids            = array();
$product_visibility_term_ids = wc_get_product_visibility_term_ids();

if ( $product_visibility_term_ids['exclude-from-catalog'] ) {
    $exclude_term_ids[] = $product_visibility_term_ids['exclude-from-catalog'];
}

if ( 'yes' === get_option( 'woocommerce_hide_out_of_stock_items' ) && $product_visibility_term_ids['outofstock'] ) {
    $exclude_term_ids[] = $product_visibility_term_ids['outofstock'];
}

$query = array(
    'fields' => "
        SELECT COUNT( DISTINCT ID ) FROM {$wpdb->posts} p
    ",
    'join'   => '',
    'where'  => "
        WHERE 1=1
        AND p.post_status = 'publish'
        AND p.post_type = 'product'

    ",
);

if ( count( $exclude_term_ids ) ) {
    $query['join']  .= " LEFT JOIN ( SELECT object_id FROM {$wpdb->term_relationships} WHERE term_taxonomy_id IN ( " . implode( ',', array_map( 'absint', $exclude_term_ids ) ) . " ) ) AS exclude_join ON exclude_join.object_id = p.ID";
    $query['where'] .= " AND exclude_join.object_id IS NULL";
}

// Pre-process term taxonomy ids.
if ( ! $terms_are_term_taxonomy_ids ) {
    // We passed in an array of TERMS in format id=>parent.
    $terms = array_filter( (array) array_keys( $terms ) );
} else {
    // If we have term taxonomy IDs we need to get the term ID.
    $term_taxonomy_ids = $terms;
    $terms             = array();
    foreach ( $term_taxonomy_ids as $term_taxonomy_id ) {
        $term    = get_term_by( 'term_taxonomy_id', $term_taxonomy_id, $taxonomy->name );
        $terms[] = $term->term_id;
    }
}

// Exit if we have no terms to count.
if ( empty( $terms ) ) {
    return;
}

// Ancestors need counting.
if ( is_taxonomy_hierarchical( $taxonomy->name ) ) {
    foreach ( $terms as $term_id ) {
        $terms = array_merge( $terms, get_ancestors( $term_id, $taxonomy->name ) );
    }
}

// Unique terms only.
$terms = array_unique( $terms );

// Count the terms.
foreach ( $terms as $term_id ) {
    $terms_to_count = array( absint( $term_id ) );

    if ( is_taxonomy_hierarchical( $taxonomy->name ) ) {
        // We need to get the $term's hierarchy so we can count its children too
        if ( ( $children = get_term_children( $term_id, $taxonomy->name ) ) && ! is_wp_error( $children ) ) {
            $terms_to_count = array_unique( array_map( 'absint', array_merge( $terms_to_count, $children ) ) );
        }
    }

    // Generate term query
    $term_query          = $query;
    $term_query['join'] .= " INNER JOIN ( SELECT object_id FROM {$wpdb->term_relationships} INNER JOIN {$wpdb->term_taxonomy} using( term_taxonomy_id ) WHERE term_id IN ( " . implode( ',', array_map( 'absint', $terms_to_count ) ) . " ) ) AS include_join ON include_join.object_id = p.ID";

    // Get the count
    $count = $wpdb->get_var( implode( ' ', $term_query ) );

    // Update the count
    update_woocommerce_term_meta( $term_id, 'product_count_' . $taxonomy->name, absint( $count ) );
}

delete_transient( 'wc_term_counts' );
}
else {}

}