I have a query which uses ROW_NUMBER(). I have something like this:
ROW_NUMBER() OVER (ORDER BY publish_date DESC) rnum
The query runs pretty fast. However, if I add any reference to the "rnum" column, the query slows to a crawl. So, it appears that just having ROW_NUMBER() is not the issue, but when I use the "rnum" in the actual query, it crawls for like 30 seconds.
Any thoughts?
For Reference, here is the query:
WITH aquire AS (
SELECT rtnum, trans_id, source, provider, publish_date, story_link, industry_name, sector_name, subject, teaser, tickers
FROM (SELECT d.trans_id, d.source, 'AquireMedia' AS provider,
d.trans_time AS publish_date, '/research/get_news.php?id=' || d.trans_id AS story_link,
i.name AS industry_name, s.sector_name, d.headline AS subject, NULL AS teaser,
NEWS.NEWS_FUNCTIONS.CONCATENATE_TICKERS(d.trans_id,'AQUIREMEDIA') AS tickers,
ROW_NUMBER() OVER (PARTITION BY d.trans_id ORDER BY d.trans_time DESC) as rtnum
FROM story_descriptions_3m d, story_tickers_3m t, uber_master_mv m, industry i, ind_sector ix, sectors s, comp_ind c
WHERE d.trans_id = t.trans_id
AND t.m_ticker = m.m_ticker
AND t.m_ticker = c.m_ticker(+)
AND c.ind_code = i.ind_code(+)
AND i.ind_code = ix.ind_code(+)
AND ix.sector_id = s.sector_id(+) AND s.sector_id = 10 )
WHERE rtnum = 1),
partner AS (
SELECT rtnum, trans_id, source, provider, publish_date, story_link, industry_name, sector_name, subject, teaser, tickers
FROM (SELECT CAST(n.story_id AS VARCHAR2(20)) trans_id, n.provider AS source, 'Partner News' AS provider,
n.story_date AS publish_date, n.link AS story_link, i.name AS industry_name, s.sector_name, n.title AS subject,
CAST(substr(n.teaser,1,4000) AS VARCHAR2(4000)) AS teaser, NEWS.NEWS_FUNCTIONS.CONCATENATE_TICKERS(n.story_id,'OTHER') AS tickers,
ROW_NUMBER() OVER (PARTITION BY n.story_id ORDER BY n.story_date DESC) as rtnum
FROM news_stories_3m n, news_stories_lookup_3m t, comp_ind c, uber_master_mv m, industry i, ind_sector ix, sectors s
WHERE t.story_id = n.story_id
AND t.ticker = m.ticker
AND m.m_ticker = c.m_ticker(+)
AND c.ind_code = i.ind_code(+)
AND i.ind_code = ix.ind_code(+)
AND ix.sector_id = s.sector_id(+) AND s.sector_id = 10 )
WHERE rtnum = 1)
SELECT trans_id, source, provider,
TO_CHAR(publish_date,'MM/DD/YYYY HH24:MI:SS') AS publish_date,
UNIX_TIMESTAMP(publish_date) AS timestamp,
story_link, industry_name, sector_name, subject, teaser, tickers
FROM (SELECT trans_id, source, provider, publish_date, story_link, industry_name, sector_name, subject, teaser, tickers,
ROW_NUMBER() OVER (ORDER BY publish_date DESC) rnum
FROM (SELECT trans_id, source, provider, publish_date, story_link, industry_name, sector_name, subject, teaser, tickers
FROM aquire WHERE rtnum <= 5
UNION ALL
SELECT trans_id, source, provider, publish_date, story_link, industry_name, sector_name, subject, teaser, tickers
FROM partner WHERE rtnum <= 5))
WHERE rnum BETWEEN 1 AND 1 * 5;
rnum
asrow_number(...)
(or anything else for that matter) in a subquery, but then you don't reference it in the main query, the optimizer simply ignoresrnum
in the subquery. You are not having a hallucination - what you are noticing is the optimizer doing its job. Now, perhaps, you will change your question: "suppose I do needrnum
, is there any way to make the query faster"? The answer is "probably, but it depends on what your query is doing" - you didn't tell us that. - mathguyEXPLAIN
on the above query and find the bottlenecks. Once you have done this, then Stack Overflow is a good place to ask with some suggestions on how to improve query performance. - Tim Biegeleisenrnum
in conditions in a main query (for example a condition likewhere rnum = 1
) then there are ways to rewrite the query to use other means, notrow_number()
, to achieve the same result with better performance; but exactly how to do that depends on what the query must do, and again - we don't know what that is. - mathguy