Data Structures for implementing a song search functionality in mp3 player?

Question

I am creating my own mp3 player. For the mp3 player I need a song searching facility same as VLC Media Player and Rhythmbox and other media players, in which one can search for a song by giving artist/track/album name.

As an example consider these are 4 songs with their respective meta data

Track                     Artist                            Album

Dear Agony                Breaking Benjamin                 Test Name
Radioactive               Imagine Dragons                   Billboard
Feel Good Drag            Anberlin                          Random
Khamaj                    Fuzon                             Tere Liye

Now suppose I give search query: ag then result should be this:

Dear Agony                Breaking Benjamin                 Test Name
Radioactive               Imagine Dragons                   Billboard
Feel Good Drag            Anberlin                          Random

because first three song have some occurence of ag in the meta data however the fourth track doesn't have any hence it shouldn't be listed.

All the mp3 files will have all these data present in them, and I know how to extract this data from them. The real challenge is which data structure to use and how to use that data structure to implement this.

Especially if user's song playlist is very big then a efficient retrieval of results is required. Please suggest some data structures which I can implement to achieve this. By the way I am using Python

You can take SQL database like Firebird or SQLite, make a table for the songs having all the columns as fields from ID3/UD3 specifications, and adding three more columns: row ID (unique number, primary key), song file name and total text string ( ALL_TAGS COMPUTED BY TAG_ALBUM || ' ' || TAG_BAND || ' ' || TAG_TITLE || ... ) and then do like SELECT ID FROM SONGS WHERA ALL_TAGS LIKE '%ag%'` Would be a linear search reading from HDD, so not very fast (about 100 MBytes/sec for raw speed). But simple. And in memory everythign would be fast... — Arioch 'The
en.wikipedia.org/wiki/Inverted_index is the basis of all FTS engines, but you would have to make a compromise what is the minimum reasonable length. Many forums disable text search unless you key in at least FOUR letters, so i think 2- or 3-letters indexes become way too large to be efficiently maintained — Arioch 'The
@Arioch'The: I would like to know how I can use Inverted Index for searching — khirod
Most FTS engines are open-source - read and learn - both data structures and algorithms operating them. Some basis information can be ggogled in the net. OF course latest know-hows by top-rank engines like Google, Yandex, etc - are probably unpublished or patent-covered. But you do not need that much. — Arioch 'The

prmottajr prmottajr · Accepted Answer · 2013-12-26T17:19:36

If you break the list of words of all data (track name, band, album) you could have a hashtable indexed by the words with a linked list as the value containing all the tracks that have that occurrence.

For the searching you could have a B+ tree to index the words to get to the keys of the hashtable (more or less like a word processor does for the autocorrection).

Cheers

Data Structures for implementing a song search functionality in mp3 player?

3 Answers