Best for resume, document matching

Question

I have used three different ways to calculate the matching between the resume and the job description. Can anyone tell me that what method is the best and why?

I used NLTK for keyword extraction and then RAKE for keywords/keyphrase scoring, then I applied cosine similarity.
Scikit for keywords extraction, tf-idf and cosine similarity calculation.
Gensim library with LSA/LSI model to extract keywords and calculate cosine similarity between documents and query.

I think you are going to need to test with you data. Since they are different documents I think you would be better off using resumes that are a match to the job. — paparazzo
@Paparazzi They are giving different results therefore i am little confused, which to use? Did you already performed the related work? — Khalid Usman
If you have all three then let it be up to the user which to use. — paparazzo
actually i am iOS expert, this is my first project in Information retrieval and machine learning, so i am just doing R&D without any guidance. Can you guide me if you already did the related work. — Khalid Usman
That is my guidance. Give users the options. Since a job description is not the same as a resume this is not going to be perfect. — paparazzo

alexis alexis · Accepted Answer · 2016-11-02T14:58:43

Nobody here can give you the answer. The only way to decide which method works better is to have one or more humans independently match lots and lots of resumes and job descriptions, and compare what they do to what your algorithms do. Ideally you'd have a dataset of already matched resumes and job descriptions (companies must do this kind of thing when people apply), because it takes a lot of work to create a sufficiently large dataset.

Next time you take on this kind of project, start by considering how you are going to evaluate the performance of the solution you'll put together.

Best for resume, document matching

3 Answers