0
votes

I have a problem statement in hand and I need to know whether it can be solved by machine learning or not. It goes like this :-

I have a system in which a user can upload documents, so let's say we have a file named xxxZxxx.xxx

User goes multiple levels into the system's folder structure and places the file, (say) A/B/C/D/Z/xxxZxxx.xxx

We need to make a system that reads the file name and suggests the path where it is to be placed.

In this case the file name contains the last part of path, which is a Business Object directory but it may not contain. We have such paths and documents in order of 10^5.

And new paths i.e. business objects may be added with time, which makes this a multi-class classification with approx 10^5 classes that keep on increasing

Is this solvable ?

I tried using a bag of characters (Inspired from bag of words) as a feature vector which failed.

Any comments on any approach that can be followed for this ? Let me know if any other information is needed I will edit the question or change the tags.

1

1 Answers

0
votes

So to make it a truly ML problem please answer the followings:

1) Why cann't you just read the filename and get the chid folder where the file needs to be placed? Is it because as you said user may not proved the name of the child folder as part of the filename? Or is it because there might be many directories with the name that user provided?

2) ML problems typically have patterns that are statistical in nature which are harder to identify with simple naked eye e.g. using regex. Here you can easily find the appropriate folder using a regular expression search, no?