0
votes

Team,

I am working on a project where i need to classify Items into certain category. I have a single file as input; which contains target variable and space separated features. My training data will look like

Category Name [Tab] DataString

Plumbing [Tab] Pipe Tap Plastic Pipe PVC Pipe Cold Water Line Hot Water Line Tee outlet up Elbow turned up Elbow turned down Gate valve Globe valve

Paint [Tab] Ivory Black Burnt Umber Caput Mortuum Violet Earth Red Yellow Ochre Titanium White Cadmium Yellow Light Cadmium Yellow Deep

Cloths [Tab] Shirt T-Shirt Pent Jeans Tee Cargo

Well, I have really big set of Category. I have couple of question here

  1. am i using correct data for Training? If no then what should i use?
  2. Once I train and Test my model, what is next step? How can i use output?

Please help me with this

Thanks,

Nimesh

1
Do you have multiple entries for each category, e.g. several lines for the 'Paint' category each with a different but overlapping set of words? - Sicco
You can look at the tutorial at chimpler.wordpress.com/2013/03/13/… It's implementing something very similar to what you are trying to do. - Frederic Dang Ngoc

1 Answers

1
votes

yep, once you got some output, you can use it to test. you can see some test dataset generating some test result. some are good, but some are not. adjust the model or test dataset may be your next step.