In my opinion the best dataset for training purposes if a sample of the real world data that your classifier will be used with.
This is true for all classifiers (but some of them are indeed not suitable to unbalanced training sets in which cases you don't really have a choice to skew the distribution), but particularly for probabilistic classifiers such as Naive Bayes. So the best sample should reflect the natural class distribution.
Note that this is important not only for the class priors estimates. Naive Bayes will calculate for each feature the likelihood of predicting the class given the feature. If your bayesian classifier is built specifically to classify texts, it will use global document frequency measures (the number of times a given word occurs in the dataset, across all categories). If the number of documents per category in the training set doesn't reflect their natural distribution, the global term frequency of terms usually seen in unfrequent categories will be overestimated, and that of frequent categories underestimated. Thus not only the prior class probability will be incorrect, but also all the P(category=c|term=t)
estimates.