Monday, February 20, 2012

Question on attributes selection for un-supervised algorithms and supervised algorithms

Hi, all,

Thanks for your kind attention.

Just wonder is there any good idea for us to select attributes for training models? Both for non-supervised algorithms like Association Rules and Clustering etc. and supervised algorithms like decision tree etc.

It will be much interesting to hear from you for any best practices and popular methods of dealing with this issue.

I am looking forward to hearing from you and thanks for your advices.

With best regards,

Yours sincerely,

Hi,

I assume that you are trying to select a subset of all your attributes to train the models. SQL Server Data Mining Algorithms have built in feature selection methods. For example, the Microsoft Decision Trees support the following attribute scoring methods: Entropy, Bayesian with K2 Prior and Bayesian Dirichlet Equivalent with Uniform Prior (which is used by default). When feature selection is necessary, the algorithm calculates the scores for each attribute and only train trees with selected features (with top scores, of course). Other algorithms have similar feature selection mechanism.

Thanks,

No comments:

Post a Comment