Showing posts with label attributes. Show all posts
Showing posts with label attributes. Show all posts

Monday, March 12, 2012

Question on Naive Bayes Viewer

Hi, guys,

I encountered a very weird question on Naive Bayes viewer, that is : one of the attributes does not appeared in Naive Bayes viewer? The original attribute data type is int data type, but then within the mining structure, I change it to discrete with Text as its data type. But the problem is after I trained the model, on the naive bayes viewer, that attribute does not appear at all? Why is that?

I have set the dependency value to be very low to enable all attributes to appear. But only that attribute got the problem?

I am looking forward to hearing from you shortly and thanks a lot in advance.

With best regards,

Yours sincerely

When you change a column's content type or data type (e.g from Integer to Text), BI Developer Studio might mark the column as Ignorable in the mining models that use it.

Could this be the issue you are seeing?

|||

No, the column is still labelled as input.

|||Was there a specific reason you needed to change the data type to Text? As far as the model is concerned, there's no difference between text and int for discrete attributes.

|||

Hi, Raman,

Thank you very much.

I think I know what you mean now, set the content type to be discrete, but the data type remains as Int? As when I left the data type as Int and content type as Continuous, the model treated all the data as continous. Is that right? Thank you.

With best regards,

Yours sincerely,

|||Correct - if you want ints to be treated as discrete, you just need to make sure that the content type is set correctly (to Discrete or Discretized).

|||

Hi, Raman,

Thanks.

With best regards,

Yours sincerely,

Question on Naive Bayes Viewer

Hi, guys,

I encountered a very weird question on Naive Bayes viewer, that is : one of the attributes does not appeared in Naive Bayes viewer? The original attribute data type is int data type, but then within the mining structure, I change it to discrete with Text as its data type. But the problem is after I trained the model, on the naive bayes viewer, that attribute does not appear at all? Why is that?

I have set the dependency value to be very low to enable all attributes to appear. But only that attribute got the problem?

I am looking forward to hearing from you shortly and thanks a lot in advance.

With best regards,

Yours sincerely

When you change a column's content type or data type (e.g from Integer to Text), BI Developer Studio might mark the column as Ignorable in the mining models that use it.

Could this be the issue you are seeing?

|||

No, the column is still labelled as input.

|||Was there a specific reason you needed to change the data type to Text? As far as the model is concerned, there's no difference between text and int for discrete attributes.

|||

Hi, Raman,

Thank you very much.

I think I know what you mean now, set the content type to be discrete, but the data type remains as Int? As when I left the data type as Int and content type as Continuous, the model treated all the data as continous. Is that right? Thank you.

With best regards,

Yours sincerely,

|||Correct - if you want ints to be treated as discrete, you just need to make sure that the content type is set correctly (to Discrete or Discretized).

|||

Hi, Raman,

Thanks.

With best regards,

Yours sincerely,

Wednesday, March 7, 2012

Question on filter itemset by up to one attribute in Association Rules

Hi, all experts here,

Thanks for your kind attention.

I want to filter the itemsets or rules based on more than 2 attributes, how can we achieve that? (I can only filter them by only one attribute?). Is it possible to achieve that?

Thanks a lot and I am looking forward to hearing from you shortly.

With best regards,

Yours sincerely,

The filter actually can specify a regular expression, using the .Net language for regular expressions (the RegEx class).

Assume that you want any items set that include the Helmets and Fenders attributes, in any order

Here is an example of such filter:

((.*Helmets.*Fenders.*)|(.*Fenders.*Helmets.*))|||

Hi, are you saying this can be achieved in the association rules viewer?

Thanks.

|||Yes. In the filter, type directly the regular expression|||

Hi, Bogdan,

Thanks a lot.

Best regards,

Monday, February 20, 2012

Question on attributes selection for un-supervised algorithms and supervised algorithms

Hi, all,

Thanks for your kind attention.

Just wonder is there any good idea for us to select attributes for training models? Both for non-supervised algorithms like Association Rules and Clustering etc. and supervised algorithms like decision tree etc.

It will be much interesting to hear from you for any best practices and popular methods of dealing with this issue.

I am looking forward to hearing from you and thanks for your advices.

With best regards,

Yours sincerely,

Hi,

I assume that you are trying to select a subset of all your attributes to train the models. SQL Server Data Mining Algorithms have built in feature selection methods. For example, the Microsoft Decision Trees support the following attribute scoring methods: Entropy, Bayesian with K2 Prior and Bayesian Dirichlet Equivalent with Uniform Prior (which is used by default). When feature selection is necessary, the algorithm calculates the scores for each attribute and only train trees with selected features (with top scores, of course). Other algorithms have similar feature selection mechanism.

Thanks,

Question on attributes selection for un-supervised algorithms and supervised algorithms

Hi, all,

Thanks for your kind attention.

Just wonder is there any good idea for us to select attributes for training models? Both for non-supervised algorithms like Association Rules and Clustering etc. and supervised algorithms like decision tree etc.

It will be much interesting to hear from you for any best practices and popular methods of dealing with this issue.

I am looking forward to hearing from you and thanks for your advices.

With best regards,

Yours sincerely,

Hi,

I assume that you are trying to select a subset of all your attributes to train the models. SQL Server Data Mining Algorithms have built in feature selection methods. For example, the Microsoft Decision Trees support the following attribute scoring methods: Entropy, Bayesian with K2 Prior and Bayesian Dirichlet Equivalent with Uniform Prior (which is used by default). When feature selection is necessary, the algorithm calculates the scores for each attribute and only train trees with selected features (with top scores, of course). Other algorithms have similar feature selection mechanism.

Thanks,