Introduction to ESTARD Data MinerIntroduction to Data MiningStep By Step Guide
Program Interface
Using Databases
Using Rules & Decision TreesBI FunctionsReporting & SavingHome page |
Using Rules and Decision Trees SettingsAt the very beginning of working on a new database it is hard to guess what
settings will suite your case best. The settings also depend on what is your
aim: your aim can be to obtain few rules, but with highest
Example of using settingssuppose you want to analyse a database with 40 000 of records. As the class field you've selected a field that contains such values: True/False. In this case if you set rules cases equal 5, you would probably get thousands of rules, that will describe small data patterns. Using such settings you will create overfitted profiles.If it is hard to decisde what value to set for "Minimum Number of Cases for a Rule" - check "Classes Statistics", select the smallest value in the "Met In" column and set it for the "Minimum Number of Cases for a Rule". Of course, if the minimum value is very small in comparison to records number, it is better to use higher values (for example, if you have 40 000 records in a table, and the minimum value found in "Met In" equals "1", then it's better to ignore such value). "Minimum rule probability" setting also has direct influence on the number of rules, and, as a result on time necessary for their creation and output. It is also recommended to start with higher values for this setting, for example - with 50%-90%. This value can also be correlated with value in the "Met In %" column on "Classes Statistics" page. For decision tree creation it is better to start with minimum values in settings ad then continuing playing with them, adjusting the minimum number of cases for a rule. Try repeating query with different settings, until you will obtain all necessary combinations of data. |