Data Mining Assignment 2
Submission Deadline: Thursday, February 16, at
Experiments with Weka
- Download a copy of the weka software from http://www.cs.waikato.ac.nz/ml/weka/ and then set up the
CLASSPATH environment variable to be /bin/weka/weka-3-6-12 in
order to run this installation. With tcsh, add the following line in
your ~/.cshrc file:
setenv CLASSPATH /bin/weka/weka-3-6-12
- Copy the data files from /bin/weka/weka-3-6-12/data to your home
- Run the Apriori algorithm on at least seven databases, and
J48 and Cobweb on at least two databases. E.g.,
java weka.associations.Apriori -t weather.nominal.arff
java weka.classifiers.trees.J48 -t iris.arff
java weka.clusterers.Cobweb -t weather.arff
- With Apriori, try the -N, -C, and -M switches.
-N <required number of rules>
specifies the number of rules.
-C <miniconf> specifies the minimum confidence.
-M <minisup> specifies the minimum support.
- Submit a report in plain text by e-mail to document
differences the -N, -C, and -M switches make on the soybean.arff file.
You should try at least two different values for each of these
switches; one greater and one smaller than the default value (which is
the value that you do not use the switch).
- The trees generated by J48 and Cobweb on a small database (smaller
than the soybean.arff file, such as weather).
Please compare and contrast these two trees.
Please e-mail questions to firstname.lastname@example.org.