HTML tag usage from 1996-2003. Each line shows the usage
profile for an HTML tag. The height of each point on the line
represents the percentage of pages that the tag appeared in in
a given year. For instance, the three straw-colored lines at
the top are the HTML, HEAD, and BODY tags. The teal lines
rising up in the middle are TABLE, TR, and TD. Full results
from this study will be available in the near future.
|
Data Mining and Graph Algorithms
Sparse Matrix Ordering Algorithms
Started as a project to generate protein families (see
Bioinformatics),
this has evolved into a more thorough study of how sparse
matrix ordering algorithms can be used to visualize large
graphs. In the process, we have introduced a new order algorithm
that takes into account domain knowledge in the form of
dissimilarity scores. This work have not been published, but the
some results from the first paper have been added to the graph mining package
Pajek.
The preliminary results from two studies are available here:
Data Mining Algorithm Presentations and Posters
Representations, posters and reports on existing data mining
algorithms.
-
Data Clustering Overview Poster

(formatted for two 11x17 sheet of paper for printing)
-
K-means for Large Data slide
- Support Vector Machines overview poster
- OPTICS (density-based clustering algorithm) presentation
- Fastmap vs. MDS comparison (html)

