PureDiscovery logo

SAS logo
Hyatt Regency Bethesda
Bethesda, Maryland

April 22, 2006

to be held in conjunction with

Sixth SIAM International Conference on Data Mining (SDM 2006)

and also in conjunction with SIAM's Link Analysis, Counterterrorism Security Workshop,
which is also being held on April 22, 2006*.

* IMPORTANT NOTE: Because of the overlapping and continuing interest in the Enron data set, the Text-Mining Workshop (co-chaired by Malu Castellanos and Michael W. Berry) will work with the Link Analysis, Counterterrorism, Security Workshop (Co-chairs: Kathleen Carley, Carnegie Mellon and Ankur Teredesai, Rochester Institute of Technology) to coordinate sessions to allow attendees to participate in areas of most interest to them. The Counterterrorism Workshop may contain work on the Enron data set, but will have other papers and presentations on other topics such as link analysis and graph mining. Likewise, the Text Mining Workshop is not limited to the Enron data sets but will include topics as listed below.

Topics of interest | Registration | Submission Requirements | Important Dates
Program | Program Committee | Organizational Committee | Sponsors


General Topics

The proliferation of digital computing devices and their use in communication has resulted in an increased demand for systems and algorithms capable of mining textual data. Thus, the development of techniques for mining unstructured, semi-structured, and fully structured textual data has become quite important in both academia and industry. As a result, this Workshop will survey the emerging field of Text Mining - the application of techniques of machine learning in conjunction with natural language processing, information extraction and algebraic/mathematical approaches to computational information retrieval. Many issues are being addressed in this field ranging from the development of new learning approaches to the parallelization of existing algorithms. The goal of this workshop is to provide a venue for researchers to share initial approaches and preliminary results of recent research in Text Mining. Through the careful selection and review of submitted workshop papers, we hope to provide a suitable selection of topics that will both generate interest and provide insight into the state of the field of Text Mining.

Special Topic - Text Mining with the Enron Data Set

Because of the interest generated from the availability of the Enron data set of 1.3 million email messages (See Enron Email Dataset) and its versatility in terms of potential research topics (link analysis, pattern matching), researchers are encouraged to submit to either this workshop or the if they feel it more appropriate, the aformentioned counter-terrorism workshop.

Other Specific Topics of Interest Include:


Registration

Attendees are required to register for SDM 2006 so that no separate registration is needed for this workshop.


Submission Requirements

To submit a paper for consideration, email either a Postscript or PDF file as an attachment to the Program Chair Malu Castellanos at Hewlett-Packard --- malu.castellanos@hp.com. Papers should be printable on 8.5 × 11 paper only and be roughly 10 pages in length using a 11pt font in two-column font with 1 inch margins.To guarantee consideration, manuscripts must be received by January 9, 2006 January 16, 2006. Submission of work in progress is also encouraged.


Important Dates

Papers Due: January 9, 2006 January 16, 2006 (Extended deadline) Deadline passed

Notifications sent: February 1, 2006 February 9, 2006 Deadline passed

Camera ready: Final Papers due to workshop chairs. February 15, 2006 February 20, 2006 Deadline passed


Workshop Program (PDF) as of February 23, 2006
Welcoming slides (PDf) by M. Berry, posted on April 23, 2006.

The Keynote speaker is Dr. Ashok Srivastava, NASA Ames Research Center,
and the title of his talk is "The Needle in the Haystack Problem: Discovering Anomalies in Text Documents"

(Handouts (PDF), Animation)

Sponsors: PureDiscovery of Dallas, Texas and SAS Institute Inc. of Cary, NC.


Program Committee

Co-Chairs: Malu Castellanos, HP Labs and Michael W. Berry, University of Tennessee
Chris Ding, Lawrence Berkeley National Lab. (NERSC)
William Ferng, Boeing
Kyle Gallivan, Florida State University
Mei Kobayashi, IBM Tokyo Research Lab
Rosie Jones, Yahoo Research Labs
Stephen Soderland, University of Washington
Haesun Park, Georgia Tech

Peg Howland, Utah State University
April Kontostathis, Ursinus University

Padma Raghavan, Penn State University
Efstratios Gallopoulos, University of Patras, Greece
Lakshminarayan Choudur, Hewkett-Packard Laboratories
Pierre Senellart, INRIA (France)


Organizational Committee

Co-Chairs:
Malu Castellanos
Intelligent Enterprise Technologies Laboratory
Hewlett-Packard Laboratories, Palo Alto, CA
Phone: (650) 857-3074
Fax: (650) 852 8137
malu.castellanos@hp.com


Michael W. Berry
Department of Computer Science
204 Claxton Complex
University of Tennessee
Knoxville, TN 37996-3450
Phone: (865) 974-3838
Fax: (865) 974-4404
berry@cs.utk.edu

Publicity and Coordination

Murray Browne
317 Claxton Complex
University of Tennessee
Knoxville, TN 37996-3450
Department of Computer Science
University of Tennessee
(865) 974-3510
mbrowne@cs.utk.edu


Last modified on February 23, 2006