
To
be Held in Conjunction with
Third
SIAM International Conference on Data Mining (SDM 2003)
The proliferation of digital computing devices and their use in communication has resulted in an increased demand for systems and algorithms capable of mining textual data. Thus, the development of techniques for mining unstructured, semi-structured, and fully structured textual data has become quite important in both academia and industry. As a result, this Workshop will survey the emerging field of Text Mining - the application of techniques of machine learning in conjunction with natural language processing, information extraction and algebraic/mathematical approaches to computational information retrieval. Many issues are being addressed in this field ranging from the development of new learning approaches to the parallelization of existing algorithms. The goal of this Workshop is to provide a venue for researchers to share initial approaches and preliminary results of recent research in Text Mining. Through the careful selection and review of submitted workshop papers, we hope to provide a suitable selection of topics that will both generate interest and provide insight into the state of the field of Text Mining.
A one-day workshop on Text Mining is being held in conjunction with SDM 2003 in San Francisco, CA (May '03) to bring together researchers from a variety of disciplines to present their current approaches and results in text mining.
Algorithms and Models
- Bayesian Models
- Concept Decomposition
- Orthogonal Decompostiton
- Probabilistic Models
- Vector Space Models
- Latent Semantic Indexing
- Graph-based Models
- Text Streaming Models
Applications
- Clustering
- Factor Analysis
- Visualization Techniques
- Metadata Generation
- Text Classification
- Text Purification
- Text Segmentation
- Text Summarization
- Query Structures
- Trend Detection
- Distributed Storage and Retrieval
Attendees are required to register for SDM 2003 so that no separate registration is needed for this workshop.
To submit a paper for consideration, email either a Postscript or
PDF file as an attachment to
berry@cs.utk.edu.
Papers should be printable on 8.5 × 11 paper only and
be roughly 10 pages in length using a 11pt font in two-column font
with 1 inch margins.
To guarantee consideration, manuscripts must be
received by February 15, 2003
February 18, 2003
Deadline has now passed
Submission of work in progress is also encouraged.
Keynote Speaker: Hinrich
Schütze (Enkata Technologies), Keynote Presentation
Slides (PowerPoint)
Schedule (PDF)
Katy Börner, Indiana University
Malu Castellanos, HP Labs, Palo Alto
Chris Ding, Lawrence Berkeley National Lab. (NERSC)
William Ferng, Boeing
Rick Fierro, California State University at San Marcos
Kyle Gallivan,
Florida State University
David Icove, Tennessee
Valley Authority
Anne Kao, Boeing
Mei Kobayashi,
IBM
Tokyo Research Lab
Haesun Park, University of Minnesota
Dan Phelps, Kodak
Padma Raghavan, Pennsylvania State University
Flavio Sartoretto, Univ. of Venezia (Italy)
Pierre
Senellart, Ecole Normale Supérieure (Paris)
|
Organizers Michael W. Berry Department of Computer Science 203 Claxton Complex University of Tennessee Knoxville, TN 37996-3450 Phone: (865) 974-3838 Fax: (865) 974-4404 berry@cs.utk.edu |
William M. Pottenger Department of Computer Science and Engineering Lehigh University Bethlehem, PA 18015 Phone: (610) 758-3454 Fax: (610) 758-6279 billp@cse.lehigh.edu |
Eastman Kodak Company
|
Telcordia Technologies
|