To
be Held in Conjunction with
Second
SIAM International Conference on Data Mining (SDM 2002)
Extracting content from text continues to be an important research problem for information processing and management. Approaches to capture the semantics of text-based document collections may be based on Bayesian models, probability theory, vector space models, statistical models, or even graph theory. No doubt we have moved beyond the bag of words notion of text documents to exploit not only the patterns but also the structure of term usage. As digital libraries and the World-Wide-Web continue to proliferate the enormous volume of online textual material, effective yet scalable approaches to text mining will be needed. How can we know what a document is about without having to read it? How do we automatically cluster or categorize documents from from diverse sources? What are the best ways to clean text? Can semantic models of text be adequately visualized? These are some of the fundamental yet simple questions we must be able to address.
A one-day workshop on Text Mining is being held in conjunction with SDM 2002 in Arlington, VA (April '02) to bring together researchers from a variety of disciplines to present their current approaches and results in text mining. One of the main themes of the workshop will be on clustering unstructured collections of documents and the challenges therein of high-dimensionality and sparsity.
Algorithms and Models
- Bayesian Models
- Concept Decomposition
- Orthogonal Decompostiton
- Probabilistic Models
- Vector Space Models
- Latent Semantic Indexing
- Graph-based Models
- Mining Continuous Text Streams
- Software and Toolkits
Applications
- Clustering
- Factor Analysis
- Mapping Science/Domain Visualization
- Metadata Generation
- Text Classification
- Text Parsing
- Text Purification
- Text Segmentation
- Text Summarization
- Query Structures
- Software and Toolkits
- Stemming
Attendees are required to register for SDM 2002 so that no separate registration is needed for this workshop.
To submit a paper for consideration, send 4 copies of the manuscript to
Ms. Peggy Stewart (see address below). Electronic submissions (postscript or
PDF versions printable on 8.5 x 11 paper only) are strongly encouraged. To
guarantee consideration, manuscripts must be received by
December 21, 2001
December 28, 2001,
and must be no more than 12 pages excluding
figures, tables, and references. A two colum format with
with 1 inch margins should be used, and all selected papers will need
to be converted to PDF format for online posting (by SIAM).
Submission of work in progress is also encouraged.
Send all submissions to:
Ms. Peggy Stewart
Attn: Text Mining Workshop
Army
High Performance Computing Research Center
1100 South Washington
Avenue
Minneapolis, MN 55415
Tel: (612) 626-8079
Fax: (612)
626-1596
stewart@cs.umn.edu

Select either Postscript or PDF formats (updated on April 4, 2002).
Katy Börner, Indiana University
Malu Castellanos, HP Labs, Palo Alto
Chris Ding, Lawrence Berkeley National Lab. (NERSC)
Rick Fierro, California State University at San Marcos
Efim Gendler, iBoogie.tv
Kyle Gallivan,
Florida State
Liz Jessup, University of Colorado
Haesun Park, University of Minnesota
Dulce Ponceleon, IBM Almaden
Bill Pottenger, Lehigh University
Padma Raghavan, Pennsylvania State University
Flavio Sartoretto, Univ. of Venezia (Italy)
Malcolm Slaney, IBM Almaden
Marc Teboulle,
Tel-Aviv University (Israel)
Layne Watson, Virgina Tech
Jason Wu, Boeing
|
Organizers Michael W. Berry Department of Computer Science 203 Claxton Complex University of Tennessee Knoxville, TN 37996-3450 Phone: (865) 974-3838 Fax: (865) 974-4404 berry@cs.utk.edu |
Inderjit Dhillon Department of Computer Science University of Texas Austin, TX 78712-1188 Phone: (512) 471-9725 Fax: (512) 471-8885 inderjit@cs.utexas.edu |
Jacob Kogan Department of Mathematics and Statistics Univ. of Maryland, Baltimore County Baltimore, MD 21250 Phone: (410) 455-3297 Fax: (410) 455-1066 kogan@math.umbc.edu |
|
Assistant Justin T. Giles Department of Computer Science 203 Claxton Complex University of Tennessee Knoxville, TN 37996-3450 Phone: (865) 974-4196 Fax: (865) 974-4404 |
Nineteenth century coal mining photos
source
Last modified on Jan. 8, 2002.