Publications |
These are my publications listed from most recent to oldest.
2005
Distributed Data Management for Large Volume Visualization
Ginzhu Gao, Jian Huang, C. Ryan Johnson, Scott Atchley, James Arthur Kohl, Distributed Data Management for Large Volume Visualization, in the proceedings of IEEE Visualization 2005, Minneapolis, MN, Oct 2005.
Abstract: We propose a distributed data management scheme for large data visualization that emphasizes efficient data sharing and access. To minimize data access time and support users with a variety of local computing capabilities, we introduce an adaptive data selection method based on an Enhanced Time-Space Partitioning (ETSP) tree that assists with effective visibility culling, as well as multiresolution data selection. By traversing the tree, our data management algorithm can quickly identify the visible regions of data, and, for each region, adaptively choose the lowest resolution satisfying user-specified error tolerances. Only necessary data elements are accessed and sent to the visualization pipeline. To further address the issue of sharing large-scale data among geographically distributed collaborative teams, we have designed an infrastructure for integrating our data management technique with a distributed data storage system supported by Logistical Networking (LoN). Data sets at different resolutions are generated and uploaded to the LoN for wide-area access. We describe a parallel volume rendering system that verifies the effectiveness of our data storage, selection and access scheme.
2004
High Performance Threaded Data Streaming for Large Scale Simulations
Viraj Bhat, Scott Klasky, Scott Atchley, Micah Beck, Doug McCune, Manish Parashar, High Performance Threaded Data Streaming for Large Scale Simulations, in the proceedings of 5th IEEE/ACM International Workshop on Grid Computing, Pittsburgh, PA, Nov 2004. Download paper.
Abstract: We have developed a threaded parallel data streaming approach using Logistical Networking (LN) to transfer multi-terabyte simulation data from computers at NERSC to our local analysis/visualization cluster, as the simulation executes, with negligible overhead. Data transfer experiments show that this concurrent data transfer approach is more favorable compared with writing to local disk and later transferring this data to be post-processed. Our algorithms are network aware, and can stream data at up to 97Mbs on a 100Mbs link from CA to NJ during a live simulation, using less than 5% CPU overhead at NERSC. This method is the first step in setting up a pipeline for simulation workflow and data management.
Advancements in Text Mining Algorithms and Software (Book Chapter)
S.Y. Mironova, M.W. Berry, S. Atchley, M. Beck, T. Wu, L.E. Holzman, W.M. Pottenger, and D.J. Phelps. In "Data Mining: Next Generation Challenges and Future Directions", H. Kargupta, A. Joshi, K. Sivakumar, and Y. Yesha (Eds.), AAAI/MIT Press, Menlo Park, CA, October 1, 2004. Available at AAAI Press and Amazon.
2003
Data Management for the IBP-based Data Grid
Ming Tang, Bu-Sung Lee, Yeo Chai Kiat, Micah Beck, James S. Plank, Scott Atchley. "Data Management for the IBP-based Data Grid", Second International Workshop on Grid and Cooperative Computing, Shanghai, China, December 2003.
Information Security on the Logistical Network: An End-to-End Approach
Micah Beck, James S. Plank, Jeremy Millar, Scott Atchley, Stephen Soltesz, Alessandro Bassi, Huadong Liu, Information Security on the Logistical Network: An End-to-End Approach, in the proceedings of IEEE Security in Storage Workshop 2003, Washington, DC, Oct 2003. Download paper.
Abstract: We describe the information security aspects of logistical networking. The security model adopted by logistical networking is an end-to-end model that provides tunable security levels while maintaining the scalability of the network as a whole.
Video IBPster
Scott Atchley, Stephen Soltesz, James S. Plank, Micah Beck, Video IBPster, Future Generation Computer Systems, Vol. 19, Issue 6, August, 2003, pp. 861-870. View online or download paper.
Abstract: At iGrid 2002, members of the Logistical Computing and Internetworking Lab (LoCI) had two goals. The first was to present an application, Video IBPster, built using the tools of the Network Storage Stack that delivers DVD-quality video without dropping frames, without losing data and without specialized multi-media streaming servers. The Video IBPster demo easily played MPEG-2 video files encoded at bit-rates up to 15 Megabits/second (Mbs). The second goal was to determine performance limits when using multiple, untuned TCP streams to retrieve a striped and replicated file across a long network. Since tools built using the Network Storage Stack allow striped downloads from multiple servers in parallel and since the client machines were all connected to Gigabit Ethernet (GigE), we hoped that we would observe a linear scale up of throughput when downloading from multiple servers. Although we did see increased throughput, it was not linear.
2002
Next Generation Content Distribution Using the Logistical Networking Testbed
Scott Atchley, Micah Beck, Hunter Hagewood, Jeremy Millar, Terry Moore, James S. Plank, and Stephen Soltesz, Next Generation Content Distribution Using the Logistical Networking Testbed, Technical Report CS-02-498, University of Tennessee Department of Computer Science, December 30, 2002. Download paper.
Abstract: We describe the difficulties of content distribution and building an ad hoc content distribution network using the Network Storage Stack and a publicly available testbed. The testbed uses the Network Storage Stack, developed at the University of Tennessee, which allows for flexible sharing and utilization of writable storage as a network resource. The ad hoc content distribution network improves resource utilization and user throughput without highly centralized control. The networking testbed provides over 10 TB of shared storage around the world that is available to all for research.
The Logistical Networking Testbed
Scott Atchley, Micah Beck, Jeremy Millar, Terry Moore, James S. Plank, and Stephen Soltesz, The Logistical Networking Testbed, Technical Report CS-02-496, University of Tennessee Department of Computer Science, December 13, 2002. Download paper.
Abstract: We describe the Logistical Networking Testbed, built using the Network Storage Stack, and sample applications that use the testbed. The testbed uses the Network Storage Stack, de-veloped at the University of Tennessee, which allows for flexi-ble sharing and utilization of writable storage as a network resource. The sample applications include content delivery, multimedia streaming, overlay routing, checkpointing, and an integrated example. This networking testbed provides over 10 TB of shared storage that is available to all for research.
Algorithms for High Performance, Wide-Area, Distributed File Downloads
J. S. Plank, S. Atchley, Y. Ding, M. Beck, Algorithms for High Performance, Wide-Area, Distributed File Downloads, Parallel Processing Letters, Volume 13, Number 2, June 2003, pages 207-224. Download paper.
Abstract: This paper explores three algorithms for high-performance downloads of wide-area, replicated data. The storage model is based on the Network Storage Stack, which allows for flexible sharing and utilization of writable storage as a network resource. The algorithms assume that data is replicated in various storage depots in the wide area, and the data must be delivered to the client either as a downloaded file or as a stream to be consumed by an application, such as a media player. The algorithms are threaded and adaptive, attempting to get good performance from nearby replicas, while still utilizing the faraway replicas. After defining the algorithms, we explore their performance downloading a 50 MB file replicated on six storage depots in the U.S., Europe and Asia, to two clients in different parts of the U.S. One algorithm, called progress-driven redundancy exhibits excellent performance characteristics for both file and streaming downloads.
Improving Performance in the Network Storage Stack
Scott Atchley, James S. Plank, Zheng Yong, Ding Jin, Long Zhou, Stephen Soltesz, Micah Beck, and Terry Moore, Improving Performance in the Network Storage Stack, April 2002. Download paper.
Abstract: This paper addresses the issue of improving performance when using multi-threading in network storage applications. An abstraction of network storage called the Network Storage Stack is detailed along with the software layers (IBP, the L-Bone, exNode, and Logistical Tools) that have been developed to implement it. These layers have been implemented so that applications use single connections to access and utilize network storage. In this paper, we explore the benefits of adding multi-threading to the applications at various points. We perform experiments utilizing network storage both on local-area clusters and on the wide-area. As expected, multi-threading improves performance, but also leads to other challenges in implementation and performance tuning.
Fault-Tolerance in the Network Storage Stack
Scott Atchley, Stephen Soltesz, James S. Plank, Micah Beck, and Terry Moore, Fault-Tolerance in the Network Storage Stack, IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, Ft. Lauderdale, FL, April, 2002. Download paper.
Abstract: This paper addresses the issue of fault-tolerance in applications that make use of network storage. A network storage abstraction called the Network Storage Stack is presented, along with its constituent parts. In particular, a data type called the exNode is detailed, along with tools that allow it to be used to implement a wide-area, striped and replicated file. Using these tools, we evaluate the fault-tolerance of several exNode files, composed of variable-size blocks stored on 14 different machines at five locations throughout the United States. The results demonstrate that while failures in using network storage occur frequently, the tools built on the Network Storage Stack tolerate them gracefully, and with good performance.