|
|
|
Accepted
Papers
We received 19 high quality research and application papers; only six papers(32%) are accepted as full papers and six papers (32%) are accepted as poster papers. The abstracts for the accepted full papers can be found here and the abstracts for the poster papers can be found here.
FULL PAPERS:
-
Title: Fusion of Vision Inertial data for Automatic Geo-referencing
Authors: D.I.B. Randeniya, S. Sarkar and M. Gunaratne (University of South Florida, Tampa, FL, USA)
Abstract: Intermittent loss of the GPS signal is a common problem encountered in intelligent land navigation based on GPS integrated inertial systems. This issue emphasizes the need for an alternative technology that would ensure smooth and reliable inertial navigation during GPS outages. This paper presents the results of an effort where data from vision and inertial sensors are integrated to achieve the above goal. However, for such integration one has to first obtain the necessary navigation parameters from the available sensors. Information from a sequence of images captured by a monocular camera attached to a survey vehicle at a maximum frequency of 3 frames per second was used in upgrading the inertial system installed in the same vehicle for its inherent error accumulation. Specifically, the rotations and translations estimated from point correspondences tracked through a sequence of images were used in the integration. Also a pre-filter is utilized to smoothen out the noise associated with the vision sensor (camera) measurements. Finally, the position locations based on the vision sensor are integrated with the inertial system in a decentralized format using a Kalman filter. The vision/inertial integrated position estimates are successfully compared with those from inertial/GPS system output. This successful comparison demonstrates that vision can be used successfully to supplement the inertial measurements during potential GPS outages.
- Title: Electricity Load Forecast using Data Streams Techniques
Authors: Joao Gama and Pedro Pereira Rodrigues (Univ. do Porto, Porto, Portugal)
Abstract: Sensors distributed all around electrical-power distribution networks produce streams of data at high-speed. From a data mining perspective, this sensor network problem is characterized by a large number of variables (sensors), producing a continuous flow of data, in a dynamic non-stationary environment. Companies make decisions to buy or sell energy based on load profiles and forecast. In this work we analyze the most relevant data mining problems and issues: continuously learning clusters and predictive models, model adaptation in large domains, and change detection and adaptation. We propose an architecture based on an online clustering algorithm where each cluster (group of sensors with high correlation) contains a neural-network based predictive model. The goal is to maintain in real-time a clustering model and a predictive model able to incorporate new information at the speed data arrives, detecting changes and adapting the decision models to the most recent information. We present results illustrating the advantages of the proposed architecture, on several temporal horizons, and its competitiveness with another predictive strategy.
- Title: Requirements for Clustering Streaming Sensors
Authors: Pedro Pereira Rodrigues, Joao Gama and Luis Lopes (University of Porto, Porto, Portugal)
Abstract: Most of the work in incremental clustering of data streams has been widely concentrated on example clustering rather than variable clustering. The data stream paradigm imposes that variable clustering should be also addressed as an online procedure, not only due to the dynamics inherent to streams but also because the relations between them can change over time. Moreover, streams may be produced in a distributed environment. The task that emerges from this setting is better known as clustering of streaming sensors, since the data is often produced in wide sensor networks. We overview previous attempts to address this problem and clarify where, in our perspective, these attempts may have failed to deal with it. We try to summarize the requirements that systems addressing this task should observe and its implications for future research. The main goal of this work is to promote discussion on the definition of clear requirements for an emerging task in machine learning.
- Title: Anomaly Detection in Transportation Corridors using Manifold Embedding
Authors: Amrudin Agovic, Arindam Banerjee (University of Minnesota, Twin Cities, MN, USA), Auroop R. Ganguly and Vladimir Protopopescu (Oak Ridge National Laboratory, Oak Ridge, TN, USA)
Abstract: The formation of secure transportation corridors, where cargoes and shipments from points of entry can be dispatched safely to highly sensitive and secure locations, is a high national priority. One of the key tasks of the program is the detection of anomalous cargo based on sensor readings in truck weigh stations. Due to the high variability, dimensionality, and/or noise content of sensor data in transportation corridors, appropriate feature representation is crucial to the success of anomaly detection methods in this domain. In this paper, we empirically investigate the usefulness of manifold embedding methods for feature representation in anomaly detection problems in the domain of transportation corridors. We focus on both linear methods, such as multi-dimensional scaling (MDS), as well as nonlinear methods, such as locally linear embedding (LLE) and isometric feature mapping (ISOMAP). Our study indicates that such embedding methods provide a natural mechanism for keeping anomalous points away from the dense/normal regions in the embedding of the data. We illustrate the efficacy of manifold embedding methods for anomaly detection through experiments on simulated data as well as real truck data from weigh stations.
- Title: Spatio-Temporal Analysis on FEMA Situation Updates with Automated Information Extraction
Authors: Chi-Chun Pan, Prasenjit Mitra (Penn State University, University Park, PA, USA) and Auroop R. Ganguly (Oak Ridge National Laboratory, Oak Ridge, TN, USA)
Abstract: With the advent of the World-Wide-Web, there is an over-abundance of textual information. Information present in digital documents can be utilized better if it can be extracted automatically and scalably from text and visualized using visualization tools. In this paper, we present an automated information extraction and visualization tool for human sensor data. Our system consists of three main components: FactXtractor, GeoTagger, and FEMARepViz. Named entities and entity relations are extracted using FactXtractor. We have proposed a novel stripped dependency tree kernel for a Support Vector Machine (SVM) based classifier to identify semantic relationships among entities. GeoTagger disambiguates location entities. Built on top of the first two system components, FemaRepViz is an application that segments text documents, identifies the topic of the segments, extracts location entities, disambiguates them, and visualizes the extracted information on Google Earth or Google Map. Our empirical evaluation shows that the system achieves reasonable accuracy.
- Title: TAG: A Framework for the Discovery of Spatio-Temporal Patterns in Sensor Data
Authors: Betsy George, James M. Kang, and Shashi Shekhar (University of Minnesota, Minneapolis, MN, USA)
Abstract: With rapidly growing use of sensor data in many application domains such as environmental science, there is a potential to use the data collected by the sensors in trend analysis and the discovery of interesting spatio-temporal patterns. With sensors reporting data at a very high frequency, there is a need for a model that is memory efficient, simple and expressive. In addition, the model must support the design of efficient algorithms for data analysis. Since the data collected from sensors are spatio-temporal in nature, the model
must support the time dependence of the data. Though spatio-temporal networks can be modeled using time expanded graphs, it replicates the entire graph across time instants resulting in high storage overhead and computationally expensive algorithms. In this paper, we propose to use time aggregated graphs (TAG) to efficiently model sensor data which allow the properties of edges and nodes to be modeled as a time series. Also, we present several case studies where we propose methods to find interesting patterns (e.g., emerging hotspots) in sensor data.
BACK TO TOP OF PAGE
POSTER PAPERS:
- Title: Mixtures of Probabilistic Principal Component Analyzers for Anomaly Detection
Authors: Yi Fang and Auroop R. Ganguly (Oak Ridge National Laboratory, Oak Ridge, TN, USA)
Abstract: Anomaly detection tools have been increasingly used in recent years to generate predictive insights on rare events. The typical challenges encountered in such applications include a large number of data dimensions and absence of labeled data. An anomaly detection strategy for these scenarios is dimensionality reduction followed by clustering in the reduced space, with the degree of anomaly of an event or observation quantified by statistical distance from the clusters. However, most research efforts so far are focused on single abrupt anomalies, while the correlation between observations is completely ignored. In this paper, we address the problem of detection of both abrupt and sustained anomalies with high dimensions. The task becomes more challenging than only detecting abrupt outliers because of the gradual and indiscriminant changes in sustained anomalies. We utilize a mixture model of probabilistic principal component analyzers to quantify each observation by probabilistic measures. A statistical process control method is then used to monitor both abrupt and gradual changes. On the other hand, the mixture model can be regarded as a trade-off strategy between linear and nonlinear dimensionality reductions in terms of computational efficiency. This compromise is particularly important in real-time deployment. The proposed method is evaluated on simulated and benchmark data, as well as on data from wide-area sensors at a truck weigh station test-bed.
- Title: Correlationbased Feature Partitioning for Rare Event Detection in Wireless Sensor Networks
Authors: Sitaram Asur and Srinivasan Parthasarathy (The Ohio State University, Columbus, OH, USA)
Abstract: Wireless sensor networks are becoming ubiquitous in their use in security, defense, monitoring and tracking applications. Intrusion detection is an important problem for wireless sensor networks in defense and security applications. Since intrusions are rare, they need to be handled efficiently. This involves: 1) continuous monitoring for threats and intrusions, 2) rapid detection, and possibly even classification and tracking, of intrusions, and 3) rapid decision making. Furthermore, sensor networks are burdened by limited battery power, which creates the need for energy-efficient classification models to address this issue.
Our goal in this work is to build local classification models in clustered sensor networks to perform efficient detection of rare events, while also improving the lifetime of the network by reducing energy losses. We propose a correlation-based scheme to partition the features observed by the sensor nodes into disjoint mutually uncorrelated feature subsets. An ensemble of local classifiers are then trained on these subsets. We implement our model on a cluster-based sensor network architecture (LEACH). To reduce energy losses, we provide an energy efficient routing scheme designed for the above model. Our experimental results on real and synthetic data show that the proposed technique provides benefits both in terms of accuracy of detection and energy savings of the network.
- Title: Unsupervised and supervised compression with principal component analysis in wireless sensor networks
Authors: Yann-Ael Le Borgne and Gianluca Bontempi (Universite Libre de Bruxelles, Brussels, Belgium)
Abstract: This paper shows that the Principal Component Analysis, a compression method widely used in statistical analysis and image processing, can be efficiently implemented in a network of wireless sensors. The proposed scheme proves to be particularly suitable to sensor networks as it allows to reduce the network load while retaining a maximum amount of variance from sensor measurements. We present two operating modes, unsupervised and supervised, allowing (i) to extract a maximum of variance while keeping the network load bounded, and (ii) to reduce the network load while keeping the approximation error bounded, respectively. We assess the efficiency of the proposed approach in a realistic wireless sensor network deployment for temperature monitoring.
- Title: Trade-offs in the Use of Bayesian Filtering for Sensor Fusion
Authors: Anatole Gershman (Carnegie Mellon Universisty, Pittsburgh, PA, USA), Rayid Ghani, Gang Wei (Accenture Technology Labs) and Damian Roqueiro (University of Illinois, Chicago, IL, USA)
Abstract: Robust identification and localization of moving objects via sensor networks depends on the quality of sensors, sensor coverage and the fusion of the information obtained from the sensors. Sensor fusion algorithms use domain and context knowledge to calculate the most plausible interpretation of all available data. Little research has been done on the trade-offs between these factors. This paper presents an empirical study of these trade-offs for a classic sensor fusion algorithm based on Bayesian forward and backward propagation. To test the robustness of this algorithm under various conditions, we created “virtual sensors” whose performance characteristics were based on the data gathered by tracking 31 people in 112 locations using a set of 34 cameras and 70 badge readers. Our results show that even a relatively simple domain model enables robust performance of the algorithm even in the presence of poor sensors, thus providing a promising alternative to the expensive practice of installing better sensors and calibration procedures in order to improve surveillance systems.
- Title: Prediction of Missing Events in Sensor Data Streams Using Kalman Filters
Authors: Nithya N. Vijayakumar and Beth Plale (Indiana University, Bloomington, IN, USA)
Abstract: Sensors and instruments are an important source of real time data. However, sensor networks and instruments and their delivery systems can fail due to intrusion attacks, node failures, link failures, or problems in the measuring instruments. Missing data can cause prediction inaccuracies or problems in the continuous events processing process.
Estimation techniques can approximate missing data in a stream, thus enabling a continuous flow of data when the stream goes down temporarily. We propose Kalman filters for predicting missing events in sensor streams, specifically, with the dynamic linear model. Our study compares the Kalman filter based approach to reservoir sampling and histogram based approaches. We show that Kalman filtering is promising and has the least root mean squared error for most cases. We introduce a novel solution for inserting this approximation technique into an SQL-based events processing system as a new query operator. Our experimental analysis shows that the prediction operator has low overhead and is effective in estimating missing events in weather data streams, specifically, the METAR streams.
- Title: Mining Sensor Data in Smart Environment for Temporal Activity Prediction
Authors: Vikramaditya Jakkula and Diane J. Cook (Washington State University, Pullman, WA, USA)
Abstract: Technological enhancements aid development and advanced research in smart homes and intelligent environments. The temporal nature of data collected in a smart environment provides us with a better understanding of patterns over time. Prediction using temporal relations is a complex and challenging task. To solve this problem, we suggest a solution using probability based model on temporal relations. Temporal pattern discovery based on modified Allen’s temporal relations has helped discover interesting patterns and relations on smart home datasets. This paper describes a method of discovering temporal relations in smart home datasets and applying them to perform activity prediction on the frequently-occurring events. We also include experimental results, performed on real and synthetic datasets.
BACK TO TOP OF PAGE
|
|
|
|