Publications

Below are my publications and preprints. You can also find them on my Google Scholar profile.

WearableMil: An End-to-End Framework for Military Activity Recognition and Performance Monitoring

Accepted for publication in the 2025 13th International Conference on Healthcare Informatics (ICHI).

Musculoskeletal injuries during military training significantly impact readiness, making prevention through activity monitoring crucial. While Human Activity Recognition (HAR) using wearable devices offers promising solutions, it faces challenges in processing continuous data streams and recognizing diverse activities without predefined sessions. This paper introduces an end-to-end framework for preprocessing, analyzing, and recognizing activities from wearable data in military training contexts. Using data from 135 soldiers wearing Garmin 55 smartwatches over six months, we develop a hierarchical deep learning approach that achieves 93.8\% accuracy in temporal splits and 83.8\% in cross-user evaluation. Our framework addresses missing data through physiologically-informed methods, reducing unknown sleep states from 40.38\% to 3.66\%. We demonstrate that while longer time windows (45-60 minutes) improve basic state classification, they present trade-offs in detecting fine-grained activities. Additionally, we introduce an intuitive visualization system that enables real-time comparison of individual performance against group metrics across multiple physiological indicators. This approach to activity recognition and performance monitoring provides military trainers with actionable insights for optimizing training programs and preventing injuries.)

Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis

Under Review

QUIC, an increasingly adopted transport protocol, addresses limitations of TCP by offering improved security, performance, and features such as stream multiplexing and connection migration. However, these enhancements also introduce challenges for network operators in monitoring and analyzing web traffic, especially due to QUIC’s encryption. Existing datasets are inadequate—they are often outdated, lack diversity, anonymize critical information, or exclude essential features like SSL keys—limiting comprehensive research and development in this area. We introduce VisQUIC, a publicly available dataset of over 100,000 labeled QUIC traces with corresponding SSL keys, collected from more than 40,000 websites over four months. By generating visual representations of the traces, we facilitate advanced machine learning (ML) applications and in-depth analysis of encrypted QUIC traffic. To demonstrate the dataset’s potential, we estimate the number of HTTP/3 request-response pairs in a QUIC connection using only encrypted traffic, achieving up to 92\% accuracy. This estimation provides insights into server behavior, client-server interactions, and connection load—crucial for tasks like load balancing and intrusion detection. Our dataset enables comprehensive studies on QUIC and HTTP/3 protocols and supports the development of tools for encrypted traffic analysis. Currently under review.

Beyond the Alphabet: Deep Signal Embedding for Enhanced DNA Clustering

Under Review

The exponential growth of digital data has fueled interest in DNA as a storage medium due to its unmatched density and durability. However, clustering the billions of reads required for error correction and data reconstruction remains a major bottleneck, as traditional edit-distance-based methods are both computationally expensive and prone to data loss. This paper introduces a novel \emph{signal-model} that processes raw Nanopore signals, bypassing the error-prone basecalling step. By directly leveraging analog signal information, the \emph{signal-model} reduces computation time by up to three orders of magnitude compared to edit-distance approaches, while delivering superior accuracy. It also outperforms DNA sequence embedding methods in both accuracy and efficiency. Furthermore, our experiments show that the \emph{signal-model} achieves higher clustering accuracy than existing strand-based algorithms, saving days of computation without compromising quality. Overall, this work represents a significant breakthrough in DNA data storage, highlighting how signal-based analysis can drastically improve both accuracy and scalability.

Data-Driven Cellular Network Selector for Vehicle Teleoperations

Published in 2024 15th International Conference on Network of the Future (NoF), 2024

The effectiveness of video-based teleoperation systems is heavily influenced by the quality of the cellular network and, in particular, its packet loss rate and latency. To optimize these parameters, an autonomous vehicle can be connected to multiple cellular networks and determine in real time over which cellular network each video packet will be transmitted. We present an algorithm, called Active Network Selector (ANS), which uses a time series machine learning approach for solving this problem. We compare ANS to a baseline algorithm, which is used today in commercial systems, and show that ANS performs much better, with respect to both packet loss and packet latency.

Download Paper

Using Deep Reinforcement Learning for mmWave Real-Time Scheduling

Published in 2023 14th International Conference on Network of the Future (NoF), 2023

We study the problem of real-time scheduling in a multi-hop millimeter-wave (mmWave) mesh. We develop a model-free deep reinforcement learning algorithm called Adaptive Activator RL (AARL), which determines the subset of mmWave links that should be activated during each time slot and the power level for each link. The most important property of AARL is its ability to make scheduling decisions within the strict time frame constraints of typical 5G mmWave networks.

Download Paper

Robusta

Can be sent upon request

This paper presents Robusta, a hybrid recoverable cache leveraging PMem and DRAM to get the best of the two: DRAM-like low latency for very frequent items, reduced tail latency due to Pmem’s large capacity, and warm start on fail- ure recovery. Robusta is implemented as a wrapper around Caffeine, a state-of-the-art Java cache that is integrated into a range of production systems, including HBase, Druid, Solr, and Cassandra.