Compression and Profiling of Machine Learning Models in Biomedical Embedded Sensing

Sebastian Cruz Romero, CapicĂș Technologies LLC

0000-0003-3892-2273

ACCESS Allocation Request CIS250603

Abstract: The project focuses on reducing computational requirements of Machine Learning models for real-time biomedical inference on low-power embedded devices. We will explore model compression techniques (e.g., quantization-aware training, structured pruning, low-rank factorization) to train and deploy neural networks while preserving task-specific performance (e.g. anomaly detection, feature extraction) under strict memory and latency constraints. ACCESS resources will be used for high-throughput model training, hyper parameter tuning, and resource profiling using CPU and GPU nodes on SDSC Expanse. We will also leverage project storage for managing clinical waveform datasets and generated model artifacts. Key software tools include PyTorch, TensorRT, TVM, TensorFlow Lite Micro, and custom profiling utilities for embedded deployment. The goal is to benchmark model configurations across performance metrics and produce a suite of deployable models for resource-constrained biomedical applications.

Allocations:

2025 Indiana Jetstream2 CPU 62,647.0 SUs
2025 Indiana Jetstream2 GPU 40,282.0 SUs
2025 MATCH Services Yes
The estimated value of these awarded resources is $9,622.70. The allocation of these resources represents a considerable investment by the NSF in advanced computing infrastructure for the U.S. The dollar value of the allocation is estimated from the NSF awards supporting the allocated resources.
There are no other allocations for this project.

Other Titles:

There are no prior titles for this project.