Home » About » Research » Projects


GitHub Code

Deep Learning, Computer Vision, and Image Processing

  • Uncertainty Estimation of the Graphical Neural Networks (UEG), I developed new uncertainty metrics and methods to estimate the uncertainty and robustness of ANNs systems that go beyond standard accuracy metrics such as accuracy.
    • [preprint — pending]
  • Object identification, tracking, and re-identification via deep-sort for activity and event analysis. The methods use matterport’s mask-rcnn, feature matching, and unscented kalman-filtering.
Figure from Mtterport’s Mask-RCNN repo.
  • Carbump. Is an activity analysis project that uses Active Learning Based Scalable Representations and was published in ICASSP2020. In particular, we (the team) devised pair-wise groupings for activity and event representation. The activities and events of interest include car collisions in synthetic (video games) and real (CCTV-cameras) videos.
Carbump framework for representing and analyzing car collisions via pair-wise reasoning (in ICASSP 2020).
  • Multimodal scene analysis. In this project I developed data automated data collection and masking methods via multimodal sensors and camera networks for object detection, scene analysis, and validation applications.
Multimodal registration and semantics (thesis work)
  • Smart Distributed Multimodal Sensor Networks. In this project I developed methods that use affordable and low-power sensors and systems (i.e., smart sensor networks) for robust and ubiquitous activity and event analysis and logging.
Ten sample activities observed from a single view and three modalities
(RGB/EO, depth, and thermography)

Data Sciences & Machine Learning

  • ABCA: Activity-Based Churn Analysis (2018): Profile user’s behavior to estimate probability of churn. The complete profile includes the integration of sales representatives, customer support representatives, and usage of various cloud-based tools (frequency, rate, and range). The requirements are: identification of actionable items (that may help prevent churn) with a timeline limit of three months prior to the event (i.e., effective retention actions require three months). The progressive evolution of the account health project (below).
  • Account Health (2017). A cohort segmentation and analysis system for data correlation and account event estimations (churn, downgrade, renew, and upgrade). The project involves time-series analysis to attribute trends (i.e., sequences) and gains using large decision trees to determine attribute entropy and hierarchical relevance.
  • DARE (2017): Document Analysis via Response and Entropy (DARE) is a system based on n-gram entropy statistics to represent and classify text content of technical drawings. The methods are modular and the algorithms are highly adaptive. These can be used to represent and classify various document elements including titles, reference (i.e., revision) numbers, and authorship among other applications without language limitations. The prototype system is optimized using multiprocessing techniques.
  • PCAS (2017): Process Estimation via Contextual Analysis and Sentiment (PCAS) is a framework to process, represent, classify, and profile submittal data in construction management applications. The objective is to identify elements that correlate with processes approval (approved or rejected) and timeliness (early, on-time, or late) outcomes.
  • Smokr (2016): A web-footprint cloud-based system to predict smoker-likelihoods. The system collects data from multiple sources, which includes membership, text content, frequency, etc. The content data is scraped, mapped, and quantized effective to represent the various elements. Text is tokenized and processed to compute sentiment vectors and extract meaningful keywords. The model is currently being used in production to fast-track life-insurance under-writings.


  • Pandroid: High-Definition Panorama Generator for Android Devices using Java and Android SDK. Students: Marco Rodriguez-Suarez and Carlos Torres. Class: Mobile Systems by Professor Tim Cheng.
  • Viewfinder Alignment: “Using video to generate FAST panoramas”. A highly involved project that uses image-keypoint detection and matching, image registration, and Random Sample And Consensus (RANSAC) techniques. Project prototyped in Python and deployed in Android. Student: Carlos Torres. Class: Adv. Topics in Image and Video Processing by Professor B. S. Manjunath.


  • R3DT (2017): A distributed multimodal multiview system of smart sensor for Interaction Monitoring and Validation. The nodes are controlled by RaspberryPi3s running Ubuntu, customized sensor drivers for RGB-Depth, and Thermal device control and data collection. The devices communicate and are synchronized using in-house software solutions in Python’s OpenCV, Pandas, HDF5, OpenNI/PCL, TCP/IP, and Scikit-Learn and C++.
    • Update: Currently being expanded to do CNN-onboard analysis via Movidius.
    • Repo [R3DT]
  • Eye-CU. A distributed system of smart cameras and sensor nodes to autonomously and robustly monitor hospital environments. The ARM devices (PandaES, BeagleXm, and RaspberryPi2) run Ubuntu, customized sensor drivers, and in-house software for WiFi Server-Client communication and synchronization between nodes (TCP-IP), distributed data acquisition (OpenNI, OpenCV, and HDF5), and distributed data analysis (Scikit-Learn, CVX, and OpenCV). Implemented in C++ and Python.
    • Update — Now using RaspBerry Pi3 and Pandas. Faster and cleaner. Threaded Server and Clients.
    • Repo [MICU]

Computer Vision & Image Processing

  • Blur Measure (2015). Quantization of image blur for video frames. Implemented Marziliano’s 2002 (Sobel) and Pech’s 2000 (Laplacian) methods. The methods were modified to work together using an overlapping grid and generate a blur signature. Prototyped in Python and implemented in C++.
  • Object Recognition (2010): Implementation and application of the SIFT descriptor and object classification with large vocabulary trees for efficient object detection. Eigenvalues for Eigenfaces: Face recognition class project, which required implementing the Turks’ eigenfaces algorithm. The code was used to recognize faces of classmates using eigenvector contribution ratio. Both projects were implemented in MATLAB. Students: Rodrigo Perez-Odeh, Marco Rodriguez-Suarez, and Carlos Torres. Class: Intro to Computer Vision by Professor B. S. Manjunath.
  • Saliency Detection (2011): Comparing Local Features to Human Saliency and Gaze Patterns. Uses the SR-Research EyeLink to collect user data and evaluate human scans on random images as an exploratory user study. Implemented using SR-Experiment Builder and Python. Students: Rodrigo Perez-Odeh and Carlos Torres. Class: Adv. Topics in Computer Vision by Professor B. S. Manjunath.
  • Viewfinder Alignment (2012): Panoramas from Video Streams (see Android for more details)
  • Pandroid Panorama Generation (see Android for more details)


  • Singing FPGA. Class project using Xilinx Spartan 3 Boards and PicoBlaze. Development board’s I/O ports were used to read commands from the user (e.g., select a song from a limited playlist ). Base on the user selection the system would play a song using a dictionary of notes, frequencies, and pitches. Implemented in Verilog 2001. Student: Carlos Torres. Class: Digital Design with FPGAs by Eric Crabil (Sr. Designer at Xilinx, Inc.).

Game Theory

  • Potential Games for Consensus (2012): Probabilistic Approach to Consensus Using Game Theory’s Multi-Player Potential Games. Implemented in MATLAB. Students: Rush Patel and Carlos Torres. Class: Game Theory by Professor Joao P. Hespanha.


  • Rapidly-Exploring Random Trees (RRTs) (2011): Robotics Motion and Navigation using random seeding to detect a path and avoid collisions. Student: Carlos Torres. Class: Intro to Robot Kinematics by Professor Francesco Bullo.
  • Autonomously Placing Objects on Flat Surfaces using Force and Torque Sensors (2007): Undergraduate research project for the Healthcare Robotics Laboratory’s custom-built El-E robot. Used the sensors on El-E’s arm and hand/gripper to place objects on flat surfaces by (1) detecting a collision and (2) measuring when the object was stable (necessary for object release). Implemented in Python. Student: Carlos Torres. Summer Undergraduate Research Experience (SURE) program at the Healthcare Robotics Lab under the supervision of Professor Charles. C. Kemp.

Speech Processing & Recognition

  • Gaussian Mixture Models or GMMs (2012): “Who is talking?”. Speaker identification project based on probabilistic methods and human speech processing techniques. Implemented in MATLAB. Students: Victor Fragoso, Rahul Kidambi, and Carlos Torres. Class: Digital Speech Processing by Professor L. Rabiner.
  • Hidden Markov Models (HMMs, 2010): “Uncovering the Mystery”. This project used fundamental speech processing methods (backward and forward and Viterbi algorithms) for linear and statistical prediction of speech. Implemented in MATLAB. Student: Carlos Torres. Class: Speech Recognition by Professor L. Rabiner.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: