|
YOLOv3 for Object Detection
TensorFlow implementation of YOLOv3 object detection for both inference and training. A ready-to-use pre-trained model converted from official implementation were provided (80 object classes trained on COCO dataset).
[details][code]
- Provided an instruction for converting trained model stored in official .weigth file to .npy file.
- Provided customized training blocks, including bounding box clustering, data augmentation and multi-scale training.
- Trained on PASCAL VOC dataset for 20 object classes detection on natural images.
|
|
Person Re-Identification with Triplet Loss
TensorFlow implementation of person re-identification using triplet loss with batch hard mining training strategy. Re-ranking was used during person image retrieval.
[code]
|
|
Implementations of Generative Adversarial Networks (GANs)
Implementions of various GANs models for comparison and testing the training behaviors of different GANs. Applied on MNIST dataset and CelebA human face dataset.
[details][code]
- Currently, DCGAN, LSGAN and InfoGAN are implemented and sucessfully trained on both dataset.
- Generated face images with controlled context, such as emotion, hairstyle and azimuth, in an unsupervised manner by using InfoGAN.
|
|
Adversarial Autoencoders for Variational Inference and Semi-Supervised Learning
Provided an implemented of adversarial autoencoders (AAE) which utilize the GAN framework as a variational inference algorithm. Applied for semi-supervised learning and disentangling style and content of images.
[code]
|
|
Image to Image Translation with Conditional GANs
Reconstructed building facade photos from label maps and generated shoes photos from sketches using pix2pix conditional GANs.
[code]
|
|
Visualization CNN for Interpretation of Trained Models
Provided interpretation of trained CNN models by visualizing the learned features and the image regions where the models pay attention to.
[details][code]
|
|
Image Classification using Recurrent Attention Model
Implementation of recurrent visual attention model for image classification. This model reduces the computational complexity by only focusing on a sequence of small regions of the image, which is controlled by a RNN.
[details][code]
- Achieved 97.82% accuracy on 60 x 60 translated MNIST.
- Provided interpretation of the classification results by visualizing the attention regions during inference.
|
|
VGG and GoogleNet for Image Classification and Feature Extraction
Implemented VGG and GoogleNet (Inceptionv1) image classification for training, inference and feature extraction.
[details][VGG code] [GoogleNet code]
- Modified VGG to a fully convolutional network by using a global average pooling layer to accept arbitrary size of images during inference.
- Designed a modified Inception network for training on low resolution dataset from scratch (achieved 93.64% accuracy on CIFAR-10 testing set).
|
|
Image and Video Style Transfer using Fast Style Transfer and Neural Style
Implemented the fast style transfer to transfer images and videos to a specific artistic style in nearly real-time, and implemented the neural style transfer for image style transfer.
[fast style code] [neural style code]
|
|
A Visual System for Autonomous Foraminifera Identification
Foraminifera are single-celled organisms with shells which are useful in petroleum exploration, biostratigraphy, paleoecology and paleobiogeography.
We developed an automated system for identification of foraminifera species to reduce the human efforts on manually picking thousands of samples from ocean sediments.
We also created a foraminifera image dataset and proposed novel robust edge detection algorithms on this dataset.
[details] [project page]
-
Leaded the creation of a foraminifera image dataset containing 1437 samples and 457 manually segmentation samples.
-
Created synthetic images refined by GANs for data augmentation.
-
Developed a coarse-to-fine fully convolutional edge detection network which iteratively applies edge detection modules on predicted edge maps.
-
Achieved 0.91 edge F1 score on the foraminifera dataset for finding vague edges between foraminifera chambers with similar texture.
-
Designed a topology-based metric to measure the structural difference between two edge maps.
-
Developed a training process utilizing the topological metric to train an edge detection network which focuses on preserving topological structures of edges.
-
Improved edge F1 score from 0.91 to 0.93 and segmentation IoU from 0.80 to 0.82.
-
Built a transfer learning process for identification of six foraminifera species using features extracted from pre-trained VGG, Inception and ResNet.
|
|
Robust Traffic Scenes Obstacle Detection and Image Segmentation
We proposed a persistent homology based image segmentation framework which is robust to image qualities and parameter selection. The application areas for this framework include autonomous driving systems and segmentation of natural and biological images.
[details] [presentation]
- Proposed a persistent homology based image segmentation framework which is robust to image qualities and parameter selection.
- Designed a pipeline for traffic scene obstacle detection based on this framework by extracting persistence regions in occupancy grids computed from disparity maps.
- Demonstrated that the traffic scene obstacle detection pipeline is robust to input image quality through experiments on KITTI dataset.
- Designed a consensus-based image segmentation based on this framework for robustly extracting consensus information from a segmentation result set generated by different segmentation algorithms.
- Achieved better performance over a wide range of parameters than any input algorithm with its best parameter setting on Berkeley Segmentation Database.
|
|
Exploring Victorian Illustrated Newspaper Data through Computer Vision Techniques
The aim of this project is to explore how computer vision and image processing techniques can be adapted for large-scale interpretation of historical materials.
We applied several computer vision techniques on a set of nineteenth-century illustrated British newspapers to test the feasibility of these techniques for analyzing large collections of historical illustrations.
[details] [project page]
- Created a Victorian newspaper illustration dataset by extracting illustration regions from scanned newspaper pages with high accuracy.
- Developed a Fourier transform based feature to distinguish line engravings and halftone images for tracking the presence of halftone images in late nineteenth-century British newspapers.
- Extracted specific scenes such as portraits, crowds, buildings and weather charts using k-means, KNN and SVM based on GIST descriptor.
|
|
Non-Rigid Image Registration with Uncertainty Analysis
We proposed a novel non-rigid image registration methodology which can be applied to medical images as well as natural images.
We also provided the uncertainty bounds to characterize the registration accuracy over the entire image domain.
[details] [poster]
- Developed a topological-based correspondence point matching algorithm under a Lipschitz non-rigid deformation with zero false negative rate and high precision.
- Extended the point matching to region registration by solving a graph matching problem with geometric constraints.
- Developed an approach to quantify the uncertainty of the region registration.
|
Automated species-level identification of planktic foraminifera using convolutional neural networks, with comparison to human performance
R. Mitra, T. Marchitto, Q. Ge, B. Zhong, B. Kanakiya, M.S. Cook, J.S. Fehrenbacher, J.D. Ortiz, A. Tripati and E. Lobaton
Marine Micropaleontology 2019
[1]
[abs] [link]
Picking foraminifera from sediment samples is an essential, but repetitive and low-reward task that is well-suited for automation. The first step toward building a picking robot is the development of an automated identification system. We use machine learning techniques to train convolutional neural networks (CNNs) to identify six species of extant planktic foraminifera that are widely used by paleoceanographers, and to distinguish the six species from other taxa. We employ CNNs that were previously built and trained for image classification. Foraminiferal training and identification use reflected light microscope digital images taken at 16 different illumination angles using a light-emitting diode (LED) ring. Overall machine accuracy, as a combination of precision and recall, is better than 80% even with limited training. We compare machine performance to that of human pickers (six experts and five novices) by tasking each with the identification of 540 specimens based on images. Experts achieved comparable precision but poorer recall relative to the machine, with an average accuracy of 63%. Novices scored lower than experts on both precision and recall, for an overall accuracy of 53%. The machine achieved fairly uniform performance across the six species, while participants’ scores were strongly species-dependent, commensurate with their past experience and expertise. The machine was also less sensitive to specimen orientation (umbilical versus spiral views) than the humans. These results demonstrate that our approach can provide a versatile ‘brain’ for an eventual automated robotic picking system.
|
Image Analytics and the Nineteenth-Century Illustrated Newspaper
P. Fyfe and Q. Ge
Journal of Cultural Analytics 2018
[2]
[link]
|
Obstacle Detection in Outdoor Scenes based on Multi-Valued Stereo Disparity Maps
Q. Ge and E. Lobaton
IEEE Symp. Series Comput. Intell. (SSCI) 2017
[3]
[abs] [pdf]
In this paper, we propose a methodology for robust obstacle detection in outdoor scenes for autonomous driving applications using a multi-valued stereo disparity approach. Traditionally, disparity maps computed from stereo pairs only provide a single estimated disparity value for each pixel. However, disparity computation suffers heavily from reflections, lack of texture and repetitive patterns of objects. This may lead to wrong estimates, which can introduce some bias on obstacle detection approaches that make use of the disparity map. To overcome this problem, instead of a single-valued disparity estimation, we propose making use of multiple candidates per pixel. The candidates are selected from a statistical analysis that characterizes the performance of the underlying matching cost function based on two metrics: The number of candidates extracted, and the distance from these candidates to the true disparity value. Then, we construct an aggregate occupancy map in u-disparity space from which obstacle detection is obtained. Experiments show that our approach can recover the correct structure of obstacles on the scene when traditional estimation approaches fail.
|
Coarse-to-Fine Foraminifera Image Segmentation through 3D and Deep Features
Q. Ge, B. Zhong, B. Kanakiya, R. Mitra, T. Marchitto, and E. Lobaton
IEEE Symp. Series Comput. Intell. (SSCI) 2017
[4]
[abs] [pdf]
Foraminifera are single-celled marine organisms, which are usually less than 1 mm in diameter. One of the most common tasks associated with foraminifera is the species identification of thousands of foraminifera contained in rock or ocean sediment samples, which can be a tedious manual procedure. Thus an automatic visual identification system is desirable. Some of the primary criteria for foraminifera species identification come from the characteristics of the shell itself. As such, segmentation of chambers and apertures in foraminifera images would provide powerful features for species identification. Nevertheless, none of the existing image-based, automatic classification approaches make use of segmentation, partly due to the lack of accurate segmentation methods for foraminifera images. In this paper, we propose a learning-based edge detection pipeline, using a coarse-to-fine strategy, to extract the vague edges from foraminifera images for segmentation using a relatively small training set. The experiments demonstrate our approach is able to segment chambers and apertures of foraminifera correctly and has the potential to provide useful features for species identification and other applications such as morphological study of foraminifera shells and foraminifera dataset labeling.
|
A Comparative Study of Image Classification Algorithms for Foraminifera Identification
B. Zhong, Q. Ge, B. Kanakiya, R. Mitra, T. Marchitto, and E. Lobaton
IEEE Symp. Series Comput. Intell. (SSCI) 2017
[5]
[abs] [pdf]
Identifying Foraminifera (or forams for short) is essential for oceanographic and geoscience research as well as petroleum exploration. Currently, this is mostly accomplished using trained human pickers, routinely taking weeks or even months to accomplish the task. In this paper, a foram identification pipeline is proposed to automatic identify forams based on computer vision and machine learning techniques. A microscope based image capturing system is used to collect a labelled image data set. Various popular image classification algorithms are adapted to this specific task and evaluated under various conditions. Finally, the potential of a weighted cross-entropy loss function in adjusting the trade-off between precision and recall is tested. The classification algorithms provide competitive results when compared to human experts labeling of the data set.
|
Consensus-Based Image Segmentation via Topological Persistence
Q. Ge and E. Lobaton
IEEE Conf. on Comput. Vis. Pattern Recognit. Workshops (CVPRW) 2016
[6]
[abs] [pdf]
Image segmentation is one of the most important low-level operation in image processing and computer vision. It is unlikely for a single algorithm with a fixed set of parameters to segment various images successfully due to variations between images. However, it can be observed that the desired segmentation boundaries are often detected more consistently than other boundaries in the output of state-of-the-art segmentation results. In this paper, we propose a new approach to capture the consensus of information from a set of segmentations generated by varying parameters of different algorithms. The probability of a segmentation curve being present is estimated based on our probabilistic image segmentation model. A connectivity probability map is constructed and persistent segments are extracted by applying topological persistence to the probability map. Finally, a robust segmentation is obtained with the detection of certain segmentation curves guaranteed. The experiments demonstrate our algorithm is able to consistently capture the curves present within the segmentation set.
|
Robust Multi-Target Tracking in Outdoor Traffic Scenarios via Persistence Topology based Robust Motion Segmentation
S. Chattopadhyay, Q. Ge, C. Wei, and E. Lobaton
IEEE Global Conf. Signal Inf. Process. (GlobalSIP) 2015
[7]
[abs] [pdf]
In this paper, we present a motion segmentation based robust multi-target tracking technique for on-road obstacles. Our approach uses depth imaging information, and integrates persistence topology for segmentation and min-max network flow for tracking. To reduce time as well as computational complexity, the max flow problem is solved using a dynamic programming algorithm. We classify the sensor reading into regions of stationary and moving parts by aligning occupancy maps obtained from the disparity images and then, incorporate Kalman filter in the network flow algorithm to track the moving objects robustly. Our algorithm has been tested on several real-life stereo datasets and the results show that there is an improvement by a factor of three on robustness when comparing performance with and without the topological persistent detections. We also perform measurement accuracy of our algorithm using popular evaluation metrics for segmentation and tracking, and the results look promising.
|
Robust Obstacle Segmentation based on Topological Persistence in Outdoor Traffic Scenes
C. Wei, Q. Ge, S. Chattopadhyay, and E. Lobaton
IEEE Symp. Series Comput. Intell. (SSCI) 2014
[8]
[abs] [pdf]
In this paper, a new methodology for robust seg- mentation of obstacles from stereo disparity maps in an on- road environment is presented. We first construct a probability of the occupancy map using the UV-disparity methodology. Traditionally, a simple threshold has been applied to segment obstacles from the occupancy map based on the connectivity of the resulting regions; however, this outcome is sensitive to the choice of parameter value. In our proposed method, instead of simple thresholding, we perform a topological persistence analysis on the constructed occupancy map. The topological framework hierarchically encodes all possible segmentation results as a function of the threshold, thus we can identify the regions that are most persistent. This leads to a more robust segmentation. The approach is analyzed using real stereo image pairs from standard datasets.
|
Manifold Learning Approach to Curve Identification with Applications to Footprint Segmentation
N. Lokare, Q. Ge, W. Snyder, Z. Jewell, S. Allibhai, and E. Lobaton
IEEE Symp. Series Comput. Intell. (SSCI) 2014
[9]
[abs] [pdf]
|
Non-Rigid Image Registration under Non-Deterministic Deformation Bounds
Q. Ge, N. Lokare, and E. Lobaton
10th International Symposium on Medical Information Processing and Analysis 2014
[10]
[abs] [pdf]
Image registration aims to identify the mapping between corresponding locations in an anatomic structure. Most traditional approaches solve this problem by minimizing some error metric. However, they do not quantify the uncertainty behind their estimates and the feasibility of other solutions. In this work, it is assumed that two images of the same anatomic structure are related via a Lipschitz non-rigid deformation (the registration map). An approach for identifying point correspondences with zero false-negative rate and high precision is introduced under this assumption. This methodology is then extended to registration of regions in an image which is posed as a graph matching problem with geometric constraints. The outcome of this approach is a homeomorphism with uncertainty bounds characterizing its accuracy over the entire image domain. The method is tested by applying deformation maps to the LPBA40 dataset.
|