1 MOTIVATION, RESEARCH QUESTIONS, and OBJECTIVES
1.1 Motivation
The motivation behind this research stems from the need to address extreme cases of keratoconus and the significance of the demarcation line in gauging treatment effectiveness. Traditional methods for demarcation line determination may be subjective and time-consuming, prompting the exploration of automated solutions. Our project seeks to contribute to the efficiency and accuracy of identifying the demarcation line through the application of advanced deep learning techniques.
1.2 Research Questions and Objectives
Our research aims to answer the following questions:
- How can deep learning, specifically CNNs, be leveraged to automate the identification of the demarcation line using cornea scans post-surgery?
- Explore and adapt state-of-the-art deep learning architectures, specifically the U-Net, to efficiently identify and analyze the demarcation line in post-CXL images.
- Develop a robust and automated system for demarcation line identification, enhancing the speed and accuracy of post-surgical evaluations for keratoconus patients.
2 INTRODUCTION
Keratoconus, a condition characterized by corneal thinning leading to visual impairment, has prompted the development of various surgical interventions, with corneal cross-linking (CXL) emerging as a standard treatment. Post-CXL, the visibility of the demarcation line serves as a critical indicator of the surgery’s success. This report explores the significance of the demarcation line and the challenges associated with its identification, presenting an overview of how recent advancements in deep learning, particularly Convolutional Neural Networks (CNNs), can enhance efficiency in automating this process.
As we delve into the research surrounding the demarcation line in keratoconus patients post-CXL, our project aims to contribute to the growing body of knowledge on automated image analysis in ophthalmology. By leveraging advancements in deep learning, we anticipate enhancing the precision and efficiency of demarcation line identification, ultimately improving the evaluation process for the success of cross-linking surgery in treating keratoconus.
3 DATA COLLECTION AND INITIAL ANALYSIS
In pursuit of advancing automated demarcation line identification in keratoconus patients post-cross-linking surgery, our team embarked on obtaining relevant data for analysis which involved reaching out to Jad Assaf, a renowned deep learning researcher.
3.1 Data Source and Collaboration with Jad Assaf
Our team proactively reached out to Jad Assaf, leveraging his expertise in the field of deep learning. The collaboration resulted in acquiring a dataset comprising eye scans from notable medical institutions, namely the Elza Institute in Switzerland and AUBMC in Lebanon.
3.2 Data Characteristics and Size
The dataset comprises a total of 939 scanned images obtained from 61 anonymized patients. Notably, each patient contributed multiple scans, adding depth and variability to the dataset. The inclusion of scans from different hospitals introduces diversity in imaging conditions and practices, enhancing the robustness of our analysis.
3.3 Initial Analysis: Inputs and labeled outputs
Upon obtaining the dataset, we conducted an initial analysis to understand its composition. The input data, denoted as X, consists of eye scans, capturing the variations inherent in keratoconus cases. These scans exhibit differences in the visibility of the demarcation line, reflecting the diverse nature of post-surgical outcomes.
The output data, denoted as y, serves as a binary matrix, functioning as a mask. This matrix, with values of 0 or 1, delineates the presence or absence of the demarcation line in the corresponding eye scan. The binary nature of y simplifies the classification task, framing it as a problem of demarcation line identification.
3.4 Visual Representation: Diverse Eye Scans
To provide a visual representation of the dataset, we present a subset of eye scans (X) that showcase diversity in demarcation line visibility. Some scans exhibit clear and easily identifiable demarcation lines, while others pose challenges due to factors such as image quality or post-surgical variations. This visual variety emphasizes the need for a robust automated identification system.
3.5 Dataset Split and Composition
To facilitate the training, validation, and testing of our model, we performed an 80-10-10 split, allocating 80% of the data for training, 10% for validation, and 10% for testing. This ensures a balanced distribution of images across these subsets, allowing for comprehensive model evaluation. The collaboration with a medical expert, Dr. Jad Assaf, has been instrumental in acquiring a diverse and substantial dataset for our project. The initial analysis has provided insights into the structure of the data, emphasizing the binary nature of the demarcation line task. As we move forward, this dataset will serve as the foundation for training and evaluating our deep learning model, ultimately contributing to the advancement of automated demarcation line identification in keratoconus patients.
4 METHODS
Our baseline model is based on U-Net, which is a Deep Convolutional Neural Network specifically for biomedical image segmentation. The U-Net architecture consists of a contracting path and an expansive path. First, the contracting path resembles a typical CNN architecture with two 3x3 convolutions, each followed by a rectified linear unit (ReLU) and a 2x2 max pooling operation with stride 2 for downsampling. Furthermore, at each downsampling step, the number of feature channels is doubled. Every step in the expansive path consists of an upsampling of the feature map followed by a 2x2 convolution (up-convolution) that halves the number of feature channels. The contracting path mainly serves to capture context and reduce the spatial resolution of the input.
The contracting path is followed by a bottleneck layer, which is a set of convolutional layers that reduce the spatial dimensions further. The U-Net model also introduces skip connections, which concatenate feature maps from the contracting path to the corresponding layers in the expansive path. These skip connections ensure that fine-grained details lost during downsampling are reintroduced in the upsampling phase, which is crucial for pixel-wise segmentation. The gray lines in the above diagram represent the skip connections.
The expansive path, or the decoder, follows the contracting path. The expansive path uses up-convolutions (or transpose convolutions) to increase the spatial dimensions of the feature maps and enable precise localization. This is in contrast to typical CNNs, which generally don’t expand the feature maps back to the original image size. Lastly, the final layer of the network uses a 1x1 convolution to map the learned features to the desired number of output channels. In segmentation tasks, this often corresponds to the number of classes.
Additionally, we utilized model checkpointing and learning rate scheduling to aid our training. Model checkpointing involves saving the state of the model at specific intervals, allowing us to not only recover the model’s state in case of interruptions but to also select the best-performing model for further usage and analysis. On the other hand, we implemented learning rate scheduling to optimize the training phase of our model, improve accuracy, and reduce overfitting.
5 RESULTS
In training our model, we utilize the Weights and Biases software to aid our model tracking. Weights and Biases is a robust tool that significantly enhances end-to-end machine learning development. This platform is widely used by machine learning researchers to help with tracking model performance, comparing various models, and even for hyperparameter tuning.
We made use of Weights and Biases’ visualization tools to track our model progress. Despite using a learning rate scheduler, we noticed an oscillating loss throughout our epochs. Our best validation loss curve was decreasing in nature, and the lowest validation loss was achieved at epoch 117. Overall, we see that the model is learning, but it isn’t completely optimized yet and will require further tuning to ultimately achieve our objective.
So far, the model has still not been finalized, and the final results need to be validated by actual medical personnel to confirm that this actually performs well on unseen images. As of now, below is a visual representation of the results we have so far.
Obviously, the model’s prediction is getting very close to the True Mask but with a bit more pixels being allocated as 1 (part of the demarcation line). One of the main things we will try to do moving forward is either decrease the weight allocated for white pixels while training or include some post processing to the output. However, as of now what we can confidently say is that the model we built is getting very close to the objective.
6 CONCLUSION
Overall, this project was a highlight of our graduate studies, serving as an ideal segue into the realm of computer vision. The complexity involved in comprehending images as tensors, the intricacies of the network architecture, the intricacies of loss functions, and the nuances of the encoder-decoder framework, coupled with the use of a real dataset, made this a challenging yet immensely rewarding endeavor.
As we move forward, our focus will be on enhancing the model to its fullest potential and exploring the possibility of publishing our work related to this project.