Skip to main content
AI in Healthcare

Deep learning to identify COVID-19 lesions in lung CT scans

An efficient methodology for experimentation, transfer learning, and continuing optimization of AI models.

Deep learning to identify COVID-19 lesions in lung CT scans

The COVID-19 pandemic is waging an unparalleled assault on human health.

Mass hospitalizations and the high levels of critical care required by many patients can push healthcare institutions and staff to their limits. By April 2020, the consensus was that—although not generally recommended for initial COVID-19 diagnosis—chest imaging is indicated in patients with worsening respiratory symptoms.

COVID pneumonia (viral infection in the lungs), which is detected by chest x-rays or CT scans, can predict the need for more advanced inpatient care. However, large numbers of CT scans might have to be carefully analyzed and compared with earlier scans of the same patient.

A busy hospital might perform many lung CTs per day, potentially affecting the service levels that radiology teams are able to deliver. Artificial intelligence (AI) can serve as a valuable diagnostic aid, augmenting the capabilities of radiology teams and enabling them to make optimum use of available resources.

By prescreening the CT scans of COVID-19 patients, an accurate AI model can quickly reveal critical results. Care teams can then zero in on patients at higher risk for severe complications.

Modern deep learning models—based on convolutional neural networks (CNNs) and trained on up-to-date patient data—can identify COVID lung lesions with a high level of performance and accuracy at scale. However, model tuning, testing, and ongoing training are necessary to create and sustain an optimized AI model.

Careful attention to traceability, reproducibility, and patient privacy are essential. NetApp and SFL Scientific have developed technology for high-performing COVID-19 lung segmentation that uses a state-of-the-art model and transfer learning.

Our methodology delivers an accurate, trained model in a short time and supports ongoing training and optimization with complete traceability. Running on fast and efficient NetApp® storage infrastructure, the model takes an average of just 6 seconds to identify the COVID lesions on each patient scan (hundreds of images). This speed is on par with other advanced models and much faster than a typical human analysis of a chest CT.

Deep learning to identify COVID-19 lesions in lung CT scans overview

Features and Benefits

The capabilities that set Deep learning to identify COVID-19 lesions in lung CT scans apart.

Rapid prototyping of an AI model for COVID-19 lung CT scans

Transfer learning approach

To deliver a highly effective COVID lesion detection and quantification model in a short time, we used a transfer learning approach. Transfer learning is the process of fine-tuning a previously trained neural network for a similar or new use case. NetApp and SFL started with a pretrained deep learning model and tuned it for our needs, optimizing for performance.

NVIDIA Clara COVID-19 lung CT lesion segmentation model

NVIDIA Clara, NVIDIA's healthcare suite, includes a COVID-19 lung CT lesion segmentation model developed in conjunction with the U.S. National Institutes of Health (NIH). The Clara model was trained on a dataset of 913 independent subjects from around the globe, annotated by human experts. The model is designed to be deployed in a scalable manner.

State-of-the-art accuracy

The Clara algorithm identifies lung lesions with state-of-the-art accuracy. The algorithm can be further enhanced by training with additional patient data, including data from a particular geography or a patient demographic. The clinical presentation of COVID-19 lung lesions can vary from one region to another or one population to another.

Fine-tuning with transfer learning

Fine-tuning with additional COVID data

The pretrained NVIDIA model was fine-tuned through training with additional COVID data. As a proof of concept of the use case, we used the dataset from the COVID-19 Lung CT Lesion Segmentation Challenge—2020, which provides 199 annotated scans. We were able to measurably improve model performance in the first transfer-learning experiment.

Iterative model tuning experiments

Typically, our data scientists experiment with a range of model configurations and data transformation methods. To maximize model performance, they use multiple model tuning experiments. They make adjustments during each iteration and select the best performing model. After they select the model, they usually retrain it at regular intervals by using the latest data. This retraining helps them minimize errors, continue to increase accuracy, reduce bias, and ultimately to help save lives. They might also continue experimentation, seeking further optimization.

Intelligent data management

During experimentation and retraining, it's important to make data efficient to manage and easy to trace. AI training workflows are often complex. Data scientists and data engineers might need to pull data from multiple data sources, and data sources aren't always compatible with one another. Data scientists need the right tools to solve these problems. They need to unify data that comes from different sources, environments, platforms, and protocols.

NetApp AI Control Plane and Data Science Toolkit

Although there are other tools for iterative experimentation, model training, and deployment, most of them don't streamline data management. The NetApp AI Control Plane pairs machine learning operations (MLOps) tools with NetApp technology to simplify the management of AI data and facilitate experimentation. The NetApp Data Science Toolkit makes it easier to manage the large volumes of data required for training deep learning models. Used together or separately, they can significantly speed up AI projects. Using these tools, we're able to quickly set up and clone the volumes needed for training, perform experiments, evaluate results, and iterate quickly. All of these tasks are fully traceable so that they're reliable and compliant, and can be reproduced.

Optimized infrastructure for AI

Optimized infrastructure for AI

NetApp ONTAP AI

This experimentation and training benefit from the robust data pipeline and parallel processing capability of NetApp ONTAP® AI. ONTAP AI consolidates a data center's worth of analytics, training, and inferencing power into a single system. From preprocessing to feeding data to neural networks to model training and retraining, ONTAP AI removes performance bottlenecks and speeds up AI workloads. Data scientists and data engineers can accomplish more work in less time. In healthcare, these advantages can translate to improved patient outcomes.

Possible clinical applications

Deployment options

You can deploy the model to on-premises servers by using the NVIDIA Clara Deploy SDK. Or you can deploy directly to embedded devices by using Clara AGX. Our COVID-19 segmentation approach can be extended to address other research and clinical needs, including the following use cases:
  • Automatic CT scan monitoring. All chest CT scans that pass through a hospital system can be automatically and routinely screened as part of the radiology workflow. This monitoring has the potential to identify asymptomatic patients.
  • Clinical study treatment monitoring. By automating the comparison of scans through time, the model can help researchers evaluate the efficacy of a drug or treatment.
  • Outcome prediction. Additional models can help predict disease progression for patients and optimize treatment. Outcome prediction can help hospitals manage capacity and tailor treatment plans to patient needs.

More AI opportunities

Generalized image segmentation

The methodology we used to quickly create a COVID-19 lung segmentation model can be generalized and applied to almost any image segmentation task. With the appropriate data, we can help you create useful AI segmentation models for any organ system. The models can encompass imaging methods that range from simple 2D X-rays to 3D CT and MRI scans or ultrasound. Similar methods can also be applied to digital pathology.

Beyond medical imaging

Beyond medical imaging, the same approach can be applied to a wide range of computer vision, natural language processing (NLP), and other use cases in healthcare and other industries. The approach would combine transfer learning, experimentation, iterative fine-tuning, intelligent data management, and production deployment with regular retraining. NetApp and SFL Scientific help you get your AI project to production more quickly with fewer missteps.

About our partnership

NetApp and SFL Scientific

The partnership between NetApp and SFL Scientific brings together SFL's proven data science and data engineering expertise and NetApp's industry-leading AI hardware and software.
Expert Guidance

Thrive with expert-led storage guidance

Get tailored advice on how Deep learning to identify COVID-19 lesions in lung CT scans fits your environment — from sizing and deployment to long-term optimization.

Thrive with expert-led storage guidance

Technical Specifications

Exhaustive hardware and software metrics extracted directly from official documentation.

  • Average inference time per patient scan
    6 seconds (hundreds of images)
  • Pretrained model dataset size
    913 independent subjects (annotated by human experts)
  • Transfer learning dataset
    COVID-19 Lung CT Lesion Segmentation Challenge—2020 (199 annotated scans)
  • Model architecture
    Convolutional neural networks (CNNs)

  • Pretrained model source
    NVIDIA Clara (developed with U.S. National Institutes of Health)
  • On-premises deployment
    NVIDIA Clara Deploy SDK
  • Embedded device deployment
    Clara AGX
  • Data management tools
    NetApp AI Control Plane, NetApp Data Science Toolkit
  • MLOps integration
    Machine learning operations (MLOps) tools paired with NetApp technology

  • AI infrastructure platform
    NetApp ONTAP® AI
  • Storage
    NetApp® storage infrastructure
  • Data Fabric scope
    Edge, Core, Cloud

Ready to get started?

Get your data flowing from edge to core to cloud.

Talk to a specialist

Request a custom quote

Build a configuration with a AI in Healthcare specialist.

Request a quote

Download the datasheet

Full specs, performance metrics, and deployment notes.

Get the datasheet

Learn more

Explore resources

Datasheets, whitepapers, case studies, and technical documentation.

Explore resources

View solutions

Tailored storage and data management solutions for your workloads.

View solutions

Most secure storage on the planet FIPS 140-3 · NSA CSfC · DoDIN APL
Validated for top-secret data Only enterprise storage to hold this certification
Authorized NetApp Partner SANDataWorks · a division of BlueAlly