AI Diffusion Model-Based Technology for Automating the Multi-Class Labeling of Electron Microscopy Datasets of Brain Cell Organelles for Their Augmentation and Synthetic Generation

A technology for the automatic multi-class labeling of brain electron microscopy (EM) objects needed to create large synthetic datasets, which could be used for brain cell segmentation tasks, is proposed. The main research tools were a generative diffusion AI model and a U-Net-like segmentation mode...

Full description

Saved in:
Bibliographic Details
Main Authors: Nikolay Sokolov, Alexandra Getmanskaya, Vadim Turlapov
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Technologies
Subjects:
Online Access:https://www.mdpi.com/2227-7080/13/4/127
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849715026622939136
author Nikolay Sokolov
Alexandra Getmanskaya
Vadim Turlapov
author_facet Nikolay Sokolov
Alexandra Getmanskaya
Vadim Turlapov
author_sort Nikolay Sokolov
collection DOAJ
description A technology for the automatic multi-class labeling of brain electron microscopy (EM) objects needed to create large synthetic datasets, which could be used for brain cell segmentation tasks, is proposed. The main research tools were a generative diffusion AI model and a U-Net-like segmentation model. The technology was studied on the segmentation task of up to six brain organelles. The initial dataset used was the popular EPFL dataset labeled for the mitochondria class, which has training and test parts having 165 layers each. Our mark up for the EPFL dataset was named EPFL6 and contained six classes. The technology was implemented and studied in a two-step experiment: (1) dataset synthesis using a diffusion model trained on EPFL6; (2) evaluation of the labeling accuracy of a multi-class synthetic dataset by the segmentation accuracy on the test part of EPFL6. It was found that (1) the segmentation accuracy of the mitochondria class for the diffusion synthetic datasets corresponded to the accuracy of the original ones; (2) augmentation via geometric synthetics provided a better accuracy for underrepresented classes; (3) the naturalization of geometric synthetics by the diffusion model yielded a positive effect; (4) due to the augmentation of the 165 layers of the original EPFL dataset with diffusion synthetics, it was possible to achieve and surpass the record accuracy of Dice = 0.948, which was achieved using 3D estimation in Hive-net (2021).
format Article
id doaj-art-a6acb543eb9741c8b0158b7282cecf6a
institution DOAJ
issn 2227-7080
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Technologies
spelling doaj-art-a6acb543eb9741c8b0158b7282cecf6a2025-08-20T03:13:32ZengMDPI AGTechnologies2227-70802025-03-0113412710.3390/technologies13040127AI Diffusion Model-Based Technology for Automating the Multi-Class Labeling of Electron Microscopy Datasets of Brain Cell Organelles for Their Augmentation and Synthetic GenerationNikolay Sokolov0Alexandra Getmanskaya1Vadim Turlapov2Research Center for Artificial Intelligence, Institute of Information Technologies, Mathematics, and Mechanics, Lobachevsky University, 603022 Nizhny Novgorod, RussiaResearch Center for Artificial Intelligence, Institute of Information Technologies, Mathematics, and Mechanics, Lobachevsky University, 603022 Nizhny Novgorod, RussiaResearch Center for Artificial Intelligence, Institute of Information Technologies, Mathematics, and Mechanics, Lobachevsky University, 603022 Nizhny Novgorod, RussiaA technology for the automatic multi-class labeling of brain electron microscopy (EM) objects needed to create large synthetic datasets, which could be used for brain cell segmentation tasks, is proposed. The main research tools were a generative diffusion AI model and a U-Net-like segmentation model. The technology was studied on the segmentation task of up to six brain organelles. The initial dataset used was the popular EPFL dataset labeled for the mitochondria class, which has training and test parts having 165 layers each. Our mark up for the EPFL dataset was named EPFL6 and contained six classes. The technology was implemented and studied in a two-step experiment: (1) dataset synthesis using a diffusion model trained on EPFL6; (2) evaluation of the labeling accuracy of a multi-class synthetic dataset by the segmentation accuracy on the test part of EPFL6. It was found that (1) the segmentation accuracy of the mitochondria class for the diffusion synthetic datasets corresponded to the accuracy of the original ones; (2) augmentation via geometric synthetics provided a better accuracy for underrepresented classes; (3) the naturalization of geometric synthetics by the diffusion model yielded a positive effect; (4) due to the augmentation of the 165 layers of the original EPFL dataset with diffusion synthetics, it was possible to achieve and surpass the record accuracy of Dice = 0.948, which was achieved using 3D estimation in Hive-net (2021).https://www.mdpi.com/2227-7080/13/4/127diffusion neural networkautomatic multi-class labelingelectron microscopysynthetic datasetdataset augmentationgeometric augmentation
spellingShingle Nikolay Sokolov
Alexandra Getmanskaya
Vadim Turlapov
AI Diffusion Model-Based Technology for Automating the Multi-Class Labeling of Electron Microscopy Datasets of Brain Cell Organelles for Their Augmentation and Synthetic Generation
Technologies
diffusion neural network
automatic multi-class labeling
electron microscopy
synthetic dataset
dataset augmentation
geometric augmentation
title AI Diffusion Model-Based Technology for Automating the Multi-Class Labeling of Electron Microscopy Datasets of Brain Cell Organelles for Their Augmentation and Synthetic Generation
title_full AI Diffusion Model-Based Technology for Automating the Multi-Class Labeling of Electron Microscopy Datasets of Brain Cell Organelles for Their Augmentation and Synthetic Generation
title_fullStr AI Diffusion Model-Based Technology for Automating the Multi-Class Labeling of Electron Microscopy Datasets of Brain Cell Organelles for Their Augmentation and Synthetic Generation
title_full_unstemmed AI Diffusion Model-Based Technology for Automating the Multi-Class Labeling of Electron Microscopy Datasets of Brain Cell Organelles for Their Augmentation and Synthetic Generation
title_short AI Diffusion Model-Based Technology for Automating the Multi-Class Labeling of Electron Microscopy Datasets of Brain Cell Organelles for Their Augmentation and Synthetic Generation
title_sort ai diffusion model based technology for automating the multi class labeling of electron microscopy datasets of brain cell organelles for their augmentation and synthetic generation
topic diffusion neural network
automatic multi-class labeling
electron microscopy
synthetic dataset
dataset augmentation
geometric augmentation
url https://www.mdpi.com/2227-7080/13/4/127
work_keys_str_mv AT nikolaysokolov aidiffusionmodelbasedtechnologyforautomatingthemulticlasslabelingofelectronmicroscopydatasetsofbraincellorganellesfortheiraugmentationandsyntheticgeneration
AT alexandragetmanskaya aidiffusionmodelbasedtechnologyforautomatingthemulticlasslabelingofelectronmicroscopydatasetsofbraincellorganellesfortheiraugmentationandsyntheticgeneration
AT vadimturlapov aidiffusionmodelbasedtechnologyforautomatingthemulticlasslabelingofelectronmicroscopydatasetsofbraincellorganellesfortheiraugmentationandsyntheticgeneration