A novel machine learning based approach for iPS progenitor cell identification.

Identification of induced pluripotent stem (iPS) progenitor cells, the iPS forming cells in early stage of reprogramming, could provide valuable information for studying the origin and underlying mechanism of iPS cells. However, it is very difficult to identify experimentally since there are no biom...

Full description

Saved in:
Bibliographic Details
Main Authors: Haishan Zhang, Ximing Shao, Yin Peng, Yanning Teng, Konda Mani Saravanan, Huiling Zhang, Hongchang Li, Yanjie Wei
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2019-12-01
Series:PLoS Computational Biology
Online Access:https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007351&type=printable
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850241352450703360
author Haishan Zhang
Ximing Shao
Yin Peng
Yanning Teng
Konda Mani Saravanan
Huiling Zhang
Hongchang Li
Yanjie Wei
author_facet Haishan Zhang
Ximing Shao
Yin Peng
Yanning Teng
Konda Mani Saravanan
Huiling Zhang
Hongchang Li
Yanjie Wei
author_sort Haishan Zhang
collection DOAJ
description Identification of induced pluripotent stem (iPS) progenitor cells, the iPS forming cells in early stage of reprogramming, could provide valuable information for studying the origin and underlying mechanism of iPS cells. However, it is very difficult to identify experimentally since there are no biomarkers known for early progenitor cells, and only about 6 days after reprogramming initiation, iPS cells can be experimentally determined via fluorescent probes. What is more, the ratio of progenitor cells during early reprograming period is below 5%, which is too low to capture experimentally in the early stage. In this paper, we propose a novel computational approach for the identification of iPS progenitor cells based on machine learning and microscopic image analysis. Firstly, we record the reprogramming process using a live cell imaging system after 48 hours of infection with retroviruses expressing Oct4, Sox2 and Klf4, later iPS progenitor cells and normal murine embryonic fibroblasts (MEFs) within 3 to 5 days after infection are labeled by retrospectively tracing the time-lapse microscopic image. We then calculate 11 types of cell morphological and motion features such as area, speed, etc., and select best time windows for modeling and perform feature selection. Finally, a prediction model using XGBoost is built based on the selected six types of features and best time windows. Our model allows several missing values/frames in the sample datasets, thus it is applicable to a wide range of scenarios. Cross-validation, holdout validation and independent test experiments show that the minimum precision is above 52%, that is, the ratio of predicted progenitor cells within 3 to 5 days after viral infection is above 52%. The results also confirm that the morphology and motion pattern of iPS progenitor cells is different from that of normal MEFs, which helps with the machine learning methods for iPS progenitor cell identification.
format Article
id doaj-art-82dcaa32c6a14e979f24cd2db4a2f804
institution OA Journals
issn 1553-734X
1553-7358
language English
publishDate 2019-12-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj-art-82dcaa32c6a14e979f24cd2db4a2f8042025-08-20T02:00:38ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582019-12-011512e100735110.1371/journal.pcbi.1007351A novel machine learning based approach for iPS progenitor cell identification.Haishan ZhangXiming ShaoYin PengYanning TengKonda Mani SaravananHuiling ZhangHongchang LiYanjie WeiIdentification of induced pluripotent stem (iPS) progenitor cells, the iPS forming cells in early stage of reprogramming, could provide valuable information for studying the origin and underlying mechanism of iPS cells. However, it is very difficult to identify experimentally since there are no biomarkers known for early progenitor cells, and only about 6 days after reprogramming initiation, iPS cells can be experimentally determined via fluorescent probes. What is more, the ratio of progenitor cells during early reprograming period is below 5%, which is too low to capture experimentally in the early stage. In this paper, we propose a novel computational approach for the identification of iPS progenitor cells based on machine learning and microscopic image analysis. Firstly, we record the reprogramming process using a live cell imaging system after 48 hours of infection with retroviruses expressing Oct4, Sox2 and Klf4, later iPS progenitor cells and normal murine embryonic fibroblasts (MEFs) within 3 to 5 days after infection are labeled by retrospectively tracing the time-lapse microscopic image. We then calculate 11 types of cell morphological and motion features such as area, speed, etc., and select best time windows for modeling and perform feature selection. Finally, a prediction model using XGBoost is built based on the selected six types of features and best time windows. Our model allows several missing values/frames in the sample datasets, thus it is applicable to a wide range of scenarios. Cross-validation, holdout validation and independent test experiments show that the minimum precision is above 52%, that is, the ratio of predicted progenitor cells within 3 to 5 days after viral infection is above 52%. The results also confirm that the morphology and motion pattern of iPS progenitor cells is different from that of normal MEFs, which helps with the machine learning methods for iPS progenitor cell identification.https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007351&type=printable
spellingShingle Haishan Zhang
Ximing Shao
Yin Peng
Yanning Teng
Konda Mani Saravanan
Huiling Zhang
Hongchang Li
Yanjie Wei
A novel machine learning based approach for iPS progenitor cell identification.
PLoS Computational Biology
title A novel machine learning based approach for iPS progenitor cell identification.
title_full A novel machine learning based approach for iPS progenitor cell identification.
title_fullStr A novel machine learning based approach for iPS progenitor cell identification.
title_full_unstemmed A novel machine learning based approach for iPS progenitor cell identification.
title_short A novel machine learning based approach for iPS progenitor cell identification.
title_sort novel machine learning based approach for ips progenitor cell identification
url https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007351&type=printable
work_keys_str_mv AT haishanzhang anovelmachinelearningbasedapproachforipsprogenitorcellidentification
AT ximingshao anovelmachinelearningbasedapproachforipsprogenitorcellidentification
AT yinpeng anovelmachinelearningbasedapproachforipsprogenitorcellidentification
AT yanningteng anovelmachinelearningbasedapproachforipsprogenitorcellidentification
AT kondamanisaravanan anovelmachinelearningbasedapproachforipsprogenitorcellidentification
AT huilingzhang anovelmachinelearningbasedapproachforipsprogenitorcellidentification
AT hongchangli anovelmachinelearningbasedapproachforipsprogenitorcellidentification
AT yanjiewei anovelmachinelearningbasedapproachforipsprogenitorcellidentification
AT haishanzhang novelmachinelearningbasedapproachforipsprogenitorcellidentification
AT ximingshao novelmachinelearningbasedapproachforipsprogenitorcellidentification
AT yinpeng novelmachinelearningbasedapproachforipsprogenitorcellidentification
AT yanningteng novelmachinelearningbasedapproachforipsprogenitorcellidentification
AT kondamanisaravanan novelmachinelearningbasedapproachforipsprogenitorcellidentification
AT huilingzhang novelmachinelearningbasedapproachforipsprogenitorcellidentification
AT hongchangli novelmachinelearningbasedapproachforipsprogenitorcellidentification
AT yanjiewei novelmachinelearningbasedapproachforipsprogenitorcellidentification