NLP-enhanced inflation measurement using BERT and web scraping
In this research note, we explore the integration of natural language processing (NLP) and web scraping techniques to develop a custom price index for measuring inflation. Using the Harmonized Index of Consumer Prices (HICP) as a benchmark, we created a database of consumer electronics product data...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-04-01
|
| Series: | Frontiers in Artificial Intelligence |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/frai.2025.1520659/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849702798761918464 |
|---|---|
| author | Martin Berki Vanesa Andicsova Milos Oravec |
| author_facet | Martin Berki Vanesa Andicsova Milos Oravec |
| author_sort | Martin Berki |
| collection | DOAJ |
| description | In this research note, we explore the integration of natural language processing (NLP) and web scraping techniques to develop a custom price index for measuring inflation. Using the Harmonized Index of Consumer Prices (HICP) as a benchmark, we created a database of consumer electronics product data through web scraping. Using the BERT model for classification, we achieved a high-performance classification of approximately 10,000 items into COICOP categories, with an accuracy of 94.56 %, macro precision of 79.41 %, and weighted precision of 94.07 % on validation data. Our custom index, particularly with weighted and median methodologies, demonstrated closer alignment with the official HICP while capturing more detailed price fluctuations within the market. Monthly inflation trends revealed variability that reflects price changes in the COICOP 091 category, contrasting with the relative stability of the official HICP. This work provides an alternative perspective on inflation measurement, highlighting the potential of computational approaches to enhance economic analysis. |
| format | Article |
| id | doaj-art-38557265a0f54aefa69ae2b6540ea286 |
| institution | DOAJ |
| issn | 2624-8212 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Artificial Intelligence |
| spelling | doaj-art-38557265a0f54aefa69ae2b6540ea2862025-08-20T03:17:31ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122025-04-01810.3389/frai.2025.15206591520659NLP-enhanced inflation measurement using BERT and web scrapingMartin BerkiVanesa AndicsovaMilos OravecIn this research note, we explore the integration of natural language processing (NLP) and web scraping techniques to develop a custom price index for measuring inflation. Using the Harmonized Index of Consumer Prices (HICP) as a benchmark, we created a database of consumer electronics product data through web scraping. Using the BERT model for classification, we achieved a high-performance classification of approximately 10,000 items into COICOP categories, with an accuracy of 94.56 %, macro precision of 79.41 %, and weighted precision of 94.07 % on validation data. Our custom index, particularly with weighted and median methodologies, demonstrated closer alignment with the official HICP while capturing more detailed price fluctuations within the market. Monthly inflation trends revealed variability that reflects price changes in the COICOP 091 category, contrasting with the relative stability of the official HICP. This work provides an alternative perspective on inflation measurement, highlighting the potential of computational approaches to enhance economic analysis.https://www.frontiersin.org/articles/10.3389/frai.2025.1520659/fullinflation measurementnatural language processingweb scrapingBERTprice indexeconomic analysis |
| spellingShingle | Martin Berki Vanesa Andicsova Milos Oravec NLP-enhanced inflation measurement using BERT and web scraping Frontiers in Artificial Intelligence inflation measurement natural language processing web scraping BERT price index economic analysis |
| title | NLP-enhanced inflation measurement using BERT and web scraping |
| title_full | NLP-enhanced inflation measurement using BERT and web scraping |
| title_fullStr | NLP-enhanced inflation measurement using BERT and web scraping |
| title_full_unstemmed | NLP-enhanced inflation measurement using BERT and web scraping |
| title_short | NLP-enhanced inflation measurement using BERT and web scraping |
| title_sort | nlp enhanced inflation measurement using bert and web scraping |
| topic | inflation measurement natural language processing web scraping BERT price index economic analysis |
| url | https://www.frontiersin.org/articles/10.3389/frai.2025.1520659/full |
| work_keys_str_mv | AT martinberki nlpenhancedinflationmeasurementusingbertandwebscraping AT vanesaandicsova nlpenhancedinflationmeasurementusingbertandwebscraping AT milosoravec nlpenhancedinflationmeasurementusingbertandwebscraping |