On the Effectiveness of Automatic Code Generation for Synthetic Dataset Creation

This paper compares synthetic and real-world code datasets for machine learning applications in cybersecurity by examining the relationships between machine code and Low-Level Virtual Machine Intermediate Representation (LLVM IR). This study analyzes 1000 randomly generated programs from a compiler...

Full description

Saved in:
Bibliographic Details
Main Authors: Josh Mitchell, Varghese Mathew Vaidyan, Yong Wang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11072675/
Tags: Add Tag
No Tags, Be the first to tag this record!