Adversarial Data Augmentation With Vision Transformer for Image Classification Tasks
This work introduces an analysis to a new end-to-end hybrid model that adopts adversarial data augmentation using C-GANs in conjunction with Vision Transformers (ViT) to enhance image classification. ViT incorporates the multi-head self-attention to address the local and global features of the images to improve the accuracy of digit classification. Through the application of self-attention processes, the ViT can identify the local and global contexts in the images. In this work, the original images of MNIST are used along with the images that are created by the help of C-GAN model for improving image quality and dataset expansion. The ViT model is trained by tuning with specific hyperparameters such as number of epochs, weight decay, learning rate, batch size, to improve the classification outcomes. To investigate the influence of synthetic data and data augmentation, the model is assessed based on its performance. The use of both original and synthesized data into the ViT framework contributes to a more diverse model with better generalization with the accuracy of 0.98.
| Year of publication: |
2025
|
|---|---|
| Authors: | Kumar, Satrughan ; Kumar, Munish ; Mahapatra, Ranjan Kumar ; Gupta, Sumit ; Baronia, Arpita |
| Published in: |
Exploring Generative Adversarial Networks and Meta-Learning Synergies. - IGI Global Scientific Publishing, ISBN 9798369375778. - 2025, p. 73-100
|
Saved in:
Saved in favorites
Similar items by person
-
The Role of Generative Models in Modern Healthcare: Applications, Challenges, and Implications
Kumar, Munish, (2025)
-
Impact of Drug Abuse on the Relationship with Families in India
Kumar, Munish, (2019)
-
GANS and Meta-Learning: Identifying Key Challenges and Uncovering Limitations in AI Research
Kumar, Munish, (2025)
- More ...