As gravitational wave detectors become more advanced and sensitive, the number of signals recorded by Advanced LIGO and Virgo from merging compact objects is expected to rise dramatically. This surge in detection rates necessitates the development of adaptable, scalable, and efficient tools capable of addressing a wide range of tasks in gravitational wave astronomy. Foundational AI models present a transformative opportunity in this context by providing a unified framework that can be fine-tuned for diverse applications while leveraging the power of large-scale pre-training. In this work, we explore how advanced transformer models, specifically OpenAI’s Whisper, can be adapted as a foundational model for gravitational wave data analysis. By fine-tuning Whisper’s encoder model—originally trained on extensive audio data—and combining it with neural networks for specialized tasks, we achieve reliable results in detecting astrophysical signals and classifying transient noise artifacts or glitches. This represents the first application of open-source transformer models, pre-trained on unrelated tasks, for gravitational wave research, demonstrating their potential to enable versatile and efficient data analysis in the era of rapidly increasing detection rates.