FFmpeg is a free and open-source video editing tool capable of trimming, cropping, concatenating, muxing, and transcoding almost any type of media file you throw at it.
It's also a very robust solution for implementing video automation, as we use it extensively in our own video editing API. For this tutorial we'll use FFmpeg 5.1.2, but any recent version will do.
Here's how to get the audio track out of a video file and convert it to MP3:
$ ffmpeg -i video.mp4 audio.mp3
As you might know, the quality of a file deteriorates every time it's re-encoded. Fortunately, there's also a way to extract the audio without re-encoding the stream. Below are some examples in which the re-encoding step is skipped and the stream is copied as-is, retaining the audio quality.
First, we must know what format the audio track is in. We can do this with FFprobe, a command-line tool included with FFmpeg. The following command will show you how the file is encoded:
1$ ffprobe video.mp4 -hide_banner
2
3Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'video.mp4':
4 Metadata:
5 major_brand : isom
6 minor_version : 1
7 compatible_brands: isomavc1
8 creation_time : 2013-12-16T17:59:32.000000Z
9 title : Big Buck Bunny, Sunflower version
10 artist : Blender Foundation 2008, Janus Bager Kristensen 2013
11 comment : Creative Commons Attribution 3.0 - http://bbb3d.renderfarming.net
12 genre : Animation
13 composer : Sacha Goedegebure
14 Duration: 00:10:34.53, start: 0.000000, bitrate: 4486 kb/s
15 Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 4001 kb/s, 60 fps, 60 tbr, 60k tbn (default)
16 Metadata:
17 creation_time : 2013-12-16T17:59:32.000000Z
18 handler_name : GPAC ISO Video Handler
19 vendor_id : [0][0][0][0]
20 Stream #0:1[0x2](und): Audio: mp3 (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 160 kb/s (default)
21 Metadata:
22 creation_time : 2013-12-16T17:59:37.000000Z
23 handler_name : GPAC ISO Audio Handler
24 vendor_id : [0][0][0][0]
25 Stream #0:2[0x3](und): Audio: ac3 (ac-3 / 0x332D6361), 48000 Hz, 5.1(side), fltp, 320 kb/s (default)
26 Metadata:
27 creation_time : 2013-12-16T17:59:37.000000Z
28 handler_name : GPAC ISO Audio Handler
29 vendor_id : [0][0][0][0]
30 Side data:
31 audio service type: main
32
As you can see in the highlighted section, the video contains two audio streams; an AAC M4A stream and a Dolby AC-3 stream. The M4A audio stream can be extracted into a separate audio file without having to convert it to another format. Here's how:
$ ffmpeg -i video.mp4 -vn -c:a copy audio.m4a
As another example, we can see that the audio stream is encoded as MP3:
1$ ffprobe video.mov -hide_banner
2
3Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'video.mov':
4 Metadata:
5 major_brand : qt
6 minor_version : 512
7 compatible_brands: qt
8 artist : Blender Foundation 2008, Janus Bager Kristensen 2013
9 title : Big Buck Bunny, Sunflower version
10 encoder : Lavf59.27.100
11 comment : Creative Commons Attribution 3.0 - http://bbb3d.renderfarming.net
12 genre : Animation
13 Duration: 00:00:09.57, start: 0.000000, bitrate: 1715 kb/s
14 Stream #0:0[0x1]: Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1579 kb/s, 60 fps, 60 tbr, 15360 tbn (default)
15 Metadata:
16 handler_name : GPAC ISO Video Handler
17 vendor_id : FFMP
18 encoder : Lavc59.37.100 libx264
19 Stream #0:1[0x2]: Audio: mp3 (.mp3 / 0x33706D2E), 48000 Hz, stereo, fltp, 128 kb/s (default)
20 Metadata:
21 handler_name : GPAC ISO Audio Handler
22 vendor_id : [0][0][0][0]
23
This is how you extract the audio stream into a separate MP3 file:
$ ffmpeg -i video.mov -vn -c:a copy audio.mp3