Updated: June 17, 2019
Over the years, there were many an occasion when I need to do some form of multimedia editing. Audio, video, you name it. Whether it's creating my unfunny clips for my Youtube channel, extracting or converting music, embedding subtitles, time and again, I would find myself using ffmpeg on the command-line and always enjoying the process while being ever so subtly amazed by this unassuming program's capabilities.
And so I thought, maybe I should write a more comprehensive guide on ffmpeg, one which lists the variety of tasks and functions, and then also wrap them around real-life examples and use cases. While I first mentioned ffmpeg in anger back in my Flash editing tutorial in 2008, it's time for a fresh, up-to-date article. It's going to be command-line, it's going to be nerdy, but it should also be fun. Follow me.
Basics
Ffmpeg is available in pretty much all and any Linux distribution, and there are also builds for Windows and Mac. If you want to know what it can do, you can check its capabilities by running ffmpeg without arguments or with an optional -hide_banner flag. You can also use the help option for the main program as well as any of its many sub-commands. The dazzling array of features can be intimidating, so we shall proceed slowly but surely. Then, don't forget the actual documentation, which is pretty neat.
Conversion and transcoding
One of the most common tasks is converting files from one format to another. Remember my example of how to extract audio files from an FLV clip? Same thing. But you may want to convert a video recorded on a camera to something like MP4, or maybe convert your OGG files to MP3, or similar. There's an endless variety of scenarios, and an equally endless number of supported modes and formats in ffmpeg. Typically, what you will do is:
- Specify input file.
- Specify the desired video and audio codecs.
- Specify output file.
If you don't want to manipulate the video stream, you can use the copy codec. The same is true for audio. You can use this option if you want to change the container, like going from MOV to MP4 without changing the actual encoding.
ffmpeg -i MVI_1348.MOV -c:v copy -c:a copy new.mp4
Note: not all formats support all codecs and options, so this won't necessarily work in all cases. Like:
[mp4 @ 0x55a21519fb20] Could not find tag for codec vp8 in stream #0, codec not
currently supported in container
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #0:1 -> #0:1 (copy)
Last message repeated 1 time
If you only want to extract a certain stream - audio, video, subtitles, or data - you can suppress the processing of other streams (to save time and CPU cycles needed for the operation). For example, if you only want audio from a combined audio-video media file, then you can use -vn (no video) option.
ffmpeg -i MVI_1348.MOV -vn -ar 44100 -ab 192k sound-only.mp3
This superb utility gives you a lot of flexibility in how you do things. For example, codec:v, vcodec and c:v are identical, and you can use any one you want. If you only want an audio output, like above, ffmpeg will guess and assume you also meant -vn. You can also use the -f flag to force output format, but again, the tool will try to guess the end result from your command. For instance, you could use -f mp3 or acodec mp3.
Different codecs will have different options. For instance, for an audio file, you could specify the audio sampling rate and audio bitrate, the number of channels, and more. If you skip some options, the command may fail, or the defaults will be used.
Scale and quality
Another common task is changing the size of the video (resolution) and the quality of streams. Ffmpeg lets you do this in different ways. You can force ratio (and break the default aspect ratio), or preserve it by specifying either the vertical or the horizontal value plus a magic qualifier. You can crop video borders, and you can re-encode streams. Examples are warranted at this stage.
ffmpeg -i input.avi -c:v mpeg4 -vtag xvid -qscale:v 3 -c:a libmp3lame -qscale:a 4 output.avi
What do we have here? We're transcoding an AVI input file by using the MPEG4 video stream with the quality set to 3 (qscale). For some of the codecs, the quality ranges from 1-31, with 1 being the highest. I tested on several generic input files, and the highest quality typically resulted in about double the size, whereas 3 or 4 gave about 20-30% reduction. Vtag xvid is an MPEG-4 specific option. Similarly, we're using LAME for audio.
ffmpeg -i MVI_1348.MOV -vfilters scale=1280:-1 -c:a copy scaled.mp4
This example shows video scaling (you can use -vf or -vfilters), with the aspect ratio being preserved through the -1 qualifier. We don't change the quality of the stream. The audio stream is simply being copied into the output container. If your original video has a higher resolution than the stated one, then you will be downscaling, and if it's lower, then you will be upscaling.
Image extraction, combination, GIFs
Ffmpeg can also manipulate pictures, even though it's a media utility. But hey, pictures are media, and essentially, videos are moving pictures, right! Some of the tasks you'd want - extract frames from a video in order to create thumbnails (high-quality ones, too) rather than manually pausing video or creating hectic screenshots. You can also do the opposite - combine stills into a stopmotion video - or a full one. You can also convert videos to animated GIFs, which can be handy for dank memeism around the Web. Once again, we shall demonstrate with some ... dank examples.
ffmpeg -i input.mp4 -r 1 -f image2 thumb-%3d.png
We specify an input file, we specify the rate (in this case we want 1 image per 1 second of video), we specify the format (image2), and we specify the output format. In this case, we're doing something clever. If you've ever seen C language code, this will look familiar. We want the numbering sequence for the thumbnails to have three digits. You can choose any amount of digits, but ideally, you want more than the expected number of extracted images.
Then, combining images into a video is very similar:
ffmpeg -f image2 -i thumb-%3d.png stopmotion.mp4
You can also specify the rate (you want your video to be something like 24 or 30 FPS), but we will touch on this a bit more later in the guide, when we talk about making video playback slower and/or faster. And since we're talking about images, you can resize them too:
ffmpeg -i thumb-004.png -vf scale=400:-1 test.png
Animated GIFs are really interesting. Also rather simple to create. Just specify an input file and set the output to have a gif extension. Ffmpeg will auto-guess the rest and create nice, high-quality animated GIF for you.
ffmpeg -i MVI_6654.MOV test.gif
Finally, the opposite of GIFs, you may also want to have a video composed of a single image - a constant preview but then have something like music or a podcast playing in the background. Ffmpeg allows you to do this with a clever trick.
ffmpeg -loop 1 -i thumb-001.png -i sound-only.mp3 -c:v libx264 -shortest video-static-image.mp4
We provide two inputs, with the first being an image set to loop. We're also re-encoding the video. But since the image is set to loop indefinitely, we also need the shortest option, so the operation ends when the shortest clip ends, which would be the video in this case.
Splitting and joining videos
This is another useful scenario. If you have a very big clip, but you only need a few minutes or seconds, ffmpeg can cut out the necessary pieces. Likewise, you can combine multiple files into one stream - both video and audio. Ffmpeg lets you choose the desired start and end timestamps, so you have a lot of leeway in how you do handle the task.
-t duration record or transcode
"duration" seconds of audio/video
-to time_stop record or transcode stop time
-fs limit_size set the limit file size in bytes
-ss time_off set the start time offset
-sseof time_off set the start time offset relative to EOF
-seek_timestamp enable/disable seeking by timestamp with -ss
-timestamp time set the recording timestamp ('now' to set the
current time)
So, you may want to try something like:
ffmpeg -i MVI_1348.MOV -ss 4 -t 4 spliced.mp4
This command will extract 4 seconds from the input clip starting at time offset of 4 seconds into the clip. You can use short notation (4 indicates seconds) or use the full one hh:mm:ss.mmm. This gives you accuracy down to one millisecond, which ought to be more than enough for all practical usecases.
Joining files is a bit more complicated. Technically termed concatenation (concat), it works differently from the cutting option. You do not specify the desired parts you want to join on the command line, you put them into a file, using a very specific notation, and then read that file with ffmpeg. For example:ffmpeg -f concat -i list.txt -codec copy joined.mp4
The contents of the list.txt file will be:
file 'part-1.mp4'
file 'part-2.mp4'
file 'part-3.mp4'
Please note the usage of the word file and single quotation marks. You can use both absolute and relative paths. If you make a mistake in the file format, you will see an error:
[concat @ 0x555ca89c38c0] Line 1: unknown keyword 'part-1.mp4'
list.txt: Invalid data found when processing input
If you try to provide the list of files on the command line (rather than the text file):
part-1.mp4: Invalid data found when processing input
Embed and extract subtitles
Remember my VLC & subtitles article? Well, if you're not intimidated by the command line, you can do the same thing with ffmpeg. As a trivial example, say you have separate audio and subtitle files. Without any additional processing, you can just use the video filtering option for this:
ffmpeg -i input.avi -vf subtitles=subtitle.srt output.avi
A more complicated use case, the command below will take separate MP4 and subtitle files, and combine them into an output file with video transcoding. The -map options (showing twice, not an error) define the order the streams will be mapped into the output. But then there's a clever and somewhat confusing listing of codecs. Because we have multiple streams, we only want to manipulate the video part - but not audio or subtitles.
To that end, -c copy specifies all codecs should be copied, but using -c:v provides an override to this option, and the video stream will be transcoded. Since we're using the libx264 codec, we're also using Constant Rate Factor (CRF) option for variable bitrate, set to 23 (ranges from lossless 0 to lowest-quality 51) and the veryfast preset, which creates a relatively small output.
ffmpeg -i input.mp4 -i subtitle.srt -map 0 -map 1 -c copy -c:v libx264 -crf 23 -preset veryfast output.mp4
Extracting subtitles takes the following form:
ffmpeg -txt_format text -i input.file out.srt
Speed up and slow videos
This is another interesting usecase. Like any human being, you may want to take any video, speed it up 2x and then add Benny Hill music, and there should be much rejoicing. You could also slow down action, and then, to spice things up, also add smoothing so the slowed sequence does not feel stilted.
For example, we may have our stop motion video - with only say 15-20 thumbnails, it takes only a second to play. So you may want to stretch it to say 5 seconds, but that means each frame will show for five times as long. Instead you could potentially "insert" extra frames to make it look like a smooth slo-mo. We did this with an interior design gallery back in the day. We use AviSynth and VirtualDub for this task.
The video speed filter works by changing the presentation timestamp (PTS) of each video frame. Simply put, fractional values speed up the video, multipliers higher than 1 slow it down. When you speed it up, ffmpeg will drop frames, but you can specify a higher framerate. To preserve the original, you can multiply the FPS in your input by the 1/PTS you're using. So if you double the speed of your 24FPS clip, you should use 48FPS rate.
ffmpeg -i input.mp4 -filter:v "setpts=0.5*PTS" output.mp4
ffmpeg -i input.mp4 -r 96 -filter:v "setpts=0.25*PTS" output.mp4
ffmpeg -i input.mp4 -filter:v "setpts=2*PTS" output.mp4
Smoothing videos is more complicated. I did have to consult the documentation. Essentially, for a video that's slowed down 5 times (and 5 times longer than the original), I'm using video interpolation to create the smoothing effect. Here, too, you need to add FPS, so that ffmpeg has sufficient data to guess the frames in between. In my example, 5x slowed down, you need 5x24 FPS (original speed), ergo 120 FPS.
ffmpeg -i input.mp4 -filter:v "minterpolate='mi_mode=mci:mc_mode=aobmc:vsbmc=1:fps=120'" output.mp4
This is a CPU-intensive task, and it can take a while to finish. I used ffplay to preview videos, and I found the results reasonable, but you can't really magic something from nothing, so while this works, it works well when there's a small amount of data missing (not one frame per second every two seconds or such).
For audio, you can use atempo. This option has a range of 0.5-2.0, which means half or double the speed at most, however, it is possible to chain multiple atempo filters to get additional speed changes. Something like the example below:
ffmpeg -i input.mp4 -filter:a "atempo=2.0,atempo=2.0" -vn output.mp4
And that brings us to the end of this tutorial.
Conclusion
I hope you found this fruitful, enjoyable and relevant. Ffmpeg is the workhorse - no, it's a whole cavalry - of audio and video tweaking in Linux, and it has so many practical tools, options and configurations that getting started can be confusing and maybe even intimidating. Those emotions should have evaporated by now as you've read through this tutorial.
Anyway, we talked about conversion and transcoding, scaling and quality, image extraction, combination and GIF creation, splitting and joining of videos, subtitle extraction and embedding, video speedup and slowdown, and then some. That should keep you happy and busy for a while. If you have any ffmpeg-related requests or questions, fire away, the email is only a click away. Or something. Finally, if this tutorial was truly useful, do give some love back, because why not.
Cheers.