FFmpeg can take input from "directshow" devices on your Windows computer. See the FFmpeg dshow input device documentation for official documentation. It can accept input from audio, video devices, video capture devices, analog tv tuner devices.
Example to list dshow input devices:
c:\> ffmpeg -list_devices true -f dshow -i dummy ffmpeg version N-45279-g6b86dd5... --enable-runtime-cpudetect libavutil 51. 74.100 / 51. 74.100 libavcodec 54. 65.100 / 54. 65.100 libavformat 54. 31.100 / 54. 31.100 libavdevice 54. 3.100 / 54. 3.100 libavfilter 3. 19.102 / 3. 19.102 libswscale 2. 1.101 / 2. 1.101 libswresample 0. 16.100 / 0. 16.100 [dshow @ 03ACF580] DirectShow video devices [dshow @ 03ACF580] "Integrated Camera" [dshow @ 03ACF580] "screen-capture-recorder" [dshow @ 03ACF580] DirectShow audio devices [dshow @ 03ACF580] "Internal Microphone (Conexant 2" [dshow @ 03ACF580] "virtual-audio-capturer" dummy: Immediate exit requested
Example to use a dshow device as an input:
c:\> ffmpeg -f dshow -i video="Integrated Camera" out.mp4
Note: "Integrate Camera" is the device name used as an input to dshow and it may differ on other hardware. Tou must always use the device name listed on your hardware. Screen capture recorder is a third party downloadable dshow capture source, here as an example.
Example to use audio and video dshow device as an input:
c:\> ffmpeg -f dshow -i video="Integrated Camera":audio="Microphone name here" out.mp4
You can also pass the device certain parameters that it needs, for instance a webcam might allow you to capture it in 1024x768 at up to max 5 fps, or allow you to capture at 640x480 at 30 fps.
Example to print a list of options from a selected device:
c:\> ffmpeg -f dshow -list_options true -i video="Integrated Camera" ffmpeg version N-45279-g6b86dd5 Copyright (c) 2000-2012 the FFmpeg developers built on Oct 10 2012 17:30:47 with gcc 4.7.1 (GCC) configuration:... libavutil 51. 74.100 / 51. 74.100 libavcodec 54. 65.100 / 54. 65.100 libavformat 54. 31.100 / 54. 31.100 libavdevice 54. 3.100 / 54. 3.100 libavfilter 3. 19.102 / 3. 19.102 libswscale 2. 1.101 / 2. 1.101 libswresample 0. 16.100 / 0. 16.100 [dshow @ 01D4F3E0] DirectShow video device options [dshow @ 01D4F3E0] Pin "Capture" [dshow @ 01D4F3E0] pixel_format=yuyv422 min s=640x480 fps=15 max s=640x480 fps=30 [dshow @ 01D4F3E0] pixel_format=yuyv422 min s=1280x720 fps=7.5 max s=1280x720 fps=7.5 [dshow @ 01D4F3E0] vcodec=mjpeg min s=640x480 fps=15 max s=640x480 fps=30 [dshow @ 01D4F3E0] vcodec=mjpeg min s=1280x720 fps=15 max s=1280x720 fps=30 video=Integrated Camera: Immediate exit requested
You can see in this particular instance that it can either stream it to you in a "raw pixel_format" (yuyv422 in this case), or as an mjpeg stream.
ffmpeg -f dshow -video_size 1280x720 -framerate 7.5 -pixel_format yuyv422 -i video="Integrated Camera" out.avi
You can specify the type (mjpeg) and size (1280x720) and frame rate to tell the device to give you (15 fps) (note for instance, in this instance, the camera can give you a higher frame rate/size total if you specify mjpeg):
ffmpeg -f dshow -video_size 1280x720 -framerate 15 -vcodec mjpeg -i video="Integrated Camera" out.avi
You can specify "-vcodec copy" to stream copy the video instead of re-encoding, if you can receive the data in some type of pre-encoded format, like mjpeg in this instance.
Example audio list options:
c:\> ffmpeg -f dshow -list_options true -i audio=virtual-audio-capturer ffmpeg version N-50911-g9efcfbe Copyright (c) 2000-2013 the FFmpeg developers built on Mar 13 2013 21:26:48 with gcc 4.7.2 (GCC) configuration:... libavutil 52. 19.100 / 52. 19.100 libavcodec 55. 0.100 / 55. 0.100 libavformat 55. 0.100 / 55. 0.100 libavdevice 54. 4.100 / 54. 4.100 libavfilter 3. 45.103 / 3. 45.103 libswscale 2. 2.100 / 2. 2.100 libswresample 0. 17.102 / 0. 17.102 libpostproc 52. 2.100 / 52. 2.100 [dshow @ 0215bc00] DirectShow audio device options [dshow @ 0215bc00] Pin "Capture Pin" [dshow @ 0215bc00] min ch=2 bits=16 rate= 44100 max ch=2 bits=16 rate= 44100 audio=virtual-audio-capturer: Immediate exit requested
Also this note that the input string is in the format video=<video device name>:audio=<audio device name>. It is possible to have two separate inputs (like -f dshow -i audio=foo -f dshow -i video=bar) though some limited tests had shown a difference in synchronism between the two options at times. Possibly you can overcome it using the "-copy_ts" flag. The reason this works is that each "input" is assumed to start "at its first input time" and FFmpeg, by default, basically normalizes it "from its first input" as meaning "0.0 seconds." Because ffmpeg is using two different dshow inputs, it basically starts one up, then starts up the second *after* so it might start sending in packets a fraction of a second later, and FFmpeg happily treats its "later starting" timestamps as also 0.0 so mixing them doesn't work well if they start off set. So if you use -copy_ts then it will start them with "relative to machine start time" timestamps which should be able to mix accurately in theory. Ping me if you want it fixed to come more than one audio and one video in an input and thus not need these work arounds email@example.com
Also note that you can only at most have 2 streams at once (one audio and one video, like -i video=XX:audio=YY). Ask if you want this improved. You can have multiples one after the other, however, like
ffmpeg -f dshow -i video=XX:audio=ZZ -f dshow -i video=ZZ:audio=QQ
FFmpeg can also "merge/combine" multiple audio inputs, like the above using its amix filter (it can also combine video inputs of course, or record them as separate streams).
See the FFmpeg dshow input device documentation for a list of more dshow options you can specify. For instance you can decrease latency on audio devices, or specify a video by "index" if two have the same name, etc. etc.
Specifying input framerate
You can set framerate like ffmpeg -f dshow -framerate 7.5 -i video=XXX. This instructs the device itself to send you frames at 7.5 fps [if it can].
Be careful *not* to specify framerate with the "-r" parameter, like this ffmpeg -f dshow -r 7.5 -i video=XXX. This actually specifies that the devices incoming PTS timestamps be *ignored* and replaced as if the device were running at 7.5 fps [so it runs at default fps, but its timestamps are treated as if 7.t fps]. This can cause the recording to appear to have "video slower than audio" or, under high cpu load (if video frames are dropped) it will cause the video to fall "behind" the audio [after playback of the recording is done, audio continues on--and gets highly out of sync, video appears to go into "fast forward" mode during high cpu scenes].
If you want say 10 fps, and you device only supports 7.5 and 15 fps, then run it at fps then "downsample" to 10 fps. There are a few ways to do this--you could specify your output to be 10 fps, like this: ffmpeg -f dshow -framerate 15 -i video=XXX -r 10 output.mp4 or insert a filter to do the same thing for you: ffmpeg -f dshow -framerate 15 -vf fps=15 output.mp4.
By default FFmpeg captures frames from the input, and then does whatever you told it to do, for instance, re-encoding them and saving them to an output file. By default if it receives a video frame "too early" (while the previous frame isn't finished yet), it will discard that frame, so that it can keep up the the real time input. You can adjust this by setting the -rtbufsize parameter, though note that if your encoding process can't keep up, eventually you'll still start losing frames just the same (and using it at all can introduce a bit of latency). It may be helpful to still specify some size of buffer, however, otherwise frames may be needlessly dropped possibly.
See StreamingGuide for some tips on tweaking encoding (sections latency and cpu usage). For instance, you could save it to a very fast codec, then re-encode it later.
There is also an option audio_buffer_size. Basically if you're capturing from a live mic, the default behavior for this hardware device is to "buffer" 500ms (or 1000ms) worth of data, before it starts sending it down the pipeline. This can introduce startup latency, so setting this to 50ms (msdn suggests 80ms) may be a better idea here. The timestamps on the data will be right, it will just have added (unneeded) latency if you don't specify this.
The "copyts" flag might be useful to helping streams keep their input timestamps. Especially if you have multiple "-f dshow -i XXX -f dshow -i YYY" style inputs, the latter capture graph might get started up slightly after the former. If you desire to have more than "2 inputs, one audio, one video" to increase synconicity please request so.
Some capture devices have "multiple inputs" for this type of capture device, you'll want to specify the "input pin" of video you want, and "input pin" of audio you want. See FFmpeg dshow input device documentation
You can preview it using ffplay, ex:
ffplay -f dshow -video_size 1280x720 -rtbufsize 702000k -framerate 60 -i video="XX":audio="YY"
To "preview while you record" however you'd need to use the "SDL out" filter or output to a jpeg file and read that with some other application (tricky though as you'd have to avoid conflicting with ffmpeg re-writing the same file, recommend rename it first or something).
If you only get "one packet" at times, you may need/want to add the "-vsync" flag.
Using DirectShow with libav*
You can use dshow input via the libavXXX libraries (i.e. directly into your own program) instead of calling out to ffmpeg.exe. See Using libav* for an intro to using libav. See also http://ffmpeg.zeranoe.com/forum/viewtopic.php?f=15&t=274&p=902&hilit=dictionary#p902
How to programmatically enumerate devices
FFmpeg does not provide a native way to do this yet, but you can lookup the devices yourself or just parse standard out from FFmpeg: http://ffmpeg.zeranoe.com/forum/viewtopic.php?f=15&t=651&p=2963&hilit=enumerate#p2963
For details on capturing the screen (which sometimes can use a dshow device) see Capture/Desktop#Windows.
FFmpeg can also take arbitrary DirectShow input by creating an avisynth file (.avs file) that itself gets input from a graphedit file, which graphedit file exposes a pin of your capture source or any filter really, ex (yo.avs) with this content:
DirectShowSource("push2.GRF", fps=35, audio=False, framecount=1000000)
By default dshow just creates a graph with a couple of source filters. So AviSynth? could be used to get input from more complex graphs (ping roger if you'd like anything more complex in the dshow source).
Running ffmpeg.exe without opening a console window
If you want to run your ffmpeg "from a gui" without having it popup a console window which spits out all of ffmpeg's input, a few things that may help:
- If you can start your program like rubyw.exe or javaw.exe then all command line output (including child processes') is basically not attached to a console.
- If your program has an option to run a child program "hidden" or the like, that might work. If you redirect stderr and stdout to something you receive, that might work (but might be tricky because you may need to read from both pipes in different threads, etc.)
ffdshow tryouts is a separate project that basically wraps FFmpeg's core source (libavcodec, etc.) and then presents them as filter wrappers that your normal Windows applications can use for decoding video, etc. It's not related to FFmpeg itself, per se, though uses it internally. see also https://github.com/Nevcairiel/LAVFilters?
Known Bugs/Feature Requests
- Do you have a feature request? Anything you want added? digital capture card support added? analog tv tuner support added? email me (see above). Want any of the below fixed? email me...
- currently there is no ability to "push back" against upstream sources if ffmpeg is unable to encode fast enough, this might be nice to have in certain circumstances, instead of dropping it when the rtbuffer is full.
- currently no ability to select "i420" from various yuv options [screen capture recorder] meh
- could use an option "trust video timestamps" today it just uses wall clock time...
- cannot take more than 2 inputs [today] per invocation. this can be arranged, please ask if it is a desired feature
- no device enumeration API as of yet (for libav users). At least do the name!
- my large list
- libav "input" to a directshow filter...could use my recycled screen-capture-recorder filter, woot LOL.
- 3D audio support'ish (non dshow but mentioning it here)
- passthrough so it can use locally installed dshow codecs on windows
- echo cancelling support (non dshow but mentioning it here)