Opened 4 months ago
Last modified 2 months ago
#11148 open defect
DirectShow (dshow) support lacking beyond device defaults
Reported by: | blurbdust | Owned by: | Diederick Niehorster |
---|---|---|---|
Priority: | normal | Component: | avdevice |
Version: | git-master | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Hello,
First and foremost, I would like to thank the dev team as a whole for the amazing software and continued support over the years. ffmpeg really is a fantastic tool that doesn't get as much credit as it should.
I have a few main requested changes surrounding DirectShow. I normally provide PRs for my bug reports but unfortunately I have not been able to determine the root cause of the main issue.
A quick bug is the sample_size
argument within libavdevice/dshow.c
caps at 16 bits when the max DirectShow supports is 32 bits.
https://learn.microsoft.com/en-us/windows/win32/directshow/bitdepth-attribute#possible-values
When testing the above quick fix, I ended up stumbling into ffmpeg's DirectShow support ignoring the provided -sample_size
field (it appears to solely use the device defaults). When digging in with WinDbg, the device capabilities is correctly enumerated to support up to 24 bits when checking acaps
on https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/refs/heads/master:/libavdevice/dshow.c#l990. However, for an unknown reason, requested_sample_size
is set to an invalid memory region according to WinDbg and the value cannot be read at this line https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/refs/heads/master:/libavdevice/dshow.c#l1008. ctx->sample_size
is still valid and set to the expected 24. I traced the write to corrupt requested_sample_size
back to immediately following the return of IPin_QueryInterface
within dshow_cycle_formats
https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/refs/heads/master:/libavdevice/dshow.c#l823.
I did look into the main previous bug reports https://trac.ffmpeg.org/ticket/9420 and as a temporary workaround of reverting 436-440 within commit d9a9b4c877b85fea5a5bad74c3d592a756047f79, revert line 1007 from d2d8b9b972ba2df6b2a2ebe29f5307cbb7a69c33, keep requested_sample_size
defined but not used, the capture does work as expected. I initially thought the write of fx->wBitsPerSample = ctx->sample_size
would be the answer but after rebuilding ffmpeg with support for more third party libraries, 24 bit capture is no longer working. A sample patch of what I expected to work can be found at https://gist.githubusercontent.com/blurbdust/5fa38b6e854d4cfe251a466131bd9b0c/raw/69287eb13d88b86b1cda06016c6438b86707133d/fix-dshow-requested-variables.patch. I popped open the generated ffmpeg_g.exe
in IDA Home (output attached at bottom and search for // start of workaround
) and requested_sample_size
is in fact missing when doing stack rebuilding of dshow_cycle_formats
. It feels like maybe some stack corruption is occurring or maybe WinDbg is leading me astray with broken symbols after the IPin_QueryInterface
call?
I (or one other person with the same capture card and issue) would be happy to help debug in any way we can.
Please let me know if you need any more information from me.
Thank you for your time
void __fastcall dshow_cycle_formats(AVFormatContext *avctx, dshowDeviceType devtype, IPin *pin, int *pformat_set) { int *priv_data; // rdi bool v5; // zf AVPixelFormat v6; // eax __int64 v8; // rbx void (__stdcall *v9)(LPVOID); // rbp unsigned int *v10; // r12 _DWORD *v11; // rax int v12; // esi dshow_format_info *v13; // rax BYTE *v14; // r8 AM_MEDIA_TYPE *v15; // rdx BYTE *v16; // rcx __int64 *v17; // r9 AM_MEDIA_TYPE *v18; // rcx BYTE *v19; // rcx dshow_format_info *format_info; // rax dshow_format_info *v21; // rbx BYTE *pbFormat; // r8 HRESULT v23; // eax AM_MEDIA_TYPE *v24; // rcx BYTE *v25; // rcx unsigned int v26; // eax unsigned int v27; // eax unsigned int v28; // eax BYTE *v29; // rcx AVPixelFormat pix_fmt; // ecx const char *pix_fmt_name; // rax const char *v32; // rax const char *v33; // r8 const char *v34; // rdx const char *v35; // rcx const char *v36; // r9 const char *v37; // r9 const AVCodec *decoder; // rax const char *v39; // r9 __int64 v40; // [rsp+20h] [rbp-E8h] __int64 v41; // [rsp+28h] [rbp-E0h] __int64 v42; // [rsp+30h] [rbp-D8h] __int64 v43; // [rsp+38h] [rbp-D0h] __int64 v44; // [rsp+40h] [rbp-C8h] AM_MEDIA_TYPE *previous_match_type; // [rsp+50h] [rbp-B8h] __int64 requested_framerate; // [rsp+58h] [rbp-B0h] int requested_width; // [rsp+64h] [rbp-A4h] int requested_height; // [rsp+68h] [rbp-A0h] AVCodecID requested_video_codec_id; // [rsp+6Ch] [rbp-9Ch] AVPixelFormat requested_pixel_format; // [rsp+70h] [rbp-98h] int wait_for_better; // [rsp+74h] [rbp-94h] bool v52; // [rsp+7Ah] [rbp-8Eh] int v53; // [rsp+7Ch] [rbp-8Ch] const char *prim; // [rsp+80h] [rbp-88h] const char *prima; // [rsp+80h] [rbp-88h] const char *space; // [rsp+88h] [rbp-80h] const char *range; // [rsp+90h] [rbp-78h] const char *chroma; // [rsp+98h] [rbp-70h] int n; // [rsp+A0h] [rbp-68h] BYREF int size; // [rsp+A4h] [rbp-64h] BYREF IAMStreamConfig *config; // [rsp+A8h] [rbp-60h] BYREF AM_MEDIA_TYPE *previous_match_type_0; // [rsp+B0h] [rbp-58h] BYREF dshow_format_info *arg; // [rsp+B8h] [rbp-50h] BYREF priv_data = (int *)avctx->priv_data; v5 = *((_QWORD *)priv_data + 39) == 0LL; requested_video_codec_id = priv_data[77]; v6 = priv_data[76]; config = 0LL; previous_match_type_0 = 0LL; requested_pixel_format = v6; requested_framerate = 0LL; if ( !v5 ) requested_framerate = 10000000LL * priv_data[83] / priv_data[82]; requested_width = priv_data[80]; requested_height = priv_data[81]; LODWORD(v8) = pin->lpVtbl->QueryInterface(pin, refptr_IID_IAMStreamConfig, (void **)&config); if ( (_DWORD)v8 ) return; if ( config->lpVtbl->GetNumberOfCapabilities(config, &n, &size) || (v10 = (unsigned int *)av_malloc(size)) == 0LL ) { previous_match_type = 0LL; v9 = CoTaskMemFree; v10 = 0LL; LABEL_6: v9(previous_match_type); goto LABEL_7; } v11 = avctx->priv_data; if ( devtype ) { if ( !v11[86] && !*((_QWORD *)v11 + 42) ) goto LABEL_49; } else if ( !*((_QWORD *)v11 + 39) && (!v11[80] || !v11[81]) && *((_QWORD *)v11 + 38) == 0xDFFFFFFFFLL ) { LABEL_49: if ( !pformat_set ) { if ( n <= 0 ) { CoTaskMemFree(0LL); config->lpVtbl->Release(config); av_free(v10); return; } v53 = 0; v9 = CoTaskMemFree; v52 = 0; goto LABEL_15; } dshow_get_default_format(pin, config, devtype, &previous_match_type_0); if ( previous_match_type_0 ) { format_info = dshow_get_format_info(previous_match_type_0); v21 = format_info; if ( format_info ) { if ( format_info->devtype == dshowDeviceType::VideoDevice ) { requested_video_codec_id = format_info->codec_id; requested_pixel_format = format_info->pix_fmt; requested_framerate = format_info->framerate; requested_width = format_info->width; requested_height = format_info->height; } av_free(format_info); if ( !previous_match_type_0 ) { v9 = CoTaskMemFree; CoTaskMemFree(0LL); previous_match_type_0 = 0LL; goto LABEL_57; } pbFormat = previous_match_type_0->pbFormat; if ( !pbFormat ) { v9 = CoTaskMemFree; CoTaskMemFree(previous_match_type_0); previous_match_type_0 = 0LL; goto LABEL_57; } } else { if ( !previous_match_type_0 ) { LODWORD(v8) = 0; CoTaskMemFree(0LL); previous_match_type_0 = 0LL; CoTaskMemFree(0LL); goto LABEL_63; } pbFormat = previous_match_type_0->pbFormat; if ( !pbFormat ) { LODWORD(v8) = 0; CoTaskMemFree(previous_match_type_0); previous_match_type_0 = 0LL; CoTaskMemFree(0LL); goto LABEL_63; } } v9 = CoTaskMemFree; CoTaskMemFree(pbFormat); CoTaskMemFree(previous_match_type_0); previous_match_type_0 = 0LL; if ( !v21 ) { LODWORD(v8) = 0; goto LABEL_62; } LABEL_57: v53 = 0; v52 = 1; if ( n <= 0 ) { LABEL_58: dshow_get_default_format(pin, config, devtype, &previous_match_type_0); v23 = config->lpVtbl->SetFormat(config, previous_match_type_0); v24 = previous_match_type_0; v8 = v23 == 0; if ( previous_match_type_0 && previous_match_type_0->pbFormat ) { v9(previous_match_type_0->pbFormat); v24 = previous_match_type_0; } v9(v24); previous_match_type_0 = 0LL; goto LABEL_62; } goto LABEL_15; } LABEL_93: LODWORD(v8) = 0; CoTaskMemFree(0LL); goto LABEL_63; } v52 = pformat_set != 0LL; if ( n <= 0 ) { if ( pformat_set ) goto LABEL_93; LODWORD(v8) = 0; CoTaskMemFree(0LL); LABEL_7: config->lpVtbl->Release(config); av_free(v10); if ( pformat_set ) goto LABEL_8; return; } v53 = 1; v9 = CoTaskMemFree; LABEL_15: wait_for_better = 0; previous_match_type = 0LL; v12 = 0; while ( 1 ) { arg = 0LL; LODWORD(v8) = config->lpVtbl->GetStreamCaps(config, v12, &previous_match_type_0, (BYTE *)v10); if ( (_DWORD)v8 ) { LABEL_76: LODWORD(v8) = 0; goto next; } v13 = dshow_get_format_info(previous_match_type_0); arg = v13; if ( v13 ) { if ( devtype ) { v15 = previous_match_type_0; if ( *(_OWORD *)refptr_FORMAT_WaveFormatEx != *(_OWORD *)&previous_match_type_0->formattype ) goto next; v25 = previous_match_type_0->pbFormat; if ( !pformat_set ) { LODWORD(v44) = v10[11]; LODWORD(v43) = v10[8]; LODWORD(v42) = v10[5]; LODWORD(v41) = v10[10]; LODWORD(v40) = v10[7]; av_log( avctx, 32, " min ch=%lu bits=%lu rate=%6lu max ch=%lu bits=%lu rate=%6lu\n", v10[4], v40, v41, v42, v43, v44); goto next; } v26 = priv_data[84]; // start of workaround if ( v26 ) { if ( v26 > v10[11] || v26 < v10[10] ) goto next; *((_DWORD *)v25 + 1) = v26; } v27 = priv_data[85]; if ( v27 ) { if ( v27 > v10[8] || v27 < v10[7] ) goto LABEL_76; *((_WORD *)v25 + 7) = v27; } v28 = priv_data[86]; if ( v28 ) { if ( v28 > v10[5] || v28 < v10[4] ) goto LABEL_76; *((_WORD *)v25 + 1) = v28; // end of workaround } LABEL_37: if ( wait_for_better ) { if ( !previous_match_type ) { previous_match_type_0 = 0LL; previous_match_type = v15; } } else { v8 = ((unsigned int (__fastcall *)(IAMStreamConfig *, AM_MEDIA_TYPE *, BYTE *))config->lpVtbl->SetFormat)( config, v15, v14) == 0; } goto next; } if ( v13->devtype ) goto next; v15 = previous_match_type_0; if ( *(_OWORD *)&previous_match_type_0->formattype - *(_OWORD *)refptr_FORMAT_VideoInfo == 0LL ) { v16 = previous_match_type_0->pbFormat; wait_for_better = 1; v17 = (__int64 *)(v16 + 40); v14 = v16 + 48; } else if ( *(_OWORD *)refptr_FORMAT_VideoInfo2 == *(_OWORD *)&previous_match_type_0->formattype ) { v29 = previous_match_type_0->pbFormat; wait_for_better = 0; v17 = (__int64 *)(v29 + 40); v14 = v29 + 72; } else { v17 = 0LL; v14 = 0LL; } if ( pformat_set ) { if ( requested_video_codec_id != AVCodecID::AV_CODEC_ID_RAWVIDEO && requested_video_codec_id != v13->codec_id || requested_pixel_format != AVPixelFormat::AV_PIX_FMT_NONE && v13->pix_fmt != requested_pixel_format ) { goto next; } if ( requested_framerate ) { if ( *((_QWORD *)v10 + 14) < requested_framerate || *((_QWORD *)v10 + 13) > requested_framerate ) goto next; *v17 = requested_framerate; } if ( requested_height != 0 && requested_width != 0 ) { if ( (int)v10[17] < requested_width || (int)v10[15] > requested_width || (int)v10[18] < requested_height || (int)v10[16] > requested_height ) { goto next; } *((_DWORD *)v14 + 1) = requested_width; *((_DWORD *)v14 + 2) = requested_height; } goto LABEL_37; } prim = (const char *)v14; chroma = av_chroma_location_name(v13->chroma_loc); pix_fmt = arg->pix_fmt; if ( pix_fmt == AVPixelFormat::AV_PIX_FMT_NONE ) { decoder = avcodec_find_decoder(arg->codec_id); if ( arg->codec_id && decoder ) av_log(avctx, 32, " vcodec=%s", decoder->name); else av_log(avctx, 32, " unknown compression type 0x%X", *((unsigned int *)prim + 4)); } else { pix_fmt_name = av_get_pix_fmt_name(pix_fmt); av_log(avctx, 32, " pixel_format=%s", pix_fmt_name); } LODWORD(v43) = v10[18]; LODWORD(v42) = v10[17]; LODWORD(v40) = v10[16]; av_log( avctx, 32, " min s=%ldx%ld fps=%g max s=%ldx%ld fps=%g", v10[15], v40, 10000000.0 / (double)(int)*((_QWORD *)v10 + 14), v42, v43, 10000000.0 / (double)(int)*((_QWORD *)v10 + 13)); if ( *(_QWORD *)&arg->col_range == 0x200000000LL && *(_QWORD *)&arg->col_prim == 0x200000002LL ) { if ( arg->chroma_loc ) { v39 = "unknown"; if ( chroma ) v39 = chroma; av_log(avctx, 32, "(%s)", v39); } } else { range = av_color_range_name(arg->col_range); space = av_color_space_name(arg->col_space); prima = av_color_primaries_name(arg->col_prim); v32 = av_color_transfer_name(arg->col_trc); v33 = prima; v34 = space; v35 = v32; v36 = range; if ( !v32 ) v35 = "unknown"; if ( !prima ) v33 = "unknown"; if ( !space ) v34 = "unknown"; if ( !range ) v36 = "unknown"; av_log(avctx, 32, " (%s, %s/%s/%s", v36, v34, v33, v35); if ( arg->chroma_loc ) { v37 = chroma; if ( !chroma ) v37 = "unknown"; av_log(avctx, 32, ", %s", v37); } av_log(avctx, 32, ")"); } av_log(avctx, 32, "\n"); } next: av_freep(&arg); v18 = previous_match_type_0; if ( previous_match_type_0 && previous_match_type_0->pbFormat ) { v9(previous_match_type_0->pbFormat); v18 = previous_match_type_0; } v9(v18); ++v12; previous_match_type_0 = 0LL; if ( n <= v12 ) break; if ( (_DWORD)v8 ) goto end; } if ( (v8 & 1) != 0 || !v52 ) { end: if ( !previous_match_type ) goto LABEL_6; goto LABEL_45; } if ( previous_match_type ) { v8 = ((unsigned int (__fastcall *)(IAMStreamConfig *))config->lpVtbl->SetFormat)(config) == 0; LABEL_45: v19 = previous_match_type->pbFormat; if ( v19 ) v9(v19); goto LABEL_6; } if ( !v53 ) goto LABEL_58; LODWORD(v8) = 0; LABEL_62: v9(0LL); LABEL_63: config->lpVtbl->Release(config); av_free(v10); LABEL_8: *pformat_set = v8; }
Change History (7)
comment:1 by , 4 months ago
Owner: | set to |
---|---|
Status: | new → open |
comment:2 by , 4 months ago
I apologize, I grabbed the wrong link from my notes. https://learn.microsoft.com/en-us/windows/win32/api/mmreg/ns-mmreg-waveformatex
shouldn't be too device specific since the code jumps to next
if the capture type isn't refptr_FORMAT_WaveFormatEx
. This at least shows > 16 is supported by DirectShow and the limit within ffmpeg should be raised.
wBitsPerSample Specifies the number of bits per sample for the format type specified by wFormatTag. If wFormatTag = WAVE_FORMAT_PCM, then wBitsPerSample should be set to either 8 or 16. If wFormatTag = WAVE_FORMAT_IEEE_FLOAT, wBitsPerSample should be set to 32. For non-PCM formats, set the value of this member according to the manufacturer's specification for the format tag. Some compression schemes cannot define a value for wBitsPerSample. In this case, set wBitsPerSample to zero.
comment:3 by , 4 months ago
ffmpeg should be raised.
Raised to what? 32 bit or 24 bit? Also, why would you need support for non-defaults? Would not it possibly break the device? Is it possible that sample_size only allowed value is 16 since that is used to overwrite default of 24, while overwriting with 24 bit on unsupported devices can lead to defects?
comment:4 by , 4 months ago
Raised to what? 32 bit or 24 bit?
32 bit as this is the maximum that I can find that is supported by DirectShow for audio devices.
Also, why would you need support for non-defaults? Would not it possibly break the device? Is it possible that sample_size only allowed value is 16 since that is used to overwrite default of 24, while overwriting with 24 bit on unsupported devices can lead to defects?
My particular device is configurable to 16 bit, 24 bit, or 32 bit audio capture. The default for DirectShow for my device is 16 bit 48kHz for audio capture. The vendor created basically a DirectShow shim to their own software for use with third-party software. I would strongly prefer to use ffmpeg to not be locked into the vendor's software thus would like the ability to configure the capture beyond the offered defaults. If I use an older version of ffmpeg from before 2021-11-03, not only does the device report support for > 16 bit within ffmpeg, if I add the small hack of + { "sample_size", "set audio sample size", OFFSET(sample_size), AV_OPT_TYPE_INT, {.i64 = 0}, 0, 32, DEC },
then configuration and capture from the device works from ffmpeg.
comment:5 by , 4 months ago
My particular device is configurable to 16 bit, 24 bit, or 32 bit audio capture
What is 32 bit capture, is that 32 bit float or 32 bit integer? Almost nothing supports 32 bit int, besides recently flac and wavpack
comment:6 by , 2 months ago
After doing some testing, the device actually upscales from 24 bit to 32 bit int when doing a 32 bit capture in the vendor's software. I will be staying away from 32 bit capture from this device to avoid the upscaling. As for the requested updates within ffmpeg, if 32 bit int is too niche, then I suggest upping the limit to 24 bit int instead. That would change the suggested line update within the patch to + { "sample_size", "set audio sample size", OFFSET(sample_size), AV_OPT_TYPE_INT, {.i64 = 0}, 0, 24, DEC },
.
Wait a second, 32 bits are for video? It says
If this is correct then I imagine it should be indeed.