Opened 2 hours ago

Last modified 2 hours ago

#11230 new enhancement

Feature request: A file header search function

Reported by: Damole-wer Owned by:
Priority: normal Component: ffprobe
Version: git-master Keywords: header search metadata
Cc: MasterQuestionable Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Greetings, FFmpeg Team.

I'm requesting this feature because, currently, ffmpeg treats files that have some data before the correct file header as invalid. Therefore it can also skip multiple files, that are embedded into a single file.

Here are two examples:

First example:

A PNG from a Godot project has GDIM png text before the valid ‰PNG signature, if i try to open it with mpv-master, i get an ffmpeg error: "Invalid PNG signature 0x4744494D03000000". So, if i remove the GDIM png, image opened successfully and no ffmpeg errors are logged. It should mean that ffprobe couldn't find the file signature because it's expected to be only at the start of the file.

Second example:

Some game stores its audio files inside a single .wav file for each level.
So, an example contains 3 wave audios, each starts with a RIFF signature. When i fully play the first audio using mpv-master, ffmpeg gives no errors and playback is finished. But that means 2 remaining audios weren't even probed.


Keeping examples above in mind,
An implemantation of a file header search function with 4 optional command-line arguments is requested:

  1. -header-search-[file-format]: this argument will enable search function for headers that are associated with the file format specified within the argument. Offset for each header will be recorded to a list that can be output using a -header-search-output command-line argument. When ffmpeg has the list, it will probe the whole file at each offset specified. Example: -header-search-wav will try to find every instance of RIFF in the whole file and record the results. From there ffmpeg can use the data as if there are multiple wave audios being input.
  1. -header-search-ext: same as previous, but it will automatically search based on the file extension. Example: -i landscape.jpg with this argument will work the same as -header-search-jpg
  1. -header-search-all: this argument will enable search function for all supported file headers. The search should work like this: there is a list of "supported headers" -> the function chooses the first header -> searches the whole file for every instance and appends offset to the "found headers" list. The process repeats for every other instance in "supported headers" list. This is obviously a very slow process, but it ensures that no useful data is missed and no manual searching has to be done.
  1. -header-search-output: this argument will enable search function to output to a file. Example: -header-search-output "path\to\output.txt" will record search function output to the specified location.

In conclusion,

I belive that the requested feature will provide a robust file probing functionality that ensures all useful data can be extracted from any type of file using a simple command-line argument with no manual searching needed.

I hope that i made everything clear.
Thank you for considering.

Attachments (2)

index_palette.png (3.4 KB ) - added by Damole-wer 2 hours ago.
Godot PNG file
index_palette_fixed.png (3.4 KB ) - added by Damole-wer 2 hours ago.
Fixed PNG

Download all attachments as: .zip

Change History (3)

by Damole-wer, 2 hours ago

Attachment: index_palette.png added

Godot PNG file

by Damole-wer, 2 hours ago

Attachment: index_palette_fixed.png added

Fixed PNG

comment:1 by MasterQuestionable, 2 hours ago

Cc: MasterQuestionable added
Note: See TracTickets for help on using tickets.