Opened 10 years ago

Last modified 10 years ago

#3706 new enhancement

support header row per section in ffprobe csv writer

Reported by: dave rice Owned by:
Priority: wish Component: ffprobe
Version: git-master Keywords: csv
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the enhancement:

I propose an option in the csv writer of ffprobe to write out header rows per section.

ffprobe -v 0 -f lavfi smptebars=r=3:d=1 -of csv -show_streams -show_format -show_frames
frame,video,1,0,0.000000,0,0.000000,0,0.000000,1,0.333333,N/A,N/A,320,240,yuv420p,1:1,I,0,0,0,0,0
frame,video,1,1,0.333333,1,0.333333,1,0.333333,1,0.333333,N/A,N/A,320,240,yuv420p,1:1,I,0,0,0,0,0
frame,video,1,2,0.666667,2,0.666667,2,0.666667,1,0.333333,N/A,N/A,320,240,yuv420p,1:1,I,0,0,0,0,0
stream,0,rawvideo,raw video,unknown,video,1/3,I420,0x30323449,320,240,0,1:1,4:3,yuv420p,-99,N/A,N/A,3/1,3/1,1/3,0,0.000000,N/A,N/A,N/A,N/A,3,N/A,0,0,0,0,0,0,0,0,0,0,0
format,smptebars=r=3:d=1,1,0,lavfi,Libavfilter virtual input device,0.000000,N/A,N/A,N/A,25

with a proposed header option the output could be:

frame,media_type,key_frame,pkt_pts,pkt_pts_time,pkt_dts,pkt_dts_time,best_effort_timestamp,best_effort_timestamp_time,pkt_duration,pkt_duration_time,pkt_pos,pkt_size,width,height,pix_fmt,sample_aspect_ratio,pict_type,coded_picture_number,display_picture_number,interlaced_frame,top_field_first,repeat_pict
frame,video,1,0,0.000000,0,0.000000,0,0.000000,1,0.333333,N/A,N/A,320,240,yuv420p,1:1,I,0,0,0,0,0
frame,video,1,1,0.333333,1,0.333333,1,0.333333,1,0.333333,N/A,N/A,320,240,yuv420p,1:1,I,0,0,0,0,0
frame,video,1,2,0.666667,2,0.666667,2,0.666667,1,0.333333,N/A,N/A,320,240,yuv420p,1:1,I,0,0,0,0,0
stream,index,codec_name,codec_long_name,profile,codec_type,codec_time_base,codec_tag_string,codec_tag,width,height,has_b_frames,sample_aspect_ratio,display_aspect_ratio,pix_fmt,level,timecode,id,r_frame_rate,avg_frame_rate,time_base,start_pts,start_time,duration_ts,duration,bit_rate,nb_frames,nb_read_frames,nb_read_packets,DISPOSITION:default,DISPOSITION:dub,DISPOSITION:original,DISPOSITION:comment,DISPOSITION:lyrics,DISPOSITION:karaoke,DISPOSITION:forced,DISPOSITION:hearing_impaired,DISPOSITION:visual_impaired,DISPOSITION:clean_effects,DISPOSITION:attached_pic
stream,0,rawvideo,raw video,unknown,video,1/3,I420,0x30323449,320,240,0,1:1,4:3,yuv420p,-99,N/A,N/A,3/1,3/1,1/3,0,0.000000,N/A,N/A,N/A,N/A,3,N/A,0,0,0,0,0,0,0,0,0,0,0
format,filename,nb_streams,nb_programs,format_name,format_long_name,start_time,duration,size,bit_rate,probe_score
format,smptebars=r=3:d=1,1,0,lavfi,Libavfilter virtual input device,0.000000,N/A,N/A,N/A,25

I think the advantage here is the output is much more self-descriptive and sustainable as the output of ffprobe changes/expands as it is developed. I know there are other self-descriptive formats such as json and xml but these are close to 10x the data rate of csv. A csv output with per-section headers would have a low data but still be self-descriptive.
Dave Rice

Change History (4)

comment:1 by Carl Eugen Hoyos, 10 years ago

Component: undeterminedffprobe
Keywords: ffprobe removed
Priority: normalwish

comment:2 by Carl Eugen Hoyos, 10 years ago

If the file contains audio and video, the 14th column contains either width (video) or sample_fmt (audio). What should the header show for the 14th column?

comment:3 by dave rice, 10 years ago

In this case I think video:pix_fmt would be the closest semantic equivalent audio:sample_fmt if it's feasible to make them appear in the same column, but this would involve restructuring the output rather than just adding headers.

Since the stream values are unique per stream could make one header per stream type rather than for all streams. Such as:

stream:video,index,codec_name,codec_long_name,profile,codec_type,codec_time_base,codec_tag_string,codec_tag,width,height,has_b_frames,sample_aspect_ratio,display_aspect_ratio,pix_fmt,level,timecode,id,r_frame_rate,avg_frame_rate,time_base,start_pts,start_time,duration_ts,duration,bit_rate,max_bit_rate,nb_frames,nb_read_frames,nb_read_packets,DISPOSITION:default,DISPOSITION:dub,DISPOSITION:original,DISPOSITION:comment,DISPOSITION:lyrics,DISPOSITION:karaoke,DISPOSITION:forced,DISPOSITION:hearing_impaired,DISPOSITION:visual_impaired,DISPOSITION:clean_effects,DISPOSITION:attached_pic,TAG:creation_time,TAG:language,TAG:handler_name
stream:audio,index,codec_name,codec_long_name,profile,codec_type,codec_time_base,codec_tag_string,codec_tag,sample_fmt,sample_rate,channels,channel_layout,bits_per_sample,id,r_frame_rate,avg_frame_rate,time_base,start_pts,start_time,duration_ts,duration,bit_rate,max_bit_rate,nb_frames,nb_read_frames,nb_read_packets,DISPOSITION:default,DISPOSITION:dub,DISPOSITION:original,DISPOSITION:comment,DISPOSITION:lyrics,DISPOSITION:karaoke,DISPOSITION:forced,DISPOSITION:hearing_impaired,DISPOSITION:visual_impaired,DISPOSITION:clean_effects,DISPOSITION:attached_pic,TAG:creation_time,TAG:language,TAG:handler_name

If the csv header for stream stay as one line, then the 14 column header could be something like:

video:width|audio:sample_fmt

though the column position for each metadata value would change based on the metadata of the stream, right?

in reply to:  3 comment:4 by Carl Eugen Hoyos, 10 years ago

Replying to dericed:

In this case I think video:pix_fmt would be the closest semantic equivalent audio:sample_fmt if it's feasible to make them appear in the same column, but this would involve restructuring the output rather than just adding headers.

The order could (in theory) be changed but this wouldn't fix the problem or do I miss something?

Since the stream values are unique per stream could make one header per stream type rather than for all streams. Such as:

stream:video,index,codec_name,codec_long_name,profile,codec_type,codec_time_base,codec_tag_string,codec_tag,width,height,has_b_frames,sample_aspect_ratio,display_aspect_ratio,pix_fmt,level,timecode,id,r_frame_rate,avg_frame_rate,time_base,start_pts,start_time,duration_ts,duration,bit_rate,max_bit_rate,nb_frames,nb_read_frames,nb_read_packets,DISPOSITION:default,DISPOSITION:dub,DISPOSITION:original,DISPOSITION:comment,DISPOSITION:lyrics,DISPOSITION:karaoke,DISPOSITION:forced,DISPOSITION:hearing_impaired,DISPOSITION:visual_impaired,DISPOSITION:clean_effects,DISPOSITION:attached_pic,TAG:creation_time,TAG:language,TAG:handler_name
stream:audio,index,codec_name,codec_long_name,profile,codec_type,codec_time_base,codec_tag_string,codec_tag,sample_fmt,sample_rate,channels,channel_layout,bits_per_sample,id,r_frame_rate,avg_frame_rate,time_base,start_pts,start_time,duration_ts,duration,bit_rate,max_bit_rate,nb_frames,nb_read_frames,nb_read_packets,DISPOSITION:default,DISPOSITION:dub,DISPOSITION:original,DISPOSITION:comment,DISPOSITION:lyrics,DISPOSITION:karaoke,DISPOSITION:forced,DISPOSITION:hearing_impaired,DISPOSITION:visual_impaired,DISPOSITION:clean_effects,DISPOSITION:attached_pic,TAG:creation_time,TAG:language,TAG:handler_name

How would you parse this?
I mean: Would this really help with any application reading csv?

If the csv header for stream stay as one line, then the 14 column header could be something like:

video:width|audio:sample_fmt

This would work at least for some cases.

though the column position for each metadata value would change based on the metadata of the stream, right?

It is possible to show only frames without metadata.

Note: See TracTickets for help on using tickets.