Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#9228 closed defect (invalid)

multiple output in single C/libavcodec xcode process offsets "start" in output files

Reported by: Ray Owned by:
Priority: normal Component: documentation
Version: git-master Keywords: AAC
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description (last modified by Ray)

[solved]: issue related to user solution based on ffmpeg's transcode_aac example that uses global static pts for frame.

Summary of the bug:

I have a C/libavcodec program that transcodes multiple input files to resulting mp3 or m4a using the inbuilt ffmeg libmp3lame or aac encoders respectively. No libavcodec constructs are reused (new AVCodecContext, AVFormatContext, AVAudioFifo used for every transcode)

When generating .m4a output, examining the output using ffprobe, I observe the "start" is incrementally offset. the first m4a file is at 00:00:00 but the subsequent m4a file has a "start" that is offset by a time very close to the length of the first file. This does not happen when transcoding to mp3.

I have been able to reproduce this using the examples/transcode_aac.c and modified it so that it creates multiple files.

For example, when transcoding the same 5sec wav file to m4a within the same process generates a set of files but the start time is offset further and further as each new file is generated - note the "start" time

$ for i in /tmp/a/foo*m4a; do ffprobe -hide_banner -i $i; done
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/tmp/a/foo00.m4a':
  Metadata:
    major_brand     : M4A 
    minor_version   : 512
    compatible_brands: isomiso2
    encoder         : Lavf58.29.100
  Duration: 00:00:05.02, start: 0.000000, bitrate: 79 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 76 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/tmp/a/foo01.m4a':
  Metadata:
    major_brand     : M4A 
    minor_version   : 512
    compatible_brands: isomiso2
    encoder         : Lavf58.29.100
  Duration: 00:00:05.02, start: 4.976009, bitrate: 79 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 76 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/tmp/a/foo02.m4a':
  Metadata:
    major_brand     : M4A 
    minor_version   : 512
    compatible_brands: isomiso2
    encoder         : Lavf58.29.100
  Duration: 00:00:05.02, start: 9.976009, bitrate: 79 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 76 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

I would expect ALL files to be identical. I tested this with built in aac and also libfdk_aac and both exhibit this behaviour

How to reproduce:

# have a single process that will transocde mulitple files to aac within a single process.
# see attached patched transcode_aac.c

$ ./transcode_aac  input.wav output m4a 3

# generates 3x m4a encoded by inbuilt "aac" encoder called output{00,01,02}.m4a

# examine the output files, observing the start time via ffprobe

Attached are the wav input, m4a outputs and the sameple src code that demonstrates this - code is examples/doc/transcode_aac.c with a loop around old main() function with a new output file name on next iteration.

Attachments (1)

bug.tar.gz (138.0 KB ) - added by Ray 3 years ago.
sample input, output and src code

Download all attachments as: .zip

Change History (9)

by Ray, 3 years ago

Attachment: bug.tar.gz added

sample input, output and src code

comment:1 by Carl Eugen Hoyos, 3 years ago

Keywords: transcoding removed
Version: 4.2.4unspecified

I don't think this has many similarities with a valid ticket (are you just reporting that aac audio has a delay?), please at least confirm that the issue is reproducible with current FFmpeg git head.

in reply to:  1 comment:2 by Ray, 3 years ago

Version: unspecifiedgit-master

Replying to Carl Eugen Hoyos:

I don't think this has many similarities with a valid ticket (are you just reporting that aac audio has a delay?), please at least confirm that the issue is reproducible with current FFmpeg git head.

Confirmed same behaviour on git head f8d910e90f599f338438833dfc92e2f1915ce414

Yes I am reporting that, after the first transcoded file, all subsequent aac encoded files have an apparent delayed "start" offset that is proportional to the sum of previous transcoded audio lengths. Input file for xcode is 5secs, 1st xcode'd m4a start=0, 2nd xcode'd m4a start=~5sec, 3rd xcode'd m4a start=~10sec ..

When playing the the 3rd generated file .m4a in mpv, mpv believes it 14secs long and starts at 10secs (instead of at 00secs) - the original .wav used for transcoding is 5 secs long. ffprobe on the .m4a

Duration: 00:00:14.99, start: 9.976009, bitrate: 26 kb/s

Interestingly, git head produces the files where the duration (~15sec, start at ~10sec) is in line with the reported start (see output above)

sample code compiled and run again 4.2.4 and 4.3.2 reports duration as 5secs (not ~15secs), start at ~10sec

Duration: 00:00:05.02, start: 9.976009, bitrate: 79 kb/s

Last edited 3 years ago by Ray (previous) (diff)

comment:3 by Carl Eugen Hoyos, 3 years ago

Can you confirm that the issue is not reproducible with ffmpeg, the application?

in reply to:  3 comment:4 by Ray, 3 years ago

Replying to Carl Eugen Hoyos:

Can you confirm that the issue is not reproducible with ffmpeg, the application?

confirmed using ffmpeg head and other versions via ffmpeg itself does not have this problem. Both files generated have "start" at 0secs

$ ffmpeg -i foo.wav \
  -b:a 96k -c:a aac bar00.m4a \
  -b:a 96k -c:a aac bar01.m4a

$ for i in bar*.m4a; do ffprobe -hide_banner -i "$i"; done
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'bar00.m4a':
  Metadata:
    major_brand     : M4A 
    minor_version   : 512
    compatible_brands: M4A isomiso2
    encoder         : Lavf58.45.100
  Duration: 00:00:05.02, start: 0.000000, bitrate: 79 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 76 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'bar01.m4a':
  Metadata:
    major_brand     : M4A 
    minor_version   : 512
    compatible_brands: M4A isomiso2
    encoder         : Lavf58.45.100
  Duration: 00:00:05.02, start: 0.000000, bitrate: 80 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 78 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

comment:5 by mkver, 3 years ago

Resolution: invalid
Status: newclosed

The reason for what you are experiencing is the use of the static pts variable which does not get reset when switching to a new output file. The reason you are not experiencing this with mp3 is that the mp3 muxer does not retain the pts (probably because the file format does not allow this).

in reply to:  5 comment:6 by Ray, 3 years ago

Replying to mkver:

The reason for what you are experiencing is the use of the static pts variable which does not get reset when switching to a new output file. The reason you are not experiencing this with mp3 is that the mp3 muxer does not retain the pts (probably because the file format does not allow this).

Thank you! I had based my appl on the examples which has this. Will submit a patch to correct this but for anyone that might search this in future:

diff --git a/doc/examples/transcode_aac.c b/doc/examples/transcode_aac.c
index 711076b5a5..7f6148fa15 100644
--- a/doc/examples/transcode_aac.c
+++ b/doc/examples/transcode_aac.c
@@ -648,8 +648,6 @@ static int init_output_frame(AVFrame **frame,
     return 0;
 }
 
-/* Global timestamp for the audio frames. */
-static int64_t pts = 0;
 
 /**
  * Encode one frame worth of audio to the output file.
@@ -663,6 +661,7 @@ static int64_t pts = 0;
 static int encode_audio_frame(AVFrame *frame,
                               AVFormatContext *output_format_context,
                               AVCodecContext *output_codec_context,
+			      int64_t* pts,
                               int *data_present)
 {
     /* Packet used for temporary storage. */
@@ -675,8 +674,8 @@ static int encode_audio_frame(AVFrame *frame,
 
     /* Set a timestamp based on the sample rate for the container. */
     if (frame) {
-        frame->pts = pts;
-        pts += frame->nb_samples;
+        frame->pts = *pts;
+        *pts += frame->nb_samples;
     }
 
     /* Send the audio frame stored in the temporary packet to the encoder.
@@ -685,7 +684,6 @@ static int encode_audio_frame(AVFrame *frame,
     /* The encoder signals that it has nothing more to encode. */
     if (error == AVERROR_EOF) {
         error = 0;
-        goto cleanup;
     } else if (error < 0) {
         fprintf(stderr, "Could not send packet for encoding (error '%s')\n",
                 av_err2str(error));
@@ -735,7 +733,8 @@ cleanup:
  */
 static int load_encode_and_write(AVAudioFifo *fifo,
                                  AVFormatContext *output_format_context,
-                                 AVCodecContext *output_codec_context)
+                                 AVCodecContext *output_codec_context,
+				  int64_t* pts)
 {
     /* Temporary storage of the output samples of the frame written to the file. */
     AVFrame *output_frame;
@@ -760,7 +759,7 @@ static int load_encode_and_write(AVAudioFifo *fifo,
 
     /* Encode one frame worth of audio samples. */
     if (encode_audio_frame(output_frame, output_format_context,
-                           output_codec_context, &data_written)) {
+                           output_codec_context, pts, &data_written)) {
         av_frame_free(&output_frame);
         return AVERROR_EXIT;
     }
@@ -791,6 +790,8 @@ int main(int argc, char **argv)
     SwrContext *resample_context = NULL;
     AVAudioFifo *fifo = NULL;
     int ret = AVERROR_EXIT;
+    /* timestamp for the audio frames. */
+    int64_t pts = 0;
 
     if (argc != 3) {
         fprintf(stderr, "Usage: %s <input file> <output file>\n", argv[0]);
@@ -851,7 +852,7 @@ int main(int argc, char **argv)
             /* Take one frame worth of audio samples from the FIFO buffer,
              * encode it and write it to the output file. */
             if (load_encode_and_write(fifo, output_format_context,
-                                      output_codec_context))
+                                      output_codec_context, &pts))
                 goto cleanup;
 
         /* If we are at the end of the input file and have encoded
@@ -862,7 +863,7 @@ int main(int argc, char **argv)
             do {
                 data_written = 0;
                 if (encode_audio_frame(NULL, output_format_context,
-                                       output_codec_context, &data_written))
+                                       output_codec_context, &pts, &data_written))
                     goto cleanup;
             } while (data_written);
             break;

comment:7 by Ray, 3 years ago

Component: avcodecdocumentation
Description: modified (diff)

comment:8 by Cigaes, 3 years ago

The point of an example is not to be the exact code that you need for your particular task, it's to teach you the basics that you need to know to use the API. As such, they need to be as simple as possible.

If you saw this pts variable, and neglected to wonder “why is it there? what it is used for? why is it initialized?”, or even worse if you never saw the variable, then you've been using the examples wrong.

Note: See TracTickets for help on using tickets.