WebVTT decoder doesn't handle html escapes
|Reported by:||RiCON||Owned by:|
|Blocking:||Reproduced by developer:||no|
|Analyzed by developer:||no|
WebVTT spec specifies a dozen HTML escapes that should be handled, including '>', '<' and '&'. These aren't converted back to the proper characters.
% ffmpeg -i htmlescapes.vtt out.srt ffmpeg version N-75818-g8135b1e Copyright (c) 2000-2015 the FFmpeg developers built with gcc 5.2.0 (Rev4, Built by MSYS2 project)
Attached is an example vtt file, result with this build and proper result.
Examples of where these html escapes are used can be found by getting the subtitles from any video in Comedy Central's site using something like youtube-dl. Example:
% youtube-dl --all-subs "http://www.cc.com/video-clips/52dpzm/the-daily-show-with-trevor-noah-terrible--unending-national-tragedies"