Changes between Version 4 and Version 5 of swscale

Mar 31, 2016, 6:17:30 PM (3 years ago)



  • swscale

    v4 v5  
    3737  * these should ideally not be exposed in the common code (i.e. a grep for x86 in libswscale/*.c should be mostly empty)
    3838  * simd should be allowed to set constraints. E.g., over-reading and -writing of buffers should be enabled in some scenarios where that makes the assembly easier to write. This should be conveyed to the user so they can choose fast conversion by having padded buffers, or a slower conversion by not having padded buffers
    39   * a "cleaner" SIMD API, see libavcodec/libavfilter/libswresample for examples
     39  * a "cleaner" SIMD API, see libavcodec/libavfilter/libswresample for examples, particularly for the "fast" versions
     40  * bringing some sanity into the choice of which function is assigned to which pointer under which conditions, or at least documenting it in human-understandable form, would also help a lot
    4041- documentation
    4142  * API docs
    254255[10:34am] <michaelni> i just saw that gimp caching code a long time ago (which used squares)
    255256[10:34am] <michaelni> or rectangles dont remember
     257[10:36am] <michaelni> also some kind of square / slice /tiling for multitreading is probably "important"
     258[10:37am] <ubitux> yeah
     259[10:37am] <ubitux> BBB: about the poor api, let's talk about the options...
     260[10:38am] <ubitux> dithering configuration is broken
     261[10:38am] <BBB> hm, dithering
     262[10:38am] <ubitux> like, you haven't a dither=none
     263[10:38am] <ubitux> it's just enabled, sometimes
     264[10:38am] <ubitux> (0=auto)
     265[10:38am] <BBB> yeah, nobody knows when it’s enabled
     266[10:38am] <michaelni> btw, iam not sure though how important/usefull such non memory squares are ...
     267[10:38am] <ubitux> also the scaler selection mixed within flags
     269[10:39am] <BBB> the avscale blueprint is pretty good at explaining that, the filter selection should be an enum
     270[10:39am] <BBB> (AVOption)
     271[10:39am] <BBB> and the rest of the flags can just be bools or whatevers as separate options
     272[10:39am] <BBB> I Guess pre-AVOption extending api was hard so this made sense, but with AVOption we realy don’y need it anymore
     273[10:39am] <ubitux> small nit: wth are param0 and param1
     274[10:40am] <BBB> hahaha right
     275[10:40am] <BBB> they are options for some filters
     276[10:40am] <BBB> wasn’t it alpha/beta for one of the larger filters?
     277[10:40am] <ubitux> error_diffusion belongs to dithering btw
     278[10:40am] <ubitux> it's currently a sws falgs
     279[10:40am] <ubitux> what are the implication of accurate_rnd too
     280[10:40am] » michaelni  doesnt remember exactly what param0 and 1 did for each filter but yes they where filter params
     281[10:41am] <BBB> For SWS_BICUBIC param[0] and [1] tune the shape of the basis function, param[0] tunes f(1) and param[1] f´(1) | For SWS_GAUSS param[0] tunes the exponent and thus cutoff frequency | For SWS_LANCZOS param[0] tunes the width of the window function
     282[10:41am] <ubitux> no other comment so far
     283[10:41am] <michaelni> these all should be AVOptions
     284[10:42am] <BBB> there’s also SWS_FULL_CHR_H_INT/INP
     285[10:42am] <BBB> in fact, I just noticed SWS_DIRECT_BGR, ...
     286[10:43am] <BBB> I believe int/inp had to do with non-420p support
     287[10:43am] <BBB> (I mean, practically speaking)
     288[10:43am] <BBB> doesn’t accurate_rnd increase precision of some internal codepath/something?
     289[10:43am] <ubitux> small note: there are a few dithering algorithms in vf paletteuse filter
     290[10:43am] <michaelni> BBB, yes accuarte_rnd uses more accurate code
     291[10:44am] <michaelni> IIRC no pmulhw
     292[10:44am] <michaelni> which would loose the lsbs
     293[10:44am] <ubitux> it has different meaning depending on the codepath
     294[10:44am] <BBB> I think it’s totally fine to basically eliminate all these flags and go back to “let’s just make it do the right thing”
     295[10:44am] <ubitux> it's not well defined
     296[10:44am] <BBB> I can see how pmulh(u)w was critically important for performance in the mplayer era
     297[10:45am] <BBB> I think it’s totally fair to say that with axv2, it is not all that relevant anymore
     298[10:45am] <ubitux> it's used between 32 vs 16 in some rgb code iirc
     299[10:45am] <BBB> I also wonder if half of the filters should be deleted
     300[10:45am] <BBB> like sinc, gauss
     301[10:46am] <BBB> maybe even fast-bilinear
     302[10:46am] <BBB> (that would clean up the code so much)
     303[10:46am] <michaelni> the filters like sinc gauss should have nearly no complexity as its just different numbers
     304[10:46am] <BBB> it’s user complexity
     305[10:46am] <BBB> we should expose the ideal configuration settings to our user
     306[10:47am] <BBB> when to use spline or lanczos: when upscaling
     307[10:47am] <BBB> (and caring about quality)
     308[10:47am] <BBB> when to use bicublin: when speed is critical and you’re downscaling
     309[10:47am] <BBB> that’s very helpful to end users
     310[10:47am] <BBB> when to use gaussian?
     311[10:47am] <BBB> I don’t know… I don’t think anyone knows
     312[10:47am] <michaelni> with scalig different people will want different options and some people just like to have the choice
     313[10:47am] <ubitux> i'd keep the different filters
     314[10:47am] <ubitux> it's useful to make various visual comparison
     315[10:48am] <michaelni> sinc is something that some people "know" is best until they try it
     316[10:48am] <ubitux> :-D
     317[10:48am] <BBB> but doesn’t that mean we should remove it?
     318[10:48am] <BBB> why keep the option there
     319[10:48am] <ubitux> people will bug you to implement it because it's the perfect filter
     320[10:48am] <nevcairiel> many of the various filters are just different kernels over the same kind of filter, so preserving them costs you practically nothing
     321[10:48am] <BBB> do you know how many people thought x264 was the best encoder in the world but they were using it with default ffmpeg parameters (instead of presets)?
     322[10:49am] <ubitux> so you have it to show them it's shit, or just as a visual demonstration (educative purpose, experiment, ...)
     323[10:49am] <michaelni> its very important that the defaults are good
     324[10:49am] <BBB> I guess as long as defaults and documentation is good, I don’t mind
     325[10:49am] <BBB> but documentation is not good right now
     326[10:50am] <michaelni> the docs need some love, i could probably help with that if theres a list of what needs new/better docs
     327[10:51am] <fritsch> michaelni: "sinc" <- one still learns that in university, that's why
     328[10:51am] <av500> avscale
     329[10:52am] <av500> /undo
     330[10:52am] <Shiz> BBB: alternatively, pseudo-filter names like 'upscale' and 'downscale' that are just aliases for whatever is best
     331[10:52am] <nevcairiel> a perfect sinc filter is perfect - a windowed sinc is just an approximation
     332[10:52am] <BBB> hm, filter presets
     33410:53am] <michaelni> fritsch, yes, its true what one learns but sinc results from some axioms and these dont apply that way to images
     335[10:53am] <ubitux> what's the filter window used in sws? is there such concept?
     336[10:55am] <fritsch> the raspberry pi people that implemented one from scratch decided to go for a special weighted bicubic filter
     337[10:55am] <fritsch> cause of implementation details / performance quality
     338[10:56am] <fritsch> kodi's lanczos3 filter needs quite a bit oomph and for example a hsw gpu is too slow to do 50 fps from 1080 to 4k
     339[10:57am] <wm4> fritsch: does kodi's scale width and height separately?
     340[10:58am] <fritsch> pi uses mitchell-natravali iirc, so the default most likely also ffmpeg uses
     341[10:59am] <fritsch> wm4: i need to look in detail, we use a pseudo separated filter
     342[10:59am] <fritsch> so most likely no
     343[10:59am] <fritsch> but wait a mo
     344[10:59am] <fritsch> it's from a time where you did not have a float intermediate buffer in the gpu
     345[10:59am] <fritsch> or where that extension was "patented" by someone
     346[11:01am] <BBB> so this may just be me, but I tend to think that swscale should be software. I’m all for doing things in hardware, but I don’t know if we should make swscale more complex for that
     348[11:01am] <BBB> or, rather, I wouldn’t know how to do it so it makes no sense for me to design it
     349[11:01am] <BBB> I don’t even know if the concept makes any sense at all
     350[11:01am] <fritsch> wm4: it uses a 4x4 convolution shader at the end
     352[11:01am] <fritsch> wm4: so "no" to your question
     354[11:02am] <fritsch> BBB: shaders are really, really mighty especially for convolution
     355[11:02am] <fritsch> i don't see a point doing that on the cpu
     356[11:02am] <BBB> I know shaders, I love them
     357[11:03am] <BBB> but my point is more about “do you want to use the swscale api if you’re going to scale stuff in hardware?”
     358[11:03am] <nevcairiel> shaders work fine if you already have the image on the gpu
     359[11:03am] <nevcairiel> if you dont, its a lot of overhead and potentially not worth it
     360[11:03am] <BBB> I’m not saying you shouldn’t scale in hw; you should, totally!
     361[11:03am] <BBB> I’m just wondering if swscale is the ideal place to serve as an intermediate layer
     362[11:03am] <wm4> fritsch: separating them makes it quite a bit faster
     363[11:03am] <fritsch> jep
     364[11:04am] <fritsch> but you need a float intermediate buffer
     365[11:04am] <fritsch> to do so
     366[11:04am] <fritsch> which we did not have (on all gpus) at this time
     367[11:04am] <fritsch> iirc gwenole also implemented his lanczos3 in libva without separated kernels
     369[11:04am] <wm4> fritsch: no you don't
     370[11:05am] <fritsch> wm4: then you loose information
     371[11:05am] <wm4> nonsense
     373[11:05am] <fritsch> nonsense
     374[11:05am] <fritsch> come one doing a float multiplication and storing in non float intermediate buffer
     376[11:05am] <fritsch> drives the separation nuts
     378[11:06am] <wm4> a 16 bit fixed point buffer preserves more information than a 16 bit float buffer
     380[11:07am] <fritsch> wm4: then you need to scale appropriately twice
     382[11:07am] <fritsch> e.g. scale the filter weights
     383[11:07am] <fritsch> and inverse at the end