doc/swscale.txt
38d174b3
     The official guide to swscale for confused developers.
    ========================================================
 
 Current (simplified) Architecture:
 ---------------------------------
                         Input
                           v
                    _______OR_________
                  /                   \
                /                       \
        special converter     [Input to YUV converter]
               |                         |
               |          (8bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:0:0 )
               |                         |
               |                         v
               |                  Horizontal scaler
               |                         |
               |      (15bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:1:1 / 4:0:0 )
               |                         |
               |                         v
               |          Vertical scaler and output converter
               |                         |
               v                         v
                          output
 
 
4a0266a0
 Swscale has 2 scaler paths. Each side must be capable of handling
 slices, that is, consecutive non-overlapping rectangles of dimension
f236aa47
 (0,slice_top) - (picture_width, slice_bottom).
38d174b3
 
 special converter
4a0266a0
     These generally are unscaled converters of common
8eafa0b4
     formats, like YUV 4:2:0/4:2:2 -> RGB12/15/16/24/32. Though it could also
38d174b3
     in principle contain scalers optimized for specific common cases.
 
 Main path
4a0266a0
     The main path is used when no special converter can be used. The code
     is designed as a destination line pull architecture. That is, for each
88cdf2f4
     output line the vertical scaler pulls lines from a ring buffer. When
f236aa47
     the ring buffer does not contain the wanted line, then it is pulled from
     the input slice through the input converter and horizontal scaler.
     The result is also stored in the ring buffer to serve future vertical
88cdf2f4
     scaler requests.
     When no more output can be generated because lines from a future slice
     would be needed, then all remaining lines in the current slice are
     converted, horizontally scaled and put in the ring buffer.
f236aa47
     [This is done for luma and chroma, each with possibly different numbers
      of lines per picture.]
38d174b3
 
 Input to YUV Converter
f236aa47
     When the input to the main path is not planar 8 bits per component YUV or
4d6a1161
     8-bit gray, it is converted to planar 8-bit YUV. Two sets of converters
f236aa47
     exist for this currently: One performs horizontal downscaling by 2
4d6a1161
     before the conversion, the other leaves the full chroma resolution,
f236aa47
     but is slightly slower. The scaler will try to preserve full chroma
4d6a1161
     when the output uses it. It is possible to force full chroma with
f236aa47
     SWS_FULL_CHR_H_INP even for cases where the scaler thinks it is useless.
38d174b3
 
 Horizontal scaler
4a0266a0
     There are several horizontal scalers. A special case worth mentioning is
652f5185
     the fast bilinear scaler that is made of runtime-generated MMXEXT code
38d174b3
     using specially tuned pshufw instructions.
f236aa47
     The remaining scalers are specially-tuned for various filter lengths.
     They scale 8-bit unsigned planar data to 16-bit signed planar data.
4d6a1161
     Future >8 bits per component inputs will need to add a new horizontal
     scaler that preserves the input precision.
38d174b3
 
 Vertical scaler and output converter
f236aa47
     There is a large number of combined vertical scalers + output converters.
38d174b3
     Some are:
     * unscaled output converters
     * unscaled output converters that average 2 chroma lines
     * bilinear converters                (C, MMX and accurate MMX)
     * arbitrary filter length converters (C, MMX and accurate MMX)
     And
f236aa47
     * Plain C  8-bit 4:2:2 YUV -> RGB converters using LUTs
     * Plain C 17-bit 4:4:4 YUV -> RGB converters using multiplies
     * MMX     11-bit 4:2:2 YUV -> RGB converters
     * Plain C 16-bit Y -> 16-bit gray
38d174b3
       ...
 
f236aa47
     RGB with less than 8 bits per component uses dither to improve the
     subjective quality and low-frequency accuracy.
38d174b3
 
 
 Filter coefficients:
 --------------------
f236aa47
 There are several different scalers (bilinear, bicubic, lanczos, area,
 sinc, ...). Their coefficients are calculated in initFilter().
 Horizontal filter coefficients have a 1.0 point at 1 << 14, vertical ones at
 1 << 12. The 1.0 points have been chosen to maximize precision while leaving
 a little headroom for convolutional filters like sharpening filters and
38d174b3
 minimizing SIMD instructions needed to apply them.
 It would be trivial to use a different 1.0 point if some specific scaler
 would benefit from it.
f236aa47
 Also, as already hinted at, initFilter() accepts an optional convolutional
38d174b3
 filter as input that can be used for contrast, saturation, blur, sharpening
 shift, chroma vs. luma shift, ...