Home - Waterfall Grid T-Grid Console Builders Recent Builds Buildslaves Changesources - JSON API - About

Builder ffmpeg64-solaris10-sparc Build #13610

Results:

Failed shell_2 shell_3 shell_4 shell_5

SourceStamp:

Projectffmpeg
Repositoryhttps://git.ffmpeg.org/ffmpeg.git
Branchmaster
Revisioncc3ca1712760ae53957a3a5987cb7c61c290a451
Got Revisioncc3ca1712760ae53957a3a5987cb7c61c290a451
Changes18 changes

BuildSlave:

unstable10s

Reason:

The SingleBranchScheduler scheduler named 'schedule-ffmpeg64-solaris10-sparc' triggered this build

Steps and Logfiles:

  1. git update ( 10 secs )
    1. stdio
  2. shell 'gsed -i ...' ( 0 secs )
    1. stdio
  3. shell_1 'gsed -i ...' ( 0 secs )
    1. stdio
  4. shell_2 'gsed -i ...' failed ( 0 secs )
    1. stdio
  5. shell_3 './configure --samples="../../../ffmpeg/fate-suite" ...' failed ( 8 secs )
    1. stdio
    2. config.log
  6. shell_4 'gmake fate-rsync' failed ( 0 secs )
    1. stdio
  7. shell_5 '../../../ffmpeg/fate.sh ../../../ffmpeg/fate_config_64.sh' failed ( 0 secs )
    1. stdio
    2. configure.log
    3. compile.log
    4. test.log

Build Properties:

NameValueSource
branch master Build
builddir /export/home/buildbot-unstable10s/slave/ffmpeg64-solaris10-sparc slave
buildername ffmpeg64-solaris10-sparc Builder
buildnumber 13610 Build
codebase Build
got_revision cc3ca1712760ae53957a3a5987cb7c61c290a451 Git
project ffmpeg Build
repository https://git.ffmpeg.org/ffmpeg.git Build
revision cc3ca1712760ae53957a3a5987cb7c61c290a451 Build
scheduler schedule-ffmpeg64-solaris10-sparc Scheduler
slavename unstable10s BuildSlave
workdir /export/home/buildbot-unstable10s/slave/ffmpeg64-solaris10-sparc slave (deprecated)

Forced Build Properties:

NameLabelValue

Responsible Users:

  1. Andreas Rheinhardt

Timing:

StartThu Apr 30 11:22:52 2026
EndThu Apr 30 11:23:12 2026
Elapsed19 secs

All Changes:

:

  1. Change #265959

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:32
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision d46414b46becc927e89b7824424df9e34d05c8e7

    Comments

    avcodec/x86/qpeldsp: Simplify resetting output pointer
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp.asm
  2. Change #265960

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:32
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision d3bd1318b3ef38c34af52ee65fedc27e183f06d9

    Comments

    avcodec/x86/qpeldsp: Don't zero unnecessarily
    This value is write-only.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp.asm
  3. Change #265961

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:32
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 69906d31c51306f2b87868fce234c8385bf06cd7

    Comments

    avcodec/x86/qpeldsp_init: Don't use unnecessarily big stack buffer
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp_init.c
  4. Change #265962

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:32
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision cf79d8052d84c08dde399ee9c6bd1ce8e1ff47b7

    Comments

    avcodec/x86/qpeldsp_init: Specify alignment properly
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp_init.c
  5. Change #265963

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:32
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision c2685234a650d7a533cb7a72229caf2d48cab2e2

    Comments

    avcodec/x86/qpeldsp_init: Deduplicate 8x8 and 16x16 code
    Also split the big macro into smaller ones for the pure horizontal vs
    the pure vertical and the mixed directions.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp_init.c
  6. Change #265964

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:32
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 7b56259dd5bc3eb08c1dc2dfe6b31f71db160378

    Comments

    avcodec/x86/constants: Move ff_pw_{15,20} to qpeldsp.asm
    Only used there.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/constants.c
    • libavcodec/x86/constants.h
    • libavcodec/x86/qpeldsp.asm
  7. Change #265965

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision bcf7293a211b7c06e54d1c16a85f6c8e3826a7e8

    Comments

    avcodec/x86/qpeldsp: Remove unused declaration
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp.asm
  8. Change #265966

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 188df9549c5c120728b43f953ec281e67e4bb3c3

    Comments

    avcodec/x86/qpeldsp: Don't use too much stack
    We only need (SIZE+1)*SIZE words.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp.asm
  9. Change #265967

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 405465700cf8fd529f2dbbc0317eab6a9ede23f2

    Comments

    avcodec/x86/qpeldsp: Don't allocate stack unnecessarily
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp.asm
  10. Change #265968

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 9beecb26704e8d9a4a27c07fd8da05eb94cf45ed

    Comments

    avcodec/x86/qpeldsp: Add SSE2 vertical lowpass functions
    Benchmarks ([4], [8] and [12] are pure vertical functions
    and therefore show the biggest improvements):
    
    avg_qpel_pixels_tab[0][4]_c:                           844.5 ( 1.00x)
    avg_qpel_pixels_tab[0][4]_mmxext:                      225.5 ( 3.74x)
    avg_qpel_pixels_tab[0][4]_sse2:                        146.6 ( 5.76x)
    avg_qpel_pixels_tab[0][5]_c:                          1915.9 ( 1.00x)
    avg_qpel_pixels_tab[0][5]_mmxext:                      499.6 ( 3.83x)
    avg_qpel_pixels_tab[0][5]_sse2:                        405.5 ( 4.72x)
    avg_qpel_pixels_tab[0][6]_c:                          1775.9 ( 1.00x)
    avg_qpel_pixels_tab[0][6]_mmxext:                      484.9 ( 3.66x)
    avg_qpel_pixels_tab[0][6]_sse2:                        385.4 ( 4.61x)
    avg_qpel_pixels_tab[0][7]_c:                          1937.0 ( 1.00x)
    avg_qpel_pixels_tab[0][7]_mmxext:                      501.3 ( 3.86x)
    avg_qpel_pixels_tab[0][7]_sse2:                        403.6 ( 4.80x)
    avg_qpel_pixels_tab[0][8]_c:                           976.7 ( 1.00x)
    avg_qpel_pixels_tab[0][8]_mmxext:                      216.9 ( 4.50x)
    avg_qpel_pixels_tab[0][8]_sse2:                        113.1 ( 8.64x)
    avg_qpel_pixels_tab[0][9]_c:                          1971.8 ( 1.00x)
    avg_qpel_pixels_tab[0][9]_mmxext:                      494.9 ( 3.98x)
    avg_qpel_pixels_tab[0][9]_sse2:                        388.3 ( 5.08x)
    avg_qpel_pixels_tab[0][10]_c:                         1900.8 ( 1.00x)
    avg_qpel_pixels_tab[0][10]_mmxext:                     476.4 ( 3.99x)
    avg_qpel_pixels_tab[0][10]_sse2:                       362.4 ( 5.24x)
    avg_qpel_pixels_tab[0][11]_c:                         2003.3 ( 1.00x)
    avg_qpel_pixels_tab[0][11]_mmxext:                     496.5 ( 4.04x)
    avg_qpel_pixels_tab[0][11]_sse2:                       385.9 ( 5.19x)
    avg_qpel_pixels_tab[0][12]_c:                          841.8 ( 1.00x)
    avg_qpel_pixels_tab[0][12]_mmxext:                     226.7 ( 3.71x)
    avg_qpel_pixels_tab[0][12]_sse2:                       143.3 ( 5.87x)
    avg_qpel_pixels_tab[0][13]_c:                         1929.0 ( 1.00x)
    avg_qpel_pixels_tab[0][13]_mmxext:                     499.6 ( 3.86x)
    avg_qpel_pixels_tab[0][13]_sse2:                       412.1 ( 4.68x)
    avg_qpel_pixels_tab[0][14]_c:                         1777.9 ( 1.00x)
    avg_qpel_pixels_tab[0][14]_mmxext:                     484.8 ( 3.67x)
    avg_qpel_pixels_tab[0][14]_sse2:                       385.9 ( 4.61x)
    avg_qpel_pixels_tab[0][15]_c:                         1914.8 ( 1.00x)
    avg_qpel_pixels_tab[0][15]_mmxext:                     501.8 ( 3.82x)
    avg_qpel_pixels_tab[0][15]_sse2:                       405.0 ( 4.73x)
    avg_qpel_pixels_tab[1][4]_c:                           203.4 ( 1.00x)
    avg_qpel_pixels_tab[1][4]_mmxext:                       64.7 ( 3.14x)
    avg_qpel_pixels_tab[1][4]_sse2:                         40.3 ( 5.05x)
    avg_qpel_pixels_tab[1][5]_c:                           488.8 ( 1.00x)
    avg_qpel_pixels_tab[1][5]_mmxext:                      134.6 ( 3.63x)
    avg_qpel_pixels_tab[1][5]_sse2:                        108.5 ( 4.50x)
    avg_qpel_pixels_tab[1][6]_c:                           448.2 ( 1.00x)
    avg_qpel_pixels_tab[1][6]_mmxext:                      128.8 ( 3.48x)
    avg_qpel_pixels_tab[1][6]_sse2:                        102.5 ( 4.37x)
    avg_qpel_pixels_tab[1][7]_c:                           489.6 ( 1.00x)
    avg_qpel_pixels_tab[1][7]_mmxext:                      134.5 ( 3.64x)
    avg_qpel_pixels_tab[1][7]_sse2:                        108.8 ( 4.50x)
    avg_qpel_pixels_tab[1][8]_c:                           223.8 ( 1.00x)
    avg_qpel_pixels_tab[1][8]_mmxext:                       57.5 ( 3.89x)
    avg_qpel_pixels_tab[1][8]_sse2:                         36.3 ( 6.16x)
    avg_qpel_pixels_tab[1][9]_c:                           496.6 ( 1.00x)
    avg_qpel_pixels_tab[1][9]_mmxext:                      129.8 ( 3.82x)
    avg_qpel_pixels_tab[1][9]_sse2:                        105.1 ( 4.72x)
    avg_qpel_pixels_tab[1][10]_c:                          466.1 ( 1.00x)
    avg_qpel_pixels_tab[1][10]_mmxext:                     123.2 ( 3.78x)
    avg_qpel_pixels_tab[1][10]_sse2:                        99.1 ( 4.70x)
    avg_qpel_pixels_tab[1][11]_c:                          497.9 ( 1.00x)
    avg_qpel_pixels_tab[1][11]_mmxext:                     129.9 ( 3.83x)
    avg_qpel_pixels_tab[1][11]_sse2:                       105.4 ( 4.72x)
    avg_qpel_pixels_tab[1][12]_c:                          203.5 ( 1.00x)
    avg_qpel_pixels_tab[1][12]_mmxext:                      63.8 ( 3.19x)
    avg_qpel_pixels_tab[1][12]_sse2:                        38.8 ( 5.25x)
    avg_qpel_pixels_tab[1][13]_c:                          487.9 ( 1.00x)
    avg_qpel_pixels_tab[1][13]_mmxext:                     134.7 ( 3.62x)
    avg_qpel_pixels_tab[1][13]_sse2:                       108.4 ( 4.50x)
    avg_qpel_pixels_tab[1][14]_c:                          447.4 ( 1.00x)
    avg_qpel_pixels_tab[1][14]_mmxext:                     128.2 ( 3.49x)
    avg_qpel_pixels_tab[1][14]_sse2:                       102.4 ( 4.37x)
    avg_qpel_pixels_tab[1][15]_c:                          487.5 ( 1.00x)
    avg_qpel_pixels_tab[1][15]_mmxext:                     134.0 ( 3.64x)
    avg_qpel_pixels_tab[1][15]_sse2:                       109.9 ( 4.44x)
    
    put_no_rnd_qpel_pixels_tab[0][4]_c:                    825.5 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][4]_mmxext:               242.5 ( 3.40x)
    put_no_rnd_qpel_pixels_tab[0][4]_sse2:                 136.0 ( 6.07x)
    put_no_rnd_qpel_pixels_tab[0][5]_c:                   1837.4 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][5]_mmxext:               542.5 ( 3.39x)
    put_no_rnd_qpel_pixels_tab[0][5]_sse2:                 446.5 ( 4.11x)
    put_no_rnd_qpel_pixels_tab[0][6]_c:                   1766.3 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][6]_mmxext:               493.6 ( 3.58x)
    put_no_rnd_qpel_pixels_tab[0][6]_sse2:                 394.6 ( 4.48x)
    put_no_rnd_qpel_pixels_tab[0][7]_c:                   1877.4 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][7]_mmxext:               541.9 ( 3.46x)
    put_no_rnd_qpel_pixels_tab[0][7]_sse2:                 447.6 ( 4.19x)
    put_no_rnd_qpel_pixels_tab[0][8]_c:                    785.1 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][8]_mmxext:               206.2 ( 3.81x)
    put_no_rnd_qpel_pixels_tab[0][8]_sse2:                 101.6 ( 7.73x)
    put_no_rnd_qpel_pixels_tab[0][9]_c:                   1772.2 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][9]_mmxext:               489.5 ( 3.62x)
    put_no_rnd_qpel_pixels_tab[0][9]_sse2:                 394.8 ( 4.49x)
    put_no_rnd_qpel_pixels_tab[0][10]_c:                  1711.5 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][10]_mmxext:              461.2 ( 3.71x)
    put_no_rnd_qpel_pixels_tab[0][10]_sse2:                357.9 ( 4.78x)
    put_no_rnd_qpel_pixels_tab[0][11]_c:                  1815.9 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][11]_mmxext:              490.8 ( 3.70x)
    put_no_rnd_qpel_pixels_tab[0][11]_sse2:                394.0 ( 4.61x)
    put_no_rnd_qpel_pixels_tab[0][12]_c:                   824.8 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][12]_mmxext:              242.9 ( 3.40x)
    put_no_rnd_qpel_pixels_tab[0][12]_sse2:                135.3 ( 6.10x)
    put_no_rnd_qpel_pixels_tab[0][13]_c:                  1843.5 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][13]_mmxext:              545.4 ( 3.38x)
    put_no_rnd_qpel_pixels_tab[0][13]_sse2:                444.9 ( 4.14x)
    put_no_rnd_qpel_pixels_tab[0][14]_c:                  1758.1 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][14]_mmxext:              497.7 ( 3.53x)
    put_no_rnd_qpel_pixels_tab[0][14]_sse2:                393.5 ( 4.47x)
    put_no_rnd_qpel_pixels_tab[0][15]_c:                  1861.3 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][15]_mmxext:              545.0 ( 3.42x)
    put_no_rnd_qpel_pixels_tab[0][15]_sse2:                445.7 ( 4.18x)
    put_no_rnd_qpel_pixels_tab[1][4]_c:                    198.3 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][4]_mmxext:                64.3 ( 3.08x)
    put_no_rnd_qpel_pixels_tab[1][4]_sse2:                  39.8 ( 4.98x)
    put_no_rnd_qpel_pixels_tab[1][5]_c:                    460.7 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][5]_mmxext:               137.2 ( 3.36x)
    put_no_rnd_qpel_pixels_tab[1][5]_sse2:                 113.5 ( 4.06x)
    put_no_rnd_qpel_pixels_tab[1][6]_c:                    441.4 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][6]_mmxext:               126.7 ( 3.49x)
    put_no_rnd_qpel_pixels_tab[1][6]_sse2:                 103.7 ( 4.26x)
    put_no_rnd_qpel_pixels_tab[1][7]_c:                    465.9 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][7]_mmxext:               137.7 ( 3.38x)
    put_no_rnd_qpel_pixels_tab[1][7]_sse2:                 114.0 ( 4.09x)
    put_no_rnd_qpel_pixels_tab[1][8]_c:                    193.8 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][8]_mmxext:                52.1 ( 3.72x)
    put_no_rnd_qpel_pixels_tab[1][8]_sse2:                  27.8 ( 6.97x)
    put_no_rnd_qpel_pixels_tab[1][9]_c:                    450.9 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][9]_mmxext:               126.2 ( 3.57x)
    put_no_rnd_qpel_pixels_tab[1][9]_sse2:                 104.3 ( 4.32x)
    put_no_rnd_qpel_pixels_tab[1][10]_c:                   436.5 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][10]_mmxext:              118.1 ( 3.69x)
    put_no_rnd_qpel_pixels_tab[1][10]_sse2:                 92.4 ( 4.73x)
    put_no_rnd_qpel_pixels_tab[1][11]_c:                   453.6 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][11]_mmxext:              128.7 ( 3.52x)
    put_no_rnd_qpel_pixels_tab[1][11]_sse2:                103.6 ( 4.38x)
    put_no_rnd_qpel_pixels_tab[1][12]_c:                   201.2 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][12]_mmxext:               64.2 ( 3.13x)
    put_no_rnd_qpel_pixels_tab[1][12]_sse2:                 39.6 ( 5.08x)
    put_no_rnd_qpel_pixels_tab[1][13]_c:                   461.9 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][13]_mmxext:              137.6 ( 3.36x)
    put_no_rnd_qpel_pixels_tab[1][13]_sse2:                113.4 ( 4.07x)
    put_no_rnd_qpel_pixels_tab[1][14]_c:                   442.6 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][14]_mmxext:              127.0 ( 3.49x)
    put_no_rnd_qpel_pixels_tab[1][14]_sse2:                102.2 ( 4.33x)
    put_no_rnd_qpel_pixels_tab[1][15]_c:                   462.9 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][15]_mmxext:              139.5 ( 3.32x)
    put_no_rnd_qpel_pixels_tab[1][15]_sse2:                113.3 ( 4.09x)
    
    put_qpel_pixels_tab[0][4]_c:                           824.6 ( 1.00x)
    put_qpel_pixels_tab[0][4]_mmxext:                      220.1 ( 3.75x)
    put_qpel_pixels_tab[0][4]_sse2:                        137.8 ( 5.98x)
    put_qpel_pixels_tab[0][5]_c:                          1892.0 ( 1.00x)
    put_qpel_pixels_tab[0][5]_mmxext:                      508.0 ( 3.72x)
    put_qpel_pixels_tab[0][5]_sse2:                        408.6 ( 4.63x)
    put_qpel_pixels_tab[0][6]_c:                          1758.0 ( 1.00x)
    put_qpel_pixels_tab[0][6]_mmxext:                      476.7 ( 3.69x)
    put_qpel_pixels_tab[0][6]_sse2:                        381.4 ( 4.61x)
    put_qpel_pixels_tab[0][7]_c:                          1924.3 ( 1.00x)
    put_qpel_pixels_tab[0][7]_mmxext:                      495.1 ( 3.89x)
    put_qpel_pixels_tab[0][7]_sse2:                        417.2 ( 4.61x)
    put_qpel_pixels_tab[0][8]_c:                           772.1 ( 1.00x)
    put_qpel_pixels_tab[0][8]_mmxext:                      197.5 ( 3.91x)
    put_qpel_pixels_tab[0][8]_sse2:                        118.4 ( 6.52x)
    put_qpel_pixels_tab[0][9]_c:                          1778.2 ( 1.00x)
    put_qpel_pixels_tab[0][9]_mmxext:                      476.7 ( 3.73x)
    put_qpel_pixels_tab[0][9]_sse2:                        379.6 ( 4.68x)
    put_qpel_pixels_tab[0][10]_c:                         1714.6 ( 1.00x)
    put_qpel_pixels_tab[0][10]_mmxext:                     460.7 ( 3.72x)
    put_qpel_pixels_tab[0][10]_sse2:                       386.8 ( 4.43x)
    put_qpel_pixels_tab[0][11]_c:                         1819.1 ( 1.00x)
    put_qpel_pixels_tab[0][11]_mmxext:                     474.9 ( 3.83x)
    put_qpel_pixels_tab[0][11]_sse2:                       404.5 ( 4.50x)
    put_qpel_pixels_tab[0][12]_c:                          829.7 ( 1.00x)
    put_qpel_pixels_tab[0][12]_mmxext:                     221.5 ( 3.75x)
    put_qpel_pixels_tab[0][12]_sse2:                       138.7 ( 5.98x)
    put_qpel_pixels_tab[0][13]_c:                         1892.8 ( 1.00x)
    put_qpel_pixels_tab[0][13]_mmxext:                     494.4 ( 3.83x)
    put_qpel_pixels_tab[0][13]_sse2:                       413.9 ( 4.57x)
    put_qpel_pixels_tab[0][14]_c:                         1763.1 ( 1.00x)
    put_qpel_pixels_tab[0][14]_mmxext:                     473.4 ( 3.72x)
    put_qpel_pixels_tab[0][14]_sse2:                       377.8 ( 4.67x)
    put_qpel_pixels_tab[0][15]_c:                         1896.4 ( 1.00x)
    put_qpel_pixels_tab[0][15]_mmxext:                     492.5 ( 3.85x)
    put_qpel_pixels_tab[0][15]_sse2:                       399.0 ( 4.75x)
    put_qpel_pixels_tab[1][4]_c:                           198.6 ( 1.00x)
    put_qpel_pixels_tab[1][4]_mmxext:                       60.9 ( 3.26x)
    put_qpel_pixels_tab[1][4]_sse2:                         40.1 ( 4.95x)
    put_qpel_pixels_tab[1][5]_c:                           471.4 ( 1.00x)
    put_qpel_pixels_tab[1][5]_mmxext:                      131.8 ( 3.58x)
    put_qpel_pixels_tab[1][5]_sse2:                        107.2 ( 4.40x)
    put_qpel_pixels_tab[1][6]_c:                           440.3 ( 1.00x)
    put_qpel_pixels_tab[1][6]_mmxext:                      126.3 ( 3.49x)
    put_qpel_pixels_tab[1][6]_sse2:                        100.6 ( 4.38x)
    put_qpel_pixels_tab[1][7]_c:                           469.2 ( 1.00x)
    put_qpel_pixels_tab[1][7]_mmxext:                      131.7 ( 3.56x)
    put_qpel_pixels_tab[1][7]_sse2:                        106.9 ( 4.39x)
    put_qpel_pixels_tab[1][8]_c:                           194.2 ( 1.00x)
    put_qpel_pixels_tab[1][8]_mmxext:                       52.9 ( 3.67x)
    put_qpel_pixels_tab[1][8]_sse2:                         28.0 ( 6.95x)
    put_qpel_pixels_tab[1][9]_c:                           464.6 ( 1.00x)
    put_qpel_pixels_tab[1][9]_mmxext:                      125.1 ( 3.71x)
    put_qpel_pixels_tab[1][9]_sse2:                        100.9 ( 4.60x)
    put_qpel_pixels_tab[1][10]_c:                          433.8 ( 1.00x)
    put_qpel_pixels_tab[1][10]_mmxext:                     118.2 ( 3.67x)
    put_qpel_pixels_tab[1][10]_sse2:                        94.5 ( 4.59x)
    put_qpel_pixels_tab[1][11]_c:                          463.9 ( 1.00x)
    put_qpel_pixels_tab[1][11]_mmxext:                     125.5 ( 3.70x)
    put_qpel_pixels_tab[1][11]_sse2:                       102.6 ( 4.52x)
    put_qpel_pixels_tab[1][12]_c:                          199.2 ( 1.00x)
    put_qpel_pixels_tab[1][12]_mmxext:                      63.7 ( 3.12x)
    put_qpel_pixels_tab[1][12]_sse2:                        36.2 ( 5.50x)
    put_qpel_pixels_tab[1][13]_c:                          475.6 ( 1.00x)
    put_qpel_pixels_tab[1][13]_mmxext:                     139.5 ( 3.41x)
    put_qpel_pixels_tab[1][13]_sse2:                       107.3 ( 4.43x)
    put_qpel_pixels_tab[1][14]_c:                          441.9 ( 1.00x)
    put_qpel_pixels_tab[1][14]_mmxext:                     126.9 ( 3.48x)
    put_qpel_pixels_tab[1][14]_sse2:                       101.3 ( 4.36x)
    put_qpel_pixels_tab[1][15]_c:                          475.9 ( 1.00x)
    put_qpel_pixels_tab[1][15]_mmxext:                     131.9 ( 3.61x)
    put_qpel_pixels_tab[1][15]_sse2:                       107.0 ( 4.45x)
    
    The new functions (in qpeldsp.asm) occupy 8244B (the MMXEXT functions
    which they will replace occupy only 6720B).
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp.asm
    • libavcodec/x86/qpeldsp_init.c
  11. Change #265969

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision dad0c010761cfaf98ddebdb188300f117c370295

    Comments

    avcodec/x86/qpeldsp: Remove vertical MMXEXT mc functions
    Superseded by SSE2.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp.asm
    • libavcodec/x86/qpeldsp_init.c
  12. Change #265970

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision a3d747f3446e828c5e03880d0459d51994f6ec15

    Comments

    avcodec/x86/qpeldsp{,_init}: Use SSE2 pixels16x16_l2 functions
    put and avg versions have been added and used in H264
    in b91081274f5a5b5f0f1ce820331f702378a425e8. This commit
    adds the size 16 version of put_no_rnd and uses all three
    of them in the SSE2 size 16 qpel functions (i.e. it uses
    them in the ones that have a vertical component); it also
    removes the 16x17 MMXEXT versions (which are no longer used).
    
    This is particularly beneficial for put_no_rnd:
    avg_qpel_pixels_tab[0][5]_c:                          1910.9 ( 1.00x)
    avg_qpel_pixels_tab[0][5]_sse2 (old):                  405.1 ( 4.72x)
    avg_qpel_pixels_tab[0][5]_sse2:                        392.9 ( 4.86x)
    avg_qpel_pixels_tab[0][6]_c:                          1778.9 ( 1.00x)
    avg_qpel_pixels_tab[0][6]_sse2 (old):                  385.5 ( 4.61x)
    avg_qpel_pixels_tab[0][6]_sse2:                        374.9 ( 4.75x)
    avg_qpel_pixels_tab[0][7]_c:                          1935.3 ( 1.00x)
    avg_qpel_pixels_tab[0][7]_sse2 (old):                  403.1 ( 4.80x)
    avg_qpel_pixels_tab[0][7]_sse2:                        391.6 ( 4.94x)
    avg_qpel_pixels_tab[0][9]_c:                          1969.0 ( 1.00x)
    avg_qpel_pixels_tab[0][9]_sse2 (old):                  384.1 ( 5.13x)
    avg_qpel_pixels_tab[0][9]_sse2:                        380.3 ( 5.18x)
    avg_qpel_pixels_tab[0][11]_c:                         2014.9 ( 1.00x)
    avg_qpel_pixels_tab[0][11]_sse2 (old):                 385.6 ( 5.23x)
    avg_qpel_pixels_tab[0][11]_sse2:                       380.2 ( 5.30x)
    avg_qpel_pixels_tab[0][13]_c:                         1925.7 ( 1.00x)
    avg_qpel_pixels_tab[0][13]_sse2 (old):                 406.1 ( 4.74x)
    avg_qpel_pixels_tab[0][13]_sse2:                       390.4 ( 4.93x)
    avg_qpel_pixels_tab[0][14]_c:                         1793.0 ( 1.00x)
    avg_qpel_pixels_tab[0][14]_sse2 (old):                 389.6 ( 4.60x)
    avg_qpel_pixels_tab[0][14]_sse2:                       377.1 ( 4.75x)
    avg_qpel_pixels_tab[0][15]_c:                         1913.0 ( 1.00x)
    avg_qpel_pixels_tab[0][15]_sse2 (old):                 404.2 ( 4.73x)
    avg_qpel_pixels_tab[0][15]_sse2:                       390.8 ( 4.89x)
    put_no_rnd_qpel_pixels_tab[0][5]_c:                   1864.1 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][5]_sse2 (old):           425.6 ( 4.38x)
    put_no_rnd_qpel_pixels_tab[0][5]_sse2:                 396.2 ( 4.71x)
    put_no_rnd_qpel_pixels_tab[0][6]_c:                   1767.1 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][6]_sse2 (old):           388.4 ( 4.55x)
    put_no_rnd_qpel_pixels_tab[0][6]_sse2:                 377.7 ( 4.68x)
    put_no_rnd_qpel_pixels_tab[0][7]_c:                   1874.9 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][7]_sse2 (old):           427.6 ( 4.38x)
    put_no_rnd_qpel_pixels_tab[0][7]_sse2:                 400.0 ( 4.69x)
    put_no_rnd_qpel_pixels_tab[0][9]_c:                   1759.7 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][9]_sse2 (old):           393.0 ( 4.48x)
    put_no_rnd_qpel_pixels_tab[0][9]_sse2:                 379.7 ( 4.63x)
    put_no_rnd_qpel_pixels_tab[0][11]_c:                  1820.9 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][11]_sse2 (old):          392.7 ( 4.64x)
    put_no_rnd_qpel_pixels_tab[0][11]_sse2:                377.4 ( 4.82x)
    put_no_rnd_qpel_pixels_tab[0][13]_c:                  1841.2 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][13]_sse2 (old):          427.1 ( 4.31x)
    put_no_rnd_qpel_pixels_tab[0][13]_sse2:                395.9 ( 4.65x)
    put_no_rnd_qpel_pixels_tab[0][14]_c:                  1761.3 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][14]_sse2 (old):          392.3 ( 4.49x)
    put_no_rnd_qpel_pixels_tab[0][14]_sse2:                375.9 ( 4.69x)
    put_no_rnd_qpel_pixels_tab[0][15]_c:                  1869.1 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][15]_sse2 (old):          425.6 ( 4.39x)
    put_no_rnd_qpel_pixels_tab[0][15]_sse2:                397.3 ( 4.70x)
    put_qpel_pixels_tab[0][5]_c:                          1888.2 ( 1.00x)
    put_qpel_pixels_tab[0][5]_sse2 (old):                  396.5 ( 4.76x)
    put_qpel_pixels_tab[0][5]_sse2:                        382.5 ( 4.94x)
    put_qpel_pixels_tab[0][6]_c:                          1760.4 ( 1.00x)
    put_qpel_pixels_tab[0][6]_sse2 (old):                  377.0 ( 4.67x)
    put_qpel_pixels_tab[0][6]_sse2:                        372.1 ( 4.73x)
    put_qpel_pixels_tab[0][7]_c:                          1927.6 ( 1.00x)
    put_qpel_pixels_tab[0][7]_sse2 (old):                  396.5 ( 4.86x)
    put_qpel_pixels_tab[0][7]_sse2:                        383.4 ( 5.03x)
    put_qpel_pixels_tab[0][9]_c:                          1775.9 ( 1.00x)
    put_qpel_pixels_tab[0][9]_sse2 (old):                  377.9 ( 4.70x)
    put_qpel_pixels_tab[0][9]_sse2:                        372.3 ( 4.77x)
    put_qpel_pixels_tab[0][11]_c:                         1809.0 ( 1.00x)
    put_qpel_pixels_tab[0][11]_sse2 (old):                 374.6 ( 4.83x)
    put_qpel_pixels_tab[0][11]_sse2:                       380.3 ( 4.76x)
    put_qpel_pixels_tab[0][13]_c:                         1893.2 ( 1.00x)
    put_qpel_pixels_tab[0][13]_sse2 (old):                 399.2 ( 4.74x)
    put_qpel_pixels_tab[0][13]_sse2:                       384.7 ( 4.92x)
    put_qpel_pixels_tab[0][14]_c:                         1756.2 ( 1.00x)
    put_qpel_pixels_tab[0][14]_sse2 (old):                 377.9 ( 4.65x)
    put_qpel_pixels_tab[0][14]_sse2:                       374.4 ( 4.69x)
    put_qpel_pixels_tab[0][15]_c:                         1922.8 ( 1.00x)
    put_qpel_pixels_tab[0][15]_sse2 (old):                 399.0 ( 4.82x)
    put_qpel_pixels_tab[0][15]_sse2:                       387.8 ( 4.96x)
    
    The purely vertical size 16 mc functions now no longer use any MMX.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpel.asm
    • libavcodec/x86/qpeldsp.asm
    • libavcodec/x86/qpeldsp_init.c
  13. Change #265971

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision c0e1c1d6b3245a5bf46b5cb5c22cd16a9138a21b

    Comments

    avcodec/x86/qpeldsp: Add SSSE3 size 16 horizontal filter
    Beats the mmxext version by a lot (in the following,
    [0][1-3] refers to horizontal-only size 16 mc;
    the _sse2 comparators for the other cases use mmxext
    horizontal mc coupled with vertical SSE2 mc):
    
    avg_qpel_pixels_tab[0][1]_c:                           945.5 ( 1.00x)
    avg_qpel_pixels_tab[0][1]_mmxext:                      262.6 ( 3.60x)
    avg_qpel_pixels_tab[0][1]_ssse3:                       110.4 ( 8.57x)
    avg_qpel_pixels_tab[0][2]_c:                          1042.1 ( 1.00x)
    avg_qpel_pixels_tab[0][2]_mmxext:                      245.1 ( 4.25x)
    avg_qpel_pixels_tab[0][2]_ssse3:                        91.7 (11.37x)
    avg_qpel_pixels_tab[0][3]_c:                           941.8 ( 1.00x)
    avg_qpel_pixels_tab[0][3]_mmxext:                      260.1 ( 3.62x)
    avg_qpel_pixels_tab[0][3]_ssse3:                       110.1 ( 8.56x)
    avg_qpel_pixels_tab[0][5]_c:                          1939.5 ( 1.00x)
    avg_qpel_pixels_tab[0][5]_sse2:                        394.3 ( 4.92x)
    avg_qpel_pixels_tab[0][5]_ssse3:                       247.4 ( 7.84x)
    avg_qpel_pixels_tab[0][6]_c:                          1785.8 ( 1.00x)
    avg_qpel_pixels_tab[0][6]_sse2:                        380.6 ( 4.69x)
    avg_qpel_pixels_tab[0][6]_ssse3:                       221.1 ( 8.08x)
    avg_qpel_pixels_tab[0][7]_c:                          1932.5 ( 1.00x)
    avg_qpel_pixels_tab[0][7]_sse2:                        393.4 ( 4.91x)
    avg_qpel_pixels_tab[0][7]_ssse3:                       238.8 ( 8.09x)
    avg_qpel_pixels_tab[0][9]_c:                          1976.9 ( 1.00x)
    avg_qpel_pixels_tab[0][9]_sse2:                        380.8 ( 5.19x)
    avg_qpel_pixels_tab[0][9]_ssse3:                       223.3 ( 8.85x)
    avg_qpel_pixels_tab[0][10]_c:                         1911.9 ( 1.00x)
    avg_qpel_pixels_tab[0][10]_sse2:                       366.9 ( 5.21x)
    avg_qpel_pixels_tab[0][10]_ssse3:                      207.0 ( 9.24x)
    avg_qpel_pixels_tab[0][11]_c:                         2046.9 ( 1.00x)
    avg_qpel_pixels_tab[0][11]_sse2:                       385.5 ( 5.31x)
    avg_qpel_pixels_tab[0][11]_ssse3:                      227.9 ( 8.98x)
    avg_qpel_pixels_tab[0][13]_c:                         1940.8 ( 1.00x)
    avg_qpel_pixels_tab[0][13]_sse2:                       389.7 ( 4.98x)
    avg_qpel_pixels_tab[0][13]_ssse3:                      244.2 ( 7.95x)
    avg_qpel_pixels_tab[0][14]_c:                         1778.4 ( 1.00x)
    avg_qpel_pixels_tab[0][14]_sse2:                       379.2 ( 4.69x)
    avg_qpel_pixels_tab[0][14]_ssse3:                      223.5 ( 7.96x)
    avg_qpel_pixels_tab[0][15]_c:                         1905.9 ( 1.00x)
    avg_qpel_pixels_tab[0][15]_sse2:                       398.9 ( 4.78x)
    avg_qpel_pixels_tab[0][15]_ssse3:                      238.3 ( 8.00x)
    put_no_rnd_qpel_pixels_tab[0][1]_c:                    922.5 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][1]_mmxext:               275.0 ( 3.35x)
    put_no_rnd_qpel_pixels_tab[0][1]_ssse3:                108.4 ( 8.51x)
    put_no_rnd_qpel_pixels_tab[0][2]_c:                    889.7 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][2]_mmxext:               236.7 ( 3.76x)
    put_no_rnd_qpel_pixels_tab[0][2]_ssse3:                 86.8 (10.25x)
    put_no_rnd_qpel_pixels_tab[0][3]_c:                    915.5 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][3]_mmxext:               274.3 ( 3.34x)
    put_no_rnd_qpel_pixels_tab[0][3]_ssse3:                108.2 ( 8.46x)
    put_no_rnd_qpel_pixels_tab[0][5]_sse2:                 400.0 ( 4.63x)
    put_no_rnd_qpel_pixels_tab[0][5]_ssse3:                246.0 ( 7.53x)
    put_no_rnd_qpel_pixels_tab[0][6]_c:                   1753.9 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][6]_sse2:                 382.5 ( 4.59x)
    put_no_rnd_qpel_pixels_tab[0][6]_ssse3:                226.4 ( 7.75x)
    put_no_rnd_qpel_pixels_tab[0][7]_c:                   1854.6 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][7]_sse2:                 393.5 ( 4.71x)
    put_no_rnd_qpel_pixels_tab[0][7]_ssse3:                248.6 ( 7.46x)
    put_no_rnd_qpel_pixels_tab[0][9]_c:                   1794.3 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][9]_sse2:                 382.2 ( 4.70x)
    put_no_rnd_qpel_pixels_tab[0][9]_ssse3:                228.0 ( 7.87x)
    put_no_rnd_qpel_pixels_tab[0][10]_c:                  1724.7 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][10]_sse2:                353.8 ( 4.88x)
    put_no_rnd_qpel_pixels_tab[0][10]_ssse3:               206.5 ( 8.35x)
    put_no_rnd_qpel_pixels_tab[0][11]_c:                  1796.3 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][11]_sse2:                378.1 ( 4.75x)
    put_no_rnd_qpel_pixels_tab[0][11]_ssse3:               227.1 ( 7.91x)
    put_no_rnd_qpel_pixels_tab[0][13]_c:                  1834.4 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][13]_sse2:                400.7 ( 4.58x)
    put_no_rnd_qpel_pixels_tab[0][13]_ssse3:               244.2 ( 7.51x)
    put_no_rnd_qpel_pixels_tab[0][14]_c:                  1755.7 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][14]_sse2:                387.2 ( 4.53x)
    put_no_rnd_qpel_pixels_tab[0][14]_ssse3:               226.8 ( 7.74x)
    put_no_rnd_qpel_pixels_tab[0][15]_c:                  1847.3 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[0][15]_sse2:                400.6 ( 4.61x)
    put_no_rnd_qpel_pixels_tab[0][15]_ssse3:               246.1 ( 7.51x)
    put_qpel_pixels_tab[0][1]_c:                           919.6 ( 1.00x)
    put_qpel_pixels_tab[0][1]_mmxext:                      255.5 ( 3.60x)
    put_qpel_pixels_tab[0][1]_ssse3:                       108.3 ( 8.49x)
    put_qpel_pixels_tab[0][2]_c:                           883.9 ( 1.00x)
    put_qpel_pixels_tab[0][2]_mmxext:                      238.1 ( 3.71x)
    put_qpel_pixels_tab[0][2]_ssse3:                        86.7 (10.19x)
    put_qpel_pixels_tab[0][3]_c:                           921.9 ( 1.00x)
    put_qpel_pixels_tab[0][3]_mmxext:                      258.9 ( 3.56x)
    put_qpel_pixels_tab[0][3]_ssse3:                       108.1 ( 8.53x)
    put_qpel_pixels_tab[0][5]_c:                          1907.5 ( 1.00x)
    put_qpel_pixels_tab[0][5]_sse2:                        384.2 ( 4.96x)
    put_qpel_pixels_tab[0][5]_ssse3:                       234.8 ( 8.13x)
    put_qpel_pixels_tab[0][6]_c:                          1757.4 ( 1.00x)
    put_qpel_pixels_tab[0][6]_sse2:                        382.8 ( 4.59x)
    put_qpel_pixels_tab[0][6]_ssse3:                       217.6 ( 8.08x)
    put_qpel_pixels_tab[0][7]_c:                          1927.5 ( 1.00x)
    put_qpel_pixels_tab[0][7]_sse2:                        384.6 ( 5.01x)
    put_qpel_pixels_tab[0][7]_ssse3:                       231.2 ( 8.34x)
    put_qpel_pixels_tab[0][9]_c:                          1832.1 ( 1.00x)
    put_qpel_pixels_tab[0][9]_sse2:                        374.8 ( 4.89x)
    put_qpel_pixels_tab[0][9]_ssse3:                       219.4 ( 8.35x)
    put_qpel_pixels_tab[0][10]_c:                         1710.3 ( 1.00x)
    put_qpel_pixels_tab[0][10]_sse2:                       384.5 ( 4.45x)
    put_qpel_pixels_tab[0][10]_ssse3:                      202.9 ( 8.43x)
    put_qpel_pixels_tab[0][11]_c:                         1825.0 ( 1.00x)
    put_qpel_pixels_tab[0][11]_sse2:                       369.6 ( 4.94x)
    put_qpel_pixels_tab[0][11]_ssse3:                      216.8 ( 8.42x)
    put_qpel_pixels_tab[0][13]_c:                         1898.4 ( 1.00x)
    put_qpel_pixels_tab[0][13]_sse2:                       384.9 ( 4.93x)
    put_qpel_pixels_tab[0][13]_ssse3:                      238.6 ( 7.96x)
    put_qpel_pixels_tab[0][14]_c:                         1779.1 ( 1.00x)
    put_qpel_pixels_tab[0][14]_sse2:                       373.3 ( 4.77x)
    put_qpel_pixels_tab[0][14]_ssse3:                      218.1 ( 8.16x)
    put_qpel_pixels_tab[0][15]_c:                         1918.2 ( 1.00x)
    put_qpel_pixels_tab[0][15]_sse2:                       385.3 ( 4.98x)
    put_qpel_pixels_tab[0][15]_ssse3:                      236.8 ( 8.10x)
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp.asm
    • libavcodec/x86/qpeldsp_init.c
  14. Change #265972

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 1d040c527d5877ede8e54e0e4bb3f737b74f1f21

    Comments

    avcodec/x86/qpeldsp: Add SSSE3 size 8 horizontal filter
    Beats the mmxext version by a lot (in the following,
    [1][1-3] refers to horizontal-only size 8 mc;
    the _sse2 comparators for the other cases use mmxext
    horizontal mc coupled with vertical SSE2 mc):
    
    avg_qpel_pixels_tab[1][1]_c:                           223.9 ( 1.00x)
    avg_qpel_pixels_tab[1][1]_mmxext:                       66.2 ( 3.38x)
    avg_qpel_pixels_tab[1][1]_ssse3:                        36.8 ( 6.08x)
    avg_qpel_pixels_tab[1][2]_c:                           251.0 ( 1.00x)
    avg_qpel_pixels_tab[1][2]_mmxext:                       58.5 ( 4.29x)
    avg_qpel_pixels_tab[1][2]_ssse3:                        25.5 ( 9.84x)
    avg_qpel_pixels_tab[1][3]_c:                           226.9 ( 1.00x)
    avg_qpel_pixels_tab[1][3]_mmxext:                       66.3 ( 3.42x)
    avg_qpel_pixels_tab[1][3]_ssse3:                        35.8 ( 6.34x)
    avg_qpel_pixels_tab[1][5]_c:                           473.9 ( 1.00x)
    avg_qpel_pixels_tab[1][5]_sse2:                        110.7 ( 4.28x)
    avg_qpel_pixels_tab[1][5]_ssse3:                        76.0 ( 6.24x)
    avg_qpel_pixels_tab[1][6]_c:                           440.9 ( 1.00x)
    avg_qpel_pixels_tab[1][6]_sse2:                        102.1 ( 4.32x)
    avg_qpel_pixels_tab[1][6]_ssse3:                        67.1 ( 6.58x)
    avg_qpel_pixels_tab[1][7]_c:                           473.8 ( 1.00x)
    avg_qpel_pixels_tab[1][7]_sse2:                        108.0 ( 4.39x)
    avg_qpel_pixels_tab[1][7]_ssse3:                        74.6 ( 6.35x)
    avg_qpel_pixels_tab[1][9]_c:                           492.9 ( 1.00x)
    avg_qpel_pixels_tab[1][9]_sse2:                        102.1 ( 4.83x)
    avg_qpel_pixels_tab[1][9]_ssse3:                        67.1 ( 7.35x)
    avg_qpel_pixels_tab[1][10]_c:                          465.6 ( 1.00x)
    avg_qpel_pixels_tab[1][10]_sse2:                        94.9 ( 4.91x)
    avg_qpel_pixels_tab[1][10]_ssse3:                       57.5 ( 8.10x)
    avg_qpel_pixels_tab[1][11]_c:                          492.8 ( 1.00x)
    avg_qpel_pixels_tab[1][11]_sse2:                       102.4 ( 4.81x)
    avg_qpel_pixels_tab[1][11]_ssse3:                       68.7 ( 7.17x)
    avg_qpel_pixels_tab[1][13]_c:                          476.6 ( 1.00x)
    avg_qpel_pixels_tab[1][13]_sse2:                       108.6 ( 4.39x)
    avg_qpel_pixels_tab[1][13]_ssse3:                       74.7 ( 6.38x)
    avg_qpel_pixels_tab[1][14]_c:                          434.9 ( 1.00x)
    avg_qpel_pixels_tab[1][14]_sse2:                       102.2 ( 4.25x)
    avg_qpel_pixels_tab[1][14]_ssse3:                       66.6 ( 6.53x)
    avg_qpel_pixels_tab[1][15]_c:                          474.1 ( 1.00x)
    avg_qpel_pixels_tab[1][15]_sse2:                       107.9 ( 4.39x)
    avg_qpel_pixels_tab[1][15]_ssse3:                       74.3 ( 6.38x)
    put_no_rnd_qpel_pixels_tab[1][1]_c:                    222.1 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][1]_mmxext:                66.0 ( 3.37x)
    put_no_rnd_qpel_pixels_tab[1][1]_ssse3:                 35.2 ( 6.31x)
    put_no_rnd_qpel_pixels_tab[1][2]_c:                    212.2 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][2]_mmxext:                56.8 ( 3.74x)
    put_no_rnd_qpel_pixels_tab[1][2]_ssse3:                 25.0 ( 8.48x)
    put_no_rnd_qpel_pixels_tab[1][3]_c:                    224.5 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][3]_mmxext:                65.8 ( 3.41x)
    put_no_rnd_qpel_pixels_tab[1][3]_ssse3:                 35.8 ( 6.26x)
    put_no_rnd_qpel_pixels_tab[1][5]_c:                    460.1 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][5]_sse2:                 114.6 ( 4.01x)
    put_no_rnd_qpel_pixels_tab[1][5]_ssse3:                 83.1 ( 5.53x)
    put_no_rnd_qpel_pixels_tab[1][6]_c:                    438.6 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][6]_sse2:                 104.2 ( 4.21x)
    put_no_rnd_qpel_pixels_tab[1][6]_ssse3:                 67.5 ( 6.50x)
    put_no_rnd_qpel_pixels_tab[1][7]_c:                    458.0 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][7]_sse2:                 113.8 ( 4.02x)
    put_no_rnd_qpel_pixels_tab[1][7]_ssse3:                 79.9 ( 5.73x)
    put_no_rnd_qpel_pixels_tab[1][9]_c:                    439.0 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][9]_sse2:                 103.7 ( 4.23x)
    put_no_rnd_qpel_pixels_tab[1][9]_ssse3:                 68.9 ( 6.37x)
    put_no_rnd_qpel_pixels_tab[1][10]_c:                   427.0 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][10]_sse2:                 93.2 ( 4.58x)
    put_no_rnd_qpel_pixels_tab[1][10]_ssse3:                57.9 ( 7.37x)
    put_no_rnd_qpel_pixels_tab[1][11]_c:                   439.9 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][11]_sse2:                104.0 ( 4.23x)
    put_no_rnd_qpel_pixels_tab[1][11]_ssse3:                69.2 ( 6.36x)
    put_no_rnd_qpel_pixels_tab[1][13]_c:                   459.3 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][13]_sse2:                113.2 ( 4.06x)
    put_no_rnd_qpel_pixels_tab[1][13]_ssse3:                83.8 ( 5.48x)
    put_no_rnd_qpel_pixels_tab[1][14]_c:                   439.5 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][14]_sse2:                103.3 ( 4.25x)
    put_no_rnd_qpel_pixels_tab[1][14]_ssse3:                67.9 ( 6.47x)
    put_no_rnd_qpel_pixels_tab[1][15]_c:                   453.6 ( 1.00x)
    put_no_rnd_qpel_pixels_tab[1][15]_sse2:                113.7 ( 3.99x)
    put_no_rnd_qpel_pixels_tab[1][15]_ssse3:                80.0 ( 5.67x)
    put_qpel_pixels_tab[1][1]_c:                           229.0 ( 1.00x)
    put_qpel_pixels_tab[1][1]_mmxext:                       65.5 ( 3.50x)
    put_qpel_pixels_tab[1][1]_ssse3:                        33.8 ( 6.77x)
    put_qpel_pixels_tab[1][2]_c:                           212.5 ( 1.00x)
    put_qpel_pixels_tab[1][2]_mmxext:                       56.6 ( 3.75x)
    put_qpel_pixels_tab[1][2]_ssse3:                        23.4 ( 9.08x)
    put_qpel_pixels_tab[1][3]_c:                           227.5 ( 1.00x)
    put_qpel_pixels_tab[1][3]_mmxext:                       64.4 ( 3.53x)
    put_qpel_pixels_tab[1][3]_ssse3:                        33.5 ( 6.79x)
    put_qpel_pixels_tab[1][5]_c:                           466.5 ( 1.00x)
    put_qpel_pixels_tab[1][5]_sse2:                        106.8 ( 4.37x)
    put_qpel_pixels_tab[1][5]_ssse3:                        71.8 ( 6.50x)
    put_qpel_pixels_tab[1][6]_c:                           438.7 ( 1.00x)
    put_qpel_pixels_tab[1][6]_sse2:                        102.0 ( 4.30x)
    put_qpel_pixels_tab[1][6]_ssse3:                        65.3 ( 6.72x)
    put_qpel_pixels_tab[1][7]_c:                           466.0 ( 1.00x)
    put_qpel_pixels_tab[1][7]_sse2:                        106.3 ( 4.38x)
    put_qpel_pixels_tab[1][7]_ssse3:                        70.9 ( 6.57x)
    put_qpel_pixels_tab[1][9]_c:                           456.0 ( 1.00x)
    put_qpel_pixels_tab[1][9]_sse2:                        100.1 ( 4.55x)
    put_qpel_pixels_tab[1][9]_ssse3:                        64.0 ( 7.13x)
    put_qpel_pixels_tab[1][10]_c:                          425.1 ( 1.00x)
    put_qpel_pixels_tab[1][10]_sse2:                        92.6 ( 4.59x)
    put_qpel_pixels_tab[1][10]_ssse3:                       55.1 ( 7.71x)
    put_qpel_pixels_tab[1][11]_c:                          452.7 ( 1.00x)
    put_qpel_pixels_tab[1][11]_sse2:                        99.6 ( 4.55x)
    put_qpel_pixels_tab[1][11]_ssse3:                       63.8 ( 7.09x)
    put_qpel_pixels_tab[1][13]_c:                          471.2 ( 1.00x)
    put_qpel_pixels_tab[1][13]_sse2:                       106.4 ( 4.43x)
    put_qpel_pixels_tab[1][13]_ssse3:                       71.4 ( 6.60x)
    put_qpel_pixels_tab[1][14]_c:                          439.7 ( 1.00x)
    put_qpel_pixels_tab[1][14]_sse2:                       101.8 ( 4.32x)
    put_qpel_pixels_tab[1][14]_ssse3:                       64.8 ( 6.79x)
    put_qpel_pixels_tab[1][15]_c:                          467.8 ( 1.00x)
    put_qpel_pixels_tab[1][15]_sse2:                       106.1 ( 4.41x)
    put_qpel_pixels_tab[1][15]_ssse3:                       72.6 ( 6.44x)
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp.asm
    • libavcodec/x86/qpeldsp_init.c
  15. Change #265973

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision f946cac2d9fd8a816a72fe4fad81587b14af53fc

    Comments

    avcodec/x86/qpeldsp: Remove horizontal mmxext mc functions
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpel.asm
    • libavcodec/x86/qpel.h
    • libavcodec/x86/qpeldsp.asm
    • libavcodec/x86/qpeldsp_init.c
  16. Change #265974

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 23d3116af93027866689e52e50533cd3121679ab

    Comments

    avcodec/x86/qpeldsp: Add combination of h_lowpass + l2
    If the subpel part of the horizontal component of
    the motion vector is 1/4 or 3/4, the MPEG-4 qpel motion compensation
    first computes the mc for the corresponding motion vector
    with 1/2 horizontal subpel part and then averages this
    with the left (for 1/4) or the right (for 3/4) source pixel.
    These two stages are currently performed in two different functions,
    involving a stack buffer as intermediate.
    
    This means that horizontal prediction for every function with
    a 1/4 or 3/4 horizontal subpel mv is more expensive code-size wise
    (and also performance-wise) as it involves two calls. Given that
    the horizontal lowpass functions are not that long, adding combinations
    of h_lowpass+l2 actually reduces binary size: An increase of 1136B
    in the asm files is more than offset by size reductions in
    the wrappers: 1968B here when not using stack protection,
    2256B when using stack protection.
    
    Of course it also improves performance. Old benchmarks:
    avg_qpel_pixels_tab[0][1]_ssse3:                       106.9 ( 8.69x)
    avg_qpel_pixels_tab[0][3]_ssse3:                       105.5 ( 8.84x)
    avg_qpel_pixels_tab[0][5]_ssse3:                       226.9 ( 8.57x)
    avg_qpel_pixels_tab[0][7]_ssse3:                       231.1 ( 8.38x)
    avg_qpel_pixels_tab[0][9]_ssse3:                       217.8 ( 9.04x)
    avg_qpel_pixels_tab[0][11]_ssse3:                      214.9 ( 9.32x)
    avg_qpel_pixels_tab[0][13]_ssse3:                      227.1 ( 8.48x)
    avg_qpel_pixels_tab[0][15]_ssse3:                      236.1 ( 8.02x)
    
    New benchmarks:
    avg_qpel_pixels_tab[0][1]_ssse3:                        96.7 ( 9.65x)
    avg_qpel_pixels_tab[0][3]_ssse3:                        96.6 ( 9.73x)
    avg_qpel_pixels_tab[0][5]_ssse3:                       225.8 ( 8.61x)
    avg_qpel_pixels_tab[0][7]_ssse3:                       228.4 ( 8.51x)
    avg_qpel_pixels_tab[0][9]_ssse3:                       217.1 ( 9.05x)
    avg_qpel_pixels_tab[0][11]_ssse3:                      217.8 ( 9.32x)
    avg_qpel_pixels_tab[0][13]_ssse3:                      227.2 ( 8.54x)
    avg_qpel_pixels_tab[0][15]_ssse3:                      220.5 ( 8.72x)
    
    Note: The l2 functions are also used for vertical lowpass
    functions, yet given that they are much bigger, duplicating
    them would lead to massive code size increase.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpel.asm
    • libavcodec/x86/qpeldsp.asm
    • libavcodec/x86/qpeldsp_init.c
  17. Change #265975

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision ca43bc6202382518137b938f681c57614087f5c8

    Comments

    avcodec/x86/qpeldsp_init: Mark functions as hidden
    It allows pic 32bit code to call the underlying
    assembly functions directly, without loading
    the GOT first; this saves 1245B of .text here
    (for 32bit pic code).
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpel.h
    • libavcodec/x86/qpeldsp_init.c
  18. Change #265976

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Thu 30 Apr 2026 10:39:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision cc3ca1712760ae53957a3a5987cb7c61c290a451

    Comments

    avcodec/x86/qpeldsp{,_init}: Use proper prefix
    E.g. rename ff_put_mpeg4_qpel8_h_lowpass_ssse3 to
    ff_mpeg4_put_qpel8_h_lowpass_ssse3.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/qpeldsp.asm
    • libavcodec/x86/qpeldsp_init.c