They are also slow when using 256 bit wide registers
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
... | ... |
@@ -182,13 +182,11 @@ int ff_get_cpu_flags_x86(void) |
182 | 182 |
|
183 | 183 |
/* Similar to the above but for AVX functions on AMD processors. |
184 | 184 |
This is necessary only for functions using YMM registers on Bulldozer |
185 |
- based CPUs as they lack 256-bits execution units. SSE/AVX functions |
|
186 |
- using XMM registers are always faster on them. |
|
185 |
+ and Jaguar based CPUs as they lack 256-bits execution units. SSE/AVX |
|
186 |
+ functions using XMM registers are always faster on them. |
|
187 | 187 |
AV_CPU_FLAG_AVX and AV_CPU_FLAG_AVXSLOW are both set so that AVX is |
188 |
- used unless explicitly disabled by checking AV_CPU_FLAG_AVXSLOW. |
|
189 |
- TODO: Confirm if Excavator is affected or not by this once it's |
|
190 |
- released, and update the check if necessary. Same for btver2. */ |
|
191 |
- if (family == 0x15 && (rval & AV_CPU_FLAG_AVX)) |
|
188 |
+ used unless explicitly disabled by checking AV_CPU_FLAG_AVXSLOW. */ |
|
189 |
+ if ((family == 0x15 || family == 0x16) && (rval & AV_CPU_FLAG_AVX)) |
|
192 | 190 |
rval |= AV_CPU_FLAG_AVXSLOW; |
193 | 191 |
} |
194 | 192 |
|