[AArch64] Failure to fold `lsr` or `asr` into `cmp` #122380

Kmeakin · 2025-01-09T22:06:04Z

Instead of performing an lsr/asr and then comparing the result against zero, the shift can be performed as part of a cmp against xzr:

src:
        lsr     x8, x0, #32
        cmp     x8, #0
        cset    w0, ne
        ret

tgt:
        cmp     xzr, x0, lsr 32
        cset    w0, ne
        ret

LLVM does perform this fold for shifts <= 31, and for lsl it is able to find a different way of doing the comparison in one instruction using tst. It also seems to already perform the fold if comparing against a variable instead of 0. I guess the fold fails when comparing against 0 because a comparison against zero can be represented either as cmp x0, #0 or cmp xzr, x0

The text was updated successfully, but these errors were encountered:

llvmbot · 2025-01-09T22:06:20Z

@llvm/issue-subscribers-backend-aarch64

Author: Karl Meakin (Kmeakin)

https://godbolt.org/z/rMMbbfMdW

Instead of performing an lsr/asr and then comparing the result against zero, the shift can be performed as part of a cmp against xzr:

src:
        lsr     x8, x0, #<!-- -->32
        cmp     x8, #<!-- -->0
        cset    w0, ne
        ret

tgt:
        cmp     xzr, x0, lsr 32
        cset    w0, ne
        ret

LLVM does perform this fold for shifts <= 31, and for lsl it is able to find a different way of doing the comparison in one instruction using tst. It also seems to already perform the fold if comparing against a variable instead of 0. I guess the fold fails when comparing against 0 because a comparison against zero can be represented either as cmp x0, #0 or cmp xzr, x0

efriedma-quic · 2025-01-09T22:16:20Z

tst x0, #0xffffffff00000000 is better than cmp xzr, x0, lsr 32; the latter has an extra cycle of latency on many chips.

Kmeakin added backend:AArch64 missed-optimization labels Jan 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AArch64] Failure to fold `lsr` or `asr` into `cmp` #122380

[AArch64] Failure to fold `lsr` or `asr` into `cmp` #122380

Kmeakin commented Jan 9, 2025

llvmbot commented Jan 9, 2025

efriedma-quic commented Jan 9, 2025

[AArch64] Failure to fold lsr or asr into cmp #122380

[AArch64] Failure to fold lsr or asr into cmp #122380

Comments

Kmeakin commented Jan 9, 2025

llvmbot commented Jan 9, 2025

efriedma-quic commented Jan 9, 2025

[AArch64] Failure to fold `lsr` or `asr` into `cmp` #122380

[AArch64] Failure to fold `lsr` or `asr` into `cmp` #122380