Profile avatar
hochreitersepp.bsky.social
8 posts 238 followers 179 following
Getting Started

Exploration imitation learning architectures: Transformer, Mamba, xLSTM: arxiv.org/abs/2502.12330 *LIBERO: “xLSTM shows great potential” *RoboCasa: “xLSTM models, we achieved success rate of 53.6%, compared to 40.0% of BC-Transformer” *Point Clouds: “xLSTM model achieves a 60.9% success rate”

xLSTM for time series with Granger causality: arxiv.org/abs/2502.09981 xLSTM again shows superb performance at time series analysis. "Our experimental evaluations on three datasets demonstrate the overall efficacy of our proposed GC-xLSTM model."

xLSTM shines at tumor segmentation: arxiv.org/abs/2502.00314 “evaluated state-of-the-art segmentation methods, including U-Net and its enhanced variants with Transformers and Mamba. Our proposed ViLU-Net [vision xLSTM-Net] model achieved superior performance with reduced complexity.” Cool.

Don't miss this workshop. Super interesting .

xLSTM for molecular property prediction: arxiv.org/abs/2501.18439 "AUROC improvement of 3.18% for classification tasks and an RMSE reduction of 3.83% across regression datasets compared to the baseline methods." Again xLSTM excels in a life science. Compare Bio-xLSTM: arxiv.org/abs/2411.04165.

xLSTM for knowledge tracing: arxiv.org/abs/2501.14256 xLSTM excels at another task. “DKT2 [the xLSTM] consistently outperforms 17 baseline models in various prediction tasks” Baseline models include Transformers, Mamba and graph-based methods. DKT2 exploits exponential gating and matrix memory.

EMBL & ELLIS deepen partnership to drive #AI in European #LifeSciences. 🤝 We aim to unlock new insights, train the next gen of AI researchers, & develop innovative solutions for human & planetary health. Read the full article here: ellis.eu/news/embl-an...

xLSTM outperforms Conformers (Transformers) and Mamba on Speech Enhancement: arxiv.org/abs/2501.06146 “xLSTM-SENet2, outperforms state-of-the-art Mamba- and Conformer-based systems on the Voicebank+DEMAND.” New xLSTM-Triton-Kernels are faster than FlashAttention 3 & Mamba for training and inference.

Often LLMs hallucinate because of semantic uncertainty due to missing factual training data. We propose a method to detect such uncertainties using only one generated output sequence. Super efficient method to detect hallucination in LLMs.