An Optimized Framework for Contextual Sentence Classification in Biomedical Abstracts

May 14, 2025·

Muhammad Yousaf Rana

Muhammad Hammad

· 1 min read

PDF Cite Code Dataset

Abstract

We propose a novel multi-input architecture for sentence classification of biomedical abstracts. By combining BERT-based contextual embeddings, character-level BiLSTM processing, and structural features like sentence position within an abstract, our model achieves 90.57% accuracy on the PubMed 20k RCT dataset. The architecture balances performance with computational efficiency and is well-suited for low-resource environments.

Type

Journal article

Publication

In ResearchGate

This paper presents an efficient and accurate model for classifying sentences in biomedical abstracts using a fusion of token-level, character-level, and positional features.

The model integrates:

BERT embeddings to capture sentence-level semantics.
Character-level BiLSTM layers to handle biomedical terminology and subword patterns.
Positional encodings based on line number and abstract structure.

Evaluation on the PubMed 20k RCT dataset shows a classification accuracy of 90.57%, outperforming previous architectures like Forward ANN, CRF, and logistic regression models.

This work is a step forward in making large-scale biomedical literature more searchable, structured, and interpretable.

Last updated on May 14, 2025