Abstractive Style Summarizer

This model is a fine-tuned version of google/flan-t5-base using PEFT (LoRA). It is designed to generate abstractive summaries in three distinct styles: Harsh (concise), Balanced (standard), and Detailed (comprehensive).

Model Details

Model Description

  • Model type: Sequence-to-Sequence Transformer (T5)
  • Language(s): English
  • License: MIT
  • Finetuned from model: google/flan-t5-base
  • Training Method: PEFT (LoRA)

Model Sources

Uses

Direct Use

The model interprets a prefixed prompt to determine the style of the summary.

  • Harsh: Generates very short, punchy summaries (approx. 35% of input length).
  • Balanced: Generates standard news summaries (approx. 50% of input length).
  • Detailed: Generates in-depth summaries (approx. 70% of input length).

Prompt Format

The input text should be prefixed with the desired style:

Summarize {Style}: {Input Text}

Example: Summarize Harsh: The Walt Disney Co. announced...

Training Details

Training Data

The model was trained on a combined dataset of 12,000 samples, split into 80% Train, 10% Validation, and 10% Test.

Style Source Dataset Size
Harsh XSum 4000
Balanced CNN/DailyMail 4000
Detailed Multi-News 4000

Training Procedure

Training Hyperparameters

  • Learning Rate: 5e-4
  • Batch Size: 4 per device
  • Gradient Accumulation Steps: 2
  • Num Epochs: 5
  • Optimizer: AdamW
  • LR Scheduler: Linear with warmup (ratio 0.05)
  • Mixed Precision: BF16

LoRA Configuration

  • r: 32
  • lora_alpha: 64
  • lora_dropout: 0.05
  • target_modules: ["q", "k", "v", "o"]
  • bias: "none"
  • task_type: "SEQ_2_SEQ_LM"

Evaluation Results

Evaluated on the held-out test set (1,200 samples) at Step 6000.

Metric Score
ROUGE-1 0.3925
ROUGE-2 0.1608
ROUGE-L 0.2776
Validation Loss 0.7824

Environmental Impact

  • Hardware Type: CUDA-enabled GPU
  • Compute: LoRA fine-tuning (Parameters: 7M trainable / 254M total)

Framework Versions

  • Datasets==3.6.0
  • Pytorch>=2.5.1
  • Transformers>=4.36.0
  • PEFT>=0.8.0
Downloads last month
48
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lityops/Abstractive-Style-Summarizer

Adapter
(290)
this model
Adapters
2 models

Datasets used to train lityops/Abstractive-Style-Summarizer

Space using lityops/Abstractive-Style-Summarizer 1

Evaluation results