This repository contains the code and model weights for GPT-2, a large-scale unsupervised language model described in the OpenAI paper “Language Models are Unsupervised Multitask Learners.” The intent is to provide a starting point for researchers and engineers to experiment with GPT-2: generate text, fine‐tune on custom datasets, explore model behavior, or study its internal phenomena. The repository includes scripts for sampling, training, downloading pre-trained models, and utilities for tokenization and model handling. Support for memory-saving gradient techniques/optimizations during training. Sampling/generation scripts (conditional, unconditional, interactive).
Features
- Pretrained model weights for multiple GPT-2 sizes (e.g. 117M, 345M, up to 1.5B parameters)
- Sampling / generation scripts (conditional, unconditional, interactive)
- Tokenizer and encoding / decoding utilities
- Training / fine-tuning script support (for smaller models)
- Support for memory-saving gradient techniques / optimizations during training
- Utilities to download / manage model checkpoints via script
Categories
Artificial IntelligenceLicense
MIT LicenseFollow GPT-2
Other Useful Business Software
ERP Software To Simplify Your Manufacturing
Global Shop Solutions AI-integrated ERP software provides the applications needed to deliver a quality part on time, every time from quote to cash and everything in between, including shop management, scheduling, inventory, accounting, quality control, CRM and 25 more.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of GPT-2!