transformer free download

Megatron

Ongoing research training transformer models at scale

Megatron is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. This repository is for ongoing research on training large transformer language models at scale. We developed efficient, model-parallel (tensor, sequence, and pipeline), and multi-node pre-training of transformer based models such as GPT, BERT, and T5 using mixed precision.

Downloads: 0 This Week

Last Update: 2026-03-16

See Project

Colossal-AI

Making large AI models cheaper, faster and more accessible

The Transformer architecture has improved the performance of deep learning models in domains such as Computer Vision and Natural Language Processing. Together with better performance come larger model sizes. This imposes challenges to the memory wall of the current accelerator hardware such as GPU. It is never ideal to train large models such as Vision Transformer, BERT, and GPT on a single GPU or a single machine.

Downloads: 0 This Week

Last Update: 2025-05-28

See Project

ONNX Runtime

ONNX Runtime: cross-platform, high performance ML inferencing

...ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Support for a variety of frameworks, operating systems and hardware platforms. Built-in optimizations that deliver up to 17X faster inferencing and up to 1.4X faster training.

Downloads: 71 This Week

Last Update: 2026-03-17

See Project

GPT-NeoX

Implementation of model parallel autoregressive transformers on GPUs

...We aim to make this repo a centralized and accessible place to gather techniques for training large-scale autoregressive language models, and accelerate research into large-scale training. For those looking for a TPU-centric codebase, we recommend Mesh Transformer JAX. If you are not looking to train models with billions of parameters from scratch, this is likely the wrong library to use. For generic inference needs, we recommend you use the Hugging Face transformers library instead which supports GPT-NeoX models.

Downloads: 0 This Week

Last Update: 2023-03-23

See Project

Deep learning time series forecasting

Deep learning PyTorch library for time series forecasting

...It provides all the latest state-of-the-art models (transformers, attention models, GRUs) and cutting-edge concepts with easy-to-understand interpretability metrics, cloud provider integration, and model serving capabilities. Flow Forecast was the first time series framework to feature support for transformer-based models and remains the only true end-to-end deep learning for time series forecasting framework. Currently, Task-TS from CoronaWhy primarily maintains this repository. Pull requests are welcome. Historically, this repository provided open-source benchmarks and codes for flash flood and river flow forecasting. Full transformer (SimpleTransformer in model_dict): The full original transformer with all 8 encoder and decoder blocks. ...

Downloads: 0 This Week

Last Update: 2022-08-19

See Project

MoCo v3

PyTorch implementation of MoCo v3

MoCo v3 is a PyTorch reimplementation of Momentum Contrast v3 (MoCo v3), Facebook Research’s state-of-the-art self-supervised learning framework for visual representation learning using ResNet and Vision Transformer (ViT) backbones. Originally developed in TensorFlow for TPUs, this version faithfully reproduces the paper’s results on GPUs while offering an accessible and scalable PyTorch interface. MoCo v3 introduces improvements for training self-supervised ViTs by combining contrastive learning with transformer-based architectures, achieving strong linear and end-to-end fine-tuning performance on ImageNet benchmarks. ...

Downloads: 0 This Week

Last Update: 6 days ago

See Project

Trax

Deep learning with clear code and speed

Trax is an end-to-end library for deep learning that focuses on clear code and speed. It is actively used and maintained in the Google Brain team. Run a pre-trained Transformer, create a translator in a few lines of code. Features and resources, API docs, where to talk to us, how to open an issue and more. Walkthrough, how Trax works, how to make new models and train on your own data. Trax includes basic models (like ResNet, LSTM, Transformer) and RL algorithms (like REINFORCE, A2C, PPO). It is also actively used for research and includes new models like the Reformer and new RL algorithms like AWR. ...

Downloads: 1 This Week

Last Update: 2021-10-26

See Project

Search Results for "transformer"

Showing 7 open source projects for "transformer"

Megatron

Colossal-AI

ONNX Runtime

GPT-NeoX

Deep learning time series forecasting

MoCo v3

Trax

Search Results for "transformer"

Showing 7 open source projects for "transformer"

Megatron

Colossal-AI

ONNX Runtime

GPT-NeoX

Deep learning time series forecasting

MoCo v3

Trax

Related Searches

Related Categories