Showing 51 open source projects for "data file"

View related business solutions
  • Get full visibility and control over your tasks and projects with Wrike. Icon
    Get full visibility and control over your tasks and projects with Wrike.

    A cloud-based collaboration, work management, and project management software

    Wrike offers world-class features that empower cross-functional, distributed, or growing teams take their projects from the initial request stage all the way to tracking work progress and reporting results.
    Learn More
  • Data management solutions for confident marketing Icon
    Data management solutions for confident marketing

    For companies wanting a complete Data Management solution that is native to Salesforce

    Verify, deduplicate, manipulate, and assign records automatically to keep your CRM data accurate, complete, and ready for business.
    Learn More
  • 1
    supabase-py

    supabase-py

    Python Client for Supabase. Query Postgres from Flask, Django

    Python Client for Supabase. Query Postgres from Flask, Django, FastAPI. Python user authentication, security policies, edge functions, file storage, and realtime data streaming. Good first issue.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    File sizes, creation dates, dimensions, indication of truncated images and existance of EXIF metadata. Mostly global details about the dataset (number of records, number of variables, overall missigness and duplicates, memory footprint). Comprehensive and automatic list of potential data quality issues (high correlation, skewness, uniformity, zeros, missing values, constant values, between others).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    OpenBB

    OpenBB

    Investment Research for Everyone, Everywhere

    Customize and speed up your analysis, bring your own data, and create instant reports to gain a competitive edge. Whether it’s a CSV file, a private endpoint, an RSS feed, or even embed an SEC filing directly. Chat with financial data using large language models. Don’t waste time reading, create summaries in seconds and ask how that impacts investments. Create your dashboard with your favorite widgets.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    Lance

    Lance

    Modern columnar data format for ML and LLMs implemented in Rust

    Lance is a columnar data format that is easy and fast to version, query and train on. It’s designed to be used with images, videos, 3D point clouds, audio and of course tabular data. It supports any POSIX file systems, and cloud storage like AWS S3 and Google Cloud Storage.
    Downloads: 6 This Week
    Last Update:
    See Project
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • 5
    NeuralNote

    NeuralNote

    Audio Plugin for Audio to MIDI transcription using deep learning

    NeuralNote is an open-source audio software tool designed to convert recorded audio into MIDI data using modern machine learning techniques. The software functions as an audio plugin that can be used inside digital audio workstations as well as a standalone application for music production and analysis. Its main purpose is to perform audio-to-MIDI transcription, allowing musicians to record a performance and automatically transform it into editable MIDI notes. NeuralNote supports polyphonic...
    Downloads: 80 This Week
    Last Update:
    See Project
  • 6
    Ludwig

    Ludwig

    A codeless platform to train and test deep learning models

    Ludwig is a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code. All you need to provide is a CSV file containing your data, a list of columns to use as inputs, and a list of columns to use as outputs, Ludwig will do the rest. Simple commands can be used to train models both locally and in a distributed way, and to use them to predict on new data.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    python-small-examples

    python-small-examples

    Focus on creating classic Python small examples and cases

    python-small-examples is an open-source educational repository that contains hundreds of concise Python programming examples designed to illustrate practical coding techniques. The project focuses on teaching programming concepts through small, focused scripts that demonstrate common tasks in data processing, visualization, and general programming. Each example highlights a specific function or programming pattern so that learners can quickly understand how to apply Python features in real-world scenarios. The repository includes examples covering topics such as file processing, JSON manipulation, data visualization, and library usage. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8

    LightGBM

    Gradient boosting framework based on decision tree algorithms

    LightGBM or Light Gradient Boosting Machine is a high-performance, open source gradient boosting framework based on decision tree algorithms. Compared to other boosting frameworks, LightGBM offers several advantages in terms of speed, efficiency and accuracy. Parallel experiments have shown that LightGBM can attain linear speed-up through multiple machines for training in specific settings, all while consuming less memory. LightGBM supports parallel and GPU learning, and can handle...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    pycm

    pycm

    Multi-class confusion matrix library in Python

    PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and an accurate evaluation of large variety of classifiers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • The full-stack observability platform that protects your dataLayer, tags and conversion data Icon
    The full-stack observability platform that protects your dataLayer, tags and conversion data

    Stop losing revenue to bad data today. and protect your marketing data with Code-Cube.io.

    Code-Cube.io detects issues instantly, alerts you in real time and helps you resolve them fast. No manual QA. No unreliable data. Just data you can trust and act on.
    Learn More
  • 10
    ONNX

    ONNX

    Open standard for machine learning interoperability

    ...It defines an extensible computation graph model, as well as definitions of built-in operators and standard data types. Currently we focus on the capabilities needed for inferencing (scoring). ONNX is widely supported and can be found in many frameworks, tools, and hardware. Enabling interoperability between different frameworks and streamlining the path from research to production helps increase the speed of innovation in the AI community.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 11
    DeepVariant

    DeepVariant

    DeepVariant is an analysis pipeline that uses a deep neural networks

    DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data. DeepVariant is a deep learning-based variant caller that takes aligned reads (in BAM or CRAM format), produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports the results in a standard VCF or gVCF file. DeepTrio is a deep learning-based trio variant caller built on top of DeepVariant. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Pedalboard

    Pedalboard

    A Python library for audio

    pedalboard is a Python library for working with audio: reading, writing, rendering, adding effects, and more. It supports the most popular audio file formats and a number of common audio effects out of the box and also allows the use of VST3® and Audio Unit formats for loading third-party software instruments and effects. pedalboard was built by Spotify’s Audio Intelligence Lab to enable using studio-quality audio effects from within Python and TensorFlow. Internally at Spotify, pedalboard is used for data augmentation to improve machine learning models and to help power features like Spotify’s AI DJ and AI Voice Translation. pedalboard also helps in the process of content creation, making it possible to add effects to audio without using a Digital Audio Workstation.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 13
    deepjazz

    deepjazz

    Deep learning driven jazz generation using Keras & Theano

    ...It uses the Keras and Theano libraries to build a two-layer Long Short-Term Memory network capable of learning temporal patterns in music. The system analyzes musical sequences from an input MIDI file and then generates new musical notes that follow similar stylistic patterns. The project was originally created during a hackathon and was designed to show how neural networks can emulate creative tasks traditionally associated with human musicians. The repository includes preprocessing scripts for preparing MIDI data, training scripts for building the neural network model, and code for generating new compositions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    HPCC Systems

    HPCC Systems

    End-to-end big data in a massively scalable supercomputing platform.

    HPCC Systems® (www.hpccsystems.com) from LexisNexis® Risk Solutions is a proven, open source solution for Big Data insights that can be implemented by businesses of all sizes. With HPCC Systems, developers can design applications with Big Data at their core, enabling businesses to better analyze and understand data at scale, improving business time to results and decisions. HPCC Systems offers a consistent data-centric programming language, two processing platforms and a single, complete...
    Downloads: 57 This Week
    Last Update:
    See Project
  • 15
    Zylthra

    Zylthra

    Zylthra: A PyQt6 app to generate synthetic datasets with DataLLM.

    Welcome to Zylthra, a powerful Python-based desktop application built with PyQt6, designed to generate synthetic datasets using the DataLLM API from data.mostly.ai. This tool allows users to create custom datasets by defining columns, configuring generation parameters, and saving setups for reuse, all within a sleek, dark-themed interface.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    Uranie

    Uranie

    Uranie is CEA's uncertainty analysis platform, based on ROOT

    ...Note : if you have downloaded version 3.12 before the 8th of february, a patch exists for a minor bug on TOutputFileKey file, don't hesitate to ask us.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    CometAnalyser

    CometAnalyser

    CometAnalyser, for quantitative comet assay analysis.

    Description: Comet assay provides an easy solution to estimate DNA damage in single cells through microscopy assessment. To obtain reproducible and reliable quantitative data, we developed an easy-to-use tool named CometAnalyser. CometAnalyser is an open-source deep-learning tool designed for the analysis of both fluorescent and silver-stained wide-field microscopy images. Once the comets are segmented and classified, several intensity/morphological features are automatically exported as a spreadsheet file. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 18
    GNNPCSAFT

    GNNPCSAFT

    Smart Thermodynamic Modeling with Graph Neural Networks

    ...We developed this app so the scientific community can access the model's results easily. In this app, the estimated pure-component parameters can be used to calculate thermodynamic properties and compare them with experimental data from the ThermoML Archive. To install the GNNPCSAFT app, download the appropriate latest release from the Files, unzip the file, and run the executable for your operating system (Linux or Windows). More info on github repository.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19

    entity-metadata

    Lists of people, churches, and other entities

    Here are lists of entities, such as people, businesses, and churches. These are large files related to this repository https://github.com/az0/entity-metadata
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Pigo

    Pigo

    Fast face detection, pupil/eyes localization

    Fast face detection, pupil/eyes localization and facial landmark points detection library in pure Go. Pigo is a pure Go face detection, pupil/eyes localization and facial landmark points detection library based on the Pixel Intensity Comparison-based Object detection paper. The reason why Pigo has been developed is because almost all of the currently existing solutions for face detection in the Go ecosystem are purely bindings to some C/C++ libraries like OpenCV or dlib, but calling a C...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    ModelFox

    ModelFox

    ModelFox makes it easy to train, deploy, and monitor ML models

    ...Train a machine learning model by running modelfox train with the path to a CSV file and the name of the column you want to predict. The CLI automatically transforms your data into features, trains a number of linear and gradient boosted decision tree models to predict the target column, and writes the best model to a .modelfox file. If you want more control, you can provide a config file.
    Downloads: 45 This Week
    Last Update:
    See Project
  • 22
    YOLO ROS

    YOLO ROS

    YOLO ROS: Real-Time Object Detection for ROS

    ...You'll have to have an Nvidia GPU and you'll have to install CUDA. The CMakeLists.txt file automatically detects if you have CUDA installed or not. CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Semantic Segmentation in PyTorch

    Semantic Segmentation in PyTorch

    Semantic segmentation models, datasets & losses implemented in PyTorch

    Semantic segmentation models, datasets and losses implemented in PyTorch. PyTorch and Torchvision needs to be installed before running the scripts, together with PIL and opencv for data-preprocessing and tqdm for showing the training progress. PyTorch v1.1 is supported (using the new supported tensoboard); can work with earlier versions, but instead of using tensoboard, use tensoboardX. Poly learning rate, where the learning rate is scaled down linearly from the starting value down to zero...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    wav2letter++

    wav2letter++

    Facebook AI research's automatic speech recognition toolkit

    ...After installing, run export KENLM_ROOT_DIR=... so that wav2letter++ can find it. This is needed because KenLM doesn't support a make install step.wav2letter++ expects audio and transcription data to be prepared in a specific format so that they can be read from the pipelines. Each dataset (test/valid/train) needs to be in a separate file with one sample per line. A sample is specified using 4 columns separated by space (or tabs).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    VoTT

    VoTT

    Visual Object Tagging Tool, an electron app for building models

    Visual Object Tagging Tool: An electron app for building end-to-end Object Detection Models from Images and Videos. An open source annotation and labeling tool for image and video assets. VoTT is a React + Redux Web application, written in TypeScript. This project was bootstrapped with Create React App. VoTT can be installed as a native application or run from source. VoTT is also available as a stand-alone Web application and can be used in any modern Web browser. VoTT is available for...
    Downloads: 7 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB