Showing 31 open source projects for "python programming language"

View related business solutions
  • Gearset | The complete Salesforce DevOps solution Icon
    Gearset | The complete Salesforce DevOps solution

    Salesforce DevOps done right.

    Gearset is the only platform you need for unparalleled deployment success, continuous delivery, automated testing and backups.
    Learn More
  • Information Security Made Simple and Affordable | Carbide Icon
    Information Security Made Simple and Affordable | Carbide

    For companies requiring a solution to scale their business without incurring security debt

    Get expert guidance and smart tools to launch or level up your security and compliance efforts without the complexity.
    Learn More
  • 1
    pyVideoTrans

    pyVideoTrans

    Translate the video from one language to another and embed dubbing

    pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 2

    MITRE Annotation Toolkit

    A toolkit for managing and manipulating text annotations

    The MITRE Annotation Toolkit (MAT) is a suite of tools which can be used for automated and human tagging of annotations. Annotation is a process, used mostly by researchers in natural language processing, of enhancing documents with information about the various phrase types the documents contain. MAT supports both UI interaction and command-line interaction, and provides various levels of control over the overall annotation process. It can be customized for specific tasks (e.g.,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Argos Translate

    Argos Translate

    Open-source offline translation library written in Python

    Argos Translate uses OpenNMT for translations and can be used as either a Python library, command-line, or GUI application. Argos Translate supports installing language model packages which are zip archives with a ".argosmodel" extension containing the data needed for translation. LibreTranslate is an API and web-app built on top of Argos Translate. Argos Translate also manages automatically pivoting through intermediate languages to translate between languages that don't have a direct translation between them installed. ...
    Downloads: 124 This Week
    Last Update:
    See Project
  • 4

    Safe Harbor Deidentification

    Safe Harbor Deidentification for medical documents

    Phalanx - Deidentify Safe Harbor Deidentification Mode of Phalanx is an abridged pipeline of NLP annotators culminating in NER annotators which write output of text offsets. It uses the Safe Harbor deidentification method.
    Downloads: 0 This Week
    Last Update:
    See Project
  • PageDNA: Web-to-Print eCommerce Software Icon
    PageDNA: Web-to-Print eCommerce Software

    eCommerce for Print, Signs and Fulfillment Trusted by In‑Plants and Commercial Print Leaders

    PageDNA enables successful eCommerce strategies for commercial print sales organizations, internal print shops, and brand owners. PageDNA’s online ordering platform increases print volume while decreasing touch costs for all stakeholders: clientele, print operations, and the organizations they support.
    Learn More
  • 5
    UnsupervisedMT

    UnsupervisedMT

    Phrase-Based & Neural Unsupervised Machine Translation

    Unsupervised Machine Translation is a research repository that implements both phrase-based SMT and neural MT approaches for translation without parallel corpora. The neural component supports multiple architectures—seq2seq, biLSTM with attention, and Transformer—and allows extensive parameter sharing across languages to improve data efficiency. Training relies on denoising auto-encoding and back-translation, with on-the-fly, multithreaded generation of synthetic parallel data to continually...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6

    Arabic Corpus

    Text categorization, arabic language processing, language modeling

    The Arabic Corpus {compiled by Dr. Mourad Abbas ( http://sites.google.com/site/mouradabbas9/corpora ) The corpus Khaleej-2004 contains 5690 documents. It is divided to 4 topics (categories). The corpus Watan-2004 contains 20291 documents organized in 6 topics (categories). Researchers who use these two corpora would mention the two main references: (1) For Watan-2004 corpus ---------------------- M. Abbas, K. Smaili, D. Berkani, (2011) Evaluation of Topic Identification Methods on...
    Leader badge
    Downloads: 14 This Week
    Last Update:
    See Project
  • 7

    Presage

    the intelligent predictive text entry platform

    Presage (formerly Soothsayer) is an intelligent predictive text entry system. Presage generates predictions by modelling natural language as a combination of redundant information sources. Presage computes probabilities for words which are most likely to be entered next by merging predictions generated by the different predictive algorithms. Presage's modular and extensible architecture allows its language model to be extended and customized to utilize statistical, syntactic, and semantic...
    Leader badge
    Downloads: 220 This Week
    Last Update:
    See Project
  • 8
    HermeneutiX

    HermeneutiX

    Your graphical tool for Syntactic/Semantic Structure Analysis of texts

    HermeneutiX is a tool for diagramming syntactic and semantic structures of complex (not necessarily foreign-language) texts (e.g. bible or other historical excerpts). HermeneutiX is now part of SciToS (the scientific tool set). Starting with version 2.0.0, HermeneutiX can be found on GitHub. Please check out the release summary: https://github.com/scientific-tool-set/scitos/releases For an introduction, check out this video: https://youtu.be/uQjewyG0Ad8 PS: To run a Java...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Helsinki Finite-State Technology
    The Helsinki Finite-State Transducer toolkit is intended for processing natural language morphologies. The toolkit is demonstrated by wide-coverage implementations of a number of languages of varying morphological complexity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Wiz: #1 Cloud Security Software for Modern Cloud Protection Icon
    Wiz: #1 Cloud Security Software for Modern Cloud Protection

    Protect Everything You Build and Run in the Cloud

    Use the Wiz Cloud Security Platform to build faster in the cloud, enabling security, dev and devops to work together in a self-service model built for the scale and speed of your cloud development.
    Learn More
  • 10
    Speakable Programming for Every Language

    Speakable Programming for Every Language

    Your language to speak with all.

    This project has the language data for spel, the main new codebase is at: https://gitlab.com/liberit/pyac A computer programming language using human language syntax for human-to-human and human-to-computer communication with high precision, supporting many languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    TEES

    Turku Event Extraction System

    Turku Event Extraction System (TEES) is a free and open source natural language processing system developed for the extraction of events and relations from biomedical text. It is written mostly in Python, and should work in generic Unix/Linux environments. Currently, the TEES source code repository still remains on GitHub at http://jbjorne.github.com/TEES/ where there is also a wiki with more information.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    RDRPOSTagger

    A Rule-based Part-of-Speech and Morphological Tagging Toolkit

    RDRPOSTagger is a robust, easy-to-use and language-independent rule-based toolkit for Part-of-Speech (POS) and morphological tagging. RDRPOSTagger obtains fast performance in both learning and tagging process. RDRPOSTagger also achieves a very competitive accuracy in comparison to the state-of-the-art results. RDRPOSTagger now supports pre-trained POS and morphological tagging models for Bulgarian, Czech, Dutch, English, French, German, Hindi, Italian, Portuguese, Spanish, Swedish, Thai...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Arramooz Alwaseet Arabic Dictionary
    Arramooz Alwaseet Open Arabic Dictionary for morphological analyze. To be useful for Arabic language processing. This dictionary is derived from the Ayaspell Arabic spell checker.
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14

    poliqarp2

    natural language corpora search engine

    This project aims at building an efficient indexer and search engine for natural language corpora with multilevel annotations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    BioC

    We describe a simple XML format to share text documents and annotation

    A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. We are...
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    Part-of-speech tagging is the task of assigning symbols from a particular set to words in a natural language text. ACOPOST implements and extends well-known machine learning techniques and provides a uniform environment for testing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    This project concerns the development of human language technology resources, based on the approach to share or recycle resources between closely related language. http://gerhard.pro/closely-related-languages/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    mwetoolkit

    THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/

    ...Even though it focuses on multiword expresisons, the framework is quite complete and can also be useful in any corpus-based study in computational linguistics. The mwetoolkit can be applied to virtually any text collection, language, and MWE type. It is a command-line tool written mostly in Python. Its development started in 2010 as a PhD thesis but the project keeps active (see the SVN logs). Up-to-date documentation and details about the tool can be found on the mwetoolkit website: http://mwetoolkit.sourceforge.net/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    Aelius Brazilian Portuguese POS-Tagger

    Python, NLTK-based package for shallow parsing of Brazilian Portuguese

    Aelius is an ongoing open source project aiming at developing a suite of Python, NLTK-based modules and interfaces to external freely available tools for shallow parsing of Brazilian Portuguese. It also includes language resources such as language models, sample texts, and gold standards. Presently, Aelius already offers facilities for POS-tagging and chunking corpora and outputting annotations in different formats, such as in XML in the TEI P5 encoding scheme.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Board Game Language
    Board Game Language (BGL, pronounced "bagel") is a natural language syntax programming language for first-time programmers. It uses board games as a metaphor for programming concepts, with the goal of teaching users the foundations of programming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    AnnotationsForStatements-plugin

    AnnotationsForStatements-plugin

    Write annotations for statements

    ...This plug-in will automatically extract the written code to a method, including its annotations, without changing the way you program. The plug-in is built for the Java programming language and is currently able to transform statements that support comments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    The Magda language

    Magda language resource site

    Magda is a programming language introduced in Jarek Kusmierek's PhD (http://www.mimuw.edu.pl/~jdk/mixiny.pdf) and continued in Mauro Mulatero's thesis (http://www.tesionline.it/default/tesi.asp?idt=45612). Magda's goal is to allow a programmer to write well-modularized, reusable code. Magda is based upon the core notion of mixin as the only unit of reuse.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23

    Language Constructor

    Complete tool for constructing/manipulating languages in digital form

    With this tool you can easily design a new language, digitize an existing one or incrementally reconstruct an ancient language. It allows for free experimentation of all aspects of the language, so it does not have to be made consistent on paper first. You can edit script, syntax, grammar, morphology, lexicon and phonology, as well as write documents in the language, as it might be too complex to be handled by current font technology. The information is stored in xml format for easy...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24

    Automatic Compound Processing (AuCoPro)

    Automatic compound splitting and semantic analysis of compounds

    The central problem to be addressed in this project concerns a multidisciplinary (linguistics and computational linguistics) investigation into sharing of knowledge and resources between closely-related languages, specifically relating to the automatic processing of compounds. Specifically, we will explore the possibility to create new knowledge about closely-related languages, and efficiently develop additional, more advanced resources for (a) compound segmentation; and (b) the semantic...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Redundancy due to cut-paste operations in text creates bias in machine learning for NLP. This module takes a directory and produces a subset of the files in that directory (in a list) with an upper bound on similarity between two files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB