Showing 9 open source projects for "image processing in java"

View related business solutions
  • New Relic provides the most powerful cloud-based observability platform built to help companies create more perfect software. Icon
    New Relic provides the most powerful cloud-based observability platform built to help companies create more perfect software.

    Get a live and in-depth view of your network, infrastructure, applications, end-user experience, machine learning models and more.

    Correlate issues across your stack. Debug and collaborate from your IDE. AI assistance at every step. All in one connected experience - not a maze of charts.
    Start for Free
  • JS7 JobScheduler is an open source workload automation solution. Icon
    JS7 JobScheduler is an open source workload automation solution.

    JS7 offers cross-platform job execution, managed file transfer, complex no-code job dependencies and a real REST API.

    JS7 JobScheduler is an open source workload automation solution. It is used to run executable files, shell scripts etc. and database procedures.
    Learn More
  • 1
    Depth Anything 3

    Depth Anything 3

    Recovering the Visual Space from Any Views

    Depth Anything 3 is a research-driven project that brings accurate and dense depth estimation to any input image or video, enabling foundational understanding of 3D structure from 2D visual content. Designed to work across diverse scenes, lighting conditions, and image types, it uses advanced neural networks trained on large, heterogeneous datasets, producing depth maps that reveal scene depth relationships and object surfaces with strong fidelity.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 2
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    ...It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. It supports local deployment, enabling organizations concerned about privacy or latency to run the pipeline on-premises rather than send sensitive documents to third-party cloud services. The codebase is written in Python with a focus on modularity: you can swap preprocessing, recognition, and post-processing components as needed for custom workflows.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    HunyuanDiT

    HunyuanDiT

    Diffusion Transformer with Fine-Grained Chinese Understanding

    HunyuanDiT is a high-capability text-to-image diffusion transformer with bilingual (Chinese/English) understanding and multi-turn dialogue capability. It trains a diffusion model in latent space using a transformer backbone and integrates a Multimodal Large Language Model (MLLM) to refine captions and support conversational image generation. It supports adapters like ControlNet, IP-Adapter, LoRA, and can run under constrained VRAM via distillation versions. LoRA, ControlNet (pose, depth,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Depth Pro

    Depth Pro

    Sharp Monocular Metric Depth in Less Than a Second

    Depth Pro is a foundation model for zero-shot metric monocular depth estimation, producing sharp, high-frequency depth maps with absolute scale from a single image. Unlike many prior approaches, it does not require camera intrinsics or extra metadata, yet still outputs metric depth suitable for downstream 3D tasks. Apple highlights both accuracy and speed: the model can synthesize a ~2.25-megapixel depth map in around 0.3 seconds on a standard GPU, enabling near real-time applications. The...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Intelligent Appointment Reminders Icon
    Intelligent Appointment Reminders

    For doctors, clinics and hospitals

    DoctorConnect provides industry leading patient engagement. In business for over 25 years, we provide highly customizable services to thousands of doctors, clinics and hospitals. Appointment Reminders, After Care Surveys, Automated No-Show and Recall Messaging, and more. We can directly interface with hundreds of EMR and PM systems. We'd love to hear from you and show you how we increase your revenue and your patient satisfaction.
    Get a Demo
  • 5
    DreamCraft3D

    DreamCraft3D

    Official implementation of DreamCraft3D

    DreamCraft3D is DeepSeek’s generative 3D modeling framework / model family that likely extends their earlier 3D efforts (e.g. Shap-E or Point-E style models) with more capability, control, or expression. The name suggests a “dream crafting” metaphor—users probably supply textual or image prompts and generate 3D assets (point clouds, meshes, scenes). The repository includes model code, inference scripts, sample prompts, and possibly dataset preparation pipelines. It may integrate rendering or post-processing modules (e.g. mesh smoothing, texturing) to make the outputs more output-ready. Because 3D generation is hardware‐intensive, the repository likely also includes optimizations like quantization, pruning, or inference accelerations (e.g. using FlashMLA or DeepEP) to make the generation pipeline faster or more efficient. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    GLM-4.5V

    GLM-4.5V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.5V is the preceding iteration in the GLM-V series that laid much of the groundwork for general multimodal reasoning and vision-language understanding. It embodies the design philosophy of mixing visual and textual modalities into a unified model capable of general-purpose reasoning, content understanding, and generation, while already supporting a wide variety of tasks: from image captioning and visual question answering to content recognition, GUI-based agents, video understanding,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    MiniMax-01

    MiniMax-01

    Large-language-model & vision-language-model based on Linear Attention

    MiniMax-01 is the official repository for two flagship models: MiniMax-Text-01, a long-context language model, and MiniMax-VL-01, a vision-language model built on top of it. MiniMax-Text-01 uses a hybrid attention architecture that blends Lightning Attention, standard softmax attention, and Mixture-of-Experts (MoE) routing to achieve both high throughput and long-context reasoning. It has 456 billion total parameters with 45.9 billion activated per token and is trained with advanced parallel...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Free AI Watermark Remover - FreeRepair

    Free AI Watermark Remover - FreeRepair

    AI-powered tool to quickly remove watermarks from images flawlessly

    AI Watermark Remover (Free And Open-Source) & Make Blurry Images Clearer Or Larger Tool - FreeRepair, Simulation IOPaint Based On The Django Of Python With No Sign-Up. As a free, open-source, AI-powered tool, FreeRepair makes it easy to remove watermarks, logos, text or clutter from images, and blurry images can be made clearer or larger. No installation, no internet connection, it works out of the box, safe and secure, unlimited.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 9
    Warlock-Studio

    Warlock-Studio

    AI Suite for upscaling, interpolating & restoring images/videos

    v6.0. Warlock-Studio is a Windows application that uses Real-ESRGAN, BSRGAN, IRCNN, GFPGAN, RealESRNet, RealESRAnime and RIFE Artificial Intelligence models to upscale, restore faces, interpolate frames and reduce noise in images and videos. the application supports GPU acceleration (including multi-GPU setups) and offers batch processing for large workloads. It includes drag-and-drop handling for single or multiple files, optional pre-resize functions, and an automatic tiling system...
    Downloads: 24 This Week
    Last Update:
    See Project
  • AI-powered SAST and AppSec platform that helps companies find and fix vulnerabilities. Icon
    AI-powered SAST and AppSec platform that helps companies find and fix vulnerabilities.

    Trusted by 750+ companies and performing 200k+ code scans monthly.

    ZeroPath (YC S24) is an AI-native application security platform that delivers comprehensive code protection beyond traditional SAST. Founded by security engineers from Tesla and Google, ZeroPath combines large language models with advanced program analysis to find and automatically fix vulnerabilities.
    Learn More
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB