hub

CodeZen

CodeZen

一个专注中文区的 GitHub 项目发现

avatar

fluxgym

Dead simple FLUX LoRA training UI with LOW VRAM support

star3.1k
Python
avatar

LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

large-language-models multimodal-large-language-models speech-interaction speech-language-model speech-to-speech
star3.1k
Python
avatar

AI-Guide-and-Demos-zh_CN

这是一份入门AI/LLM大模型的逐步指南,包含教程和演示代码,带你从API走进本地大模型部署和微调,代码文件会提供Kaggle或Colab在线版本,即便没有显卡也可以进行学习。项目中还开设了一个小型的代码游乐场🎡,你可以尝试在里面实验一些有意思的AI脚本。同时,包含李宏毅 (HUNG-YI LEE)2024生成式人工智能导论课程的完整中文镜像作业。

star3.1k
Python
avatar

seed-vc

Seed-VC *English | 简体中文README-ZH.md | 日本語README-JA.md* real-time-demo.webmhttps://github.com/user-attachments/assets/86325c5e-f7f6-4a04-8695-97275a5d046c Currently released model supports *zero-shot voice conversion* 🔊 , *zero-shot real-time voice conversion* 🗣️ and *zero-shot singing voice

singing-voice-conversion voice-conversion
star3.4k
Python
avatar

mochi

The best OSS video generation models, created by Genmo

star3.5k
Python
avatar

Nugget

Unlock the fullest potential of your device

star4.0k
Python
avatar

g1

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

managed-by-terraform
star4.2k
Python
avatar

aci

ACI.dev is the open source tool-calling platform that hooks up 600+ tools into any agentic IDE or custom AI agent through direct function calling or a unified MCP server. The birthplace of VibeOps.

agents ai ai-agents api developer-tools
star4.7k
Python
avatar

PlugNPlay-Modules

全网最全最新的即插即用模块:目前进度70% 包括卷积 注意力机制 下采样 特征融合模块等 持续更新~ 详细论文讲解关注公众号【ai缝合大王】和B站【ai缝合大王】 模块分享、缝合交流进q群: 994264161 更多细分方向群:① 目标检测 ② 图像分类 ③ 语义分割 ④ 人脸识别 ⑤ 三维重建 ⑥ 多模态融合 ⑦ 姿态估计 ⑧ 超分辨率⑨ 自动驾驶 ⑩ 图像生成 ⑪ 遥感影像 ⑫ 医学图像 ⑬ 底层视觉 ⑭ YOLO 系列 ⑮ Mamba 等新架构⑯ 视频处理 ⑰ 3D ⑱ 大模型 ⑲ 重识别(ReID)⑳ 图像去雨/去噪/去模糊 细分方向群为微信群,扫描二维码添加微信,扣1-

star4.8k
Python
avatar

minimind-v

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

artificial-intelligence chatgpt vision-language-model
star5.2k
Python
avatar

podcastfy

An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI

elevenlabs gemini genai notebooklm openai
star5.6k
Python
avatar

GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

star8.0k
Python