hub

CodeZen

CodeZen

一个专注中文区的 GitHub 项目发现

avatar

minimind-v

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

artificial-intelligence chatgpt vision-language-model
star5.2k
Python
avatar

KAG

KAG: Knowledge Augmented Generation English | 简体中文 | 日本語版ドキュメント 1. What is KAG? KAG is a logical reasoning and Q&A framework based on the OpenSPGhttps://github.com/OpenSPG/openspg engine and lar

knowledge-graph large-language-model logical-reasoning multi-hop-question-answering trustfulness
star8.2k
Python
avatar

ha_xiaomi_home

Xiaomi Home Integration for Home Assistant English./README.md | 简体中文./doc/README_zh.md Xiaomi Home Integration is an integrated component of Home Assistant supported by Xiaomi official. It allows you to use Xiaomi IoT smart devices in Home Assistant. Installation > Home Assistant version require

home-assistant home-assistant-integration miot miot-devices smart-home
star20.9k
Python
avatar

PDFMathTranslate

[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero

chinese document edit english japanese
star29.7k
Python
avatar

Show-o

One Single Transformer to Unify Multimodal Understanding and Generation Jinheng Xiehttps://sierkinhane.github.io/1&42;  Weijia Maohttps://scholar.google.com/citations?hl=zh-CN&user=S7bGBmkyNtEC&view_op=list_works&sortby=pubdate1&42;  Zechen Baihttps://www.baizechen.site/1&42;  David

diffusion-models large-language-models multimodal
star1.8k
Python
avatar

LongWriter

LongWriter:释放长上下文LLM的10,000+字生成能力 🤗 HF 仓库 • 📃 论文 • 🚀 HF 空间 English./README.md | 中文./README_zh.md | 日本語./README_jp.md https://github.com/user-attachments/assets/c7eedeca-98ed-43ec-8619-25137987bcde 左:LongWriter-glm4-9b;右:GLM-4-9B-chat 🔥 更新 **2024年8月18日** 您现在可以使用vllmhttps://github.com/vllm-

fine-tuning llm long-context long-text
star1.8k
Python
avatar

ProxyCat

一款部署于云端或本地的隧道代理池中间件,可将静态代理IP灵活运用成隧道IP,提供固定请求地址,一次部署终身使用

cyber-security cyber-security-tool proxy proxypool security
star2.3k
Python
avatar

VITA

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction 📖 VITA-1.5 Paperhttps://arxiv.org/pdf/2501.01957 🤖 Basic Demohttps://modelscope.cn/studios/modelscope/VITA1.5_demo 🍎 VITA-1.0https://vita-home.github.io/ 💬 WeChat 微信./asset/wechat-group.jpg --- 📽 VITA-1.5 De

large-multimodal-models multimodal-large-language-models omni-language-model omni-modal-video-understanding omni-model
star2.4k
Python
avatar

llm_related

复现大模型相关算法及一些学习记录

star2.5k
Python
avatar

D-FINE

English | 简体中文README_cn.md | 日本語README_ja.md | English Blogsrc/zoo/dfine/blog.md | 中文博客src/zoo/dfine/blog_cn.md D-FINE: Redefine Regression Task of DETRs as Fine&8209;grained Distribution Refinement

d-fine detr object-detection
star2.8k
Python
avatar

LiYing

LiYing is an automated photo processing program designed for automating the post-processing workflow of ID photos in general photo studios. | LiYing 是一套适用于自动化 完成一般照相馆后期证件照处理流程的照片自动处理的程序。

background-replacement image-compression image-cropping photo-layout photo-processing
star3.0k
Python
avatar

NarratoAI

利用AI大模型,一键解说并剪辑视频; Using AI models to automatically provide commentary and edit videos with a single click.

aiagent aiops gemini-api llm moviepy
star7.1k
Python