通用创意智能 Creative General Intelligence
Creative General Intelligence Creative General Intelligence
探索机器创造力的边界 —— 从“理解人类创作”出发,赋予AI顶层设计与美学推理的能力。 Exploring the boundaries of machine creativity — starting with 'understanding human creation' to endow AI with top-level design and aesthetic reasoning capabilities.
我们旨在打破传统AI“只能机械生成、无法理解创作”的局限。本方向不仅研究人类在艺术创作中的意图、过程与认知机制,更致力于让AI真正具备类似人类导演或设计师的“设计思维 (Design Thinking)”。 1. 跨模态美学对齐:优秀的艺术作品源于对声音、画面与故事的整体想象。我们的研究核心是实现文字、音乐、色彩、情感与动态效果等跨模态设计元素的深度对齐,赋予AI顶层的美学感知和全景式构思能力,使其不仅能生成素材,更能理解创作背后的美学逻辑。 2. 大模型的可解释性与轻量化: 从微观(神经元激活表征)、中观(子网络与注意力机制协同)到宏观(整体泛化与涌现能力)三个尺度,深度剖析生成式大模型的内在机理。通过揭示这些“黑盒”规律,指导算法优化与架构精简,实现“以小博大”的高效、轻量化计算。 We aim to break through the traditional limitation of AI being confined to mechanical generation without an understanding of the creative process. This direction not only investigates human intent, processes, and cognitive mechanisms in artistic creation but also strives to grant AI a 'Design Thinking' capability akin to that of human directors or designers. 1. Cross-modal Aesthetic Alignment: Outstanding artworks stem from a holistic imagination of sound, visuals, and narrative. Our core research focuses on the deep alignment of cross-modal design elements—such as text, music, color, emotion, and motion—endowing AI with top-level aesthetic perception and panoramic conceptualization abilities. This enables AI to generate assets while truly understanding the underlying aesthetic logic of creation. 2. Interpretability and Lightweighting of Large Models: We conduct a multi-scale analysis of generative large models—from the micro level (neuron activation patterns), through the meso level (sub-network and attention mechanism synergy), to the macro level (overall generalization and emergent capabilities). By demystifying these 'black boxes,' we guide algorithmic optimization and architectural refinement to achieve efficient, lightweight computing that 'does more with less.'