Vision Encoder/Decoder Model

RynnVLA-002: A Unified Vision-Language-Action and World Model

RynnVLA-002 is an autoregressive action world model that unifies action and image understanding and generation. RynnVLA-002 intergrates Vision-Language-Action (VLA) model (action model) and world ...

GitHub

GAPrompt: Geometry-Aware Point Cloud Prompt for 3D Vision Model

Official implementation of 'GAPrompt: Geometry-Aware Point Cloud Prompt for 3D Vision Model'. The paper has been accepted by ICML 2025. Pre-trained 3D vision models have gained significant attention ...

EurekAlert!

Insilico Medicine launches science MMAI gym to train frontier LLMs into pharmaceutical-grade scientific engines

New “AI GYM for Science” dramatically boosts the biological and chemical intelligence of any causal or frontier LLM, ...

Scientific Research Publishing

Geo-Refined Point Transformer: Coordinate-Aware Excitation and Positional Upsampling for 3D Scene Segmentation ()

The proposed Coordinate-Aware Feature Excitation (CAFE) module and Position-Aware Upsampling (Pos-Up) module both adhere to ...

13d

New Apple model combines vision understanding and image generation with impressive results

Manzano combines visual understanding and text-to-image generation, while significantly reducing performance or quality trade-offs.

AZoRobotics on MSN

Combining AI and X-ray physics to overcome tomography data gaps

With PFITRE, Brookhaven scientists achieve breakthrough 3D imaging in nanoscale X-ray tomography, combining AI and physics for superior clarity and precision.

Tech Xplore

Novel AI method sharpens 3D X-ray vision

X-ray tomography is a powerful tool that enables scientists and engineers to peer inside of objects in 3D, including computer chips and advanced battery materials, without performing anything invasive ...

IEEE

A Novel Length Controllable Encoder-Decoder Transformer Model for Abstractive Summarization of Scientific Documents

Abstract: A growing number of scientific publications are available today. As this data grows, it becomes increasingly important to use semantic density to convey the most essential information as ...

IEEE

Attention-Based Deep Learning Method for Rotor Temperature Estimation in Permanent Magnet Synchronous Motors

Abstract: Accurate real-time temperature estimation in permanent magnet synchronous motors is critical for safe and efficient operation. This article presents an attention-based deep learning ...

EurekAlert!

Beyond bigger models: How efficient multimodal AI is redefining the future of intelligence

A generalized architectural blueprint for building efficient MLLMs. This template achieves efficiency through a combination of component choices and data flow optimization. Key strategies include: (1) ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results