1. Direct3D-S2: A movie grade 3D generative model that can be trained with just 8 GPUs, surpassing many closed source commercial models in terms of performance
The Direct3D-S2 open-source 3D generative model jointly launched by DreamTech, Nanjing University, Fudan University, and Oxford University has performed outstandingly in the HuggingFace hot list, requiring only 8 GPUs to train, surpassing many closed source commercial models and achieving film and television level precision. Its core innovation, Spatial Sparse Attention Mechanism (SSA), significantly improves generation efficiency and detail representation, solving the computational pressure and complexity problems faced by traditional 3D modeling.
In Direct3D-S2, the DreamTech team proposed a core innovation - Spatial Sparse Attention (SSA) mechanism.
This mechanism is designed specifically to solve the problem of low efficiency and poor precision in processing high-resolution 3D generation in the current Diffusion Transformer (DiT), and can be regarded as an efficiency engine in the field of 3D generation.
2. Neuralink and Grok collaborate to endow ALS patients with the ability to speak using brain computer chips
Recently, a case shared by Musk on X showed that Neuralink and Grok are collaborating to help ALS patients regain their voice.
Through brain computer interface technology, a patient with ALS successfully achieved the ability to output text using their thoughts, and with the help of AI, completed sentence completion and voice cloning, ultimately speaking in a voice that is close to their own. This breakthrough is due to Neuralink's brain computer chip implantation technology and Grok's powerful natural language processing capabilities.
Specifically, patients only need to think to move the cursor to generate text, while Grok Assistant automatically corrects and completes the text like "mind reading", and finally clones the patient's original voice through AI, making communication more natural.
The original source of the post forwarded by Musk, Mario Nawwal, previously introduced that the patient Bradford Smith has lost his ability to move and speak due to ALS, and Neuralink enables him to generate text through thinking. Grok can achieve "mind reading" style automatic correction, and then use another AI to "clone" his real voice, so that he can have a voice that sounds like himself when he "speaks".
In May of this year, Neuralink's brain computer interface device Link obtained the "Breakthrough Device" certification from the US FDA, specifically designed to help patients with severe language disorders restore their communication skills.
3. Open source framework Rowboat: Quickly build intelligent assistants, supporting MCP and Agent SDK
Rowboat, an open-source multi-agent development framework supported by Y Combinator, has made its debut, offering MCP services and OpenAI Agent SDK. The framework consists of three major modules: Agent, Playground, and Co pilot, which facilitate users to quickly build, test, and deploy intelligent assistants.
Agent, Responsible for handling specific parts of the conversation and being able to execute tasks using tools based on instructions. Its highlight is that it can be configured through natural language instructions, can be arranged between agents in the form of graphics, and can also access tools and RAGs.
Playground, This is an interactive environment that facilitates users to test in a conversational manner when building assistants. It has real-time testing and debugging capabilities, can check the parameters and results of tool calls within the interface, and can communicate with individual agents or the entire assistant.
Copilot, AI driven auxiliary tools can create and update intelligent agents and tools on behalf of users. Capable of perceiving the context of all components, including the training ground, optimizing the agent based on dialogue and feedback, and understanding requests made by users in natural language.
Users can create multi-agent systems, such as credit card assistants, to achieve task collaboration. Rowboat also provides HTTP API and Python SDK to adapt to various development scenarios. Currently, Rowboat has over 2000 stars on Github.
1. Real time translation function of Apple Intelligence: based on multiple applications on the side and horizontal frames, open to third-party developers
In Apple's latest release of iOS 26, Apple Intelligence supports real-time translation, which spans across three communication apps: phone, messaging, and Facetime. When you receive a foreign language message, the system will automatically translate it into your language; The relevant functions have been integrated into information, phone and other apps, enabling real-time translation of text and audio, thereby helping users overcome language barriers.
Similarly, the content you post will be translated into the other person's language in real-time, making cross language communication smoother than ever before.
The real-time translation function is completely based on the end side, and your conversation content will not flow to any unauthorized places.
The real-time translation function driven by Apple Intelligence will be open to all third-party developers through API interfaces, and developers can integrate the real-time translation function into any communication software.
In the past year, Apple has launched AI features such as Genmoji and Tuyuan overseas to help users express content more freely and interestingly. However, the most concerned AI Siri will be implemented, and there is still no specific date given at this year's WWDC.
Language adaptation has made progress. Apple Intelligence will support these languages before the end of this year: Danish, Dutch, Norwegian, Portuguese, Swedish, Turkish, Traditional Chinese and Vietnamese.
Apple announces the launch of Foundation Models Framework. This is a brand new API that allows third-party developers to call the Large Language Model (LLM) of Apple Intelligence core and integrate it into their own applications.
Developers do not need to build their own AI models, nor rely on cloud services, they can call a powerful, responsive, and privacy conscious intelligent assistant in their apps. More importantly, it is not afraid of network disconnection and can run offline.
2. Talking Tours: Google releases AI tour guides that support real-time conversation interaction
Open the Talking Tours page and you will see an interactive map covering multiple cultural landmarks and natural landscapes around the world, divided into multiple themes: cultural institutions (museums, libraries, theaters), landmark buildings, historic sites, and natural landscapes (forests, caves, deserts, gardens, oceans).
Click on the coordinates on the map to enter the immersive street view of the corresponding location. The AI tour guide will explain the background information of the location through voice, such as the architectural style and historical allusions of a certain museum, and even the design inspiration for the wallpaper in the exhibition hall.
After switching screens, click the 'take a snapshot' button, and the AI will generate a new explanation based on the new screen. From a different perspective, the same location may also tell different stories. You can also click on the " 🙋」 Icon, ask questions to AI tour guides.
1. Ren Zhengfei: AI may be the last technological transformation in human society
On June 10th, the front page of People's Daily reported that recently, at Huawei headquarters in Shenzhen, a group of People's Daily reporters had a face-to-face conversation with Huawei CEO Ren Zhengfei on some hot topics of public concern. During the communication, Ren Zhengfei revealed that when faced with external blockade and suppression, encountering many difficulties, he firmly believed that "if you don't think about difficulties, you're done, step by step, move forward".
When asked about the future prospects of artificial intelligence (AI), Ren Zhengfei stated that AI may be the last technological transformation in human society. Its explanation states:
The development of artificial intelligence will take decades, centuries. Don't worry, China also has many advantages. Ren Zhengfei also emphasized that the key to artificial intelligence technology lies in having sufficient electricity and a developed information network. The development of artificial intelligence requires electricity security. China's power generation and grid transmission are very good, and its communication network is the most developed in the world. The ideal of East West computing is possible to achieve
In addition, Ren Zhengfei also mentioned other advantages: there is no need to worry about chip issues, and using methods such as stacking and clustering, the calculation results are comparable to the most advanced level. In terms of software, there will be thousands of open source software in the future to meet the needs of the entire society. (@ APPSO)
2. Former Chief Scientist of OpenAI: AI will do everything we can
Recently, Ilya Sutskever, former chief scientist of OpenAI, returned to his alma mater, the University of Toronto, and gave a personal speech while receiving an honorary doctorate degree.
Ilya shared her personal mentality at the beginning: accept reality, try not to regret the past, and strive to improve the current situation. Furthermore, he stated that everyone is in a truly extraordinary era - due to the emergence of AI.
Ilya admits that today's AI has greatly changed the meaning of 'student', and far beyond that. Ilya stated that the things AI can do are far beyond imagination, and our current challenge is "how AI will affect our work and careers," while also facing deeper challenges - the future development of AI will be unprecedented and extremely intense.
He also emphasized, 'Anything I can learn, anything any one of you can learn, AI can learn.'. So, why am I so confident? How do we know if AI can do these things in the future? The reason is that each of our brains is a biological computer. We have a brain because it is a biological computer. So, if human biological computers can do these things, why can't digital computers, that is, digital brains, do the same things? That's why I believe AI can ultimately do everything we can. ''
What will happen when AI can do all of our work? Ilya believes that this issue requires great attention. He reminded, 'You may not care about AI, but AI will actively care about you.'.
Therefore, Ilya suggests that in the era of AI, as long as you start using AI and understand what the most advanced AI can do, you will gradually establish an intuition. I think that by using AI and observing what the most advanced AI can do today, you will form an intuition. As AI continues to improve within one, two, or three years, this intuition will become even stronger. Slowly, we will have a certain concept of the development of AI, naturally no longer have fear of AI, and be able to control AI, inspiring the power brought by new technologies to us.
Finally, Ilya emphasized:
The challenge brought by AI is the biggest challenge in human history. But if we respond appropriately, the rewards we receive will also be the greatest in human history.