Blog

一文了解Transformer全貌(图解Transformer)

打个小广告 ☻,知乎专栏《大模型前沿应用》的内容已经收录在新书《揭秘大模型:从原理到实战》中。感兴趣的朋友可以购买,多谢支持!♥♥ 自2017年Google推出Transformer以来,基于其架构的语言模型便如雨后春笋般涌现,其中Bert、T5等备受瞩目,而近期风靡全球的大模型ChatGPT和LLaMa更是大放异彩。网络上关于Transformer的解析文章非常大,但本文将力求用浅显易懂的语言,为大家深入解析Transformer的技术内核。 前言 Transformer是谷歌在2017年的论文《Attention Is All You Need》中提出的,用于NLP的各项任务,现在是谷歌云TPU推荐的参考模型。网上有关Transformer原理的介绍很多,在本文中我们将尽量模型简化,让普通读者也能轻松理解。 在机器翻译中,Transformer可以将一种语言翻译成另一种语言,如果把Transformer看成一个黑盒,那么其结构如下图所示: 将法语翻译成英语 那么拆开这个黑盒,那么可以看到Transformer由若干个编码器和解码器组成,如下图所示: 继续将Encoder和Decoder拆开,可以看到完整的结构,如下图所示: Transformer整体结构(引自谷歌论文) 可以看到Encoder包含一个Muti-Head Attention模块,是由多个Self-Attention组成,而Decoder包含两个Muti-Head Attention。Muti-Head Attention上方还包括一个 Add & Norm 层,Add 表示残差连接 (Residual Connection) 用于防止网络退化,Norm 表示 Layer Normalization,用于对每一层的激活值进行归一化。 假设我们的输入包含两个单词,我们看一下Transformer的整体结构: Transformer整体结构(输入两个单词的例子) 为了能够对Transformer的流程有个大致的了解,我们举一个简单的例子,还是以之前的为例,将法语”Je suis etudiant”翻译成英文。 Transformer输入表示 输入X经过Encoder输出编码矩阵C Transformer Decoder预测 上图Decoder接收了Encoder的编码矩阵,然后首先输入一个开始符 “”,预测第一个单词,输出为”I”;然后输入翻译开始符 “” 和单词 “I”,预测第二个单词,输出为”am”,以此类推。这是Transformer的大致流程,接下来介绍里面各个部分的细节。 2. Transformer的输入表示 Transformer中单词的输入表示由单词Embedding和位置Embedding(Positional Encoding)相加得到。 Transformer输入表示 2.1 单词Embedding 单词的Embedding可以通过Word2vec等模型预训练得到,可以在Transformer中加入Embedding层。 2.2 位置Embedding Transformer 中除了单词的Embedding,还需要使用位置Embedding

Heron Financial draws out AI principles committee and training program

Heron Financial takes a bold step into the future, unveiling an AI ethics committee and a pioneering training program, setting new standards in responsible AI use.

World Monetary institution Country and Lending Groups – World Monetary institution Data Back Desk

For the current 2026 fiscal year, low-income economies are defined as those with a GNI per capita, calculated using the World Bank Atlas method, of $1,135 or less in 2024; lower middle-income economies are those with a GNI per capita between $1,136 and $4,495; upper middle-income economies are those with a GNI per capita between

Smart Tech Must-Haves!

Smart Tech Must-Haves!

In a world that thrives on rapid communication and instantaneous information, the right tech tools can significantly enhance our productivity and enrich our daily interactions. Enter 2025’s lineup of smart devices that are not just gadgets but powerful allies in both our personal and professional lives. From seamless translations to AI-powered recordings, these innovations are redefining convenience.

1. 2025 AI Wireless Mouse for PC Laptop
Imagine a mouse that doesn’t just help you navigate your desktop but also acts as your personal AI assistant-meet the ChatGPT Enabled Bluetooth Mouse. This sleek, ergonomic presenter doubles as a remote laser pointer and features voice recording capabilities, allowing you to capture notes without lifting a pen. Not only can it summarize your meetings, its USB rechargeable battery means you’ll never be caught off guard. Say goodbye to clunky devices and hello to high-performance functionality!

2. AI Translation Earbuds
For globetrotters and business mavens alike, the Real-Time AI Translation Earbuds represent the ultimate trifecta of travel convenience. Supporting an astonishing 144 languages, these 3-in-1 translation headphones will bridge the communication gap whether you’re sipping espresso in Paris or negotiating deals in Tokyo. With a sleek charging case, they make language barriers a relic of the past. Just pop them in, and let the world talk!

3. Plaud Note AI Voice Recorder
What if we told you that capturing ideas and lectures could be as straightforward as pressing a button? The Plaud Note AI Voice Recorder is here to revolutionize your note-taking game. With its ability to transcribe and summarize conversations in real-time, all while supporting 112 languages, it’s an invaluable tool for students, professionals, or anyone who wants to ensure they never miss a detail. Plus, with 64GB of storage, you can record hours of content without worrying about running out of space.

4. rabbit r1 Voice-Activated AI Assistant Device
Last but certainly not least, the rabbit r1 is like having a personal assistant that never takes a day off. This voice-activated AI device is not only an effective voice recorder but also offers transcription, summarization, and a magic camera feature-perfect for capturing both audio and visuals effortlessly. With more than 100 language translations at your disposal and zero subscription fees, it’s a smart investment for anyone seeking to boost productivity.

As we edge further into 2025, the integration of smart technology in our daily lives is not merely advantageous; it’s essential. Each of these must-have devices offers unique features tailored to meet varied needs-whether streamlining office tasks, easing communication barriers, or simplifying note-taking. Choose the ones that resonate with your lifestyle and step into a future that’s not just smart but also seamless!

This AI Startup Is Training The Next Generation Of Accountants

In a world where AI is reshaping industries, a groundbreaking startup is harnessing its power to train the accountants of tomorrow. This innovative venture is revolutionizing the financial sector, making complex calculations and financial forecasting as easy as a click of a button. Stay tuned as we delve into how this startup is making waves in the world of finance.

逆天|详细说说苹果M4、M4 Pro和M4 Max

上周,苹果发布了搭载最新M4芯片的三款Mac。 M4是苹果自己设计的芯片,ARM架构,采用台积电第二代3nm制程制造。 其实它早在24年5月就在iPad Pro上搭载了,只是最近才把它放在了Mac上。 这一代同上一代M3一样,是3nm制程。但M4的3nm是台积电的第二代3nm工艺,在能效表现上又跨了一大步,并且竟然没扯到蛋。 所以,这一代Mac的性能相比上一代提升了30%,MacBook Pro续航也大大增加,达到了24小时,让俺们牛马在实现724工作制的道路上又近了一步。 M4一共有三个版本,M4、M4 Pro以及M4 Max。 即所谓的中杯、大杯、超大杯。 满血版的普通M4有10个CPU核心,10个GPU核心和16个NPU核心。 其中,10个CPU核心包含4颗性能核心和6颗能效核心。 这颗芯片搭载在基础款的MacBook Pro、Mac mini和iMac上。 不过要注意,iMac最低配上的M4是残血版,它的CPU和GPU都只有8核。 估计未来的M4 MacBook Air低配款同样也是残血M4。 这是在geekbench上,满血版10核M4的跑分: 因为工艺进步,M4的单核跑分大幅领先M3的3100,史上最高。 从多核跑分上来看也是不得了,10核的M4达到了12核M2 Max的水平。 这里简单说下单核跑分和多核跑分的意义。 单核跑分反映其中的一个CPU核心的性能。单核分数越高,在处理单线程任务的时候,性能就越好,用起来越丝滑。 日常的很多工作大多属于是这种类型的,比如上上网、做做表什么的。 因此,普通款的M4对于轻度用户来说是个很好的选择。如果不搞编程、不剪视频啥的,我觉得还可以等等明年更便宜的MacBook Air。 而多核跑分反映的就是所有CPU集火输出的能力,对于专业用户来说,比如剪视频、编译代码啥的,多核跑分就很重要。 因此,单论CPU性能,M4用起来绝对比M2 Max爽。 二、大杯:M4 Pro 现在我们来看看中档的M4 Pro。 M4 Pro其实有两款,差价1500块: 12核CPU、16核GPU 14核CPU、20核GPU 满血版的是14核CPU和20核GPU。CPU核心多了4个,但GPU数量相比普通M4翻倍了。 M3 Pro的内存带宽很高,达到了273GB/s,相比M3 Pro提升了75%。 高带宽意味着短时间内可以向内存读写更多的内容,在处理重度任务时比较有优势,比如视频渲染、大模型训练等等。 另外,M4 Pro支持雷雳5接口,而普通M4只支持雷雳4。 「雷雳」就是英特尔的雷电技术,用的是USB Type-C接口。雷雳5传输速率可以达到120 Gbit/s,非常快,是雷雳4的三倍。 如果你有雷雳5设备,传输数据的时候就会非常爽。 但雷雳5的设备价格非常贵,就连一根数据线都可能要上百块,你的钱包可能会非常不爽。 我们来看看M4 Pro的跑分: 单核就不说了,所有M4都一样。

100 Chilly Tech Objects We Suggest in 2024

100 Chilly Tech Objects We Suggest in 2024

1 A Beautiful Smart Wallet Ekster RFID Blocking Leather Wallet The Ekster Parliament is a smart bifold wallet with RFID coating (to protect against identity theft) and a patented mechanism that ejects cards from its aluminum storage pocket with the press of a button. It has space for at least ten cards, as well as

read more

Trigger Caching with Claude 3.5 Sonnet

Introducing the innovative solution of prompt caching with Claude 3.5 Sonnet! This groundbreaking technology revolutionizes the speed and efficiency of data retrieval, ensuring that your prompts are delivered faster than ever before. Say goodbye to delays and hello to seamless performance with Claude 3.5 Sonnet.

read more

Join The Newsletter

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy policy and terms and conditions on this site
×
aiomatic aime assistant
you are the CEO of an artificial intelligence company ; you are friendly and approachable, you respond in vocabulary appropriate to an executive level ; Assume the executive has no knowledge of Artificial Intelligence