以前读书大概是这样的:读纸质书时,就只是读,觉得特别好或者重要的地方,手机拍张照;读电子书时,重要的地方会用手滑动下划线一下。整本书读完,如果感触比较深,或者心血来潮,写篇读后感。写读后感算是一种复盘,除了记下要点,也会产生一些新的想法,比没写读后感的那些书学习留存率还是要高些。
但是,以上方法有两个问题。当我们读一本书时,经常每读几页或一章就会产生一些想法和疑问,上述方法1)没有及时以较小的颗粒度记下当时的理解和想法 2)因为没有及时对每部分产生的疑问求解答而错过更加深入思考的机会。
最近用了一种新的读书方法:每天刚刚读完一部分,趁热打铁,把书扣上,首先,针对刚刚读完的那部分在脑子中快速回忆一下,写几条简单的笔记(我用的笔记app是UpNote), 这里写的内容可以是刚读过部分的一些概括要点,也可以是自己由此产生的一些新想法,笔记颗粒度从“一本书”变为“一章”。然后,在刚读过的内容或记笔记的过程中产生的一些疑问(这层疑问可能更深一层,无法直接从书中寻找答案),向GPT4寻求这些问题的答案,然后把自己的问题和GPT4的回答copy回到笔记中。这样,就完成了当天阅读的三部曲:读,记,问答。
举个例子,最近接着研究人类学习和机器学习的异同,在读 Stanislas Dehaene 的 《How We Learn: Why Brains Learn Better Than Any Machine . . . for Now》,有一天读到了第五章:nurture’s share的第一部分,这部分描述了人脑的基本工作原理:包括神经元,轴突,树突,突触,海马体等基本概念,信息经过感官神经传递到脑神经中的流动方式,化学信号和电信号的相互传递和转化,以及这些部分在学习过程中发生的变化。
我当时读完这部分首先查看了youtube上一个关于人脑组成的3分钟视频,通过视频中的图加深了对人脑中重要组成部分概念和关系的理解,
并记了如下笔记:
- 神经元细胞(neuron)
- 轴突(axon)
- 树突(dendrite)
- 神经细胞核 (nucleus)
- 突触 (synapse)
- 突触是两个神经元相遇的地方
- 一个神经元的轴突遇到另一个神经元的树突
- 突触是两个神经元相遇的地方
- 学习的过程就是因为外部刺激产生
- 突触的变化(类似ANN中的weights, 每个weight connect two neurons)
记完笔记,我想强化总结一下学习过程大脑中这些主要组成部分发生的变化,于是,我开始向GPT提问
问题1: explain briefly how the signal flows between neurons in human brain through axon, dendrite and synapses, as well as what happened on these elements during learning process
GPT给了很好的回答(见附录回答1),描述了这个过程。然后我又有点记不清这中间哪部分是电信号,哪部分是化学信号,重新翻阅电子书找答案比较耗时,于是我接着问GPT4:
问题2: can you elaborate on during which stage the signal is chemical, and which stage the signal is electric?
GPT给了清晰简洁的答案(见附录回答2),然后我根据答案更新了笔记:
- 当受到刺激时:
- 神经元内是电信号,跨神经元传递是化学信号
- 当电信号强度超过一定阈值,转化为化学信号,传到邻居神经元
- 流动方式:从当前细胞的轴突,通过突触,传到邻居神经元的树突
- 在邻居细胞内从化学信号再转回电信号,如果电信号足够强,超过阈值,又转化为化学信号传到下一个邻居神经元
了解了人脑神经网络的基本结构后,对比深度学习中用的人工神经网络(Artifical Neural Networks(ANN)), 我产生了一个疑问:人脑神经网络结构远比ANN复杂,人脑神经网络近似于一个graph的结构,而ANN类似一个一层一层堆起来的结构,那么ANN为什么不更精确的模仿人脑神经网络结构呢?于是,接着问GPT4:
问题3: why doesn’t ANN mimic the human brain by making the structure or architecture of neurons a graph, instead of a layer by layer structure?
GPT4给出的答案(见附录回答3)主要从计算的简单性(back-propagation),训练的效率和可解释性方面阐述了原因,同时提到了相对复杂的网络结构如Recurrent Neural network(RNN)和Graph Neural Network (GNN)。
我顺着这个思路又问了InstructGPT的结构,又问到了Transformer中的attention机制(省略问题4,5)。
我又想到人脑中的突触(synapses)是连接两个神经元的结构,是不是相当于ANN中的权重(weights)? 忽然意识到:之前拿目前的人工大模型(Large Language Model (LMM))中参数的数量(如GPT3中175 billion), 和人脑中神经元的数量(86 billion)进行比较是有问题的!因为其实人脑中连接神经元的突触(synapses)才相当于ANN中的weights,突触(神经元链接)的数量应该远大于神经元的数量。于是接着问:
问题6:are synapses in human brain kind of like weights in ANN? since they both carry the strength of connectivity between two neurons?
得到肯定回答(详见附录回答6)
问题7:how many neurons and how many synapses in human brain?
GPT回答(详见附录回答7)是神经元约86 billion, 突触是约100 trillion到1 quadrillion。 果然,如果把ANN和人脑在参数量级上做粗略比较(其实结构很不一样),GPT3的参数数量还是远远小于人脑的。有意思的是,传言GPT4的参数数量是100 trillion,刚好约等于人脑突触数量的lower bound。
所以,从以上例子看出,读完一本书中不到一个章节的内容,居然可以通过这种新的读书方法学到这么多延展知识!
目前使用了两周这种新的读书方法,发现不仅学习留存率大大提高,而且也使阅读变得更加有趣了,因为阅读变成了一个互动的过程。这样的好处是,每次读书过程中就开始思考哪些可以记下来,有什么疑问。真正避免了“学而不思则罔”,不仅知识留存率提高很大,而且会产生很多延展学习和思考。
附录:GPT4部分问题回答
回答1:
In the human brain, neurons communicate with each other through a complex process involving axons, dendrites, and synapses. The flow of information can be briefly described as follows:
- Signal initiation: When a neuron receives input (from external stimuli or other neurons), it processes the information, and if the input is strong enough, it generates an electrical signal called an action potential.
- Axon: The action potential travels down the axon, a long, slender projection of the neuron that transmits the signal to the next neuron. The speed of the signal can be increased by the presence of a myelin sheath, a fatty layer that insulates the axon.
- Synapse: At the end of the axon, the signal reaches a small gap called a synapse, which separates the transmitting neuron (presynaptic neuron) from the receiving neuron (postsynaptic neuron).
- Neurotransmitters: When the action potential reaches the synapse, it causes the release of neurotransmitter molecules from the presynaptic neuron. These neurotransmitters diffuse across the synaptic cleft and bind to specific receptors on the postsynaptic neuron’s dendrites.
- Dendrites: Dendrites are the branched, tree-like structures of a neuron that receive incoming signals from other neurons. Once the neurotransmitters bind to the receptors on the dendrites, the signal is converted back into an electrical impulse and passed on to the cell body of the postsynaptic neuron.
During the learning process, the brain undergoes changes at the synaptic level in a process called synaptic plasticity. Two primary processes that contribute to learning are long-term potentiation (LTP) and long-term depression (LTD):
- LTP: Long-term potentiation strengthens the connection between two neurons. When neurons are repeatedly activated together, the efficiency of their communication increases, leading to an increased likelihood that the presynaptic neuron’s signal will elicit a response from the postsynaptic neuron.
- LTD: Long-term depression weakens the connection between two neurons. It occurs when there is a decrease in the activity of the presynaptic neuron or when the neurons are activated asynchronously. This results in a reduced likelihood that the presynaptic neuron’s signal will elicit a response from the postsynaptic neuron.
These changes in synaptic strength allow the brain to adapt and learn from new experiences, forming the basis of memory and learning.
回答2:
Certainly! The neuronal communication process involves both electrical and chemical signals at different stages:
- Electrical signal: The communication within a single neuron primarily involves electrical signals. When a neuron receives input, it generates an electrical signal called an action potential if the input is strong enough. This action potential, or nerve impulse, travels down the axon of the neuron towards the synapse.
- Chemical signal: At the synapse, the communication between two neurons shifts from electrical to chemical signaling. When the action potential reaches the synaptic terminal at the end of the axon, it triggers the release of neurotransmitter molecules from the presynaptic neuron. These neurotransmitters diffuse across the synaptic cleft, a small gap between the neurons, and bind to specific receptors on the postsynaptic neuron’s dendrites. This process is entirely chemical in nature.
- Electrical signal: Once the neurotransmitters bind to the receptors on the dendrites of the postsynaptic neuron, the chemical signal is converted back into an electrical signal. The binding of neurotransmitters to their receptors generates a new electrical impulse, called a postsynaptic potential, which travels towards the cell body of the postsynaptic neuron. If this new electrical signal is strong enough, it can generate an action potential in the postsynaptic neuron, continuing the communication process.
In summary, the signal is electrical within individual neurons (during action potentials and postsynaptic potentials), while it is chemical when transferring information between neurons at the synapse (through neurotransmitters).
回答3:
Artificial neural networks (ANNs) are inspired by the human brain, but they are not exact replicas. The layer-by-layer architecture of ANNs is a simplified model that allows for efficient computation and easier training of the network. There are, however, several reasons why ANNs do not use a graph structure like the human brain:
- Computational simplicity: Layered architectures are more straightforward to implement and train. By organizing neurons into layers, it’s easier to perform computations and propagate information through the network. Additionally, the backpropagation algorithm, which is widely used for training ANNs, is well-suited for layer-by-layer structures.
- Convergence and training efficiency: In layered architectures, the training process is more likely to converge to a solution, and it generally does so more quickly than in more complex structures. Graph-based structures can create complex dependencies between nodes, which can make learning more challenging and less efficient.
- Interpretability: Layered architectures provide a certain level of interpretability, making it easier to understand how information is processed within the network. In contrast, graph-based architectures can be more challenging to interpret and visualize.
However, it’s worth noting that there are alternative neural network architectures that incorporate more complex and graph-like structures. For example:
- Recurrent Neural Networks (RNNs): RNNs include loops that allow information to persist and be fed back into the network. This architecture can model sequential data and is often used for natural language processing and time-series prediction.
- Graph Neural Networks (GNNs): GNNs explicitly model graph structures and are designed to handle data with irregular connectivity patterns. GNNs are particularly useful for tasks like social network analysis, molecular structure prediction, and recommendation systems.
While these more complex architectures can better model certain types of problems, they also come with increased computational complexity and can be more challenging to train. In many cases, the layer-by-layer structure of traditional ANNs strikes a balance between simplicity, computational efficiency, and expressive power.
回答6:
Yes, synapses in the human brain can be thought of as somewhat analogous to the weights in an artificial neural network (ANN). Both synapses and weights represent the strength of the connection between two neurons, influencing how signals are transmitted and processed within their respective networks.
In the human brain, synapses are the junctions where neurons communicate with each other. The strength of a synapse, also known as synaptic weight or efficacy, determines how effectively a signal from one neuron (the presynaptic neuron) will influence the activity of the connected neuron (the postsynaptic neuron). This synaptic strength can change over time through processes like long-term potentiation (LTP) and long-term depression (LTD), which are critical for learning and memory formation.
In an ANN, weights are numerical values assigned to the connections between artificial neurons (also called nodes or units). These weights determine how much influence a signal from one neuron will have on the connected neuron. During the training process, these weights are adjusted using optimization algorithms like gradient descent to minimize the error between the network’s output and the target output. This weight adjustment process is the primary mechanism through which ANNs learn and adapt to the input data.
While there are similarities between synapses in the human brain and weights in an ANN, it is essential to remember that these are still simplified abstractions. The biological processes underlying synaptic communication in the brain are much more complex and involve intricate interactions between neurotransmitters, receptors, and other molecules. Nevertheless, the concept of connection strength is a fundamental aspect of both biological and artificial neural networks, allowing them to process and transmit information in a dynamic and adaptive manner.
回答7:
The human brain is an incredibly complex organ, containing a vast number of neurons and synapses.
Neurons: The number of neurons in the human brain is estimated to be around 86 billion. However, this is an approximation, as the exact number can vary between individuals.
Synapses: The number of synapses in the human brain is much larger than the number of neurons. It is estimated that there are approximately 100 trillion to 1 quadrillion synapses, allowing for extensive connectivity and communication between neurons.
These numbers highlight the immense complexity of the human brain and the intricate network of connections that enable a wide range of cognitive functions and the processing of information.