Critique of CNN
CNN only generates one answer according to one input, it cannot resolve one-to-many and many-to-one sequence-related problems
Problems to be solved:
- Sequence input, one output (videos)
- One input (CNN), sequence output (photo to text)
- Sequence input, sequence output (translation)
- Sequence input, sequence output about the original sequence ( per-frame video classification )
[!TIP] RNNs for non-sequential tasks Actually, RNNs can perform sequential processing on non-sequential data RNNs take multiple glimpses at the image, and the position RNN is focused on depends on the information extracted from previous glimpses When generating images, RNNs can output a series of color painting over time, building up the entire image
RNNs
The whole pipeline of NLP with Encoder——Decoder Models
flowchart LD A[Input Text] -->|Tokenizer} B(Input Number Sequence) B -->|Encoder by RNNs| C(Input Vector Sequence) C --> |Decoder by RNNs| D(Output Vector Sequence) D -->|