(§3.1) independently, printing every shape so a student can verify that the dimensions line up with Fig. 1 of the paper. encoder = Encoder(num_layers, d_model, num_heads, d_ff, dropout=0.0) decoder = ...
"""Self-decoder component of Gemma 4.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results