When the wave amplitudes are small, the nonlinearity is weak and the wave periods, determined by the linear dynamics, are much smaller than the characteristic time at which different wave modes exchange energy. In the other words, weak nonlinearity results in a timescale separation and this fact is exploited in WT to describe the slowly changing wave statistics by averaging over the fast linear oscillations.