That R at the start effectively allows only rows where Y is measured to have non-zero contributions to the estimating equations.
This covers Step 1
This covers Step 1
Comments
The reason is that the M-estimator also gives us a way to estimate the variance directly. We can simply use the sandwich variance