The former will show among top recommended items, spoiling the results. In other words, RMSE doesn’t tell a true story, and we need metrics specifically crafted for ranking.
The two most popular ranking metrics are MAP and NDCG. NDCG stands for Normalized Discounted Cumulative Gain.
Some algorithms are better suited to this scenario, some worse. This method fits the model by keeping user factors fixed while adjusting item factors, and then keeping item factors fixed while adjusting user factors. At test time, when we have input from a new user, we could keep the item factors fixed and fit the user factors, then proceed with recommendations.MAP is a metric for binary feedback only, while NDCG can be used in any case where you can assign relevance score to a recommended item (binary, integer or real).We can divide users (and items) into two groups: those in the training set and those not.The items with few ratings don’t mean much in terms of their impact on the loss.As a result, predictions for them will be off, some getting scores much higher, some much lower than actual.Our working assumption will be using a pre-trained model without updates, so we need a way to account for previously unseen users.