As large amounts of data keep being generated by users on social media platforms, some of the information can be considered as harmful. These kinds of textual information can be generated in forums, online discussions or any other communication exchange in an online medium. As such, it is sometimes difficult to filter out what information is actually meaningful. Detecting these harmful pieces of information can help in providing a means of online moderation so that a safe discussion can be maintained, which helps in preventing issues such as cyber bullying. Using the Kaggle Jigsaw dataset of comments that are classified as toxic labels, we can implement deep learning models to implicitly extract textual features from the comments and solve this supervised learning problem. This paper focuses on using the variations of recurrent neural networks, with the main focus on using bidirectional gated recurrent units, and evaluating their performances against each other. © 2020, Springer Nature Singapore Pte Ltd.