Brianna White

Administrator
Staff member
Jul 30, 2019
4,656
3,456
Data labelling is a key process in machine learning. It facilitates in training machine learning models and accelerates the development of artificial intelligence. Data annotation is frequently outsourced to data labelling firms, which annotate images, videos, audios and text language. In addition to providing outsourcing data annotation services to firms, data labelling companies have also collaborated and partnered with firms to enable research and innovation in the field of data annotation and AI. This article presents the top five data labelling projects of 2021. 
Scale AI and Oxford University’s Reddit Data Set 
Scale AI, a data annotation platform, has collaborated with Oxford University to build a comprehensive dataset on online debates and discourse. Natural language processing is currently in its nascent stage, and NLP models often struggle with understanding the context of online exchanges. For example, the NLP models fail to process slang, sarcasm, context-specific jokes, and diverse online interactions by default. 
Scale AI and Oxford University created a dataset, ‘Debagreement’, containing comment-reply interactions across five subreddits: Democrats, Republicans, Black Lives Matter, Brexit, and Climate. Each comment-reply interaction is annotated with “agree,” “disagree,” “neutral,” or “unsure” labels by at least three raters, allowing the ML model to detect the stance of Redditors in online discourse. The collaborative project has been viewed as the first step in training socially aware language models. 
Continue reading: https://analyticsindiamag.com/5-data-labelling-projects-that-impacted-the-ai-industry-the-most/
 

Attachments

  • p0005960.m05613.data_lab_11zon.jpg
    p0005960.m05613.data_lab_11zon.jpg
    96.8 KB · Views: 42
  • Like
Reactions: Brianna White