Facebook Responds To Reports Of Indian Workers Labelling Private User Data

Facebook Responds To Reports Of Indian Workers Labelling Private User Data

SUMMARY

Wipro is one of the companies contracted to label user-generated content

Facebook said it prioritises the privacy of users even when labelling content

As many as 260 third-party workers scan through user content on Facebook

After the Cambridge Analytica scandal and numerous allegations of ignoring user privacy and rampant data collection, Facebook is once again caught in a controversy. This time around, the problem is due to the company not informing users about what it actually does with their posts, videos and photos.

For a long time, Facebook has admitted that it runs analysis and puts user content through data processing to improve the main feed as well as the kind of posts that users will see. A new report this week says that it’s not machines reading your posts.

According to a Reuters report, at least 260 third-party workers scan through user content on Facebook, and that’s just from one company in one location in India. These workers label images, status updates and other posts to understand the context of the post and surface related ads and improve the main News Feed algorithm.

The social network also gets these works to label private posts according to the report. Labelling or data annotation is one of the fastest growing industries as companies race to train AI and machine learning systems. Employees label images, text, logos, and symbols to help computers understand context and contents of the image or text. This is then used to develop consumer-facing AI features such as text OCR or object recognition inside camera apps.

Facebook’s Year of Controversies

Facebook has not yet recovered from the Cambridge Analytica scandal which broke out in early 2018 but had been brewing in the background since mid-2017.

Before 2016, British marketing analytics firm Cambridge Analytica allegedly mined data without user consent through an innocuous Facebook app, which was subsequently used by Russia’s Internet Research Agency (IRA) and the 2016 presidential campaign of Donald Trump to run ads, promote disinformation, and spread fake news and misleading content. Trump’s Facebook campaigns proved instrumental in leading him to the White House.

Many aspects of Facebook’s involvement with Cambridge Analytica are still under investigation. Facebook and CEO Mark Zuckerberg were grilled by legislators and lawmakers in the US and UK over its involvement with the British company.

Facebook has tried every step to recover from the PR nightmare that was the Cambridge Analytica scandal. Zuckerberg, at the F8 conference last week, said that Facebook is going to emphasise privacy in its features and services this year onwards. However, the Reuters reports suggest the company has not revealed the full scale of what it does with user data.

How Is Facebook User Data Being Labelled?

The report says that Indian company Wipro has a contract with Facebook for the labelling operations. Wipro runs this operation from its Hyderabad office and 260 ‘workers’ have been hired by Facebook, through Wipro, to manually tag or label photos, posts, links shared on timeline, stories and videos into several category items according to the ‘five dimensions’ that Facebook considers key for its AI datasets.

The datasets are used to improve the AI and machine learning algorithms on Facebook’s platform to improve content and ads suggestions. However, Facebook is running human labelling with no consent from the user. Although it may in the grey area of legality, annotation/labelling of personal posts on Facebook is certainly unethical on the grounds that a user has no idea their private posts are being read by other humans, who may also be Facebook users.  

A Facebook spokesperson told Inc42 that “We’re building AI systems that help people across a variety of Facebook products, from reducing policy violating content to helping people with visual impairments connect better with their friends and family. Labelled data is important to train the models that make this possible.”

What does Facebook’s Privacy Policy Say?

Facebook uses the information any user uploads to personalise features, content, and to improve suggestions and recommendations. Its privacy policy states that “[Facebook uses the information a user provides] to create personalized products that are unique and relevant to you, we use your connections, preferences, interests and activities based on the data we collect and learn from you and others.”

Further, on who has access to this data, it says “We provide information and content to vendors and service providers who support our business, such as by providing technical infrastructure services, analyzing how our products are used, providing customer service, facilitating payments or conducting surveys.”

To clarify to what extent data is shared with the likes of Wipro, the Facebook spokesperson further told Inc42, “We treat the privacy of our users with utmost importance when labelling content for the purposes of improving the user experience for all that use our products.”

What is Data Annotation

As more and more companies embrace AI, the data which goes into teaching the AI system is increasingly becoming proprietary. In such a scenario, the need for such ‘data annotation’ companies is only set to increase with time.

One such company is iMerit, a data-training startup which counts eBay, Getty Images and Microsoft as its clients. Over 1400 employees working in iMerit around the world are trained to label photos on behalf of the clients in a way which eliminates bias. iMerit’s 90% clients are US-based. Another company based out of Kerala, Infolks, which started just three years ago with an investment of INR 25K, now has enough cash to employ 200 people.

The industry, in its early stage is presenting a lot of employment or semi-employment opportunities in tier 2 and tier 3 cities as most of the annotation/labelling work is outsourced by major companies developing AI.

In March, the tech industry body, Nasscom’s senior vice president and chief strategy officer, Sangeeta Gupta said that “This is an emerging sector… in India and everybody has begun to realise the humongous opportunity it presents”

Commenting on the huge opportunity annotation presents in India she further added that “AI requires properly annotated, classified and anonymised data. For this, whether you like it or not, you will use automation but you will also have to use skilled human workforce, and that is the opportunity it presents for India.”

(This story has been written by Ankur Bhardwaj and Nikhil Subramanium)

Step up your startup journey with BHASKAR! From resources to networking, BHASKAR connects Indian innovators with everything they need to succeed. Join today to access a platform built for innovation, growth, and community.

You have reached your limit of free stories
Become An Inc42 Plus Member

Become a Startup Insider in 2024 with Inc42 Plus. Join our exclusive community of 10,000+ founders, investors & operators and stay ahead in India’s startup & business economy.

2 YEAR PLAN
₹19999
₹7999
₹333/Month
UNLOCK 60% OFF
Cancel Anytime
1 YEAR PLAN
₹9999
₹4999
₹416/Month
UNLOCK 50% OFF
Cancel Anytime
Already A Member?
Discover Startups & Business Models

Unleash your potential by exploring unlimited articles, trackers, and playbooks. Identify the hottest startup deals, supercharge your innovation projects, and stay updated with expert curation.

Facebook Responds To Reports Of Indian Workers Labelling Private User Data-Inc42 Media
How-To’s on Starting & Scaling Up

Empower yourself with comprehensive playbooks, expert analysis, and invaluable insights. Learn to validate ideas, acquire customers, secure funding, and navigate the journey to startup success.

Facebook Responds To Reports Of Indian Workers Labelling Private User Data-Inc42 Media
Identify Trends & New Markets

Access 75+ in-depth reports on frontier industries. Gain exclusive market intelligence, understand market landscapes, and decode emerging trends to make informed decisions.

Facebook Responds To Reports Of Indian Workers Labelling Private User Data-Inc42 Media
Track & Decode the Investment Landscape

Stay ahead with startup and funding trackers. Analyse investment strategies, profile successful investors, and keep track of upcoming funds, accelerators, and more.

Facebook Responds To Reports Of Indian Workers Labelling Private User Data-Inc42 Media
Facebook Responds To Reports Of Indian Workers Labelling Private User Data-Inc42 Media
You’re in Good company