Industry Insights

Top AI Apps Training on Your Data - June 2024

July 9, 2024

Last month, we published our first “Top AI Apps Training on Your Data” blog. This blog focused on 10 GenAI-enabled apps with content declarations that indicated they might be training on customer data. 

This topic continued to garner attention in June. Despite many calls to cease training on customer data, Meta announced plans to continue training their foundational models on customer data. While this excludes under-18-year-olds and offers opt-out for EU citizens, it’s another example of data-hungry SaaS providers scrambling to find datasets to train their models. 

Figma, too, has recently announced their AI capabilities which will soon begin training on user data unless they opt out (keep reading for some handy links to opt-out forms!).

Given this hot topic, we decided to revisit this topic and analyze GenAI traffic trends for this June.

GenAI versus GenAI-Enabled

When we speak about employees using “GenAI”, we immediately jump to Microsoft Copilot and ChatGPT. However, with every SaaS company now adding GenAI capabilities, the scope of GenAI has exploded. 


To this end, our previous blog focused solely on GenAI-enabled applications that train on customer data, such as Grammarly, DocuSign, and others. In this blog, we’ve opted to include GenAI apps themselves in this version. 

Quick Overview: Top 5 Used GenAI Apps, June 2024

This month, we analyzed the top web-based GenAI apps that have content training declarations. As you might expect, ChatGPT dominated, with Perplexity and Google Gemini the followers.

  1. ChatGPT 
  2. Perplexity 
  3. Google Gemini 
  4. Quillbot 
  5. Claude

If you want to drill into the differences between ChatGPT and Gemini privacy policies, you can check out our blog from a few months back: https://www.harmonic.security/blog-posts/gemini-vs-chatgpt-comparing-data-privacy-policies

Quick Overview: Top 10 Used GenAI-Enabled Apps, June 2024

Let’s now turn our focus to GenAI-enabled applications (side note: are there any SaaS apps left not incorporating AI?!). As a quick recap of May’s activity, check out our previous blog: https://www.harmonic.security/blog-posts/top-ai-apps-training-on-your-data---may-2024

Again, we’ve ordered by how much they were used. 

  1. LinkedIn. Minimizes personal data in training datasets, using privacy-enhancing technologies. More details.
  2. ADP. Minimize access to identifiable information to ensure they use only the personal data we need to generate insights.
  3. Yelp. AI content is trained on platform content and third-party services, aiming to improve user experience. More details.
  4. Drift. Uses personal information for service improvement and development, often anonymized. More details.
  5. Squarespace 
  6. Calendly. Performs research and analysis on user interactions to improve products, with vague specifics on data use. More details.
  7. BambooHR. Uses performance data and feedback for AI development, specifically for providing AI features to the same customer. More details.
  8. DocuSign. Implements role-based access controls and security measures to minimize privacy impacts, but does allow for the training of AI models on personal information if customer consent is given. More details.
  9. Kayak. May process user input and the generated recommendations for quality and product improvement purposes. More details.
  10. Yellow. Uses customer data to improve and enhance the operation of the yellow.ai services. More details

Best Practices for Organizations

When it comes to data privacy and AI, simply being aware of the risks isn't enough. Organizations need to take concrete steps to protect their valuable data assets. Here are some key best practices to implement:

  1. Regular Audits: Conduct regular audits of the apps used within your organization to understand their data practices.
  2. Clear Policies: Develop and enforce clear data usage and AI policies. You can use our free policy generator to create one! <Link>
  3. User Training: Educate employees about the risks and best practices for using AI tools safely.
  4. Opt-Out: Where possible, opt out of content training to avoid exposing your intellectual property.

Opting Out of Content Training

In some instances, companies provide options to opt out of their training. As you might expect, some make this easier than others, and some have no options whatsoever.

Here’s a list of some helpful links to guide you!

OpenAI Privacy Center: https://privacy.openai.com/policies?modal=take-control

Meta (EU Opt-Out): https://www.facebook.com/help/contact/1500272623934875 

Grammarly: https://account.grammarly.com/security/privacy

Gemini: https://myactivity.google.com/product/gemini?utm_source=gemini 

Perplexity: https://www.perplexity.ai/settings/account 

Figma (steps): https://help.figma.com/hc/en-us/articles/17725942479127-Control-AI-features-and-content-training-settings 

X (steps): https://help.x.com/en/using-x/about-grok

More options are listed in this helpful article: https://www.linkedin.com/pulse/regain-control-your-data-how-opt-out-ai-training-jenish-pithadiya-v5tpc/

Conclusion

As AI continues to evolve, so do the challenges it brings. By staying informed and adopting smart, data-centric security strategies, organizations can harness the power of GenAI while safeguarding their data.

If you’re concerned about how GenAI-enabled apps are handling your data, reach out to Harmonic Security. We’re here to help you navigate this new frontier with confidence.

Request a demo

Team Harmonic